Characterization of the human germ cell nuclear factor gene

Characterization of the human germ cell nuclear factor gene

BIOCHIMICA ET BIOPHYSICA Af.~IA ELSEVIER Biochimica et BiophysicaActa 1309 (1996) 179-182 BBN Short sequence-paper Characterization of the human ...

322KB Sizes 1 Downloads 173 Views

BIOCHIMICA ET BIOPHYSICA Af.~IA

ELSEVIER

Biochimica et BiophysicaActa 1309 (1996) 179-182

BBN

Short sequence-paper

Characterization of the human germ cell nuclear factor gene U t e Stisens, U w e B o r g m e y e r * Zentrum fiir molekulare Neurobiolgie, Uniuersitiit Hamburg, Falkenried 94, D-20251 Hamburg, Germany

Received 28 June 1996; accepted 12 August 1996

Abstract

A cDNA clone encoding the germ cell nuclear factor, GCNF, a member of the nuclear receptor superfamily has been isolated from the human embryonal carcinoma cell line NT2/D1. Sequencing of this clone reveals an open reading frame encoding a 476 amino acid protein. A comparison of the amino acid sequence of the human GCNF with its mouse homologue shows only six amino acid exchanges in the whole protein and a deletion in the amino-terminal region. Northern blot analysis demonstrates that the expression in the testis is conserved. Keywords: Germ cell nuclear factor; Nuclear receptor superfamily;Expression; Testis; Neuronal

The receptors for steroid, retinoid, vitamin D, and thyroid hormones are intracellular transcription factors that bind to cis-acting elements of target genes and modulate gene expression in response to ligands [1-3]. They comprise a superfamily of nuclear receptors with a variable amino-terminal region, a highly conserved DNA-binding domain that contains eight cysteines coordinating zinc to form two zink fingers, and a moderate conserved carboxy-terminal part that contains the ligand-binding and the DNA-independent dimerization functions. A growing number of related proteins have been isolated that possess these features, but that lack known ligands [4,5]. These orphan receptors may either be constitutive transcription factors, exert their function as heterodimerization partners, or are targets for novel signaling molecules. The mouse nuclear receptor mGCNF belongs to the orphan nuclear receptors [6-8]. The retinoic acid

* Corresponding author. Fax: 01149 40 4717 5101. E-mail: [email protected],de

receptors and retinoid X receptors are among its closest homologues. In vitro translated mGCNF binds to the direct repeat of the sequence -AGGTCA-, DR-0. It is expressed during gametogenesis and during early embryogenesis [6-8]. The expression pattern indicates that GCNF and its putative ligand are involved in aspects of development and differentiation. In view of these putative functions it would be important to analyze the conservation in structure and of the expression in other mammals. With respect to a possible link to human diseases, we decided to study GCNF in man. Here, we describe the cloning and sequencing of a human GCNF gene expressed in a neuronal precursor cell line. We further demonstrate by Northern blot analysis of several endocrine tissues high expression in the human testis. Approximately 1 × 10 6 plaques from an uninduced NT2/D1 neuronal precursor cell A ZAP Express cDNA library (Stratagene) were screened for human homologues of the mouse GCNF. Filters were hybridized to a random primed restriction fragment (2447 bp), including the whole coding region of the mouse GCNF gene. Hybridization was carried out at

0167-4781/96/$15.00 Copyright © 1996 Elsevier Science B.V. All rights reserved. PI1 S01 67-4781(96)001 57- 1

U. Siisens, U. Borgmeyer / Biochimica et Biophysica Acta 1309 (1996) 179-182

180

42°C in 42% formamide, 5 × Denhardt's solution, 5XSSPE (1XSSPE: 150 m M NaC1, 10 m M N a 2 H P O 4, 1 m M E D T A , pH 7.4), 0.1% sodium dodecyl sulfate (SDS), 0.1 mg ml-~ herring sperm D N A , 0.25 mg m l - J t-RNA, 5% dextran sulfate. The final wash was 20 min at 65°C in 2 X SSC, 0.l% S D S (1 X SSC: 150 m M NaC1, 15 m M sodium cit-

A h m

MERDEPPPSGGGGGGGSA~FLEPPAALPPFPRNGFCQ *** *R************ *** **************

h m

DELAELDPGTN ................... ***********GETDSFTLGQGHIPVSVPD*******

h

CLICGDRATGLHyGIISCEGCKG~FKRS~CNKRVYRC *********************** ****

m

DRAEQRT

********

SRDKNCVMSRKQRNRCQYCRLLKCLQMGMNRKAIEDG ************************************* MPGGRN~SIGPVQISEEEIERIMS?QEFEEEANHWSN

1 61 121

GGCACGA~C~CGC~AGGGGCGCGGAGCGGCGCGGAACCGGGCGGCTCGGGGCCCAGA GAGAGCCGC~CC~GAGCTCGCGGGCTCC~AC~CCTCCTCCCCTCGGCGGACGACGA CCACGGCGACTA~GCGCCGGTCA~GCGGAGC~CAAACCCGGCGCGGACCCTA~CAC

HGDSDHSSPGNRASESNQPSrGST~SSS~VELNGFM

18! 1

CACCGCA~GAGCGGGACGAACCGCCGCCTAGCGGAGGGGGAGGCGGCGGGGGCTCGGCG M E R D E P P P S G G G G G G G S A

AFREQYMGMSVPPHYQYIPHLFSYSGHSPLLPQQARS *********************************p***

241 19

GGGTTCC~GAGCCTCCCGCCGCGCTCCCTCCGCCGCCGCGCAACGGTTTC~TCAGGAT G F L E P P A A L P P P P R N G F C Q D

301 39 361

h m

LDPQSYSLIHQLLSAEDLEPLGTPMLIEDGYAVTQAE *************************************

h

LFALLCRLADELLFRQIAWIKKLPFFCELSIKDYTCL

~GGGACCGCGCTACAGGCT~CACTA~ATCATCTCCTGTGAGGGC~CAAAGGG C G D R A T G L H Y G I I S C E G C K G

LSSTWQELILLSSLTVYSKQIFGELADVTAKYSPSDE *************************************

421 TTTTTCAAGCGGAGCATTTGCAAC~ACGGGTATATCGATGCAGTCGI~ACAAG~C~T 79 F F K R S I C N K R V Y R C S R D K N C 481 G T C A ~ T C T C G G A A G C A G A ~ A A C A G G ~ C C A G T A C T G C C G C C ~ C T C A ~ C C T C C A G 99 V M S R K Q R N R C Q Y C R L L K C L Q 541 A~GAT~ACCGGAAGGCTATCAGAG~GATGGCA~CCTGGAGGCCGGAAT~GAGC M G M

ELHRFSDEGMEVIERLIYLYHKFHQLKVSNEEYACMK *************************************

A~FL~QDIaCLTS~SQLEQLNKaYWYICQDFTEYKY ********************************** ,

R K A I R E D G M P G G R N K S

h

THQPNRFPDLMMCLPEIRYIAG~NVNVPLEQLPLLFK

h

VVLHSCKTSVGKE *********TV**

601 139

ATT~GCCAGTCCAGATATCGGAAGAAG~ATCGAAA~ATCA~TC~GGCAGGAGTTT I G P V Q I S E E E I E R I M S G Q E F

661 159

GAGGAAGAGGCCAATCACTGGAGCAACCATGG~ATAGTGACCACAGTTCCCC~GGAAC E E E A N H W S N H G D S D H S S P G N

721 179

AGGGCT~GGAGAGC~CCAGCCCTCACCAGGCTCCACAC~TCTTCCAGTAGGTC~TG R A S E S N Q P S P G S T L S S S E S V

781 199

GAACTGAA~GATTCA~GCCTTCAGGGAACAGTACATGGGAATGTC~CCTCCACAT E L N G F M A F R E Q Y M G M S V P P H

841 219

TACCAATATATACCGCACCTTTTTAGCTATTC~GCCACTCACCACTTC~CCCCAACAA y Q y I P H L F S Y S G H S P L L P Q Q

901 239

GCTCGCAGCCTGGATCCCCAGTCATACAGTCTGATTCACCAGCTGTTATCAGCCGAGGAC A R S L D P Q S Y S L I H Q L L S A E D

961 259

CTGGAACCAT~GGCACGCCCATGTTGAT~AAGA~GATACGC~TGACACAGGCAGAA L E P L G T P M L I E D G Y A V T Q A E

mGCNF

DELAELDPGTNGETDSFTLGQGHIPVSVPDDRAEQRT

1021 279

CTA~CCC~CTT~CCGCCTGGCCGACGAGCTGCTCTTTAGGCAGAT~CCTGGATC L F A L L C R L A D E L L F R Q I A W I

hGCNF-I

**********~

hGCNF-2

**********ISVS

i081 299

AAGAAACTGCCTTTCTTC~CGAGCTCTCAATCAAGGATTACACGTGCCTCT'~AGCTCT K K L P F F C E L S I K D Y T C L L S S

mGCNF.A**********

1141 319

ACG~GCAGGAGCTAATCCTGCTGTCTTCCCTCACCGTTTACAGCAAGCAGATCTT~GG T W Q E L I L L S S L T V Y S K Q I F G

12¢1 339 1261 359

GAAC~GC~A~TCAC~CCAAGTACTCGCCCTCCGA~G~CTACACAGATTTAGT E L A D V T A K Y S P S D E E L H R F S

1321 379

~GGTCAGCAACGAGGAGTA~C~GCATGA~GC~TT~CTTCCTA~TCAAGATATC K V S N E E Y A C M K A I N F L N Q D I

1381 399

AGGGGTC~ACCAGTGCCTCACAGC~GAAC~T~T~ACGATACTGGTACATTTGC R G L T S A S Q L E Q L N K R Y W Y I C

1441 419

CAGGATTTTACTGAATATAAATACACACATCAGCCG~CCGC~TCC~ATCTCA~A~] Q D F T E Y K Y T H Q P N R F P D L M M

1501 439

~CTTACC~AGATTCGATATAT~CAGGAAAGA~GTG~TG~CCCC~GAGCAGC~ C L P E I R Y I A G K M V N V P L E Q L

1561 459

CCCCTCCTC~T~GG~G~C~CATTCC~CAAGACCAG~TG~C~GG~ACCT P L L F K V V L H S C K T S V G K E *

1621 1681 1741 1801 1861

G~CCAGGCGCCCTCCTCAGGCC~CCACAGCGTCT~GG~GGCAGGACAGGCTC~GA GGG~AAGCCAGAGAGACC~GA~GAGGC~GAGCAGCATTTCCCGT~CCTCCATA GCAAGAAGAGTTT~GTT~TT~TC~TTTTTTT~CCTCATTTTTCTATATA~TATT TCACGACAGAGT~TA~3CCTTC~CA~A~CACA~CTTT~AA~CAG CCAATGCATTTTC~ACAGTTTACAGAA~GA~AAAA~AAAAAAA~AAA 1916

GA~GGGA~GAGG~ATCGAGCGGCTCATCTACCTCTATCACAAGTTCCATCAGCTA D E G M E V I E R L I Y L Y H K F H Q L

Fig. 1. Nucleotide sequence and predicted amino acid sequence of human GCNF-1. The putative start codon is at nucleotide 245. The core DNA-binding domain is boxed. The asterisk (*) marks the termination codon.

m

476 495

B ...................

*******

...............

********

I ..............

***********

Fig. 2. Sequence comparison of GCNF from human and mouse. A: Alignment of the amino acid sequence of hGCNF-I (h) with mGCNF (m). A putative dimerization interface in the ligand-binding domain is boxed. The length of the proteins are indicated. The amino acids conserved with human RTR are indicated by an asterisk. The bars indicate the position of 15 additional amino acids found in the mouse protein, B: Alignment of the amino acids in the amino-terminal domain of mGCNF with the two hGCNF isoforms. The lower row shows a hypothetical splice product based on the exon structure of the mouse gene. Asterisks indicate sequence identities, bars represent gaps with respect to the published mouse sequence.

rate, pH 7.0). 45 positive plaques resulted from the screen. 33 positive clones were converted to plasmids by the automatic excision process. By partial sequencing of 18 clones from both ends, 14 could be identified as members of the nuclear receptor super-

U. Siisens, U, Borgmeyer / Biochimica et Biophysica Acta 1309 (1996) 179-182

9.5~ 7.5--

--7.5

4.4-2.4~

m2.2

1.35 --

Fig. 3. Tissue specificity of GCNF expression. Poly(A) + RNA from different human tissues was analyzed by Northern blot analysis using a 3ap-labeled GCNF probe. Fujix exposure was for 16 h. Different endocrine tissues are as indicated. RNA size markers (in kb) are indicated at the left side, the approximate size of GCNF-specific transcripts is shown on the right side of the blot.

family. Six clones revealed high homology to the probe. One of these clones, pBK-CMV hGCNF-1, was restriction-mapped and sequenced by the dideoxy method with a Thermo Sequenase fluorescent-labelled primer cycle sequencing kit (Amersham) on a Li-Cor automatic sequencing system. Sequences were analyzed by the University of Wisconsin Genetics Computer Group programs [9]. Analysis of pBK-CMV hGCNF-1 revealed a 1916 bp insert and an open reading frame that starts at nucleotide 187 and terminates with a stop codon at nucleotide 1615. On this basis, the cDNA encodes a protein of 476 amino acids with a predicted relative molecular weight of 53 990 (Fig. 1). The amino acid sequences of hGCNF-1 and mGCNF are compared in Fig. 2A. There is an overall identity of 98.7% (91.6% homology of the DNA), including the untranslated region. The amino-terminal domain exhibits only one exchange. This domain is by 19 amino acids (57 bp)

181

shorter than the mouse protein. Partial sequencing from the amino terminus of the six independently isolated hGCNF cDNAs shows that three of them are 45 bp (15 amino acids), and three are 57 bp shorter (Fig. 2B). Murine cDNAs for GCNF have been isolated from testis, and embryonic libraries without any variation in the predicted protein sequence [6-8]. In mouse, the additional amino acids are encoded by a 45 bp exon, raising the possibility of alternative splicing in human cells (Fig. 2B; our unpublished results). The DNA-binding domains are identical. Therefore, one can propose that the DNA-binding characteristics found with the mouse protein can be applied to the human protein. There is a 98.6% identity in the ligand-binding domain without any insertion or deletion. A stretch of amino acids homologue to an o~-helix that mediates dimerization in the retinoid X receptor [10] is identical with the mouse clone (Fig. 2A). The N T 2 / D I cell line is derived from a human teratocarcinoma, and these cells represent a committed neuronal precursor cell of differentiation [11,12]. The isolation of several GCNF clones from cells with many markers characteristic for neuroectodermal cells of the embryo suggests a conserved function of GCNF during embryonic development. The discovery of a cell line expressing GCNF will be important to study GCNF functions. To analyze if the expression during spermatogenesis is conserved, we performed a Northern blot analysis on poly(A) + RNA from a variety of human endocrine tissues. A human multiple tissue Northern blot (denaturing formaldehyde 1.2% agarose gel) was purchased from Clontech. The blot contains 2 /_tg of poly(A) + RNA from different endocrine tissues, including testis. It was hybridized to the radiolabelled cDNA insert of pBK-CMV GCNF-1 in ExpressHyb buffer (Clontech) at 68°C, and the final washes were in 0.1 × SC, 0.1% SDS at 50°C. Radioactivity was visualized using the Fujix BAS 2000 technology. The full-length hGCNF probe hybridized strongly to a 7.5 kb and to a 2.2 kb transcript in testis. This result is reminiscent of the findings in mouse [6-8]. A weak signal at 7.5 kb can be detected in thyroid. Expression in mouse thyroid has not been investigated. However, high levels of the 7.5 kb message have been found in mouse embryos, and low levels in liver, kidney, lung and brain of the adult mouse [6,71.

182

U. Siisens, U. Borgmeyer / Biochimica et Biophysica Acta 1309 (1996) 179-182

In conclusion, the isolation of several GCNF clones from a human cell line that comprises many characteristics of primitive neuroectoderm argues that expression during embryonic development is conserved. The deduced primary sequence exhibits even higher conservation than has been found among isoforms of the human and mouse retinoic acid receptors [13]. The Northern blot analysis indicates that GCNF has the same function during gametogenesis in man and mouse (Fig. 3). We are grateful to Professor H. Chica Schaller for support during this work in her institute. We thank J. Kuhl for her help with the figures. This work was supported by the Deutsche Forschungsgemeinschaft through the SFB 232 to U.B.

References [1] Beato, M. (1989) Cell 56, 335-344. [2] Green, S. and Chambon, P. (1988) Trends In Genet. 4, 309-314.

[3] Evans, R.M. (1988) Science 240, 889-895. [4] Mangelsdoff, D.J., Thummel, C., Beato, M., Herrlich, P., Schiitz, G., Umesono, K., Blumberg, B., Kastner, P., Mark, M., Chambon, P. and Evans, R.M. (1995) Cell 83, 835-839. [5] Mangelsdorf, D.J. and Evans, R.M. (1995) Cell 83, 841-850. [6] Chen, F., Cooney, A.J., Wang, Y., Law, S.W. and O'Malley, B.W. (1994)Mol. Endocrinol. 8, 1434-1444. [7] Hirose, T., O'Brien, D.A. and Jetten, A.M. (1995) Gene 152, 247-251. [8] Stisens, U., Aguiluz, J.B., Evans, R.M. and Borgmeyer, U. (1996) Eur. J. Neurosci., in preparation. [9] Devereux, J., Haeberli, P. and Smithies, O. (1984) Nucleic Acids Res. 12, 387-395. [10] Bourguet, W., Ruff, M., Chambon, P., Gronemeyer, H. and Moras, D. (1995) Nature 375, 377-382. [11] Andrews, P.W., Damjanov, I., Simon, D., Banting, G.S., Carlin, C., Dracopoli, N.C. and Fogh, J. (1984) Lab. Invest. 50, 147-162. [12] Pleasure, S.J. and Lee, V.M.-Y. (1993) J. Neurosci. Res. 35, 585-602. [13] Zelent, A., Krust, A., Petkovich, M., Kastner, P. and Chambon, P. (1989) Nature 339, 714-717.