Biodffmica et Biophysica Acta, 1129(1992)331-334 © 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00
BBAEXP 90311
331
Short Sequence-Paper
Cloning and sequencing of a COUP transcription factor gene expressed in Xenopus embryos Philip J. Matharu and Glen E. Sweeney Department of Biochemistry, Unirersily of Wedes College of Cardiff, Cardiff ( UK) (Received 26 November 1991 )
Key words: COUP transcription factor; Steroid/thyroid receptor superfamily; Xenopus hu'cis
A eDNA clone encoding COUP transcription factor, a member of the steroid/thyroid receptor superfamily, has been isolated from a Xenopus neurula (stage 17 embryo) library. Sequencing of this clone reveals an open reading frame encoding a 397 amino acid protein. The aminG acid sequence of Xenopus COUP has been compared with its human and Drosophila homologues showing that there are few similarities within the aminoqermina| region, whereas the remainder of the protein, including the putative DNA and ligand binding domains, is very well conserved.
Many aspects of development, differentiation and metabolism in higher eukaryotes are regulated by small, biologically active, molecules such as steroid hormones, thyroid hormones, retinoic acid and vitamin D-3 [1]. These molecules regulate transcription via a family of intracellular receptor proteins collectively known as the steroid/thyroid receptor superfamily. These proteins are transcription factors which activate the expression of (different) sets of target genes, but only in the presence of the appropriate hormone/ligand. The members of the superfamily all consist of a variable amino-terminal region, a well conserved DNA binding domain comprising two zinc fingers, and a large, less well conserved, ligand binding domain located towards the carboxyl end of the protein [1-3]. Recently, a number of novel members of the superfamily have been identified. These possess well conserved DNA and ligand binding domains, similar to those of other superfamily members, but the ligands with which they presumably interact remain to be identified. Amongst these novel receptors are the human COUP protein [4], the rat NGF-IB protein [5], and several Drosophila proteins known to play crucial roles in embryonic development. The latter include the products of the gap
The nucleotide sequence data reported here have been submitted to the EMBL database under the accession number X63092. Correspondence: G.E. Sweeney, Department of Biochemistry, University of Wales College of Cardiff, P.O. Box 903, Cardiff, CFI IST, UK.
genes knirps and tailless [6,7] and the seren-up protein
[81. COUP was initially identified as a transcription factor required for expression of the chicken ovalbumin gene [9]. The human COUP gene has been cloned [4,10], but almost nothing is known about the biological role of COUP in vertebrate adults and embryos. The only other COUP..~ike gent described to date is the se, en-up gene of Drosophila, which has greater than 90% amino acid homology to human COUP within the putative DNA and ligand binding domains [8]. Mutations in the secen-up gene cause inappropriate differentiation of photoreceptor cells in the developing eye. Secen-up must have additional developmental roles since mutant embryos also have defects in their central nervous systems and die early in development [8]. In view of the significance of seren-up in Drosophila development, COUP protein may be similarly involved in vertebrate development. We therefore decided to study the function of COUP in the frog Xenopus laecis, since the ease with which Xenopus embryos can be obtained and manipulated make it an ideal system in which ,o analyse developmental processes. Here we describe the cloning and sequencing of a Xenopus COUP gene expressed during early embryonic development. Approx. 10~ plaques from a ,~gtl0 eDNA library constructed from Xenopus neurulae (stage 17 embryos) [11] were hybridised to a nick translated probe prepared from a fragment of the human COUP gene. Low stringency conditions were used, with hybridisation being carried out at 60°C in 6 x SSC, 0.1% SDS, 100
332 1
G T T G T G C A G C T C C G T G C A TTG A T T C T A C T T G C T C T A C T G C T T C C T G G A G C A T T C T G T C T G
1 61
M A M V V N P W 0 E D Z P G V P CTG CAC TGT ATT ATG GCC ATG GTG GTT AAC CCT TGG CAG GAG GAC ATT CCT GGT GTT CCA
17 121
G S Q N N N P P G L C N G D P G G T ¥ Q G G A T C T C A G A T G A A C A A C C C A C C G G G G CTC TGC A A T C A A C A T C C A G G G G G T A C C C C T C A A
37 181
T P T T P K G G I P G Q D P V H S G D K ACA CCO ACC ACC CCA AAA GGA GGT ATC CCT GGT CAG GAT CCT GTT CAT TCT GGA GAT AAA
57 241
G V P N GGG GTA CCTAAT
77 301
Q P ~_%~_ ~
97 361
Y T C A S N R D C P Z D Q H H R N Q C Q T A C ~ C ~ T O T & G G ..~=I~'-"___~L1r~A G A G A C T a T O C T A T A G A T C A A C A T C A C C G C A A T C A G TGC C A G
117 421
Y t.~ R L K K ¢ L K V G N T A C q~C CO(: C ~ G AAQ AAG T G T C T C A A A ~ T ~ Q G a ~ A T ~
137 491
L N N V D P Y N M 8 H P Q T S P G O Y T A T G T C T ( ~ T C C A C A A A C C A G C C O G G G C C A G T A T A C T C T C A A C A A C G T T G A C C C C TAC A A T
157 541
G H 8 ¥ L T G G G G C A T T C A TAC C T A A C G G G A
177 601
B R ¥ G A Q C L Q P N N I N G Z E N Z C TCT. C G G TAC G G G _ G C A C A G TGC_ C T C C A G C C G A A C A A C A T C A T G G G C A T T G A G A A C A T C TGC
197 661
E L A A R L L P S A I | W A K N ! P F F G A G T T G G C G G C C C G C C T G C T C TTC A G T G C C A T T G A G T G G G C C A A G A A C A T T C C T TTC T T C
217 721
P D F G L C C T GAC TTC C A G ~ C
237 791
V L N A A Q C S M P L H V A P L L A R A G T T C T T A A T G C A G C G t A G TOC T C C A T G C C C C T C C A T G T G G C C C C A C T G T T G G C A C G C G C T
257 941
G L GGC~TC
277 901
Z P Q E O V E K L K A L H V D B A B Y B AT(: T T T C A G G A G r A G G T G Q A A A A G C T G A A G G C A C T A C A T G T A G A C T C T (=CA G A A T A T T C C
297 9GI
~ L K A Z A L P T P D A V G L S D I G H TGT TTG AAA GCC ATA GCT TTG TTT ACA OCT OAT GCG GTG GGA CTA TCA GAC ATT GGC CAC
317 1021
V ~ S Z Q g K S Q C A L g I Y V R N Q Y G T G G A A A G C A T T C A A G A G A A A T C C C A G T G T GCC C T G G A A G A G TAC G T C C G A A A C C A G T A ~
337 1001
P N Q P T R P G R L L L R L P B L R I V C C A A A C C A A C C A ACA...CGG T T T G G G A G G C T T T T G CTC__CGC C T T C C T T C G T T A C G A A T T G T C
357 1141
-" A P V l E Q L F F V R L V G K T P I E T C C G C T C C T G T T A T A G A G C A A C T T TTC T T T GTC CGC C T T G T C G G C _SLI__OA C C ~ C A A T C G ~ r:-
377 1201
T L Z R D M L L B G S S F N W P Y N P M A C T T T A A T T C G A G A T A T G CTC C T G T C T G G G TCC A G C T T T A A T T G G C C C T A C A T G C C T A T G
397 1261
r A G T G A T G G G G C T C A T A G TGC T C T C A G A C A T C T A A C A C C C A A T C A A A A G G A A C C A G A A G A
1321
A C A T T A T G T C A T G A A A G A A C T T C A C A T G C A G A T GAC T A C A T A TGC T C A A C A G G T G C A T G T
Q
V D C GTA GAC ~
L V C TTGGTGTGT
G D K S S G r~m~-__:G A C A A G TCA. A G T G G C
K H AAACAC
T C • G C K 8 P P K R S V R It A e C lq~T n s s mm'a TGC A A G & G T T T C T T T A A G A G Q T C G G T G A G G A ~
8 ~T
P I S L L TTC A T C T C ¢ T T G C T T
¥ G TAT GGA
N L T AAC CT& ACC
R ~ Z V Q R O R AGA AGA GAA GTG CAG CGC GGA COG
L R A E CTC CGAGCAGAA
P Y P T CCA TAC CCG ACC
D Q V 8 L L R M T W S E L P GU~C C A A (~T~ T ~ C C T ~ CT(2 ~ G T A T G A C A T G Q A G T G A G C T G T T T
H A 8 P M 8 A D R V V A F N D H Z R C A T G C A T C A C C G A T G TCC G C T G A C C G G G T G G T G G C C T T T A T G G A C C A T A T T C G A
t
1381 CAG 'J~A &T& CAG GCA TTC &TG A,GA CAT CGC Fig. I. The open reading frame from XCOUPI. The nucleotides encoding the putative DNA binding domain and ligand binding domain are indicated by double and single underlining, respectively.
333 / z g / m l yeast tRNA, 5 × D e n h a r d t ' s solution. T h e final wash conditions were 20 mins at 42°C in 2 × SSC, 0.1% SDS. 30 positive plaques resulted from this screen. D N A was prepared from three of these plaques and restriction fragments from the clone with the largest insert, designated X C O U P 1 , were subcloned into M I 3 vectors and sequenced by the dideoxy m e t h o d using T7 D N A polymerase (Pharmacia). T h e sequence o f a 1409 bp fragment of this clone, containing an o p e n reading frame encoding a protein of 397 amino acids (predicted svp
svp
1
molecular weight 44000), is shown in Fig. 1. The proposed start codon is likely to be the natural initiator since it matches the consensus sequence for an active eukaryotic start codon [12]. Additionally, the first few amino acids downstream from the initiation codon are identical to those seen in the human C O U P protein [10]. T h e amino acid sequences of Xcnopus C O U P , human C O U P and Drosophila seven-up are compared in Fig. 2. Within the putative DNA binding domain,
MC*SPSTAPGFFNPRPQSGAELSAFDIGLSRSMGLGVPPHSAWHEPPA
49 SLGGHLHAASAGPGTTTGSVATGGGGTTPSSVASQQSAVIKQDLSCPSLNOAGSGHHPGI
XCOUP hCOUP svp
1 MAMVVNPWQEDIPGVPGSQMNNPPGLCNQDPGG --TPQTPTTPKG 1 *****SS*RDPQDD*A*GNPGG*NPAAQAAR**GGGAGEQQQQAGSGAPH*e***GQ*GA 109 KEDLSSSLPSANG*SA*GHHSGSGS~SGSGVNPGHGSDMLPLIKGHGQDMLTSIKGQ*T*
XCOUP hCOUP svp
44 G I P G Q D P V H S G D K G V P N ............. VDCLVCGDKSSGKHYGQFTCEGCKSFFKRS 61 P A - - - T * G T A * * * * Q G P P G S G Q S Q .... QHIE*V **********t***~****~****** 169 CGST-T*SSQANSSHSQSSNSGSQIDSKQNIE.*V**************************
XCOUP hCOUP svp
91 VRRNLTYTCRSNRDCPIDQHHRNQCQYCRLKKCLKVGMRRE-VQRGRR~SHPQTS 114 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * e * * * * * * 228
.... PG .... **
X C O U P 146 Q Y T L N N V D P Y ..... N G H S Y L T G F I S L L L R A E P Y P T S R Y G A Q C L Q P N N I M G I E N I C E L A A h C O U P 170 * * A * T * G e * L ..... e * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * svp 288 **QIA*G**MGIAGF******SSY****t**********~*-**M******t*p *~****
X C O U P 201 R L L F S A I E W A K N I P F F P D F Q L S D Q V S L L R M T W S E L F V L N A A Q C S M P L H V A P L L A R A G L H A h C O U P 225 * * * * * * v e " * R * * * * * * * L * I T * * * * * * * L * * * * * * * * * * * * * * w * * * e * * * e * A * * * * * SVp 347 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
XCOUP hCOUP svp
261 S P M S A D R V V A F M D H I R I F Q E Q V E K L K A L H V D S A E Y S C L K A I A L F T P D A V G L S D I G H V E ~ 285 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * S * * C * * * * A A * * * * * 407 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * C * * * * * T * * * * *
XCOUP hCOUP svp
321 Q E K S Q C A L E E Y V R N Q Y P N Q P T R F G R L L L R L P S L R I V S A P V I E Q L F F V R L V G K T P I E T L I R ************************************************************ 345 467
XCOUP hCOUP Svp
381 D M L L S G S S F N W P Y M P M Q 405 * * * * * * * * * * * * * * S I * C S 527 * * * * * * N * * S * * * L * S M
397 423 543
Fig. 2. Alignment of the ~(enop,s COUP protein (XCOUP), the human COUP protein (hCOUP) and the Drosophihl scten-up rsvp) protein. Amino acids in the human ~'~OUPprotein and the Drosophila seven-up protein which are identical to those at the corresponding position in the Xenopus COUP protein are indicated by asterisks (*). Where necessary, gaps, indicated by hyphens (-}. have been inserted in Ihe seqt,ences to maximise homology. The putative DNA binding domains and ligand binding domains are indicated by double and single underlining. respectively,
334
Xenopus COUP shows 95% and 91% homology with its human and Drosophila homologues, respectively. In view of this high level of conservation it seems likely that the Xenoptts COUP protein will bind to the well characterised DNA motif recognised by human COUP protein [4]. The putative ligand binding domain is also well conserved, with this region of the Xenopus protein showing 90% and 88% amino acid homology with its human and Drosophila counterparts respectively In view of this similarity, it seems likely that all three proteins interact with a common ligand, Within the ligand binding domain is a potential leucine zipper (residues 221-235), which is also found in other members of the steroid/thyroid receptor superfamily. The amino-terminal region of COUP is poorly conserved, with homology between this region of the Xenopus and human proteins being confined to three short sections, whilst there is no significant homology between the amino-terminal regions of Xenopus COUP and set'enup (Fig. 2). Furthermore, the amino-terminal region of the Xenopus COUP protein is considerably shorter (60 residues) than those of its human and Drosophila counterparts (83 and 199 residues, respectively). However, it is noteworthy that the amino-terminal regions of the Xenopus and human COUP proteins are both proline rich (12 residues out of 60 in the Xenopus protein and 12 residues out of 83 in the human protein). A number of transcription factors, for example the CAT box binding proteins of the CTF/NF1 family [13], have proliae rich trans-activation domains, thus suggesting such a role for the amino-terminal regions of the COUP proteins. It is significant that the Xenopus COUP clone was isolated from a neurulae eDNA library. As 30 out of
approx. 100000 clones (0.03%) hybridised to the probe it seems that COUP mRNA, and probably COUP protein, is moderately abundant in early Xenopus embryos. This observation is consistent with the suggestion that COUP may play an equivalent role in the development of vertebrate embryos to that of seven-up in Drosophila. We will further investigate this possibility. We thank Professor D. Melton for the Xenopus neurula eDNA library and Professor B. O'Malley for the human COUP cDNA clone. P.J.M. acknowledges the receipt of an SERC studentship, References I 2 3 4
Evans, R.M. (1988) Science 240, 889-895. Beato, M.G. (1991) FASEB J. 5, 21¼4-2051. Green, S. and Chambon, P. (1988) Trends Genet. 4, 309-314. Wang, L,-H, Tsai, S,T., Cook, R.G., Beattie, W.G., Tsai, M.-J. and O'Malley, B.W. (1989) Nature 340, 163-166. 5 Milbrandt, J. (1988) Neuron [, 183-188, 6 Naeber, U., Pankratz, MJ., Kienlin, A., Seifert, E,, Klemm, U. and Jackle, H. (1988) Nature 336, 489-492. 7 Pignoni, F., Balderelli, R.M.. Steingrimsson, E., Diaz. RJ., Patapoutian, A., Merriam, J.R, and Lensyel, J.A. (1990) Cell 62, 151-163. 8 Mlodzik, M., Hiromi, Y., Weber, U., Goodman, C.S. and Rubin, G.M, (1990)Cell 60, 211-224. 9 Bagcbi, M.K., Tsai, S.Y,, Tsai, M.-J. and O'Malley, B,W. (1987) Moi. ('el l, Biol. 7, 4151-4158. 10 Miyajirla, N., Kadowaki, Y,, Fakushige, S., Shimizu, S., Senba, K., Yamanashi, Y., Matsubara, K., Toyoshima, K. and Yamamoto, T, (1988) Nucleic Acids. Res. 16, 11057-11074. 11 Kintner, C,R. and Melton, D.A. (1987) Development 99, 311-325. 12 Kozak, M. (1986) Cell 44, 283-292. 13 Marmod, N., O'Neill, E.A., Kelley, TJ. and Tjian, R. (1989) Cell 58, 741-573.