Biochimica et Biophysica Acta 1396 Ž1998. 158–162
Short sequence-paper
Structure and expression of the human ubiquitin fusion–degradation gene žUFD1L / Giuseppe Novelli a,) , Aldo Mari a,b, Francesca Amati a,b, Alessia Colosimo a , Federica Sangiuolo a , Mario Bengala a , Emanuela Conti a,c , Antonia Ratti d , Roberta Bordoni d , Antonio Pizzuti a,d , Antonio Baldini e, Rita Crinelli f , Franco Pandolfi g , Mauro Magnani f , Bruno Dallapiccola a a
Dipartimento di Sanita` Pubblica e Biologia Cellulare, Cattedra di Genetica Umana e Medica, UniÕersita` di Roma, ‘Tor Vergata’, Õia di Tor Vergata 135, 00133 Roma, Istituto C.S.S. ‘Mendel’, Roma, Italy b Istituto di Biologia e Genetica, UniÕersita` di Chieti, Chieti, Italy c III DiÕisione Pediatrica, Istituto ‘G. Gaslini’, GenoÕa, Italy d Istituto di Clinica Neurologica, UniÕersita` di Milano, Ospedale Policlinico I.R.C.S.S., Milano, Italy e Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA f Istituto di Chimica Biologica ‘G. Fornaini’, UniÕersita` di Urbino, Urbino, Italy g Cattedra di Semeiotica Medica, UniÕersita` Cattolica del Sacro Cuore, Roma, Italy Received 26 September 1997; accepted 19 November 1997
Abstract We report the genomic organization, RNA and protein expression patterns of the gene encoding for the human homolog of the yeast ubiquitin fusion–degradation protein-1 ŽUFD1L.. This enzyme is involved in a ubiquitin-dependent proteolytic pathway ŽUFD., firstly described in yeast. The human UFD1L gene is organized into 12 exons ranging in size from 33 to 161 bp. Sequence analysis of the 5X-flanking region of the gene revealed a high GC content, multiple CCAAT-binding motifs, CREB, CFT, and AP-2 sites. RNA transcripts were detected in all tissues and cell lines examined, including thymus, thymocytes, T- and B-cells, fibroblasts, chorionic villi, and amniocytes. In Western blot, the UFD1L antibody demonstrated the presence of multiple protein isoforms in all the tested tissues. Expression profile and promoter characteristics suggest UFD1L is a housekeeping gene with implications in the pathogenesis of DiGeorgervelo–cardio–facial syndrome, due to 22q11.2 deletions. q 1998 Elsevier Science B.V. Keywords: Ubiquitin; DiGeorge syndrome; Chromosome 22; Expression
Ubiquitin-mediated proteolysis is a well known pathway of cellular protein degradation w1x. Ligation of ubiquitin to substrate proteins requires a reaction
)
Corresponding author. Fax: q39-6-20427313; E-mail:
[email protected]
cascade involving different ubiquitin-conjugating enzymes w1,2x. Usually, protein ubiquitination is a posttranslation process. Ubiquitin is joined to proteins in different configurations, depending on the ubiquitin moiety number, and on whether ubiquitin–ubiquitin linkages are formed w1,2x. The ubiquitin-conjugated proteins can be degraded to small peptides by a large 26 S ATP-dependent protease complex w1,2x. Ubiqui-
0167-4781r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. PII S 0 1 6 7 - 4 7 8 1 Ž 9 7 . 0 0 2 1 1 - X
G. NoÕelli et al.r Biochimica et Biophysica Acta 1396 (1998) 158–162
tin is also found linked to the N-terminus of certain proteins w2x. These ubiquitin-containing proteins are linear rather than branched in form, and produced as primary translation products w2x. In Saccharomyces cereÕisiae, the ubiquitin-fusion proteins are degraded by a ubiquitin-fusion–degradation ŽUFD. pathway w1x, involving at least four different enzymes whose genes ŽUFD1, UFD2, UFD4, UFD5. , have been isolated and characterized w1x. However, the exact cellular function of UFD enzymes is unknown. UFD1 appears as the most important among the UFD pathway enzymes, being essential in maintaining the yeast vegetative cell viability w1x. A UFD2 homolog has been isolated, which is important during the development of Dictyostelium discoideum wS. Pukatzki, personal communicationx. We recently isolated the human and mouse UFD1 homolog cDNAs Ž UFD1L and Ufd1l , respectively. w3x. In mammals, Ufd1l is primarily expressed in the eyes and inner ear primordia during development. The human UFD1L gene maps to 22q11.2 and is deleted in more than 85% of the patients with DiGeorge syndromervelo–cardio–facial syndrome ŽDGSrVCFS; MIM ) 188400;192430. , a developmental defect involving the 3rd and 4th pharyngeal pouche derivatives w3,4x. Haploinsufficiency of this gene might therefore disturb those embryogenetic mechanisms responsible for the pathogenesis of these disorders. We report here the whole intron–exon organization and the expression in several tissues of the human UFD1L gene. We subcloned in cosmids a single 400 kb YAC ŽYAC 706b10 of the CEPH Library. containing the TUPLE1rHIRA gene w5x. Five cosmids were isolated, containing the whole UFD1L gene, by hybridization of the library to the human full-length cDNA w3x. The selected cosmids were subcloned into pBluescript vector and the UFD1l gene fragments were isolated and automatically sequenced using a ThermoSequenase cycle sequencing kit ŽAmersham, UK. and infrared dye IRD41 5X labelled primers ŽLICOR, USA. w6x. Several sequencing primers were designed on the cDNA sequence pointing on both 3X and 5X directions. Intron–exon junction positions ŽTable 1. were determined by comparing the human UFD1L genomic and cDNA sequences. PCR reactions were performed to amplify each exon separately from the cosmid clones and
159
total genomic human DNA. The gene is composed of 12 exons and 11 introns extending over 30 kilobases Žkb.. The exon sizes range from 33 Ž exon 3. to 161 bp of the last exon, which contains 121 bp coding sequence plus 40 bp 3X-untranslated region Ž UTR. . The intron length varies between 218 Ž intron 8. and 7057 bp Ž intron 6.. The donor and acceptor splice sites at each exon–intron junction conform to the consensus ŽTable 1.. The translation start codon ŽATG. is localized in exon 1. Subcloning and sequence analysis of about 1 kb of the 5X-flanking region of the UFD1L gene revealed high GC and multiple CCAAT-binding factor sites contents. In addition, three CREB, three CFT Ž consensus ATTGG., and one AP-2 sites were also identified u s in g th e TRA N SFA C d a ta b a s e Žhttp:rrtransfac.gbf-braunschweig.de. . The presence of these sequences suggests that this region is the putative promoter. We did not reveal a TATA-box within 1 kb upstream exon 1, indicating UFD1L gene promoter belongs to the category of housekeeping CG-rich sequences. A single polyadenylation signal ŽAATAAA. was found 27 bp downstream to the termination codon. The expression pattern of the UFD1L gene was examined by RT–PCR and Western blot analysis. RT–PCR was performed on RNA extracted by the acid guanidium thiocyanate–phenol–chloroform method of Chomczynski and Sacchi w7x. About 8 m g of total RNA was converted to cDNA using a firststrand cDNA synthesis kit Ž Clontech Laboratories, USA., following the manufacturer’s protocol. Each cDNA synthesis included a control transcription of one sample without RNA. As a positive control of retrotranscription and amplification we used a coding fragment of the human hexokinase type 3 cDNA w8x. UFD1L specific transcription was obtained by PCR using 20 pmol each of forward UFa Ž5X-AGG CAT CTG TAC CTC CCA CAC ACT GG-3. and reverse UFb Ž5X-AGA GAA AGC GCG GAA GCC AAG C-3. primers. These primers encompass a 421-bp coding sequence including part of exon 4 and exon 10 Žnt 266 and nt 687, respectively.. Fig. 1 shows the results of RT–PCR studies in different human tissues and cell lines. A single PCR product was evident in all adult and fetal human tested tissues, according to Northern blot results w3x. In this study, we demonstrated that UFD1L is expressed starting from week
160
Table 1 The exon boundaries and intronic primers used to amplify each exon of UFD1L cDNA position
1
81
GGCACG . . . ATCATG
1
2
133
TTCTCT . . . GGAAGA
82
3
33
TAATTA . . . AACTCA
215
4
122
GCCGAC . . . CACTGG
248
5
131
ATGATG . . . AGCCGT
370
6
73
ATTAGA . . . GAAAAG 501
7
69
ATCTAC . . . AACGTG
574
8
66
GACTTT . . . TCGACA
643
9
48
GAAGGT . . . TTCCGC
709
10
89
GCTTTC . . . TAAAAG
757
11
82
AGGAAT . . . GAAGAG 846
12
121
GATGAA . . . TTTGCA
928
F R F R F R F R F R F R F R F R F R F R F R F R
Primer sequence
PCR PCR conditions Number of Intron size Žbp. size Žbp. Denat. Ann. Exten. cycles
–AGTGGTCAAGGGTCCTGCTTGTGCC –TTAGAGCCTAGCGTCTTCGCCTGGC –GGCTACAGCATTGGAGACTTAGGGC –GCAAGAGGTTGGGCCTTGATCC –CCCTCCACGGACAATTCACAACC –GATCACTGGGACATTCTGGCTACCC –TAAGACGCTGAGCAGGAGGAGTAGG –TGTGTTACCAAGTTGCCAGGTTCC –TACATGATGTTTAACAACCGCCAGC –TAGTGAGTAATCCGGGCAGCCC –AAGATCCAGGCATACGCTGC –TCGTGTTTCAAGTACCATCATGG –CAGCATCTTCCTCCTCTTTACCC –ACATGGACAGCGACAGCTCC –GGGACGCAGAGGATATGACTACACC –TGAATAAGAGCAGTGGGTTGTGCC –GGCTGTTGGGAAGATTCCAGG –TGTTGGTGTAGTCATATCCTCTGCG –CCACCAAGGAGAGCAGCTAAGG –TCCAAGCAAAGGACATATCGTCC –TGTAAATTGCCCAGTCTCGGG –TGCCAGGAAGCACCAAAGG –GTATTCTCTGATGAGGCTCGTCC –GGCTGGTCTTGAACTCCTGG
392
948C
558C 728C
35
3493
426
948C
558C 728C
35
369
296
948C
558C 728C
35
3259
425
948C
558C 728C
35
3681
464
948C
558C 728C
35
2597
293
948C
558C 728C
35
7057
267
948C
558C 728C
35
1102
405
948C
558C 728C
35
218
380
948C
558C 728C
35
818
300
948C
558C 728C
35
850
361
948C
578C 728C
35
4004
374
948C
558C 728C
35
G. NoÕelli et al.r Biochimica et Biophysica Acta 1396 (1998) 158–162
Exon Exon Exon boundaries number size Žbp.
G. NoÕelli et al.r Biochimica et Biophysica Acta 1396 (1998) 158–162
Fig. 1. The RT–PCR expression of the human UDF1L gene. Lane 1–8, thymus, thymocytes, mature T-cells, mature B-cells, Jurkat T-cells, fibroblasts, chorionic villi, amniocytes, respectively; lane 9, RT, control; lane 10, substrate negative control Žwater.; lane 11, positive control Žplasmid containing UFD1L cDNA.; M, DNA molecular weight marker Ž f X171rHaeIII..
10 of gestational age, as documented by its presence in human chorionic villi, and continues to be transcribed throughout the second trimester of gestation ŽFig. 1. . UFD1L is transcripted in thymus, as shown by positivity with homogenized thymic tissue obtained from patients undergoing cardiac surgery for unrelated pathologies Ž kindly provided by Dr. Bruno Marino, Bambino Gesu` Hospital, Rome. . To further investigate gene expression on lymphoid cells, we isolated thymocytes by delicately mincing thymic tissue and subsequent separation by density gradient centrifugation. Thymocytes were characterized by double expression of CD4 and CD8 surface markers as assessed by flow cytometry. Other T and B cell lines were also evaluated. These included: Ž 1. the Jurkat T-cell line derived from a patient with T-cell leukemia and expressing a more mature CD4 q , CD8 y phenotype; Ž 2. mature T-cell cultures derived from peripheral blood lymphocytes Ž PBL. from normal donors. PBL were stimulated with phytohemagglutinin and then cultured in medium supplemented with 100 IU of recombinant interleukin-2 and were comprised of a mixture of both CD4 and CD8 populations; Ž3. B-cell lines were derived by infecting normal donors’ PBL with Epstein–Barr virus and cultured cells showed a mature B-cell phenotype. The expression of UFD1L in thymus and thymus-derived cells is intriguing, since the thymic rudiment is formed from ectoderm of the third pharyngeal cleft and endoderm of the third pharyngeal pouch w9x. Therefore, a reduction in dosage of the UFD1L activity, could have a role in the abnormal development of this organ in DGSrVCFS patients. RNA results were confirmed by protein expression analysis using a polyclonal antibody to UFD1L. The
161
antibody was raised in rabbits immunized with a mix of three synthetic peptides Ž UFD1L protein residues 11–25, 121–133, and 260–275, respectively. w3x. The expected UFD1L molecular weight is about 32 kDa, basing on the full length cDNA sequence. Northern blot analysis has shown a single RNA form, hybridizing to the UFD1L cDNA, excluding alternative spliced RNA forms. The antibody specifically reacts with the original peptides and with the UFD1L:GST fusion protein. In Western blot assay, using several protein lysates, the antibody recognizes three major proteins of 32, 48, and 77 kDa ŽFig. 2.. Antibodies from the pre-immune sera did not react with these proteins. As anticipated, the presence of very different alternative UFD1L forms was not expected. Thus, the two additional immunoreactive signals may represent either multimeric protein aggregations or proteins with strictly amino acid homologies. So far, no other homolog to the UFD pathway proteins and proteins with amino acid similarity have been isolated from mammalian cells. A T-blast GenBank database search using the yeast protein as queries revealed only a mouse EST clone Ž AA260973. similar to Ufd2p and UFD1L ŽU64444. and Ufd1l ŽU64445. reported by our group w3x. However, the presence of other sequences with high homology to the UFD1L gene have been excluded by Southern blot experiments using the UFD1L full length cDNA Ždata not shown. . Although the biological functions of the UFD enzymes are poorly understood, studies performed in yeast suggest that, similar to the N-end rule pathway,
Fig. 2. The representative Western blot of UFD1L protein in whole-cell lysates from thymus Žlane 1., thymocytes Žlane 2., mature T-cells Žlane 3., mature B-cells Žlane 4., amniocytes Žlane 5.. The 77-kDa, 48-kDa and 32-kDa protein bands are indicated by arrows.
162
G. NoÕelli et al.r Biochimica et Biophysica Acta 1396 (1998) 158–162
the UFD system targets a relatively small fraction of short-lived intracellular proteins w1x. The ubiquitinmediated proteolytic system is involved in a multitude of processes, like cell growth and differentiation, signal transduction, DNA repair, transmembrane traffic, and response to stress, including the immune response w2x. The mapping of the UFD1L gene within a chromosomal region frequently deleted in DGSrVCFS patients, makes it a potential candidate for some anomalies associated with this disorder. The major clinical features of DGSrVCFS include conotruncal cardiac defects, aplasia or hypoplasia of the thymus and parathyroid glands, palatal anomalies, developmental delay, and craniofacial dysmorphia w4x. Other less common manifestations comprise conductive hearing loss, renal abnormalities, learning problems and skeletal malformations w10x. Mouse Ufd1l is expressed during development in the majority tissues involved in these anomalies, including cardiac outflow tract, brain, lung and particularly in the otic vesicle. These data, together with the human expression pattern described here, suggest that the deletion of UFD1L gene may contribute to the 22q11.2 deletion syndrome phenotype. UFD1L represents the second example of an ubiquitin-dependent enzyme involvement in human genetic pathology. In fact, loss of enzyme function due to truncating mutations in the coding region of the E6-AP ubiquitin protein ligase 3A gene ŽUBE3A., was found in patients with Angelman syndrome w10,11x. The determination of the genomic structure of the UFD1L gene is an important step in performing detailed investigation of its intracellular function and for the elucidation of the role of this gene in mammalian development.
We are indebted to the UK HGMP Resource Centre for computer assistance. This work was supported by funds provided by Italian Telethon Ž Grants E.399 and D27. and Ministero dell’Universita` e della Ricerca Scientifica Ž MURST. , and in part by grants from the National Institutes of Health HL51524 and from the American Heart Association. References w1x E.S. Johnson, P.C.M. Ma, I.M. Ota, A. Varshavsky, J. Biol. Chem. 270 Ž1995. 17442–17456. w2x M. Hochstrasser, Annu. Rev. Genet. 30 Ž1996. 405–439. w3x A. Pizzuti, G. Novelli, A. Ratti, F. Amati, A. Mari, G. Calabrese, S. Nicolis, V. Silani, B. Marino, G. Scarlato, S. Ottolenghi, B. Dallapiccola, Hum. Mol. Genet. 6 Ž1997. 259–265. w4x E.J. Lammer, J.M. Opitz, Am. J. Med. Genet. 29 Ž1986. 113–127. w5x A. Pizzuti, G. Novelli, A. Mari, A. Ratti, A. Colosimo, F. Amati, D. Penso, F. Sangiuolo, G. Calabrese, G. Palka, G. Silani, M. Gennarelli, R. Mingarelli, G. Scarlato, P.J. Scambler, B. Dallapiccola, Am. J. Hum. Genet. 58 Ž1996. 722– 729. w6x P. Maceratesi, F. Sangiuolo, G. Novelli, P. Ninfali, M. Magnani, J.K.V. Reichardt, B. Dallapiccola, Hum. Mutat. 8 Ž1996. 369–372. w7x P. Chomczynski, N. Sacchi, Anal. Biochem. 162 Ž1987. 156–159. w8x A. Colosimo, G. Calabrese, M. Gennarelli, A.M. Ruzzo, F. Sangiuolo, M. Magnani, G. Palka, G. Novelli, B. Dallapiccola, Cytogenet. Cell. Genet. 74 Ž1996. 187–188. w9x B.F. Haynes, Thymus 16 Ž1990. 143. w10x T. Kishino, M. Lalande, J. Wagstaff, Nat. Genet. 15 Ž1997. 70–73. w11x T. Matsuura, J.S. Sutcliffe, P. Fang, R.J. Galjaard, Y. Jiang, C.S. Benton, J.M. Rommens, A.L. Beaduet, Nat. Genet. 15 Ž1997. 74–77.