Molecular cloning and characterization of the mouse and human TUSP gene, a novel member of the tubby superfamily

Molecular cloning and characterization of the mouse and human TUSP gene, a novel member of the tubby superfamily

Gene 273 (2001) 275±284 www.elsevier.com/locate/gene Molecular cloning and characterization of the mouse and human TUSP gene, a novel member of the ...

6MB Sizes 6 Downloads 91 Views

Gene 273 (2001) 275±284

www.elsevier.com/locate/gene

Molecular cloning and characterization of the mouse and human TUSP gene, a novel member of the tubby superfamily Quan-Zhen Li a, Cong-Yi Wang a, Jin-Da Shi a, Qing-Guo Ruan a, Sarah Eckenrode a, Abdoreza Davoodi-Semiromi a, Thomas Kukar b, Yunrong Gu b, Wei Lian b, Donghai Wu b, Jin-Xiong She a,* a

Department of Pathology, Immunology and Laboratory Medicine, Center for Mammalian Genetics and Diabetes Center of Excellence, College of Medicine, University of Florida, Gainesville, FL 32610, USA b Department of Medicinal Chemistry, College of Pharmacy and The McKnight Brain Institute, University of Florida, Gainesville, FL 32610, USA Received 6 February 2001; received in revised form 29 May 2001; accepted 22 June 2001 Received by J.A. Engler

Abstract We report here the cloning and characterization of a novel gene belonging to the tubby superfamily proteins (TUSP) in mouse and human. The mouse Tusp cDNA is 9120 bp in length and encodes a deduced protein of 1547 amino acids, while the human TUSP gene is 11,127 bp and encodes a deduced protein of 1544 amino acids. The human and mouse genes are 87% identical for their nucleotide sequences and 85% identical for their amino acid sequences. The protein sequences of these genes are 40±48% identical to other tubby family proteins at the Cterminal conserved `tubby domain'. In addition, the TUSP proteins contain a tubby signature motif (FXGRVTQ), two bipartite nuclear localization signals (NLSs) at the C-terminal, two proline-rich regions, one WD40 repeat region and one suppressor of cytokines signaling domain. Transfection assay with green ¯uorescent protein-tagged TUSP expression constructs showed that the complete TUSP protein and the N-terminal portion of TUSP are localized in the cytoplasm but the C-terminal portion with the two NLSs produced distinct dots or spots localized in the cytoplasm. Northern blotting analysis showed that the major transcript with the complete coding sequence is expressed mainly in the brain, skeletal muscle, testis and kidney. Radiation hybrid mapping localized the mouse gene to chromosome 17q13 and the human TUSP gene to chromosome 6q25-q26 near the type 1 diabetes gene IDDM5. However, association analysis in diabetic families with a polymorphic microsatellite marker did not show any evidence for association between TUSP and type 1 diabetes. The precise biological function of the tubby superfamily genes is still unknown; the highly conserved tubby domain in different species, however, suggests that these proteins must have fundamental biological functions in a wide range of multi-cellular organisms. q 2001 Elsevier Science B.V. All rights reserved. Keywords: Tubby; Cloning; Expression; IDDM; Obesity; Gene family; Mapping

1. Introduction Tubby, an autosomal recessive disease in mice, is characterized by maturity-onset obesity, insulin resistance and progressive cochlear and retinal degeneration (Coleman and Eicher, 1990; Heckenlively et al., 1995; Ohlemiller et al., Abbreviations: NLS, nuclear localization signal; PCR, polymerase chain reaction; RACE, rapid ampli®cation of cDNA ends; RT, reverse transcription; TDT, transmission disequilibrium test; Tusp, tubby superfamily protein; UTR, untranslated region * Corresponding author. Department of Pathology, Immunology and Laboratory Medicine, Box 100275, College of Medicine, University of Florida, Gainesville, FL 32610, USA. Tel.: 11-352-392-0667; fax: 11352-392-3053. E-mail address: she@u¯.edu (J.-X. She).

1995). The tubby gene (Tub), which is responsible for the tubby phenotype, is the prototype of a family of genes (the tubby family) that include three tubby-like proteins (Tulp1, Tulp2 and Tulp3) (Kleyn et al., 1996; Noben-Trauth et al., 1996; North et al., 1997; Nishina et al., 1998). The most striking feature of the tubby family proteins is that all the family members share a highly conserved carboxyl terminus but the N-terminal of these proteins is poorly conserved. Mutation analysis in tubby mice identi®ed a G to T transversion in the 3 0 coding region of the Tub gene that abolishes a donor splice site and results in the generation of a larger transcript containing the unspliced intron. This mutation disrupts the highly conserved C-terminus and generates a Tub mutant protein in which the last 44 amino acids are replaced with 20 amino acids not found in the wild-type

0378-1119/01/$ - see front matter q 2001 Elsevier Science B.V. All rights reserved. PII: S 0378-111 9(01)00582-0

276

Q.-Z. Li et al. / Gene 273 (2001) 275±284

protein (Kleyn et al., 1996; Noben-Trauth et al., 1996). Similar mutation was also found in the Tulp1 gene, where a splice site mutation that occurred in the C-terminal caused early-onset, severe retinal degeneration (Lewis et al., 1999; Hagstrom et al., 1998). In order to identify other members in the tubby family, we screened the EST database and identi®ed a clone that may contain a tubby-related gene. Our cloning effort of fulllength cDNA revealed a gene encoding a protein that has 40±48% homology in its C-terminal `tubby domain', suggesting that this gene is a distantly related member of the tubby protein family. Since the sequence homology between this new protein and the other known tubby members is less than the homologies among the tubby family proteins, we named the new protein as Tubby Superfamily Protein (TUSP). In this paper, we report the molecular cloning and characterization of the mouse and human TUSP genes and proteins.

2. Materials and methods 2.1. 5 0 -RACE and 3 0 -RACE 5 0 -RACE and 3 0 -RACE were conducted using the SMART RACE kit purchased from Clontech (K1811-1) according to the manufacturer's instruction with some modi®cations. Mouse and human total RNA were used for reverse transcription. For 5 0 -RACE, the ®rst-strand cDNA synthesis was primed using a gene-speci®c primer and a SMART oligo was also present in the reaction. After reverse transcription reached the end of the mRNA template, several dC residues were added to the end of the cDNA. The SMART oligo, which contains 3-dG at its 3 0 -end, anneals to the tail of the newly synthesized cDNA and then serves as template for further extension of the cDNA by RT. After RT reaction, an internal gene-speci®c reverse primer and an UP primer, which is complimentary to the SMART oligo, were used to perform PCR using the RT products as templates. To increase the speci®city and product yield of 5 0 -RACE, nested PCR was then performed using another internal gene-speci®c primer and NUP primer (internal primer of UP). For 3 0 RACE, the ®rst-strand cDNA was synthesized using a modi®ed oligo-dT with an UP oligonucleotide tail. The UP primer and a gene-speci®c forward primer were used for the ®rst round PCR. Nested PCR was performed using NUP and an internal gene-speci®c forward primer. PCR reactions were carried out in a ®nal volume of 35 ml. After the RT reaction, the samples were denatured at 948C for 5 min. Ampli®cations were carried out with ®ve cycles of 30 s of denaturing at 948C, 30 s of annealing at 688C and 4 min of extension at 728C, then 30 cycles of 30 s of denaturing at 948C, 30 s of annealing at 628C and 4 min of extenuation at 728C. PCR products obtained from nested PCR were loaded on to a 2% agarose gel and individual bands were excised from the gel for direct sequencing.

2.2. DNA sequencing For direct sequencing of PCR products, the expected bands were excised from the gel and transferred into 1.5 ml Eppendorf tubes. The tubes were frozen at 2808C for 5±10 min, and the gel was then smashed while frozen. DNA fragments were eluted out of the gels by brief centrifugation. The supernatant was directly used for sequencing with an ABI 310 automated DNA sequencer as described previously (Wang et al., 1998, 1999). 2.3. Sequence assembly and analysis Sequence assembly and analysis were performed using the programs AssemblyLign and MacVector (Oxford). Protein and DNA homology searches were conducted using tblastn, tblastx and blastn programs (http:// www.ncbi.nlm.nih.gov/BLAST/blast_program.html). Multiple sequence alignments were performed using the GeneBee program (http://www.genebee.msu.su/services/malign_reduced.html). Multiple programs including PROSITE (http://www.expasy.ch/prosite/), PFSCAN (http://www.isrec.isb-sib.ch/software/PFSCAN_FORM.HTML), and PIR (http://www-nbrf.georgetown.edu/cgi-bin/pirwww/fasta.pl) were used for searching sequence features of known protein domains and motifs. The phylogenetic tree was generated with programs in the Vector NTI Suite (InforMax). 2.4. Northern blot analysis Nylon membranes containing 2 mg poly(A) 1 RNA samples from 12 human tissues and eight mouse tissues were hybridized with human or mouse cDNA probes. The human cDNA probe was a 449 bp fragment (located in position 201±650) and the mouse cDNA probe was a 313 bp fragment (located in position 1346±1659). cDNA probes were labeled with [ 32P]dCTP by a random priming labeling kit (Amersham Pharmacia Biotech, Uppsala, Sweden). The blots were pre-hybridized at 428C for 3 h and then hybridized with probes at 428C overnight. After hybridization, the blots were washed three times each for 5 min in 2 £ SSC, 0.1% SDS at room temperature, followed by washing in 0.1 £ SSC, 0.1% SDS at 658C for three times each for 20 min. Signals were visualized by autoradiography. 2.5. Subcellular localization with green ¯uorescent protein fusion proteins Human TUSP cDNA fragments were cloned into the green ¯uorescent protein (GFP) expression vector pEGFPN1 (Clontech). HEK293 cells were cultured in Dulbecco's modi®ed Eagle's medium (DMEM) supplemented with 10% fetal calf serum (FCS). Cells were washed in Hanks' buffer before transfection. Recombinant constructs containing various TUSP cDNA fragments were transfected into the HEK293 cells under standard conditions for mammalian cells using LipofectAMINE reagent (Life Tech-

Q.-Z. Li et al. / Gene 273 (2001) 275±284

nologies, Inc.). After transfection, cells were cultured in chamber slides (LAB-TEK) for 12±18 h. Transfected cells were then ®xed with 4% paraformaldehyde for 20 min and mounted with coverslips. The ¯uorescent images were obtained using a confocal microscope (BioRad) equipped with an Ar laser with excitation at 488 nm and detection at a 510±530 nm bandpass for GFP. 2.6. Chromosomal localization The GeneBridge 4 radiation hybrid panels for human (RH02.05) and mouse (RH04.05) were purchased from Research Genetics. For the human TUSP gene, the HtuspF12 and HtuspR12 primers (which amplify a 437 bp fragment in the 3 0 -untranslated region (UTR) of the TUSP cDNA) were used to amplify 93 radiation hybrid clones of the whole human genome. For the mouse Tusp gene, the MtuspF10 and MtuspR10 primers (which amplify a 213 bp fragment in the 3 0 -UTR of Tusp) were used to amplify 100 radiation hybrid clones of the whole mouse genome. The data were submitted to the GeneBridge 4.0 mapping server at the Whitehead Institute (http://carbon.wi.mit.edu:8000/ cgi-bin/contig/rhmapper.pl) for analysis. 2.7. Association analyses in diabetic families A polymorphic CA repeat located in intron 14 of TUSP was identi®ed by sequencing and then genotyped using radioactive labeling of PCR primers (forward primer, HTUSPCA-F; reverse primer, HTUSPCA-R) and denaturing polyacrylamide gel electrophoresis as previously reported (Luo et al., 1996). A total of 265 American Caucasian families with two diabetic siblings (Luo et al., 1996) were typed and analyzed by the affected sibpair and transmission disequilibrium test (TDT). 3. Results 3.1. Cloning and sequencing of the mouse Tusp and human TUSP genes A cDNA clone (GenBank Accession number: AA288745) containing a 389 bp (position 5593±5981 in the Tusp sequence) fragment of the mouse Tusp gene was initially sequenced and used as a starting point to clone the full-length Tusp gene through several rounds of 5 0 - and 3 0 RACE. The primers used for ampli®cation and sequencing are listed in Table 1. The full-length Tusp cDNA (9123 bp) contains a long reading frame of 4641 bp, beginning at nucleotide 201 and ending at nucleotide 4842. The encoded protein contains 1547 amino acids (Fig. 1) with an estimated molecular weight of 169.6 kDa. Three in-frame stop codons were found in the upstream of the ®rst putative start codon. Five polyadenylation signals (AATAAA) were found in the 3 0 non-coding region at positions 6807, 6907, 7013, 8110

277

and 9105. The mouse Tusp gene sequence has been deposited in GenBank under the Accession number AF219945. To identify the human homologue of the mouse Tusp gene, a blast search was conducted using mouse Tusp cDNA sequence against the human EST database. Three human EST clones (AA682942, AA984088 and AA984070) showed 86±89% sequence identity with the mouse Tusp cDNA sequence at positions 2253±2696 (AA286942), 167±730 (AA984088) and 167±496 (AA984070). Primers (Table 1) were designed based on these EST sequences and PCR was performed to amplify the sequence between these ESTs using human brain cDNA. The 5 0 and 3 0 end sequences of the human TUSP gene were obtained by 5 0 -RACE and 3 0 -RACE. The longest TUSP cDNA contains 11,127 bp and has an open reading frame (ORF) of 4632 bp with the ®rst ATG at base 1358 and the stop codon at base 5989 (Fig. 1). Multiple in-frame stop codons are present in the upstream of the ®rst putative start codon. The longest peptide is predicted to contain 1544 amino acids (169.2 kDa). There are three polyadenylation signals (AATAAA) in the 3 0 -UTR at positions 6518, 7091, 7169 and 11,104. The human TUSP gene sequence has been deposited in GenBank under the Accession number AF219946. A database search identi®ed three assembled DNA fragments: RP3-442A17, RP11-732M18 and RP1-301F24 (AL353800, AL360169 and AL161761). These fragments were mapped to chromosome 6q25-q26. There is no homology between the genomic DNA and cDNA beyond the poly(A) tail. The coding region of the human TUSP gene spans over 100 kb and was organized into 15 exons (Fig. 2). The shortest exon (exon 12) is 93 bp and the longest exon (exon 14) is 2500 bp. All introns contain the consensus donor and acceptor splice sites. 3.2. Expression, tissue distribution and alternative splicing Northern blot was performed to determine the expression patterns of these genes in mouse and human tissues. A 7.0 kb transcript was detected in mouse brain and testis. This transcript is not the full-length transcript but corresponds well with the potential polyadenylation sites at position 6907 and/or 7013. This transcript contains the full coding sequence. Another major transcript (about 2 kb) was also detected in the brain (Fig. 3A). On the human Northern blot, an 11 kb mRNA transcript, which corresponds well with the full-length transcript, was detected in brain, skeletal muscle, kidney and placenta (Fig. 3B). A 2 kb transcript was strongly expressed in the heart and kidney (Fig. 3B). Another band detected in human Northern blot is a 5.2 kb transcript which was weakly expressed in kidney, skeletal muscle and heart. 3.3. Sequence homology analysis The mouse Tusp and human TUSP cDNA sequences share 87% nucleotide sequence similarity throughout the

278

Q.-Z. Li et al. / Gene 273 (2001) 275±284

Table 1 Primers used in this study Primer names

Sequence (5 0 ±3 0 )

Position

Human primers Htuspf1 Htuspr1 Htuspf2 Htuspr2 Htuspf3 Htuspr3 Htuspf4 Htuspr4 Htuspf5 htuspr5 Htuspf6 Htuspr6 Htuspf7 Htuspr7 Htuspf8 Htuspr8 Htuspf9 Htuspr9 Htuspf10 Htuspr10 Htuspf11 Htuspr11 Htuspf12 Htuspr12 Htuspf13 HtuspR13 HTUSPCA-F HTUSPCA-R

ggactagtggtgtggtgtttg GCGCACCCCTCAGTGCCATAG caggctgaatgagaataaccaa CAGGCACAGGATGTTGGAGTC CAACCTCCGGGGCCACAATA GGCAATCCATGACAATCACCT CCTCGGGAGACATCAGCTTAA GCCATCAACAGCCTCGAATCC gcaccatgaagcgcacagag CTTGGGAGTTTGGGTGATTTGC TGCTCAGGTCACGTCTAATATCT TGGGGGACATAGCCTGAGCCGAG aatgggccgcatcattcagaac CCGGGGAGGGTGTAGGTGTT tcaagatggcccagctggccga CTCAGGAGGCTGTAGGAGGACT ggacgtgtcccgactgcccttc CGGAGCTGTCCTGGAACTCGTTT atgaaccagagccagggcagcag CCAACCAAATAAATGACGACCAC GCCAAAAATTCTCAAGATGCAGG GAGAAAACTGCTACCACTACCAC tacagactgtttactaaggagcta TTGGAACCCAACGCCAGCACAT GGAGACCTATTAGAATGAGAG attaagcaatagctggacaactc ACGTTTCTACTTCTGTTCATTG GGAAATGTAACTAAGCCGTAAT

128±148 548±568 976±997 1397±1416 1585±1604 1926±1946 2145±2165 2416±2436 2682±2701 2967±2988 3145±3167 3445±3467 3646±3667 4094±4113 4347±4368 4554±4575 4843±4864 5557±5579 5618±5640 6148±6170 6423±6445 6871±6893 7252±7275 7559±7580 8694±8714 11,084±11,106 gDNA gDNA

Mouse primers Mtuspf1 Mtuapr1 Mtuspf2 Mtuspr2 Mtuspf3 Mtuspr3 Mtuspf4 Mtuspr4 Mtuspf5 Mtuspr5 Mtuspf6 Mtuspr6 Mtuspf7 Mtuspr7 Mtuspf8 Mtuspr9 Mtuspf9 Mtuspr9 Mtuspf10 Mtuspr10 Mtuspf11 Mtuspr11

GACCTAATTGGACTCTAGATCAG tgcccatctgctgtgccaaacaac TGGTATATGGACTCCCGATGA AGCCCTCCCAGGTACTCCAAGT CACTCTACTTGGAGTACCTGG AGGATTGGAGTACCGACTGGAG ACCTGCCTGTCTTCAACCCAAAC ACACCTCTTCCACATTGCCACT ACTAGGCTATGAGAGGATTACCA ACAGGGCATCCTTAGGGGAGCA GCCATATCCAGGCAGCTATAACA TGCCATCAATCCGTCCAAACTGC CAGGAGTCTGCCAAGAATTTCCA ggctctatcttttagtatgctagg CTTCCCCCTTTGTTTGACTCTG CACTAACTCTGAACATACTGCTC atatgagagctgattcttaggtgg CTCAGGCTATCCATTCAAATAAGTA ACCCATAGGAAAAGAGTATATTAAC CACCTTTGGTAAATAATACAGTGCC agtcaattttaaggggaagggta aagcaatagctggacaaatcttac

Primers for RACE Smart oligo 5 0 -CDS 3 0 -CDS UP NUP

AAGCAGTGGTAACAACGCAGAGTACGCGGG (T)25VN (N ˆ A, C, G, or T; V ˆ A, G, or C) AAGCAGTGGTAACAACGCAGAGTAC(T)30VN (N ˆ A, C, G, or T; V ˆ A, G, or C) CTAATACGACTCACTATAGGGCAAGCAGTGGTAACAACGCAGAGT AAGCAGTGGTAACAACGCAGAGT

58±81 746±769 716±736 1582±1603 1576±1596 2438±2459 2161±2183 2892±2911 2858±2880 3831±3852 3779±3801 4733±4755 4683±4705 5909±5932 5714±5735 6514±6536 6468±6491 7645±7668 7553±7577 8560±8584 8462±8484 9081±9004

Q.-Z. Li et al. / Gene 273 (2001) 275±284

279

Fig. 1. Deduced amino acid sequences of the human TUSP and mouse Tusp genes. Identical amino acids are shaded. The conserved tubby domain at the Cterminal is included in the large box. The small open boxes indicate the proline-rich domain. The two nuclear localization signals (NLS) are double underlined. The WD40 repeat region is single underlined. The SOCS domains are in boldface and italic. The tubby signature motifs are italic boldface and underlined. The putative ®rst methionine is bold and underlined. An asterisk marks the position of the stop codon. The sequence data for the mouse Tusp and human TUSP have been deposited in the GenBank database with Accession numbers AF219945 and AF219946.

Fig. 2. Genomic structure of the human TUSP gene. Open boxes indicate exons and the hatched boxes indicate 5 0 - and 3 0 -UTRs. The solid boxes indicate exons and the lines indicate introns for the genomic sequence.

280

Q.-Z. Li et al. / Gene 273 (2001) 275±284

deduced coding region, while the sequence homology in the 5 0 -UTR and 3 0 -UTR is relatively poor, further supporting the ORF assignment. The protein sequences in the two species are also very conserved as they are 85% identical and 92% similar (Fig. 1). Signi®cant sequence homologies were observed between TUSP and tubby protein family members. The homology region was restricted to the Cterminal `tubby domain' (250 amino acids). The TUSP and Tusp proteins shared 40±48% identity and 55±65% similarity in the C-terminal region with the tubby family members in the tubby domain (Fig. 4). A tubby signature motif (FXGRVTQ) was also found in tubby and TUSP proteins by the GeneFind software (http://www-nbrf.georgetown.edu/gf-cgi/gene®nd.pl). The tubby domain is conserved in a wide range of species including Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, mouse and human. Phylogenetic analysis using the C-terminal tubby domain protein sequences indicates that the inferred protein sequence of TUSP is a distant relative of the tubby family members (Fig. 5). Using the Prosite and Pro®le Scan programs, we identi®ed two bipartite NLSs at positions 1387 and 1438, two proline-rich regions at positions 775 and 838, a WD40 repeat region at position 82±208 (23.4±27.5% homology), a suppressor of cytokines signaling (SOCS) domain at position 373±414 (45.5% homology) and a EBNA-2 homology region at position 25±521 (23.6% homology) (Fig. 1). A protein motif search also revealed multiple potential phosphorylation, myristoylation, and glycosylation sites.

3.4. Subcellular localization of the human TUSP gene We determined the subcellular localization of the TUSP protein using a transient transfection assay in HEK293 cells with GFP-tagged TUSP protein. Since the carboxyl terminal sequences of both human and mouse TUSP contain two potential bipartite NLSs and the putative NLSs are absent in the 5.2 kb splice form which does not contain exon 13, we made three constructs representing the full coding sequence of TUSP (amino acids 1±1544), N-terminal portion (amino acids 1±700) and C-terminal portion (amino acids 701± 1544) of TUSP, respectively. These constructs produce a full-length GFP fusion protein (TUSP-GFP), N-terminal GFP fusion protein (TUSPN-GFP) and C-terminal GFP fusion protein (TUSPC-GFP). HEK293 cells were transfected with each of the recombinant constructs and analyzed for the green ¯uorescence signal 18 h post-transfection. As shown in Fig. 6, TUSP-GFP and TUSPN-GFP produced a cytoplasmic signal. Interestingly, TUSPC-GFP produced distinct dots or spots localized in the cytoplasm. These dots, which are similar to the promyelocytic (PML) bodies or the so-called nuclear bodies (Doucas et al., 1999), represent some intracellular protein assembly which may be functionally relevant. The localization result for the N-terminal portion is consistent with the absence of a functional NLS in its sequence. The cytoplasmic localization of the whole protein suggests that the two putative NLSs are not functional or may be inaccessible in the absence of some external stimuli. The cytoplasmic localization of the fulllength TUSP protein is in clear contrast to the Tub and

Fig. 3. Northern blots for mouse and human TUSP. (A) Eight mouse tissues; (B) 12 human tissues.

Q.-Z. Li et al. / Gene 273 (2001) 275±284

281

Fig. 4. Protein sequence alignment of human TUSP and mouse Tusp C-terminal region with the tubby domain of human and mouse tubby family members. HTUB, HTULP1, HTULP2 and HTULP3 represent human protein TUB (Accession number: AAB53494), TULP1 (AAB53700), TULP2 (AB53701) and TULP3 (AAC95431). mTub, mTulp1, mTulp2 and mTulp3 represent mouse protein Tub (AAB53495), Tulp1 (AAD38451), Tulp2 (AAD38452) and Tulp3 (AAC95430). Black boxes indicate identical amino acids.

TULP1 proteins that are localized in the nucleus (He et al., 2000). Like the TUSP protein, the TUB C-terminal is also localized in the cytoplasm (He et al., 2000). 3.5. Chromosomal localization and association analysis in IDDM The chromosomal location of the human TUSP gene was determined in the radiation hybrid mapping panel for both human and mouse (Research Genetics) using primers speci®c to the 3 0 -UTR. The human TUSP gene mapped to chromosome 6q25-q26 between markers D6S442 and D6S281, while the mouse Tusp gene mapped to chromosome 17q23 between markers D17Mit196 and D17Mit113 (Fig. 7). Since the human TUSP gene resides within a type 1 diabetes susceptibility interval (IDDM5 on 6q25-26; Davies et al., 1994), it is a candidate gene for type 1 diabetes. We identi®ed a (CA)7 repeat located in the TUSP intron 14. This polymorphic marker was used to analyze a total of 265 diabetic sibpair families. The TDT showed no evidence for association between diabetes and the TUSP polymorphism.

Fig. 5. Phylogenetic tree showing relationships among members of the tubby superfamily in various species. The tree was constructed using amino acid sequences from the C-terminal tubby domain. HTUSP, human TUSP; MTUSP, mouse TUSP; HTUB, human TUB; MTUB, mouse TUB; HTULP 1, human TULP 1; MTULP 1, mouse TULP 1; HTULP 2, human TULP 2; MTULP 2, mouse TULP 2; HTULP 3, human TULP 3; MTULP 3, mouse TULP 3; CETULP, C. elegans tubbylike protein; RATTULP, rat tubby-like protein.

282

Q.-Z. Li et al. / Gene 273 (2001) 275±284

Fig. 6. Subcellular localization of human TUSP by transfection assay with GFP-tagged expression constructs. (A) TUSP-GFP, full-length TUSP fused with GFP; (B) TUSPN-GFP, N-terminal part of TUSP (amino acids 1±700) fused with GFP; (C) TUSPC-GFP, C-terminal part of TUSP (amino acids 701±1544) fused with GFP; (D) GFP, GFP control.

4. Discussion In this study, we have cloned the full-length cDNA of a novel gene (TUSP) from both human and mouse. The most prominent feature of the TUSP protein is the strong homology in its C-terminal region with the conserved `tubby domain' found in the tubby family proteins. Furthermore, it contains a tubby signature motif (FXGRVTQ). Previous studies have identi®ed four tubby family members, i.e. TUB, TULP1, TULP2 and TULP3, which share 55±90% protein sequence identity in the tubby domain. However, the sequence identity between tubby proteins and TUSP is only 40±48%, less than the identities shared by the four tubby family members. Therefore, TUSP is a distant member of the tubby family. The precise biological function of the tubby gene is still unknown. The highly conserved tubby domain in different species suggests that these proteins must have fundamental

biological functions for multi-cellular organisms ranging from A. thaliana, C. elegans, D. melanogaster, mouse and human. Several possible functions for TUB have recently been postulated. TUB may be phosphorylated and constitute an intermediate in the insulin signaling transduction pathway (Kapeller et al., 1999). Furthermore, it has been suggested that TUB interacts with a novel protein that shows homology to the stress/mitogen-activated protein kinases in a yeast two-hybrid screen (Zheng et al., 1997). Recently the three-dimensional structure of the highly conserved tubby domain has been determined by X-ray crystallography (Boggon et al., 1999), which reveals a unique barrel structure with a long positively charged putative DNA binding groove. Furthermore, it was shown that the tubby domain can bind double-stranded DNA and that the N-terminal domains of TUB and TULP1 can activate transcription from a GAL4 promoter when fused to the GAL4 DNA binding domain in a heterologous system.

Q.-Z. Li et al. / Gene 273 (2001) 275±284

283

Fig. 7. Chromosomal localization of the TUSP genes by radiation hybrid mapping. (A) Radiation hybrid map for mouse chromosome 17q23. (B) Radiation hybrid map for human chromosome 6q25-q26.

Therefore, it was suggested that TUB is a novel transcription factor. Like other tubby gene superfamily members, the predicted amino acid sequence of TUSP contains two potential NLSs (Garcia-Bustos et al., 1991). Additionally, TUSP shares 24% identity with transactivator EBNA-2 (Ling et al., 1993). These characteristics suggest that TUSP may also function as a transcription factor to regulate the expression of other genes. Interestingly, the full-length TUSP protein tagged with GFP was localized in the cytoplasm, suggesting that these two NLSs are not suf®cient to target TUSP protein to the nuclear compartment. It is possible that other accessory protein factors and/or post-translational modi®cations such as phosphorylation or dephosphorylation are required to mediate nuclear localization of TUSP protein in response to certain environmental stimuli. In that case, TUSP may function as a regulated transcription factor in response to external signals such as STATs, which involve both chemical modi®cation and protein±protein interactions. Observation of the formation of distinct intracellular protein assembly (or cytoplasmic bodies) of the GFP-tagged TUSP C-terminal portion in a transient transfection assay clearly demonstrates the possibility of protein±protein interaction. The N-terminal region of TUSP has a WD40 repeat region (82±208) and a SOCS domain (373±414) (Nicholson and Hilton, 1998; Vasiliauskas et al., 1999). The SOCS box was ®rst identi®ed in SH2 domain-containing proteins of the SOCS family and was also found in the WSB (WD40 repeat-containing proteins with a SOCS box) family and other proteins (Starr and Hilton, 1998; Kamura et al., 1998). The functions of these domains are still unknown.

However, it has been found that some SOCS domaincontaining proteins have immune regulation functions. For instance, mammalian cytokine-inducible SH2-containing (CIS) protein inhibits erythropoietin and interleukin 3 receptor signaling by competing with signaling molecules such as STATs for binding to phosphorylated receptor cytoplasmic domains (Hilton et al., 1998; Alexander et al., 1999; Zhang et al., 1999; Haque et al., 2000); mammalian SOCS-1 protein is a negative regulator of IL-6 signaling (Nicholson et al., 1999). However, further studies are needed to ®nd out whether these domains are functionally important to the TUSP protein. Acknowledgements The authors wish to thank Drs Predeep G. Kumar, Malini Laloraya and Richard McIndoe for helpful discussions. This research was supported by NIH grants P01 AI-42288 (J.X.S.). References Alexander, W.S., Starr, R., Metcalf, D., Nicholson, S.E., Farley, A., Elefanty, A.G., Brysha, M., Kile, B.T., Richardson, R., Baca, M., Zhang, J.G., Willson, T.A., Viney, E.M., Sprigg, N.S., Rakar, S., Corbin, J., Mifsud, S., DiRago, L., Cary, D., Nicola, N.A., Hilton, D.J., 1999. Suppressor of cytokine signaling (SOCS): negative regulators of signal transduction. J. Leukoc. Biol. 66, 588±592. Boggon, T.J., Shan, W.S., Santagata, S., Myers, S.C., Shapiro, L., 1999. Implication of tubby proteins as transcription factors by structure-based functional analysis. Science 286, 2119±2125. Coleman, D.L., Eicher, E.M., 1990. Fat (fat) and tubby (tub): two autoso-

284

Q.-Z. Li et al. / Gene 273 (2001) 275±284

mal recessive mutations causing obesity syndromes in the mouse. J. Hered. 81, 424±427. Davies, J.L., Kawaguchi, Y., Bennett, S.T., Copeman, J.B., Cordell, H.J., Pritchard, L.E., Reed, P.W., Gough, S.C., Jenkins, S.C., Palmer, S.M., Balfour, K.M., Rowe, B.R., Farrall, M., Barnett, A.H., Bain, S.C., Todd, J.A., 1994. A genome-wide search for human type 1 diabetes susceptibility genes. Nature 371, 130±136. Doucas, V., Tini, M., Egan, D., Evans, R., 1999. Modulation of CREB binding protein function by the promyelocytic (PML) oncoprotein suggests a role for nuclear bodies in hormone signaling. Proc. Natl. Acad. Sci. USA 96, 2627±2632. Garcia-Bustos, J., Heitman, J., Hall, M.N., 1991. Nuclear protein localization. Biochem. Biophys. Acta 1071, 83±101. Hagstrom, S.A., North, M.A., Nishina, P.M., Berson, E.L., Dryja, T.P., 1998. Recessive mutations in the gene encoding the tubby-like protein TULP1 in patients with retinitis pigmentosa. Nat. Genet. 18, 174±176. Haque, S.J., Harbor, P.C., Williams, B.R., 2000. Identi®cation of critical residues required for suppressor of cytokine signaling (SOCS)-speci®c regulation of IL-4 signaling. J. Biol. Chem. 275 (34), 26500±26506. He, W., Ikeda, S., Bronson, R.T., Yan, G., Nishina, P.M., North, M.A., Naggert, J.K., 2000. GFP-tagged expression and immunohistochemical studies to determine the subcellular localization of the tubby gene family members. Brain Res. Mol. Brain Res. 81 (1±2), 109±117. Heckenlively, J.R., Chang, B., Erway, L.C., Peng, C., Hawes, N.L., Hageman, G.S., Roderich, T.H., 1995. Mouse model for Usher syndrome: linkage mapping suggests homology to Usher type I reported at human chromosome 11p15. Proc. Natl. Acad. Sci. USA 92, 11100±11104. Hilton, D.J., Richardson, R.T., Alexander, W.S., Viney, E.M., Willson, T.A., Sprigg, N.S., Starr, R., Nicholson, S.E., Metcalf, D., Nicola, N.A., 1998. Twenty proteins containing a C-terminal SOCS box form ®ve structural classes. Proc. Natl. Acad. Sci. USA 95, 114±119. Kamura, T., Sato, S., Haque, D., Liu, L., Kaelin Jr., K.G., Conaway, R.C., Conaway, J.W., 1998. The Elongin BC complex interacts with the conserved SOCS-box motif present in members of the SOCS, ras, WD-40 repeat, and ankyrin repeat families. Genes Dev. 12, 3872±3881. Kapeller, R., Moriarty, A., Strauss, A., Stubdal, H., Theriault, K., Siebert, E., Chickering, T., Morgenstern, J.P., Tartaglia, L.A., Lillie, J., 1999. Tyrosine phosphorylation of tub and its association with Src homology 2 domain-containing proteins implicate tub in intracellular signaling by insulin. J. Biol. Chem. 274 (35), 24980±24986. Kleyn, P.W., Fan, W., Kovats, S.G., Lee, J.J., Pulido, J.C., Wu, Y., Berkemeier, L.R., Misumi, D.J., Holmgren, L., Charlat, O., Woolf, E.A., Tayber, O., Brody, B., Ebeling, C., Alperin, G.D., Deeds, J., Lakey, N.D., Culpepper, J., Chen, H., Glucksmann-Kuis, M.A., Carlson, G.A., 1996. Identi®cation and characterization of the mouse obesity gene tubby: a member of a novel gene family. Cell 85, 281±290. Lewis, C.A., Batlle, I.R., Batlle, K.G.R., Banerjee, P., Cideciyan, A.V., Huang, J., Aleman, T.S., Huang, Y., Ott, J., Gilliam, T.C., Knowles, J.A., Jacobson, S.G., 1999. Tubby-like protein1 homozygous splice-site mutation causes early-onset severe retinal degeneration. Invest. Ophthalmol. Vis. Sci. 40, 2106±2114.

Ling, P.D., Ryon, J.J., Hayward, S.D., 1993. EBNA-2 of herpesvirus papio diverges signi®cantly from the type A and type B EBNA-2 proteins of Epstein-Barr virus but retains an ef®cient transactivation domain with a conserved hydrophobic motif. J. Virol. 67, 2990±3003. Luo, D.F., Buzzetti, R., Rotter, J.I., Maclaren, N.K., Raffel, L.J., Nistico, L., Giovannini, C., Pozzilli, P., Thomson, G., She, J.X., 1996. Con®rmation of three susceptibility genes to insulin-dependent diabetes mellitus: IDDM4, IMMD5 and IDDM8. Hum. Mol. Genet. 5, 693±698. Nicholson, S.E., Hilton, D.J., 1998. The SOCS proteins: a new family of negative regulations of signal transduction. J. Leukoc. Biol. 63, 665± 668. Nicholson, S.E., Willson, T.A., Farley, A., Starr, R., Zhang, J.G., Baca, M., Alexander, W.S., Metcalf, D., Hilton, D.J., Nicola, N.A., 1999. Mutational analyses of the SOCS proteins suggest a dual domain requirement but distinct mechanisms for inhibition of LIF and IL-6 signal transduction. EMBO J. 18, 375±385. Nishina, P.M., North, M.A., Ikeda, A., Yan, Y., Naggert, J.K., 1998. Molecular characterization of a novel tubby gene family member, TULP3, in mouse and humans. Genomics 54, 215±220. Noben-Trauth, K., Naggert, J.K., North, M.A., Nishina, P.M., 1996. A candidate gene for the mouse mutation tubby. Nature 380, 534±538. North, M.A., Naggert, J.K., Yan, Y., Noben-Trauth, K., Nishina, P.M., 1997. Molecular characterization of Tub, Tulp1, and Tulp2, members of the novel tubby gene family and their possible relation to ocular disease. Proc. Natl. Acad. Sci. USA 94, 3128±3133. Ohlemiller, K.K., Hughes, R.M., Mosinger-Olilvie, J., Speek, J.D., Grosof, D.H., Silverman, M.S., 1995. Cochlear and retinal degeneration in the tubby mouse. NeuroReport 6, 845±849. Starr, R., Hilton, D.J., 1998. SOCS: suppressors of cytokine signalling. Int. J. Biochem. Cell Biol. 30, 1081±1085. Vasiliauskas, D., Hancock, S., Stern, C.D., 1999. SwiP-1: novel SOCS box containing WD-protein regulated by signalling centres and by Sh h during development. Mech. Dev. 82, 79±94. Wang, C.Y., Davoodi-Semiromi, A., Huang, W., Connor, E., Shi, J.D., She, J.X., 1998. Characterization of mutations in patients with autoimmune polyglandular syndrome type1 (APS1). Hum. Genet. 103, 681±685. Wang, C.Y., Shi, J.D., Davoodi-Semiromi, A., She, J.X., 1999. Cloning and genomic structure of Aire, the mouse homologue of the autoimmune regulator (AIRE) gene responsible for autoimmune polyglandular syndrome type 1 (APS1). Genomics 55, 322±326. Zhang, J.G., Farley, A., Nicholson, S.E., Willson, T.A., Zugaro, L.M., Simpson, R.J., Moritz, R.L., Cary, D., Richardson, R., Hausmann, G., Kile, B.J., Kent, S.B., Alexander, W.S., Metcalf, D., Hilton, D.J., Nicola, N.A., Baca, M., 1999. The conserved SOCS box motif in suppressors of cytokine signaling binds to elongins B and C and may couple bound proteins to proteasomal degradation. Proc. Natl. Acad. Sci. USA 96, 2071±2076. Zheng, P., Liu, Z., Ng, J., Wang, J., Lee, F., Kadia, C., et al., 1997. Obesity gene tubby interacts with a stress activated protein kinase. Soc. Neurosci. Abstr. 23, 1671.