GENOMICS
36, 316–319 (1996) 0467
ARTICLE NO.
Isolation of Three Testis-Specific Genes (TSA303, TSA806, TSA903) by a Differential mRNA Display Method KOUICHI OZAKI, TAMOTSU KUROKI, SEITAKU HAYASHI,
AND
YUSUKE NAKAMURA1
Laboratory of Molecular Medicine, Institute of Medical Science, The University of Tokyo, Tokyo, Japan Received November 27, 1995; accepted June 18, 1996
We isolated three human testis-specific genes by a differential mRNA display method. The cDNAs contained open reading frames of 1620, 453, and 333 nucleotides, encoding 540, 151, and 111 amino acids, respectively. The first of these genes, designated TSA303, encodes a novel protein homologous to TCP20, one of the subunits of the human TRiC chaperonin complex that can bind newly synthesized or unstable folding intermediates of polypeptides and assist substrate proteins in folding, assembly, and transport. The second, TSA806, encodes a novel protein containing 3.3 contiguous repeats of the cdc10/swi6 (ankyrin) motif that was originally found in products of cell cycle control genes of yeast and cell fate determination genes in Drosophila and Caenorhabditis elegans. The third gene, TSA903, encodes a protein homologous to the Cterminal region of murine uridine monophosphate kinase. Northern blot analysis confirmed that in 16 human adult tissues examined, each of these genes was expressed specifically in the testis. From the results of cDNA screening of nearly 1 million plaques, the abundance of each transcript in a preparation of total mRNA was estimated as 0.0004% (TSA303), 0.0006% (TSA806), and 0.0002% (TSA903). Our results imply that the differential display method is a powerful tool for isolation of tissue-specific genes even if they are expressed at a level as low as 1 in several hundred thousand to a million molecules of total mRNA. q 1996 Aca-
et al., 1992). Parallel comparison of the displays allows one to identify differentially expressed species with ease. Unlike so-called subtractive hybridization techniques that are designed to compare two samples, this method permits comparisons among multiple samples. Furthermore, it provides access to transcripts that are too rare to be detected with conventional differential hybridization techniques, and, since the products are tagged at both ends with primers, they can be easily recovered from the gel, reamplified, and used for various additional analyses. In recent years significant efforts have been made to isolate and characterize testis- and spermatogenesisspecific genes (Propst et al., 1988). The spermatogenic cell lineage demonstrates unique cellular and developmental characteristics; spermatogenic precursor cells constantly differentiate into mature spermatozoa in adult males (Bellve et al., 1977), a process involving a transition from mitotically to meiotically dividing cells. Using a differential mRNA display method, we isolated three novel human genes that were expressed specifically in testis. We suggest that the differential display method is a powerful technique for isolating tissue-specific genes even if the abundance of the transcript is as low as 0.0001–0.001% of the total population of mRNA molecules. MATERIALS AND METHODS
demic Press, Inc.
INTRODUCTION
A PCR-based technique termed a differential mRNA display is a unique and powerful tool for identification of genes expressed differently among various tissues or cell lines. This approach is designed to screen a defined subpopulation of transcripts through RT-PCR, using arbitrarily selected primers, and to display the results as bands on a gel after electrophoretic separation of the amplified cDNAs (Liang and Pardee, 1992; Welsh Sequence data from this article have been deposited with the GenBank/EMBL Data Libraries under Accession Nos. D78333–D78335. 1 To whom correspondence should be addressed. Telephone: 8135449-5372. Fax: 81-35449-5433.
0888-7543/96 $18.00 Copyright q 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
AID
Genom 4264
/
6r1a$$$381
Differential mRNA display. The differential display procedure was performed essentially as described by Liang and Pardee (1992). Poly(A) RNA (0.2 mg) isolated from each of nine human tissues (brain, lung, liver, stomach, pancreas, breast, testis, ovary, heart) (Clontech) was mixed with 25 pmol of 3*-anchored oligo(dT) primer G(T)15MA (M represents a mixture of G, A, and C) in 8 ml of diethylpyrocarbonate-treated water and heated at 657C for 5 min. To this solution 4 ml of 51 firststrand buffer (BRL), 2 ml of 0.1 M DTT (BRL), 1 ml of 250 mM dNTPs (BRL), 1 ml of ribonuclease inhibitor (40 units; TOYOBO), and 1 ml of Superscript II reverse transcriptase (200 units; BRL) were added. The final volume of each reaction mixture was 20 ml. After each solution was incubated at 377C for 1 h, it was diluted 2.5-fold by addition of 30 ml of distilled water and stored at 0207C until use. The cDNAs were amplified by PCR in the presence of [a-35S]dATP (10 mCi/ml; Amersham). PCR amplification of cDNAs was carried out as follows: each 20-ml PCR mixture contained 2 ml of the RT reaction mixture, 2 ml of 101 PCR buffer (Boehringer), 1.2 ml of 25 mM dNTPs, 1 ml of [a-35S]dATP, 0.25 ml of Taq DNA polymerase (5 units/ml; Boehringer), 25 pmol 3*-anchored oligo(dT) primer, and 25 pmol 5*-primer (10-mer deoxyoli-
316
07-31-96 00:37:58
gnmal
AP: Genomics
ISOLATION OF TSA303, TSA806, AND TSA903 gonucleotide primer with arbitrary sequences: No. 3, 5*-CTTGATTGC C-3*; No. 8, 5*-GGAACCAATC-3*; or No. 9, 5*-AAACTCCGTC-3*). Reactions were performed under the following conditions: one cycle of 3 min at 957C, 5 min at 407C, and 5 min at 727C, then 40 cycles of 0.5 min at 957C, 2 min at 407C, and 1 min at 727C followed by 5 min at 727C. Samples were precipitated with ethanol, resuspended in formamide sequencing dye, and run on a 6% acrylamide 7.5 M urea sequencing gel. The gel was dried without fixation and subjected to overnight autoradiography. Subcloning of amplified cDNA fragments. The autoradiogram and the dried gel were oriented with radioactive ink, so that tissuespecific cDNA bands could be located by marking with a pencil. The desired gel slice, along with the 3 MM paper, was kept in 300 ml of dH2O for 1 h with shaking. After the polyacrylamide gel and the paper were removed, cDNA was recovered by ethanol precipitation in the presence of 0.3 M NaOAC, with 1 ml of 10 mg/ml glycogen as a carrier, and redissolved in 10 ml of dH2O. For reamplification, 5 ml of this solution was used. PCR conditions and the primers were the same as for the first PCR. Reamplified products of the appropriate sizes were recovered as in the first PCR experiment and then cloned into an EcoRV site of the pBluescript SK(0) vector. Nucleotide sequences were determined by the dideoxy chain-termination method with T7 DNA polymerase. Northern blot analysis. Northern blot analysis was performed using human multiple tissue Northern blots I and II (Clontech). The cDNA fragments were labeled with [a-32P]dCTP by PCR, using the primer set of T3 and T7 promoter sequences. The membranes were prehybridized and then hybridized according to the manufacturer’s protocol. Washed membranes were autoradiographed for 48 h at 0807C. Screening of cDNA. A human normal testis cDNA library was constructed using oligo(dT)-primed human normal testis cDNA and Uni-ZAP XR (Stratagene). A total of 1 1 106 clones were screened with [a-32P]dCTP-labeled cDNA fragments that had been isolated by differential display. Positive clones were selected and their insert DNAs were excised in vivo in pBluescript II SK(0) according to the supplier’s recommendations.
RESULTS AND DISCUSSION
Differential mRNA Display To identity human genes expressed in a tissue-specific manner, we compared differential display patterns using mRNAs isolated from brain, lung, liver, stomach, pancreas, testis, ovary, breast, and heart. We performed PCR amplifications with three primer combinations (see Materials and Methods) and identified three PCR products that were expressed specifically in testis among 9 tissues examined (data reviewed, but not shown). These products, designated TSA303, TSA806, and TSA903, were cloned and sequenced; they consisted of 190, 138, and 133 nucleotides, respectively. A search among DNA sequences in the EMBL database (Rerease 42.0), using the FASTA program (Pearson and Lipman, 1988), revealed no significant homologies between any of these PCR products and known DNA sequences. Northern Blot Analysis with Cloned Fragments To confirm the expression patterns observed in the differential display profiles, we examined expression of the three novel genes in various human adult tissues by Northern blot analysis. As shown in Fig. 1, each cDNA clone revealed a band specifically in testis; the transcript corresponding to TSA303 (a) was approxi-
AID
Genom 4264
/
6r1a$$$382
07-31-96 00:37:58
317
FIG. 1. Northern analysis of TSA303 (a), TSA806 (b), and TSA903 (c) in various human tissues. Fragments obtained from the differential mRNA display were amplified and used as probes. The beta actin level was measured as control.
mately 2.3 kb, that of TSA806 (b) was 1.4 kb, and that of TSA903 (c) was 0.9 kb. These results were consistent with the expression patterns seen in the differential display experiments. Cloning and Sequencing of Three Testis-Specific cDNAs In the course of screening a human testis cDNA library (1 1 106 plaques) using cDNA fragments isolated through the differential display experiments as probes, we identified four positive clones of TSA303, six of TSA806, and two of TSA903. On the basis of these results we estimated the abundance of each transcript among total mRNA to be approximately 0.0004% (TSA303), 0.0006% (TSA806), and 0.0002% (TSA903). The cDNA clone corresponding to TSA303 contained 1759 nucleotides, including a 1620-nucleotide open reading frame (data not shown; GenBank Accession No. D78333), which would encode a protein of 540 amino acids with a calculated molecular mass of 59,493 Da (Fig. 2a). The predicted amino acid sequence of TSA303 is highly homologous (83% identity) to that of human TCP20 (Fig. 2a), a ubiquitously expressed subunit of the TRiC chaperonin complex that in eukaryotes participates in functional folding of actin, a-, b-, and g-tubulins, centractin, luciferase, and phytochrome in ATP-dependent processes (Gao et al., 1993; Melki et al., 1993; Mummert et al., 1993). TRiC, a het-
gnmal
AP: Genomics
318
OZAKI ET AL.
FIG. 2. (a) Alignment of the deduced amino acid sequence of TSA303 with human TCP20. Identities are indicated by black background. (b) Deduced amino acid sequence of TSA806. (c) Alignment of the cdc10/SWI6 motif in TSA806 with related proteins. A 34amino-acid sequence, repeated 3.3 times in TSA806, is shown; residues that are common to more than two repeats are shaded and used to derive the TSA806 consensus sequence. Amino acid sequences shown below the TSA806 sequence are consensus sequences for other members of the protein family containing the cdc10/SWI6 motif, where the consensus sequences are defined as residues present in more than 50% of the individual consensus repeat sequences. Dashes indicate residues that are not common in the individual consensus repeat sequences. (d) Alignment of the deduced amino acid sequences of TSA903 with murine UMP kinase (M. UMK). Identities are indicated by black background.
ero-oligomer, appears to be composed of at least seven different subunits (Frydman et al., 1992). Hence, the similarity between the TSA303 protein and TCP20 suggests that TSA303 is a member of the widespread family of TRiC chaperonin subunits and is likely to mediate protein folding in testicular cells. The cDNA clone corresponding to TSA806 contained 1067 nucleotides with a 453-nucleotide open reading frame (data not shown; GenBank Accession No.
AID
Genom 4264
/
6r1a$$$382
07-31-96 00:37:58
D78334); the predicted protein of 151 amino acids would have a calculated molecular mass of 16,897 Da (Fig. 2b). This protein contains 3.3 copies of a 34-aminoacid repeat that was previously named the ‘‘cdc10/ SWI6 motif’’ (Borus et al., 1990; Lux et al., 1990); the alignment shown in Fig. 2c indicates that the TSA806 protein is a novel member of the cdc10/SWI6 motif superfamily. However, in comparison with other proteins of this superfamily, the TSA806 protein has some unique structural characteristics. First, TSA806 is relatively small, and its arrangement of the cdc10/SWI6 motif is simple: 114 of 151 amino acids of the predicted protein constitute 3.5 tandem repeats of this motif. The cdc10/SWI6 motif is followed by a short stretch of relatively hydrophobic sequence that is apparently unrelated to the cdc10/SWI6 motif. This is in contrast with ankyrin, for example, which has a molecular mass of 206,144 Da and contains 22 repeats of the cdc10/SWI6 motif that account for one-third of the protein molecule (Lux et al., 1990). Second, as the deduced protein sequence of TSA 806 contains neither signal peptides nor hydrophobic segments, it is a typical cytosolic protein; most members of the cdc10/SWI6 motif superfamily possess hydrophobic transmembrane domains (Wharton et al., 1985; Yochem et al., 1988, 1989). Certain members of the cdc10/SWI6 motif-containing superfamily of proteins have been studied in detail with respect to functional roles of the cdc10/SWI6 motif in the signal transduction cascade through interactions with other proteins. For example, IkB (Bauerle and Baltimore, 1988; Haskill et al., 1991) and the b-subunit of GA-binding protein (LaMarco et al., 1991; Thompson et al., 1991), both of which possess the cdc10/SWI6 motif, play crucial roles in the regulation of gene transcription through interaction with NF-kB and the a-subunit of GA-binding protein (LaMarco et al., 1991), respectively. The proteins require the cdc10/SWI6 motif to interact with their partner proteins (Blank et al., 1992; Inoue et al., 1992). The TSA806 protein’s structural homology to IkB and the GA-binding protein b-subunit suggests that it may function through protein–protein interactions in testicular cells, although other members of the cdc10/SWI6 motif superfamily appear to have relatively diverse physiological functions. The cDNA corresponding to TSA903 was composed of 831 nucleotides, including a 333-nucleotide open reading frame (data not shown; GenBank Accession No. D78335) encoding a protein of 111 amino acids with a calculated molecular mass of 12,616 Da (Fig. 2d). The TSA903 protein shared 76, 48, and 43% identity, respectively, with the C-terminal regions of the UMP kinases of mouse (Fig. 2d) (EC 2.7.1.48), yeast (Liljelund and Lacroute, 1986), and Escherichia coli (Serina et al., 1995). The C-terminal region of UMP kinase is considered to provide the catalytic site of this enzyme (Traut, 1994). The proposed structure of UMP kinase in mouse indicates that 10 amino acid residues (Arg138, Leu139, Phe140, Val141, Asp142, Arg151, Arg152, Val153, Leu154, and Arg155) bind to ATP (Traut, 1994). All of these residues are conserved in TSA903 protein; we
gnmal
AP: Genomics
ISOLATION OF TSA303, TSA806, AND TSA903
found 9 of the 10 to be identical, while the other (Arg138) was substituted with the relatively conserved Lys. The structural similarity suggests that TSA903 plays a similar role, i.e., conversion of UMP and other monophosphates into corresponding diphosphates in the presence of ATP. Unlike subtractive hybridization-based approaches for the identification of transcripts present in one sample but not in another, the method reported here enabled us simultaneously to compare more than two samples. Differential mRNA display can be applied to identify transcripts with different behaviors like induction, reduction, and transient changes, in addition to specification in some tissues as shown here. A human prostate carcinoma-specific gene (Shen et al., 1995), human breast cancer-specific genes (Watson and Fleming, 1994; Swisshelm et al., 1995), and a mouse developmental stage-specific gene (Zimmermann and Schultz, 1994) have been isolated by means of differential mRNA display. The work reported here also supports the differential mRNA display technique as a tool for discovering a variety of novel tissue-specific genes. ACKNOWLEDGMENTS We thank Kiyoshi Noguchi and Keiko Kogawa for their kind assistance. This work was supported by grants from the Ministry of Education, Culture, Sports and Science of Japan.
REFERENCES Bauerle, P. A., and Baltimore, D. (1988). I kappa B: A specific inhibitor of the NF-kB transcription factor. Science 242: 540–546. Bellve, A. R., Cavicchia, J. C., Millette, C. F., O’Brien, D. A., Bhatnagar, Y. M., and Dym, M. (1977). Spermatogenic cells of the prepubertal mouse: Isolation and morphological characterization. J. Cell Biol. 74: 68–85. Blank, V., Kourilsky, P., and Israel, A. (1992). NF-kB and related proteins: Rel/dorsal homologies meet ankyrin-like repeat. Trends Biochem. Sci. 17: 135–140. Borus, V., Villalobos, J., Burd, R., Kelly, K., and Siebenlist, U. (1990). Cloning of a mitogen-inducible gene encoding a kappa B DNAbinding protein with homology to the rel oncogene and to cell-cycle motifs. Nature 348: 76–80. Frydman, J., Nimmesgern, E., Erdjument-Bromage, H., Wall, J. S., Tempst, P., and Hartl, F. U. (1992). Function in protein folding of TRiC, a cytosolic ring complex containing TCP-1 and structurally related subunits. EMBO J. 11: 4767–4778. Gao, Y., Vainberg, I. E., Chow, R. L., and Cowan, N. J. (1993). Two cofactors and cytoplasmic chaperonin are required for the folding of alpha- and beta-tubulin. Mol. Cell. Biol. 13: 2478–2485. Haskill, S., Beg, A. A., Tompkins, S. M., Morris, J. S., Yurochko, A. D., Johannes, A. S., Mondal, K., Ralph, P., and Baldwin, A. S. (1991). Characterization of an immediate-early gene induced in adherent monocytes that show I kappa B-like activity. Cell 65: 1281–1289. Inoue, J., Kerr, L. D., Rashid, D., Bose, H. R., and Verma, I. M. (1992). Direct association of pp40/I kappa B beta with rel/NF-kappa B transcription factors: Role of ankyrin repeats in the inhibition of DNA binding activity. Proc. Natl. Acad. Sci. USA 89: 4333–4337. LaMarco, K., Thompson, C. C., Byers, B. P., Walton, E. M., and McKnight, S. L. (1991). Identification of Ets- and notch-related subunits in GA binding protein. Science 253: 789–792.
AID
Genom 4264
/
6r1a$$$382
07-31-96 00:37:58
319
Liang, P., and Pardee, A. B. (1992). Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257: 967–971. Liljelund, P., and Lacroute, F. (1986). Genetic characterization and isolation of the Saccharomyces cerevisiae gene coding for uridine monophosphokinase. Mol. Gen. Genet. 205: 74–81. Lux, S. E., John, K. M., and Bennett, V. (1990). Analysis of cDNA for human erythrocyte ankyrin indicates a repeated structure with homology to tissue-differentiation and cell-cycle control protein. Nature 344: 36–42. Melki, R., Vainberg, I. E., Chow, R. L., and Cowan, N. J. (1993). Chaperonin-mediated folding of vertebrate actin-related protein and gamma-tubulin. J. Cell Biol. 122: 1301–1310. Mummert, E., Grimm, R., Speth, V., Eckerskorn, C., Schilz, E., Gatenby, A., and Schafer, E. (1993). A TCP1-related molecular chaperone from plants refolds phytochrome to its photoreversble form. Nature 363: 644–648. Pearson, W. R., and Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85: 2444– 2448. Propst, F., Rosenberg, M. P., and van de Woude, G. F. (1988). Protooncogene expression in germ cell development. Trends Genet. 4: 183–187. Serina, L., Blondin, C., Krin, E., Sismeiro, O., Danchin, A., Sakamoto, H., Gilles, A. M., and Barzu, O. (1995). Escherichia coli UMPkinase, a member of the aspartokinase family, is a hexamer regulated by guanine nucleotides and UTP. Biochemistry 34: 5066– 5074. Shen, R., Su, Z. Z., Olsson, C. A., and Fisher, P. B. (1995). Identification of the human prostatic carcinoma oncogene PTI-1 by rapid expression cloning and differential RNA display. Proc. Natl. Acad. Sci. USA 92: 6778–6792. Swisshelm, K., Ryan, K., Tsuchiya, K., and Sager, R. (1995). Enhanced expression of an insulin growth factor-like binding protein (mac25) in senescent human mammary epithelial cells and induced expression with retinoic acid. Proc. Natl. Acad. Sci. USA 92: 4472–4476. Thompson, C. C., Brown, T. A., and McKnight, S. L. (1991). Convergence of Ets- and notch-related structural motifs in a heteromeric DNA binding complex. Science 253: 762–768. Traut, T. W. (1994). The functions and consensus motifs of nine types of peptide segments that form different types of nucleotide-binding sites. Eur. J. Biochem. 222: 9–19. Watson, M. A., and Fleming, T. P. (1994). Isolation of differentially expressed sequence tags from human breast cancer. Cancer Res. 54: 4598–4602. Welsh, J., Chada, S., Dalal, S., Cheng, R., Ralph, D., and McClelland, M. (1992). Arbitrarily primed PCR fingerprinting of RNA. Nucleic Acids Res. 20: 4965–4970. Wen-Zhuo, L., Paul, L., Judith, F., Thomas, R. B., Thomas, S. C., Lynn, M. R., David, T., Marshall, A. L., Franz-Ulrich, H., Fred, S., and George, B. S. (1994). Tcp20, a subunit of the eukaryotic TRiC chaperonin from humans and yeast. J. Biol. Chem. 269: 18616–18622. Wharton, K. A., Johansen, K. M., Xu, T., and Artavanis-Tsakonas, S. (1985). Nucleotide sequence from the neurogenic locus notch implies a gene product that shares homology with proteins containing EGF-like repeat. Cell 43: 567–581. Yochem, J., Weston, K., and Greenwald, I. (1988). The Caenorhabditis elegans lin-12 gene encodes a transmembrane protein with overall similarity to Drosophila notch. Nature 335: 547–550. Yochem, J., and Greenwald, I. (1989). glp-1 and lin-12, genes implicated in distinct cell–cell interactions in C. elegans, encode similar transmembrane proteins. Cell 58: 553–563. Zimmermann, J. W., and Schultz, R. M. (1994). Analysis of gene expression in the preimplantation mouse embryo: Use of mRNA differential display. Proc. Natl. Acad. Sci. USA 91: 5456–5460.
gnmal
AP: Genomics