Genomics 61, 210 –218 (1999) Article ID geno.1999.5951, available online at http://www.idealibrary.com on
Human Eukaryotic Initiation Factor EIF2C1 Gene: cDNA Sequence, Genomic Organization, Localization to Chromosomal Bands 1p34 –p35, and Expression Robert Koesters,* ,1 Volker Adams,* ,2 David Betts,† Rita Moos,* Mirka Schmid,* Anja Siermann,‡ Shabbir Hassam,* Sandra Weitz,§ Peter Lichter,§ Philipp U. Heitz, ¶ Magnus von Knebel Doeberitz,‡ and Jakob Briner* ,3 *Institute of Clinical Pathology and ¶Department of Pathology, University Hospital of Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland; ‡Division of Molecular Diagnostics and Therapy, Department of Surgery, University of Heidelberg, Im Neuenheimer Feld 110, 69120 Heidelberg, Germany; †Children’s Hospital of Zurich, Steinwiesstrasse 75, 8032 Zurich, Switzerland; and §Division of the Organization of Complex Genomes, Deutsches Krebsforschungszentrum Heidelberg, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany Received March 16, 1999; accepted August 3, 1999
INTRODUCTION We report the cloning and characterization of the human eukaryotic protein translation initiation factor EIF2C1 gene. The human EIF2C1 gene consists of 19 exons and 18 introns that span a region of almost 50 kb. It is located on the short arm of chromosome 1 in the region 1p34 –p35. This genomic region is frequently lost in human cancers such as Wilms tumors, neuroblastoma, and carcinomas of the breast, liver, and colon. The human EIF2C1 gene is ubiquitously expressed at low to medium levels. Differential polyadenylation and splicing result in a complex transcriptional pattern. The cDNA sequence is 7478 bp long and contains an extremely large 3* untranslated region of 4799 bp with multiple, short repeated segments composed of mono-, tri-, or quattronucleotides interspersed throughout. The human EIF2C1 gene belongs to a multigene family in human. It is highly conserved during evolution, sharing about 90% identity with rabbit eIF2C and 70% identity with plant AGO1 at the amino acid level. These facts suggest that human EIF2C1 might play an important physiological role. © 1999 Academic Press
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under Accession Nos. AF093097 and AF121255. 1 To whom correspondence should be addressed at present address: Division of Molecular Diagnostics and Therapy, Department of Surgery, University of Heidelberg, Im Neuenheimer Feld 110, 69120 Heidelberg, Germany. Telephone: (011) 49 6221 422467. Fax: (011) 49 6221 422417. E-mail:
[email protected]. 2 Present address: University of Leipzig, Heart Center, Russenstrasse 19, 04289 Leipzig, Germany. 3 Present address: Institute of Histological and Cytological Diagnostics, Dammweg 1, 5001 Aarau, Switzerland.
0888-7543/99 $30.00 Copyright © 1999 by Academic Press All rights of reproduction in any form reserved.
The initiation of protein translation is a major way in which eukaryotic cells control gene expression (Pain, 1996). The initiation pathway itself consists of a number of discrete steps that involve at least 12 unique proteins, which are referred to as eukaryotic initiation factors. The three steps that are thought to be most important in regulating the process in vivo include the formation of a ternary complex composed of eukaryotic initiation factor 2 (eIF2), initiator methionyl-tRNA i (Met-tRNA), and GTP. This is followed by transfer of the Met-tRNA to the 40S ribosomal subunit and formation of the Met-tRNA– 40S–mRNA complex. Evidence that suggests that several accessory protein factors are involved in this process has accumulated. Among these protein factors, eIF2C has been shown to stimulate ternary complex formation and to stabilize the ternary complex and the Met-tRNA– 40S–mRNA complex in reticulocyte lysates. eIF2C-like activity has been demonstrated in widely divergent eukaryotic organisms such as mouse ascitic tumor cells (Dasgupta et al., 1978), wheat germ (Osterhout et al., 1983; Seal et al., 1983), Artemia salina (Woodley et al., 1981), and yeast (Ahmad et al., 1985). Recently, a 94-kDa protein displaying eIF2C-like activity was purified from rabbit reticulocyte lysates. This finally led to the cloning of rabbit eIF2C cDNA (Zou et al., 1998). The AGO1 gene, which displays high similarity to rabbit eIF2C, has been isolated from Arabidopsis thaliana (Bohmert et al., 1998) through a functional cloning method that involved T-DNA tagging techniques. Plants homozygous for T-DNA insertions into the AGO1 gene were macroscopically identified through their significantly different overall morphology. They form unexpanded, pointed cotyledons and narrow leaves that lack blades and possess an abnormal inflo-
210
FIG. 1. Nucleotide and deduced amino acid sequence of human EIF2C1 cDNA. This sequence is available from GenBank under Accession No. AF093097. The two different polyadenylation signals are underlined. A homotypic T-stretch is double-underlined, and two repetitions of either eight CCTG quadruplets or seven CAG triplets are indicated by underlying dots. Individual repeating units are written in boldface and normal letters that alternate.
212
KOESTERS ET AL.
FIG. 2. The structure of the human EIF2C1 genomic locus shows the positions and the sizes of the introns and exons drawn approximately to scale. Translated exons are indicated by black boxes; the large open box indicates the untranslated part of exon 19. The central part of the figure shows the array of overlapping plasmid clones that were used to determine the exon–intron boundaries. The plasmid clones are individually numbered and were generated by subcloning the P1 genomic clone P1-Q99 by using EcoRI (E), PstI (Pst), SacI (S), SphI (Sph), or SacI1XbaI (SX). F-O6-4/6-5 refers to a genomic fragment created by PCR; this procedure is outlined under Materials and Methods. The exact locations of exons 9, 10, and 11 have not been refined within the limits of the respective plasmid clones.
rescence that bears infertile flowers. T-DNA insertions into the AGO1 locus could be demonstrated in two different mutant lines. Both insertions disrupted the AGO1 transcript independently, which suggests a causal relationship between AGO1 gene activity and proper plant development. The high sequence similarity to rabbit eIF2C (50% identical amino acids) suggests that AGO1 encodes the plant homologue of eIF2C. In the course of experiments carried out to isolate candidate genes that might be involved in the development of Wilms tumor (a childhood kidney tumor), we used differential cDNA cloning to identify the Q99 gene, which is more significantly expressed in tumors harboring a mutation of the Wilms tumor suppressor gene WT1 than in tumors expressing wildtype WT1 (Koesters et al., manuscript in preparation). From sequence comparisons, it became clear that the Q99 gene represents the human homologue of rabbit eIF2C and plant AGO1. Since there is unambiguous evidence for the existence of multiple Q99-related genes in humans, we refer to the Q99 gene as human EIF2C1. In this report, we describe the cDNA sequence, genomic structure, chromosomal localization, and mRNA expression of the human EIF2C1 gene. MATERIALS AND METHODS Procedures for isolation and sequencing of cDNA and genomic clones. A human fetal kidney derived lDR2 cDNA library was purchased from Clontech (Palo Alto, CA). By screening of approximately 6 3 10 5 plaques with the 1.6-kb insert of a previously isolated cDNA clone Q99 as probe, seven clones harboring inserts of various sizes were selected. The inserts were rescued into the pDR2 plasmid by in vivo excision according to the manufacturer’s protocol. A genomic P1 clone (P1-Q99) was provided by Genome Systems (St. Louis, MO) after a human P1 library was screened with PCR using two gene-specific primers (sense, position 2513–2533, 59-GGGATGACAACCGTTTCACAG-39; antisense, position 2691–2708, 59-CTCTGCCCCGATATGTGG-39). The combination of the same antisense primer and a different sense primer (position 2386 –2405, 59-AGTGGTAACATCCCAGCTGG-39) gave rise to the genomic PCR fragment F-O6-4/6-5, which spans exon 18 (Fig. 2). Plasmid libraries were established from P1-Q99 by shotgun subcloning of P1 DNA restricted with various enzymes into pBluescript (Stratagene, La Jolla, CA) or
pZero (Invitrogen, San Diego, CA). Overlapping clones were partially sequenced and arranged either by identifying exonic sequences or by sequential hybridization using defined subfragments that lacked repetitive sequences. For RT-PCR cloning experiments, poly(A) 1 RNA was prepared from the human colon cancer cell line HCT116 and converted to cDNA using Superscript II reverse transcriptase (Life Technologies, Gaithersburg, MD). PCR was typically carried out for 35 cycles (30 s at 94°C, 30 s at the primer-specific annealing temperature, and 1 min at 72°C) using ExTaq Polymerase (Takara, Japan). For the rapid amplification of cDNA ends (RACE), cDNA synthesis was carried out as described above and usually primed using a gene-specific primer. The resulting cDNA was purified using the High Pure PCR Purification Kit (Roche Diagnostics, Basel, Switzerland) and was tailed in a total volume of 50 ml tailing buffer using terminal transferase and 200 mM dATP according to the manufacturer’s instructions (Roche Diagnostics). Subsequent PCR was carried out for 40 cycles with an oligo(dT) anchor primer (59-GACCACGCGTATCGATGTCGAC(T) 16V-39) and a nested gene-specific primer. Nested PCR was carried out for 25 cycles with the anchor primer (59-GACCACGCGTATCGATGTCGAC-39, Roche Diagnostics) and a third nested gene-specific primer. PCR products were cloned into pCRII (TA Cloning, Invitrogen). PCRs performed on the extremely GC-rich 59 end of the EIF2C1 cDNA were strictly dependent on the presence of 10 mM DMSO in the PCR buffer. DNA sequencing was carried out by the dideoxy chain termination method with cycle sequencing on an ABI sequencing unit (ABI Prism 310, Perkin–Elmer). Nucleotide and amino acid sequence analyses were carried out using the Heidelberg Unix Sequence Analysis Resources (HUSAR) software program package. Size determination of introns. Exon–intron positions were determined by comparing the genomic sequence with the cDNA sequence. The smallest introns were systematically sequenced, while the sizes of larger introns were estimated from the sizes of the inserts from the corresponding genomic plasmid clones. Fluorescence in situ hybridization (FISH). Total DNA from the P1 clone P1-Q99 was labeled with biotin by nick-translation and hybridized to human metaphase chromosomes as previously described (Lichter et al., 1990). Briefly, 60 ng of labeled probe was combined with 3 mg of human Cot1 DNA and 7 mg of salmon sperm DNA in a 10-ml hybridization cocktail and hybridized overnight at 37°C. After the posthybridization washes, the biotin-labeled probe was detected by incubation with streptavidin conjugated to Cy3. Chromosomes were banded with 4,6-diamidino-2-phenylindole-dihydrochloride (DAPI). Digitized images of Cy3 and DAPI fluorescence were recorded with a CCD camera (Photometrics), electronically overlaid, and aligned. Photographs were taken from the monitor.
HUMAN EIF2C1 GENE
FIG. 3. Chromosomal localization of the human EIF2C1 gene by FISH. Human metaphase chromosome spread after in situ hybridization with the biotin-labeled probe P1-Q99 detected via Cy3. Specific hybridization signals are found in 1p34 –p35. Chromosome bands were visualized with DAPI. Both panels show digitized images that result from the electronic overlay of two fluorescence images obtained from the hybridization signal and the DAPI stain, respectively. Northern blot analysis. Northern blot analysis was carried out with human Multiple Tissue Northern (MTN) blots (Clontech) according to the manufacturer’s instructions. A partial cDNA clone spanning 1 kb of the carboxy-terminal end and 0.8 kb of the 39 untranslated cDNA sequence was used as a probe. Blots were prehybridized for 12 h and then hybridized for 24 h at 65°C in 7% SDS/1% BSA/1 mM EDTA/0.5 M NaPi (pH 7.2) containing 100 mg/ml denatured salmon sperm DNA. The blot was washed with 23 SSC/ 0.05% SDS at room temperature for 1 h and then with 0.13 SSC/ 0.1% SDS at 65°C for 1 h. The filters were exposed for 1–3 days at 280°C with an intensifying screen.
RESULTS
cDNA and Genomic Cloning of the Human EIF2C1 Gene Using the technique of differential display, we previously isolated several sequences that were differentially expressed between Wilms tumors harboring or lacking mutations of the Wilms tumor suppressor gene WT1 (Koesters et al., manuscript in preparation). Among the sequences found to be preferentially expressed in Wilms tumors containing mutant WT1, clone Q99 represented a partial cDNA of a previously
213
unknown gene, which could subsequently be mapped to human chromosome 1p34 –p35. Since the same genomic region has been found to be frequently lost in a variety of human malignancies including Wilms tumors (Grundy et al., 1994; Steenman et al., 1997), neuroblastoma, melanoma, pheochromocytoma, and carcinomas of the breast, liver, and colon (reviewed in Weith et al., 1996), we proceeded to isolate the fulllength cDNA sequence of this gene and to characterize it in detail. The sequence of the initial cDNA clone Q99 displayed striking similarity (70% identical at the protein level) to a gene from Caenorhabditis elegans. The sequence of this gene, F48F7.1, had been determined in the course of the C. elegans sequencing project, but because of the lack of functional data, the possible function of the predicted protein remained elusive. Recently, Zou et al. (1998) reported the cDNA cloning of the rabbit protein translation initiation factor eIF2C gene and, almost simultaneously, Bohmert et al. (1998) reported the functional cloning of the AGO1 gene from A. thaliana, the plant counterpart of rabbit eIF2C. The high degree of similarity among these four different sequences (see below) indicated that C. elegans F48F7.1 probably represents the nematodes and that Q99 represents part of the human homologue of rabbit elongation factor eIF2C and plant AGO1. Rabbit eIF2C, which encoded a protein translation initiation factor, was the only one of those genes to which a biochemical function had been assigned. Therefore, we referred to the human gene, represented by the cDNA clone Q99, as human EIF2C. Since we also had strong evidence for the fact that EIF2C was encoded by a multigene family at least in human (see below) and C. elegans (more than 10 members are present in the C. elegans genome databases), and the gene represented by Q99 was only the first one to be described here, we referred to it as human EIF2C1. The cDNA sequence (Fig. 1) of the human EIF2C1 gene was assembled using a combinatorial approach. A human fetal kidney-derived cDNA library was screened using clone Q99 as the initial probe, and clones with additional 59 end sequence information were isolated. In the course of these experiments, we also accidentally isolated a cross-hybridizing cDNA clone that was found to represent a gene very similar (85% identical at the amino acid level) to human EIF2C1. We referred to it as human EIF2C2 (Fig. 5). However, the human EIF2C2 gene was not studied in further detail. To obtain genomic sequence information about the EIF2C1 gene, a P1 genomic clone (P1-Q99) that contained the EIF2C1 genomic locus was isolated. P1-Q99 was subcloned into several overlapping fragments that were partly sequenced, and exon-containing fragments were identified through their strong sequence similarity to C. elegans F48F7.1. Predicted exons were then connected by RT-PCR. Finally, RACE was applied to isolate 59 untranslated sequences. Subsequently, the entire cDNA sequence was indepen-
214
KOESTERS ET AL.
FIG. 4. Northern blot analyses of human EIF2C1 in 16 adult and 4 fetal tissues. Each lane contained approximately 2 mg of poly(A) 1 RNA. Hybridization was carried out with a partial cDNA clone that contained 1 kb of coding sequence and 0.8 kb of 39 untranslated sequence. Exposure to an X-ray film lasted for 4 days in the case of the two blots containing RNA derived from adult tissues. The blot containing fetal RNA was exposed for 2 days.
dently confirmed at the genomic and cDNA levels. The complete cDNA sequence of human EIF2C1 and the partial cDNA sequence of human EIF2C2 will be available from GenBank under Accession Nos. AF093097 and AF121255, respectively. Structure of the Human EIF2C1 Gene The nucleotide and deduced amino acid sequence of the transcribed region of the EIF2C1 gene is shown in Fig. 1, and a schematic representation of the gene structure is given in Fig. 2. The EIF2C1 gene consists of 19 exons spanning a region of about 50 kb. Exon 1 contains the start methionine preceded by an in-frame stop codon and lies within a highly GC-rich region. 59 RACE experiments failed to amplify any sequences extending further upstream so that the transcriptional start site could not be precisely determined. Exon 1 is followed by 17 relatively small exons, ranging from 88 to 235 bp in size, and an extremely large last exon 19, which is 4799 bp long and is almost exclusively noncoding. All exon–intron boundaries conform to the GT/AG rule (Breathnach and Chambon, 1981). The cDNA sequence is 7478 bp long and defines a single open reading frame of 2571 bp flanked by 238 bp of 59 sequence and a rather extended 39 untranslated region of 4694 bp. From primer extension experiments, we could deduce that about 500 bp are still missing at the extreme 59 end of our sequence so that the full-length cDNA can be estimated to be 8.0 kb in size (data not shown). Characteristic features of the cDNA sequence include (see also Fig. 1) an
extremely GC-rich (60 – 80%) 59 end, a start methionine that conforms closely to the Kozak initiation site (Kozak, 1989), two confirmed alternative polyadenylation sites that are both of uncanonical sequence (cagaaa at position 4115– 4120 and tataaa at 7449 – 7454), a homotypic stretch of 20 thymidine residues (position 4464 – 4483), a repeat of eight CCTG quadruplets (position 5779 –5810), and a stretch of seven consecutive CAG triplets that are located immediately upstream from the last polyadenylation site. The encoded protein of 857 amino acids is rich in arginine, lysine, and histidine, which render it extremely basic (pI 5 10.1). A search for protein motifs covered by the prosite database revealed the presence of a multitude of putative phosphorylation, amidation, myristylation, and glycosylation sites (not shown), which suggests that the EIF2C1 protein is subject to extensive posttranslational modification. Chromosomal Localization For chromosomal localization, biotin-labeled DNA from the P1 clone was hybridized to normal human metaphase chromosome spreads and detected via Cy3. Hybridization signals were detected on human chromosome 1 in the region 1p34 –p35. Examples are shown in Figs. 3a and 3b. The fluorescent signals were highly specific. More than 90% of the target sequences were labeled, and no additional fluorescence signals were observed on other chromosomes.
FIG. 5. Multiple amino acid sequence alignment between human EIF2C1 (GenBank Accession No. AF093097), human EIF2C2 (AF 121255), rabbit eIF2C (AF005355), C. elegans F48F7.1 (Z69661), and A. thaliana AGO1 (U91995). Gaps are introduced to realize the best alignments. Residues that are conserved the most are indicated by dark boxes; less conserved residues are shaded in gray. The parts of C. elegans and A. thaliana that are nearest the amino-terminal end do not bear any similarities and are, therefore, not shown.
HUMAN EIF2C1 GENE
215
216
KOESTERS ET AL.
TABLE 1 Exon–Intron Boundaries of the Human EIF2C1 Gene Donor Exon 1
.238 bp
Exon 2
184 bp
Exon 3
121 bp
Exon 4
173 bp
Exon 5
136 bp
Exon 6
235 bp
Exon 7
88 bp
Exon 8
148 bp
Exon 9
120 bp
Exon 10
123 bp
Exon 11
134 bp
Exon 12
185 bp
Exon 13
160 bp
Exon 14
91 bp
Exon 15
195 bp
Exon 16
135 bp
Exon 17
102 bp
Exon 18
200 bp
Exon 19
4799 bp
GGA Gly 7 GTC Val 68 AAC Asn 108 TCC Ser 169 ATT Ile 215 ATC Ile 260 CAT His 289 CCC Pro 338 AGT Ser 378 GGC Gly 419 GTG Val 464 GTG Val 526 CAC His 579 ATC Ile 609 CTA Leu 674 AAT Asn 719 GGC Gly 753 CAT His 820 TTC Phe 856
GCA Ala 8 AAC Asn 69 GAA Glu 109 ATG Met 170 GAT Asp 216 AAG Lys 261 CAG Gln 290 CTA Leu 339 CGC Arg 379 GGC Gly 420 CTC Leu 465 TAT Tyr 527 CAG Gln 580 ACA Thr 610 CCC Pro 675 GAG Glu 720 ATC Ile 754 GAC Asp 821 GCT Ala 857
Acceptor
G
gtaagggtcc . . . tgtcttgtag 4962 bp
CG
gtaagtgatg . . . tcccccacag ;3.4 kb
CGG Arg 110 AG
gtaaggttgg . . . cctcctgaag 420 bp
G
gtgagtgggg . . . gtccccacag 226 bp
G
gtgaggaccc . . . tgtatctcag 143 bp
AC
gtaagttagc . . . ttctgaacag 718 bp
GAG Glu 340 CTG Leu 380 CGG Arg 421 AA
gtgagattgc . . . gtttcctcag ?
gtattgggtg . . . tccctgccag 415 bp
gtcagtgggc . . . tgtcctgcag ? gtgagcaggg . . . ctatccccag 133 bp gtaaggaggg . . . gtgtgtacag ?
G
gtacagttct . . . ctgcccttag ;10.0 kb
CG
gtatgaactc . . . ccttgctcag 243 bp
GCA Ala 611 CAG Gln 676 CGA Arg 721 CAG Gln 755 AG
gtgagtgata . . . tgactgtcag 1080 bp gtagggccca . . . ttgtgcctag ;1.6 kb gtgagtgagg . . . ctctatacag ;0.6 kb gtagctgggc . . . atcttcccag ;1.0 kb gtgaggcctg . . . tccttttcag 245 bp
CT Ala 9 G Arg 70 GTC Val 111 G Arg 171 TC Val 217 GC Gly 262 A Thr 291 GTC Val 341 ATG Met 381 AAC Asn 422 G Lys 466 CT Ala 528 C Arg 581 GTG Val 612 ATA Ile 677 ATT Ile 722 GGC Gly 756 T Ser 822
GCG Ala 10 GAA Glu 71 GAC Asp 112 TAC Tyr 172 TCA Ser 218 CTG Leu 263 TTC Phe 292 TGT Cys 342 AAG Lys 382 CGG Arg 423 AAG Asn 467 GAG Glu 529 TCT Ser 582 GTA Val 613 CTD Leu 678 GGG Gly 723 ACC Thr 757 GGA Gly 823
GGC Gly 11 GTG Val 72 TTT Phe 113 ACC Thr 173 GCC Ala 219 AAG Lys 264 CCC Pro 293 AAC Asn 343 AAT Asn 383 GCC Ala 424 TTC Phe 468 GTG Val 530 GCC Ala 583 GGC Gly 614 CAT His 679 AAG Lys 724 AGC Ser 758 GAG Glu 824
TGA Stop. . .
mRNA Expression Pattern of EIF2C1 in Normal Human Adult and Fetal Tissues Northern blot analysis was performed to analyze the distribution of EIF2C1-specific transcripts among different adult and fetal human tissues. Hybridization
was carried out using a partial cDNA clone of EIF2C1 as the probe, which spanned 1 kb of the carboxy-terminus and 0.8 kb of the 39 untranslated cDNA sequence. Low-abundance, but ubiquitous expression of multiple transcripts could be detected (Fig. 4). A major
HUMAN EIF2C1 GENE
band appeared at approximately 8.0 kb, a second major band appeared at about 11–12 kb, and several additional bands that ranged from about 4.4 kb to 7 kb in size were also visible. All different transcripts displayed only slight variations in their abundance among adult tissues. This suggests a housekeeping-like function of the EIF2C1 gene. However, the expression of the EIF2C1 gene in fetal tissues was not as uniform. Fetal lung and fetal kidney exhibited a level of expression that was five times higher than that in fetal liver and brain. The complex pattern of different mRNA species recognized by the EIF2C1 cDNA is due to differential polyadenylation (see Fig. 1), extensive splicing of the EIF2C1 mRNA (data not shown), and the concomitant expression of genes that are highly similar to the EIF2C1 gene (see below). The results from Northern blot experiments and the estimated length of the full-length EIF2C1 cDNA sequence are very consistent with the 8.0-kb message representing the fulllength EIF2C1 transcript. In later experiments, the 11to 12-kb message was found to be the result of crosshybridization of EIF2C1 to human EIF2C2, which is about 75% identical at the nucleotide level within the coding region (see also Fig. 5). DISCUSSION
We report on the isolation and structure of the human translation initiation factor EIF2C1 gene. This gene corresponds to a previously isolated sequence Q99, which was found to be differentially expressed in Wilms tumors, depending on the presence or absence of mutations within the WT1 gene. The sequence of the human EIF2C1 cDNA was deduced from a series of cDNA clones and from a genomic P1 clone. The human EIF2C1 gene resides at chromosome 1p34 –p35 and consists of 19 exons and 18 introns that span a genomic region of about 50 kb. The full-length EIF2C1 transcript is about 8 kb and is expressed ubiquitously at low to medium levels of abundance. The coding region of the EIF2C1 gene contains no known protein motifs, and despite the fact that no specific data about the biochemical function of the encoded protein have been determined, the very high sequence similarity to rabbit eIF2C (see Fig. 5) provides indirect evidence about the biological function of human EIF2C1 as a possible protein translation initiation cofactor. This conclusion is supported by several additional and independent observations. First, the ubiquitous and modest level of human EIF2C1 expression agrees well with its putative role as a general and nonstructural protein. Second, human EIF2C1 is a highly basic protein (pI 10.1); this makes it a potentially good binding partner for ribonucleic acids. Third, there are related genes described at the nucleotide level in C. elegans that possess a so-called RGG box, a stretch of repeating arginine, glycine, and tyrosine residues (data not shown). RGG boxes are typically found as accessory RNA-binding domains in a wide number
217
of bona fide RNA-binding proteins such as hnRNP U, FMR 1, and EWS (for review see Burd and Dreyfuss, 1994). Fourth, as one would predict for proteins that compose the general machinery of eukaryotic protein translation, human EIF2C1 is highly conserved during evolution and shares striking sequence similarity with proteins from such divergent species as A. thaliana and C. elegans (Fig. 5). AGO1, the putative plant homologue of EIF2C1, was recently isolated from A. thaliana by a functional cloning approach. Plants homozygous for an inactivated AGO1 allele are severely defective in their development. They form only rudimentary leaves and infertile flowers, whereas heterozygotes are apparently normal. Two different plant lines defective for the AGO1 gene were isolated independently; this suggests that defects in the AGO1 gene were directly responsible for the observed phenotype. Furthermore, sense expression of AGO1 restored a normal phenotype in AGO1 2/2 homozygous knockouts, whereas antisense expression of AGO1 was able to impose the AGO1 2/2 phenotype on heterozygous plants. These data suggest a major role for the AGO1 gene in plant development. But how do the developmental defects observed in plants lacking functional AGO1 protein fit in with the presumed biochemical role of AGO1 as a general protein translation cofactor? Furthermore, is EIF2C1-like activity essential for proper development in animals? At least in human (e.g., human EIF2C2; additional genes inferred by EST sequences) and in C. elegans (several genes sequenced in the course of the C. elegans sequencing project), good evidence exists of related, hypermorphic genes that upon mutational inactivation could possibly functionally complement one another. In contrast, AGO1 seems to be a true single-copy gene in plants, and inactivation, then, could indeed negatively affect general plant architecture. We studied the structure of the human EIF2C1 gene because of our interest in Wilms tumors, a childhood renal tumor. EIF2C1 expression was found to be elevated in Wilms tumors that lack functional copies of the Wilms tumor suppressor gene WT1 (not shown). Then, the human EIF2C1 gene was mapped to human chromosome 1p35, a genomic region that is commonly lost in Wilms tumors (Steenman et al., 1997; Grundy et al., 1994). Furthermore, Wilms tumors are caused by molecular defects in embryonic kidney development that render primitive metanephrogenic precursor cells defective in their capacity to differentiate into normal kidney cells. This may correlate with the observed disturbance of growth in AGO1 knockout plants, despite the fact that in human there exist closely related genes that may display some functional redundancy. Taken together, these findings could make human EIF2C1 an interesting candidate gene for potential involvement in Wilms tumorigenesis. However, we have so far been unable to find any evidence of EIF2C1 gene mutations in Wilms tumors (data not shown). Since our mutational analysis has not yet covered the entire EIF2C1
218
KOESTERS ET AL.
coding region, we cannot completely rule out the occurrence of EIF2C1 gene mutations in Wilms tumors. ACKNOWLEDGMENTS We thank Malek Ajmo and Dieter Zimmermann for their reliable sequencing services. The excellent support provided by Reinhold Scha¨fer, the former Abteilung fu¨r Krebsforschung, and Ruediger Ridder is also gratefully acknowledged. This study was financially supported by the Krebsliga des Kantons Zu¨rich, Fonds fu¨r medizinische Forschung der Universita¨t Zu¨rich, and Ciba-Geigy Jubila¨umsstiftung.
REFERENCES Ahmad, M. F., Nasrin, N., Banerjee, A. C., and Gupta, N. K. (1985). Purification and properties of eukaryotic initiation factor 2 and its ancillary protein factor (Co-eIF-2A) from yeast Saccharomyces cerevisiae. J. Biol. Chem. 260: 6955– 6959. Bohmert, K., Camus, I., Bellini, C., Bouchez, D., Caboche, M., and Benning, C. (1998). AGO1 defines a novel locus of Arabidopsis controlling leaf development. EMBO J. 17: 170 –180. Breathnach, R., and Chambon, P. (1981). Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50: 349 –383. Burd, C. G., and Dreyfuss, G. (1994). Conserved structures and diversity of functions of RNA-binding proteins. Science 265: 615– 621. Dasgupta, A., Das, A., Roy, R., Ralston, R., Majumdar, A., and Gupta, N. K. (1978). Protein synthesis in rabbit reticulocytes XXI: Purification and properties of a protein factor (Co-eIF-1) which stimulates Met-tRNA binding to EIF-1. J. Biol. Chem. 253: 6054– 6059. Grundy, P. E., Telzerow, P. E., Breslow, N., Moksness, J., Huff, V., and Paterson, M. C. (1994). Loss of heterozygousity for chromo-
somes 16q and 1p in Wilms’ tumors predicts an adverse outcome. Cancer Res. 54: 2331–2333. Kozak, M. (1989). The scanning model for translation: An update. J. Cell Biol. 108: 229 –241. Lichter, P., Tang, C. C., Call, K., Hermanson, G., Evans, G. A., Housman, D., and Ward, D. C. (1990). High resolution mapping of human chromosome 11 by in situ hybridization with cosmid clones. Science 24: 64 – 69. Osterhout, J. J., Lax, S. R., and Ravel, J. M. (1983). Factors from wheat germ that enhance the activity of eukaryotic initiation factor eIF-2. Isolation and characterization of Co-eIF-2alpha. J. Biol. Chem. 258: 8285– 8289. Pain, V. M. (1996). Initiation of protein synthesis in eukaryotic cells. Eur. J. Biochem. 236: 747–771. Seal, S. N., Schmidt, A., and Marcus, A. (1983). Wheat germ eIF2 and CoeIF2: Resolution and functional characterization in in vitro protein synthesis. J. Biol. Chem. 258: 10573–10576. Steenman, M., Redeker, B., de Meulemeester, M., Wiesmeijer, K., Voute, P. A., Westerveld, A., Slater, R., and Mannens, M. (1997). Comparative genomic hybridization analysis of Wilms tumors. Cytogenet. Cell. Genet. 77: 296 –303. Weith, A., Brodeur, G. M., Bruns, G. A., Matise, T. C., Mischke, D., Nizetic, D., Seldin, M. F., van Roy, N., and Vance, J. (1996). Report of the Second International Workshop on Human Chromosome 1 Mapping. Cytogenet. Cell. Genet. 72: 114 –144. Woodley, C. L., Roychowdhury, M., MacRae, T. H., Olsen, K. W., and Wahba, A. J. (1981). Protein synthesis in brine shrimp embryos: Regulation of the formation of the ternary complex (Met-tRNAfX eIF-2 X GTP) by two purified protein factors and phosphorylation of Artemia eIF-2. Eur. J. Biochem. 117: 543–551. Zou, C., Zhang, Z., Wu, S., and Osterman, J. C. (1998). Molecular cloning and characterization of a rabbit eIF2C protein. Gene 211: 187–194.