Gene 338 (2004) 257 – 265 www.elsevier.com/locate/gene
Human gene MOB: structure specification and aspects of transcriptional activity Irina P. Vladychenskaya *, Lyudmila V. Dergunova, Veronica G. Dmitrieva, Svetlana A. Limborska Department of Human Molecular Genetics, Institute of Molecular Genetics RAS, Kurchatov sq., 2, 123182 Moscow, Russia Received 27 October 2003; received in revised form 15 April 2004; accepted 1 June 2004 Available online 21 July 2004 Received by V. Larionov
Abstract Prior investigation of human brain cDNA libraries revealed an evolutionarily conserved gene MOB that has been cloned in silico on chromosome 10. To elucidate its biological role, we performed structural and functional analysis of its transcripts. Applying an expressed sequence tag (EST) approach, we specified the sequence of the predicted MOB transcript and found another four exons to belong to the 5Vend of the MOB gene; the newly constructed MOB transcript was detected in vitro. Here, we report MOB to comprise at least 11 exons and 10 introns and to span more than 320 kb of the genomic sequence. We propose complex regulation of MOB gene activity at a transcriptional level, based on its expression pattern. Thus, in the human cerebellum, we discovered multiple alternatively spliced products of MOB differing in their coding portion; one of the alternative transcripts was demonstrated to lack the longest coding exon VII. MOB was expressed at very low levels in a wide spectrum of human tissues: most abundantly in the brain and in the kidney. Two transcription initiation sites were found for MOB and two alternative promoters were suggested to govern its expression. We believe that MOB activity is also regulated at the posttranscriptional level. In the constructed MOB transcript, the extended multiexon 5V-untranslated region (UTR) together with the weak context of the translation start ATG codon are considered as potent translator inhibitors. Modulation of MOB translation efficiency is proposed based on the appropriate alternative splicing events within the 5V-UTR. The MOB 3V-UTR is anticipated to mediate message instability. We thus suggest that this MOB transcript may be a labile short-lived molecule with strong regulation of its translational efficiency. We believe that MOB gene activity is controlled at least at the transcriptional and the posttranscriptional levels, strictly regulating the amount of the encoded protein product. D 2004 Elsevier B.V. All rights reserved. Keywords: Gene of unknown function; Expression; 5V- and 3V-untranslated regions; Alternative transcript; Promoter; In silico analysis
1. Introduction Massive increase in the amount of DNA sequence information, together with the development of bioinformatics to
Abbreviations: cDNA, DNA complementary to RNA; mRNA, messenger RNA; EST, expressed sequence tag; Hmob, human medulla oblongata; A, adenosine; T, thymidine; G, guanosine; C, cytidine; ORF, open reading frame; uORF, upstream open reading frame; uATG, upstream ATG; h2MG, h2-microglobulin gene; UTR, untranslated region; TM, transmembrane domain; SAM, sterile alpha motif; RT-PCR, reverse transcriptase-polymerase chain reaction; ARE, adenylate/uridylate-rich element. * Corresponding author. Tel.: +7-095-196-1858; fax: +7-095-196-0221. E-mail address:
[email protected] (I.P. Vladychenskaya). 0378-1119/$ - see front matter D 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2004.06.003
exploit its use, has remarkably changed modern biological and biomedical research. Among others, the problem of identification of novel human genes has undergone a significant transition. The earliest studies in molecular genetics identified genes through their peptide products, then messenger RNA (mRNA) became the primary target for investigation. A current powerful strategy of gene discovery named ‘in silico cloning’ relies upon the idea that, in theory, all human genes are now represented in nucleotide sequence databases as genomic and expressed sequences. The subjects of in silico analysis are the clusters of the expressed sequences (cDNAs and/or ESTs) representing the transcripts of the genes; the main tool is computational sequence similarity search. Hybridizing the sequence of interest (‘virtual probe’) with the total accumulated expressed sequences, one hopes to reveal a
258
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
distinct cluster or clusters of anonymous overlapping ESTs/ cDNAs related to the virtual probe and then to reconstruct the transcript of the corresponding novel gene or genes. Such a strategy makes possible to clone in silico novel homologues of the already known genes, novel genes localizing in certain chromosomal regions, novel genes coming from anonymous expressed sequences, etc. Working on human brain-specific gene identification, we screened cDNA libraries of various human brain structures and selected several clones. One of these was a 1.5-kb Hmob33 clone (GenBank accession no. Y14155), from medulla oblongata (Buiakova et al., 1992; Dergunova et al., 1998). Initial in silico analysis has not revealed any significant similarity between human medulla oblongata (Hmob)33 and already described genes, but demonstrated it to be a part of a 3V-untranslated region (UTR) of some unknown gene. A BLAST search of the human expressed sequence databases has identified a set of an anonymous sequences overlapping with Hmob33 and extending it in the 5V-direction; a 3.2-kb contig, HMOB33, was constructed using these data. This contains an open reading frame (ORF) flanked with the untranslated regions and is assumed to represent one of the possible transcription products of an unknown gene named MOB. Seven exons and six introns spanning about 155.5 kb of human chromosome 10 were predicted to constitute the hypothetical MOB gene (Vladychenskaya et al., 2002). Here, we specify our previous in silico results for MOB gene structure and present new data on the structure and functional aspects of the MOB expression. By that, we near the elucidation of the biological role of this gene. Note that as we have used cDNA rather than mRNA sequences in our investigation, henceforth U will be referred as T (AUG codons as ATG, UAA as TAA, etc.). For the same reason, using the term ‘transcript’ with reference to the cDNAbased contig is only conditional.
2. Materials and methods 2.1. Bioinformatics Verification of MOB hypothetical transcript 5V- end was accomplished by BLASTn search against the human EST database (http://www.ncbi.nlm.nih.gov/blast). Initial analysis of genomic structure of the extended MOB transcript was also carried out by BLAST software using Human Genomic BLAST Pages division (http://www.ncbi.nlm.nih. gov/genome/seq/page.cgi?F=HsBlast.html&&ORG=Hs). Possible ORFs were revealed using the DNASIS software package. The 5V-flanking sequence of MOB exon I was analysed for predicted CpG islands using GrailEXP software available at http://compbio.ornl.gov. For recognition of the MOB gene putative promoter within the same region, we used GENOMATIX software (http:// www.genomatix.de).
2.2. RNA isolation and treatment with DNAse I Total RNA was isolated from human brain and other tissues using guanidine thiocyanate (Chomczynski and Sacchi, 1987). RNA integrity was assessed by analysing the ratio between rRNA bands after agarose gel electrophoresis under denaturing conditions. RNA samples were stored at 70 jC under ethanol. To remove the residual genomic DNA from the total RNA samples, we treated each sample (100 Ag) with DNAse I (MBI Fermentas, Lithuania) in accordance with the supplier’s recommendations; RNA then was extracted with a 1:1 phenol –chloroform mix and precipitated with sodium acetate (3.0 M, pH 5.2). 2.3. RT-PCR DNAse I-treated total RNA samples from human cerebellum, forebrain cortex, hippocampus, spleen, lymph node, liver, kidney and lung were taken for cDNA synthesis. This was carried out using a RevertAidk First Strand cDNA synthesis Kit (MBI Fermentas) in accordance with the supplier’s recommendations. First strand cDNA was used as a template for the PCR with gene-specific primers. All amplifications were performed in a Tercyc MC2 thermocycler (DNA Technology, Russia). Oligonucleotide primers were designed using OLIGO software; their sequences and nucleotide coordinates according to MOB mRNA (GenBank accession no. BN000143) and to human h2-microglobulin gene (h2MG) mRNA (GenBank accession no. NM 004048) are summarized in Table 2. Aliquots of the PCR products were size-fractionated on silver-stained 6% nondenaturing polyacrylamide gels in Tris – Borate – EDTA buffer (Budowle et al., 1991). For detection of MOB transcripts, we analyzed human cerebellum RNA. Five microgram aliquot of total RNA was taken for the reverse transcription. First strand cDNA was used in the amplification with the combination of the same upstream primer F1 and different downstream primers R1, R2, R3, R5, R6, R7 and R8 (Table 2). PCR was performed in 20 Al of the reaction mix described previously (Dergunova et al., 2003) containing 1 Al of reverse transcription reaction mix. PCR parameters were as follows: 4 min at 94 jC followed by 30 cycles of 94 jC for 1 min, 65 jC for 1 min and 72 jC for 1 min followed by a 10-min final elongation at 72 jC. Expression patterns of the major MOB transcript and a transcript lacking exon VII were analysed in human cerebellum, forebrain cortex, hippocampus, spleen, lymph node, liver, kidney and lung. For that, 10 Ag of each total RNA sample was subjected to reverse transcription. All resulting cDNA samples were used in two sets of amplifications: one with primer pair F6/R4 to analyse the expression of the major transcript and one with primer pair F5/R5 to examine the expression of both alternative and major transcripts. For analysis of all the cDNAs with primer pair F6 and R4, we
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
used 25 cycles and 0.2 Al of the RT mix/cDNA template; for the primer pair F5 and R5, we used 30 cycles and 1 Al of the RT mix. Under these conditions, the yield of the amplification products of the gene under study and the control gene was found to be within the exponential growth phase of PCR amplification. Amplification with primers F6/R4 was performed in 20 Al of the reaction mix (see above) containing 0.2 Al of the reverse transcription mix, 10 pM of each genespecific primer and 0.5 pM of the primers specific to h2MG. Cycling parameters were as follows: 4 min at 94 jC followed by 25 cycles of 94 jC for 1 min, 65 jC for 1 min and 72 jC for 1 min followed by a 10-min final elongation at 72 jC. Amplification with F5/R5 was performed in 20 Al of the same reaction mix containing 1 Al of the reverse transcription mix, 5 pM of each gene-specific primer and 0.5 pM of the primers specific to h2MG. Cycling parameters were as follows: 3 min at 94 jC followed by 30 cycles of 94 jC for 30 s, 65 jC for 30 s and 72 jC for 30 s followed by a 10min final elongation at 72 jC. Expression activity of MOB transcripts was estimated by three independent PCR experiments for each tissue type; the amplification products were triply analysed on separate polyacrylamide gels. 2.4. Southern blot hybridization Reverse transcriptase-polymerase chain reaction (RTPCR) products obtained with the primer pair F1/R8 (exons I– XI) were size-fractionated in 1% agarose gels in Tris – Acetate –EDTA buffer and then transferred onto Hybond N nylon membranes (Amersham, UK). Recombinant plasmid pUC 19 with an Hmob33 insert was labelled with [a-32P] dATP using a random-primed labelling kit (Amersham) and used as a probe. As a negative control and a molecular weight marker, we used E phage DNA digested with PstI endonuclease. Hybridization was performed according to standard protocols (Sambrook et al., 1989). 2.5. Primer extension analysis For the identification of MOB transcription start sites by primer extension, we used primers designed using OLIGO software. Two primers were complementary to MOB exon I, and one (R2) to exon VI. This was accomplished using Primer Extension System-AMV Revertase Transcriptase Kit (PROMEGA, USA) according to the manufacturer’s protocol. The primers were end-labelled with [g-32P] dATP; 10 fM of each primer was annealed to 5 Ag of human cerebellum poly (A)+ fraction in a 30-Al volume of the PIPES-formamide annealing buffer. The hybridization was carried out at 45 jC for 16 h. The radioactive hybrids were precipitated by ethanol and then transferred into the primer extension reaction. Half of the reaction volume was subjected to electrophoretic separation through the 8% denaturing polyacrylamide gel containing 7 M urea. [g-32P] dATP-labelled fX174/HinfI was also run through the gel as a molecular weight marker as above.
259
2.6. Sequence analysis Amplification products obtained from human cerebellar cDNA with primers F1/R7 and corresponding to the major and shortest silver-detectable transcripts were sequenced using an ABI 373A sequencer (Applied Biosystems, USA) with ABI PrismTM Dye Terminator Cycle Sequencing kits (Perkin Elmer, USA).
3. Results 3.1. In silico construction of the extended MOB transcript, and verification of the MOB exon –intron structure For the verification of the ambiguous 5V- end of the previously constructed in silico MOB gene transcript, we used an additional round of BLASTn searches against human nr and EST databases. The 5V- end 480 nucleotide fragment of the hypothetical MOB transcript (HMOB33 contig) was used as a virtual probe. The result showed two overlapping EST clones (GenBank accession nos. BI752115 and BM764008, originated from human brain and myeloma cDNA libraries, respectively) extending the hypothetical MOB transcript in the 5V-direction for 531 nucleotides. These clones demonstrated 99% identity between each other and 99% and 100% of nucleotide identity with MOB transcript at sequence lengths of 200 and 98 nucleotides, respectively, and thus were attributed to this gene. These new EST data allowed us to reconstruct the updated contig (GenBank accession no. BN000143; Fig. 1A) representing the improved primary structure of MOB transcript. A BLAST comparison of BN000143 with the sequences from human genomic databases has split the analysed sequence into 11 fragments representing the novel exon distribution pattern of the MOB gene (Fig. 1B). Previously, MOB was reported to consist of seven exons (Vladychenskaya et al., 2002) represented by exons V – XI of the newly reported gene structure. Exons I, II, III and IV emerged from the newly discovered part of the transcript. Exon –intron boundaries of the corresponding gene fragment were found to be flanked with canonical AG/GT dinucleotides. MOB is expected to contain at least 11 exons and 10 introns and to span about 320 kb of genomic sequence. The exons range from 44 bp (exon IV) to 1718 bp (exon 11), the introns from 0.7 to 89.4 kb (Table 1). The coding region of the gene is predicted to span exons VII – XI (Fig. 1B). 3.2. Analysis of the constructed MOB transcript Fig. 1C represents a schematic structural organization of the newly constructed MOB transcript. It is 3734 nucleotides long and contains a hypothetical coding region of 1239 nucleotides flanked with extended 5V- and 3V-UTRs. The 5V-UTR of the transcript is 954 nucleotides long; it comprises exons I –VI and part of exon VII (Fig. 1B). Three
260
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
Fig. 1. MOB gene in silico cloning. (A) Assembly of the contig (GenBank accession no. BN000143) representing the hypothetical MOB transcript. Nucleotide positions of the cDNA overlapping regions are indicated. (B) Predicted genomic organization of human MOB gene. Exons are shown as numbered rectangles; black regions represent the coding part and white regions represent the UTRs. Shadowed regions represent upstream ORFs. Arrows mark the polyadenylation sites and ovals indicate possible AREs. Introns are shown as interrupted lines. The approximate size of the MOB gene is indicated on the bottom. (C) Structural organization of the hypothetical MOB transcript. The black box represents the coding region; white boxes represent 5V- and 3V-untranslated regions; shadowed regions represent upstream ORFs. Arrows mark the polyadenylation sites and ovals indicate possible AREs. The size of the transcript is indicated on the bottom.
potential upstream ORFs (uORFs), all in-frame with the main ORF, were found within the 5V-UTR sequence (Fig. 1C). According to the MOB genomic structure, these uORFs are located within exons I, V and VI (Fig. 1B); corresponding upstream ATGs (uATGs)start are located at nucleotide positions 13– 15 (uATGI), 523– 525 (uATGV) and 646 – 648 (uATGVI). The GC content of the whole 5VUTR is 53%, whereas the GC content of exon I is 73%. The coding region of the transcript was predicted to start with a ‘weak’ ATG codon (the nucleotide context is AGTACAATGA) at the 955 – 957 nucleotide positions and to stop with the TAA codon at positions 2194 – 2196. It codes for a putative transmembrane protein ‘Mob’ of 413 amino acids (Vladychenskaya et al., 2002). The 3V-UTR of the transcript is 1538 nucleotides long. It contains six canonical AATAAA polyadenylation signals.
Human EST dataset analysis of MOB 3V-UTR indicates functional activity of three polyadenylation sites located at nucleotide positions 2382 –2387, 3597 –3601 and 3714– 3719 (Fig. 1B and C). The 3V-UTR is enriched by adenosine (A) and thymidine (T) residues and contains many poly-T and poly-A stretches. Six ATTTA motifs (nucleotide positions 2393 – 2397, 2440 – 2444; 2857 – 2861, 2897– 2901; 2991 – 2995 and 3049 – 3053) are found within the 3VUTR sequence (Fig. 1B and C). 3.3. Detection of the constructed MOB transcript in human tissues To detect the newly predicted MOB transcript in human tissues, we applied RT to the total RNA isolated from human cerebellum; the resulting cDNA was used as a
Table 1 Exon – intron structure of the human MOB gene Exon no.
Exon size (bp)
cDNA
Splice donor
Intron size (kb)
Splice acceptor
I II III IV V VI VII VIII IX X XI
271 95 91 44 142 81 854 118 154 167 1718
1 – 271 272 – 366 367 – 457 458 – 500 501 – 642 643 – 723 724 – 1577 1578 – 1695 1696 – 1849 1850 – 2016 2017 – 3734
TCCAGTGCAA/gtgagtcgtg CATTTGAAAG/gtaagtaaat TAAAATACAG/gtaagtcaga CATGATTCAG/gtaaactgat GCTGTTGCAG/gtatgtattg TAAATAATCC/gtaagttaat TAAAATACAA/gtaagtcaag TTCTCCGAAG/gtaaactatt ATCAAAGAGT/gtaagtctaa CAATCAGCAA/gtgagtttcc
6.1 52.9 70.2 33.8 27 89.4 16.2 16.8 3.1 0.7
attcttgcag/CGTGACGACA attattccag/GTAAAGCTTC cttcttctag/ATTGGAAAGA tctgtcccag/GCACACCATT cttatgacag/GTGATGGAAC gctttcacag/AAGGAAGAAT ctttttctag/GTCTATTATT tttctgccag/CTTTTCGGAG cccctttcag/ATTCCCCTCG gttatcctag/GTGCTAAAGG
Exon and intron sequences are shown in upper- and lower-case letters, respectively. Canonical AG/GT nucleotides are underlined.
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
template for PCR. All the amplifications were carried out with the combination of the same upstream primer F1 located in MOB exon I and different downstream primers R1, R2, R3, R5, R6, R7 and R8 corresponding to exons V, VI, VII, VIII, IX, X and XI, respectively (Table 2). Location of the primers on the MOB transcript together with the expected length of the amplification fragments are represented schematically in Fig. 2A. The obtained RTPCR products were of the expected size: F1/R1, 402 bp; F1/R2, 580 bp; F1/R3, 718 bp; F1/R5, 1497 bp; F1/R6, 1647 bp; F1/R7, 1771 bp and F1/R8, 2338 bp (Fig. 2B). For the primer pairs F1/R5, F1/R6, F1/R7 and F1/R8, additional amplification fragments were detected. Bands of the expected size are represented as major bands and the additional fragments as minor ones. The pattern of distribution of the minor bands through the polyacrylamide gels and their locations relative to the major bands were roughly similar for all the corresponding primer pairs. On lanes 4 – 7 (Fig. 2B) one can distinguish a group of minor bands clustering tightly around the major fragment and a single minor band located significantly lower than the others. The difference in size between the expected major fragment and the shortest fragment was estimated at about 800 bp for each corresponding primer pair. Negative controls in which human genomic DNA was used as a template produced no amplification fragments (data not shown). For the positive control, Southern blot hybridization of the RT-PCR products obtained with primers F1/R8 (exons I– XI; Fig. 2B) was carried out with P32labelled recombinant plasmids bearing the 1.5-kb Hmob33 insert. Four distinct radioactive signals were detected (Fig. 2C). Two signals (one between 2.1 and 2.4 kb; one less than 1.7 kb) corresponded to the bands visualized on the polyacrylamide gel (Fig. 2B, lane 7). Two other signals (one between 1.09 and 1.16 kb; one less than 0.80 kb) had no visible analogues on the polyacrylamide gels. Table 2 Nucleotide sequences and location of the primers corresponding to human MOB and h2MG transcripts (GenBank accession nos. BN000143 and NM004048, respectively) Primer
Nucleotide sequence, 5V-3V
Location, nucleotides
MOB transcript F1 GGC TGA CTG CTC TCC CCT C F5 TAC TGG AAC AAT GGG AGA AGC F6 GCC AAA CAA GTC TCT GCT CAT R1 GGC ATC CCA AAG AAC TCA AT R2 CTG GTC ATC TTG GCT TCC TAT R3 ATG AGC AGA GAC TTG TTT GGC R4 GGT GGG GAT GTC TAC GCC R5 TAC AGC GTG CCA ACT ATG C R6 GAG CCA GTG ATA GAC AAG CCA R7 GAG AAG CCA GCA AAT CCA GTG R8 ATG GTG GTT GCG GGT TAT GTA
126 – 144 655 – 675 823 – 843 508 – 527 685 – 705 823 – 843 1216 – 1233 1604 – 1622 1752 – 1772 1876 – 1896 2443 – 2463
h2MG transcript F299 GAT GAG TAT GCC TGC CGT G R593 CTA AGT TGC CAG CCC TCC T
346 – 364 640 – 658
261
3.4. Sequence analysis of two MOB transcripts We determined the primary structure of two RT-PCR products obtained with the primers F1/R7 (exons I– X) on the cerebellar RNA template. One was the product of the expected size (major fragment, 1771 bp) and another was the shortest product visualized by silver staining (minor fragment, 918 bp). The sequence of the major fragment (GenBank accession no. AY332650) was identical with that of the corresponding region of the predicted in silico MOB transcript sequence. The shortest fragment (GenBank accession no. AY364088) appears to lack the exon VII sequence. 3.5. In silico analysis of a hypothetical protein encoded by alternative MOB transcript lacking exon VII We deduced a hypothetical 214 amino acid product (Mob1) encoded by a MOB transcript devoid of exon VII. The coding region that spans exons VI, VIII – XI is in-frame with that determined for the major MOB transcript. The initiation ATG codon, distinct from uATGVI, is located at nucleotide positions 698 – 700 within a weak nucleotide context (GCCAAGATGA). Nine amino acids encoded by exon VI discriminate Mob1 from Mob. The Mob1 sequence encoded by exons VIII –XI is identical with that deduced for Mob; three transmembrane (TM) domains corresponding to TM3 – TM5 of Mob (Vladychenskaya et al., 2002) are localized within this sequence fragment. Mob1 protein was found to lack a sterile alpha motif (SAM) domain. 3.6. Identification of MOB transcription start sites We used primer extension for detection of the MOB transcription start sites. Different primers specific to exon I failed to generate any product in the assay with human cerebellum RNA. Use of the primer complementary to exon VI, known for its relatively low GC content resulted in two radioactive signals (Fig. 3). One was of low intensity and corresponded to a fragment of approximately 700 nucleotides. A stronger signal represented a fragment of about 350 nucleotides. Reproducibility of these results in experiments with other human brain tissues (hippocampus and frontal cortex) indirectly testified for their specificity (data not shown). 3.7. In silico analysis of the region flanking the MOB 5 V- end A total of 15 kb of continuous genomic sequence obtained from GenBank (GenBank accession no. NT_008583) and flanking the supposed MOB 5V- end was analysed using GENOMATIX software. Two putative promoter regions were revealed within the 1-kb genomic fragment adjacent to the 5V- end of exon I. Neither of these regions contained canonical TATA boxes. On the genomic sequence, these regions were located at nucleotide positions from 754 to
262
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
Fig. 2. In vitro detection of the constructed MOB transcript. (A) Location of the primers on the predicted MOB gene transcript and the expected RT-PCR products: schematic model. The shadowed box corresponds to the coding portion of the transcript. The exons are numbered. Primers are indicated as the black circles, arrows show their 5V- to 3V- orientation. Arrow-flanked bars represent the expected length of the amplified fragments. (B) Products amplified from human cerebellar cDNA with primers specific to the different regions of the constructed MOB transcript. Size fractionating was performed in 6% nondenaturing polyacrylamide gel. (1) Primer pair F1/R1 (exons I – V); (2) F1/R2 (exons I – VI); (3) F1/R3 (exons I – VII); (4) F1/R5 (exons I – VIII); (5) F1/R6 (exons I – IX); (6) F1/R7 (exons I – X); (7) F1/R8 (exons I – XI); M, marker (E/PstI); the size of the marker bands is indicated. (C) Southern blot hybridization of the products amplified with primers F1/R8 (exons I – XI) from the cerebellar cDNA with P32-labelled recombinant plasmids bearing the 1.5-kb Hmob33 insert. Size fractionating was performed in 1% agarose. Black arrows indicate the amplification fragments detected both by the silver staining on the polyacrylamide gel and by hybridization with the radioactive probe; empty arrows indicate the amplification fragments detectable only by hybridization. (1) amplification products; (2) negative control (E/PstI). Bars indicate the arrangement and sizes of the molecular weight marker bands (E/PstI).
482 and from 209 to + 26 relative to the 5V- end nucleotide of the MOB exon I. No promoters were predicted within the rest of the analysed genomic sequence. According to GrailEXP software predictions, MOB exon I is located within a 2.8-kb CpG island (nucleotide position from 1734 to + 1293 relative to the 5V- end nucleotide of the MOB exon I) with a GC content of 67.18% and CpG ratio of 0.81. 3.8. MOB expression analysis Fig. 3. Determination of the MOB transcription start sites. The products were generated from human cerebellar poly (A)+ RNA by extension of the primer R2 specific to MOB exon VI and separated on 8% polyacrylamide denaturing gel. (1) products of the primer extension reaction; M, marker (fX174/HinfI); size of the marker fragments is indicated. Arrows point to the transcription start sites.
Expression patterns of the major MOB transcript and the alternative transcript lacking exon VII were analysed in human cerebellum, forebrain cortex, hippocampus, spleen, lymph node, liver, kidney and lung by RT-PCR. In one set of experiments, we examined the major transcript expres-
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
sion using the primer pair F6/R4; both primers corresponded to MOB exon VII and generated a single amplification product of 410 bp. Primers F5 and R5, used in another set of experiments for the investigation of the alternative transcript expression, were specific to exons VI and VIII, respectively. They generated two products corresponding to alternative (114 bp) and major transcript (968 bp) fragments. Primers F299 and R593, specific to distinct exons of human h2MG, were used as internal controls in both sets of experiments (Fig. 2A, Table 2). RT-PCR conditions were optimized to allow comparisons between the expression levels of each type of MOB transcript in the different tissues. Fig. 4 demonstrates the electrophoretic distribution of the RT-PCR products amplified from all the cDNAs with the primer combinations F6/ R4, F299/R593 (major transcript fragment) and F5/R5, F299/R593 (major and alternative transcript fragments). The expression profiles of the major and alternative transcripts appeared to be very similar. Both MOB transcripts were expressed more actively in cerebellum, hippocampus, forebrain cortex and kidney, whereas minimal expression level was observed in spleen, lymphatic tissue and liver. However, taking into consideration the disproportion be-
Fig. 4. Expression patterns of the major MOB transcript and of a transcript lacking exon VII. (A) Fragments of the major transcript amplified from different human tissue cDNAs with the primers F6/R4 specific to exon VII. (B) Fragments of the alternative and major transcripts amplified from the same cDNAs with the primers F5/R5 specific to exons VI and VIII. In both experiments, the PCR product amplified with the primers F299/R593 specific to h2MG were used as internal controls. PCR products were fractionated on nondenaturing 6% polyacrylamide gels. Arrows point to the products amplified from MOB major transcript (410 and 968 bp), MOB alternative transcript (114 bp) and h2MG transcript (294 bp). (1) Cerebellum; (2) forebrain cortex; (3) hippocampus; (4) kidney; (5) lung; (6) liver; (7) lymph node; (8) spleen; M, marker (E/PstI; the size is indicated).
263
tween MOB and h2MG primers concentration (see Section 2.4), the general expression level of both MOB transcripts was assessed as being very low among all the tissues investigated.
4. Discussion The human gene MOB, expressed predominately in the brain and located on human chromosome 10, was proposed by our previous results; its transcript was constructed in silico (Dergunova et al., 1998; Vladychenskaya et al., 2002). Having now specified the ambiguous 5V- end of this sequence, we constructed the updated contig BN000143 representing the improved primary structure of the MOB transcript. Our data indicate that MOB comprises at least 11 exons and 10 introns spanning about 320 kb of genomic sequence. In silico analysis of the newly constructed MOB transcript has shown some structural features that we believe will be very helpful in elucidation of the MOB gene biological function. First, the organization of the 5V-UTRs is worthy of notice. The long 5V-UTR, 954 nucleotides vs. the mean human mRNA 5V-UTR length of 210.2 nucleotides (Pesole et al., 2001) is distributed along six exons and partially covers the seventh exon; among these, exon I is highly rich in guanosine (G) and cytidine (C) residues and can possibly form specific secondary structures near to the transcript cap region. Moreover, the 5V-UTR harbours three uORFs located in three distinct exons. In theory, such structures are able to affect translation efficiency by severe hampering of the 40s ribosome complex scanning process (Kozak, 1989; Van der Velden and Thomas, 1999; Pesole et al., 2001; Meijer and Thomas, 2002). Because these tentative translationally negative cis modulators are localized within separate MOB exons, we expect that alternative splicing will prove to be one of the mechanisms controlling MOB translational efficiency. Second, the presumed coding region of the MOB transcript starts with the weak ATG codon, and it is well known that presence of an unsuitable context for the ATGstart codon also impairs mRNA translation efficiency (Kozak, 2000). Thus, all the structural features proposed for the 5V-UTR and for the coding region of the MOB transcript belong to the hallmarks of a strong repression of translational activity, and, in general, are the tools of very specific posttranscriptional control of eukaryotic mRNA activity. Such negative regulation is typical for the mRNAs encoding regulatory proteins, such as proto-oncogenes, growth factors, their receptors and homeodomain proteins. Third, the 3V-UTR is half as long again as the mean 3V-UTR of 1027 nucleotides estimated for humans (Pesole et al., 2001) and is ATenriched so that AT content is 70% vs. the general mean of 55% (Pesole et al., 2001). Six ATTTA motifs were found within the 3V-UTR in the vicinity of the poly-T and poly-A tracts; these motifs may represent the functional domains
264
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
of adenylate/uridylate-rich elements (AREs) known as determinants of mRNA instability. Such AREs are often found in numerous labile short-living mRNAs encoding regulatory proteins, such as proto-oncoproteins, growth factors and their receptors, inflammatory mediators and cytokines (Chen and Shyu, 1995; Guhaniyogi and Brewer, 2001). Taken together, all the structure features revealed by meticulous examination of the newly constructed MOB transcript suggest that this messenger may appear to be a short-lived molecule with strong regulation of its translational efficiency. The RT-PCR assay allowed us to detect the predicted MOB transcript in human tissues. Electrophoretic separation of the amplified products revealed fragments of the expected size represented here as major bands. The primary structure of the major RT-PCR product amplified with the primers to exons I and X (GenBank accession no. AY332650) was fully identical with that of the corresponding region of the predicted in silico MOB transcript. Besides, a number of additional minor amplification fragments were detected on the same gel for primer pairs for exons I– VIII, I– IX, I– X and I –XI. We consider this as evidence for additional less-abundant MOB splice forms. The difference in size between this shortest minor fragment and the major one for each corresponding primer pair was estimated as approximately 800 bp: similar to that of the longest coding exon, VII. Therefore, we propose that the shortest fragment represents an alternative MOB transcript resulting from excision of exon VII. Indeed, sequencing of the shortest silver-detectable RT-PCR product obtained with the primers to exons I– X (GenBank accession no. AY364008) has demonstrated that the corresponding transcript lacked the exon VII sequence. The distribution pattern of the clustered minor fragments indicates that they could result upon the splicing events within the gene region comprising exons VII and VIII (the fragment of the gene coding portion) and intron VII. Moreover, alternative MOB transcripts different from those described above were detected by Southern blot analysis within the same tissue. Upon hybridization of the amplification products obtained with the primers to the exons I – XI with the P32-labelled Hmob33 insert, four distinct radioactive signals instead of two were detected. As expected, the most intense signal (between 2.1 and 2.4 kb) corresponded to the major PCR product running as one band with the adjacent alternative products on the agarose gel, another (less than 1.7 kb) to the shortest alternative PCR product. Two additional signals (one between 1.09 and 1.16 kb and one less than 0.80 kb) revealed amplification products that were not detectable on the polyacrylamide gel by silver staining. These extra signals could represent extremely rare alternative MOB transcripts resulting from as yet unknown exon combinations. Otherwise, we cannot exclude that these signals could represent the degradation products of MOB transcripts. We thus demonstrate the wide spectrum of the alternative MOB splice forms in human cerebellum. Under given
conditions, most of the alternative changes were found to affect the MOB coding region, thus providing for the diversity of functional isoforms of the corresponding protein within the same tissue. When compared with the hypothetical protein Mob (product of the major MOB transcript), protein Mob1 (product of the transcript lacking exon VII) has a truncated N-terminus devoid of a SAM domain and first two transmembrane domains. Instead, a short ninenucleotide amino acid sequence of yet unknown function represents the N-terminus of Mob1. Primer extension assay performed to determine MOB transcription initiation sites has failed to generate any product from the primers specific to exon I in the experiments with human cerebellum RNA. We believe that the GC-rich nature of this region could present difficulties with primer annealing. In another set of experiments, we used primer specific to exon VI that is known to have a relatively low GC content. Two signals of different intensity were revealed. One of an approximate length of 700 nucleotides and of low intensity was considered to represent the expected fragment; another highly intense signal of about 350 nucleotides was presumed to represent a novel type or types of transcript synthesized from an alternative start site. This site either could initiate the synthesis of a transcript or transcripts far downstream from the MOB exon I or could be responsible for the transcripts bearing as yet unknown alternative 5V- end exons. Some cDNA clones probably representing such alternative 5V- end fragments were revealed by in silico analysis (data not shown). These results demonstrate that at least two different initiation sites for MOB gene transcription are functionally active in human cerebellum. Most probably, their activity is governed by two distinct promoters, each being specific to different cerebellar cell types. Considering the primer extension results as an evidence of the constructed MOB transcript 5V- end verity, we performed an in silico search of the putative promoters within the genomic sequence flanking MOB exon I’s 5Vend. Two possible promoter regions were predicted just upstream to the exon I within the extended CpG island. A TATA-less structure together with their location within the CpG island shows these putative promoters to resemble those of genes encoding housekeeping enzymes, growth factors and their receptors, transcription factors and oncogenes (Azizkhan et al., 1993; Cross and Bird, 1995). We therefore propose that the MOB expression pattern could be similar to that of the housekeeping genes; that is, MOB could be expressed ubiquitously. This hypothesis is consistent with multiple EST data demonstrating that MOB is expressed in a wide spectrum of human tissues (Vladychenskaya et al., 2002). Our results obtained by RT-PCR confirm that the MOB expression pattern is not tissue specific. Maximum expression of the major MOB transcript as well as of the transcript lacking exon VII was detected in the brain samples and in the kidney, with minimal expression in spleen, lymphatic tissue and liver. Although differing between the tissue samples, the overall
I.P. Vladychenskaya et al. / Gene 338 (2004) 257–265
MOB expression level was very low in all the tissues examined. Summarizing our observations, we conclude that the in silico cloned MOB gene is functionally active and is expressed in a wide spectrum of human tissues. Under given conditions, MOB is able to produce alternatively spliced transcripts differing in their coding portions and thus generating a diversity of encoded proteins within the same tissue. We suspect that two promoters govern MOB expression. We presume that regulation of MOB functional activity is provided by simultaneous transcriptional and posttranscriptional control that can strictly regulate the yield of the encoded protein products. Allowing fast adaptation to changing cell context, such a combined control is typical for some genes that have to be activated only under very specific conditions, for example, during changing stages of cell differentiation or embryonic development (Bernstein et al., 1995; Meijer et al., 2000). We believe MOB might be such an inducible gene activated under as yet unknown cell contexts.
Acknowledgements This study was supported by the MCB program of the Russian Academy of Sciences, by Federal Support of Leading Scientific Schools of the Russian Ministry of Science and Technology (project no. 1430.2003.4) and by the Russian Foundation for Basic Research (project no. 0204-48809).
References Azizkhan, J.C., Jensen, D.E., Pierce, A.J., Wade, M., 1993. Transcription from TATA-less promoters: dihydrofolate reductase as a model. Crit. Rev. Eukaryot. Gene Expr. 3, 229 – 254. Bernstein, J., Shefler, I., Elroy-Stein, O., 1995. The translational repression mediated by the platelet-derived growth factor 2/c-sis mRNA leader is relieved during megakaryocytic differentiation. J. Biol. Chem. 270, 10559 – 10565. Budowle, B., Chakraborty, R., Giusti, A.M., Eisenberg, A.J., Allen, R.C., 1991. Analysis of the VNTR locus D1S80 by the PCR followed by high-resolution PAGE. Am. J. Hum. Genet. 48, 137 – 144.
265
Buiakova, O.I., Barinova, O.I., Dergunova, L.V., Khaspekov, G.L., Chivilev, I.V., Limborskaia, S.A., 1992. Isolation and analysis of brainspecific sequences from cDNA libraries for various segments of the human brain. Genetika 28, 40 – 46. Chen, C.Y., Shyu, A.B., 1995. AU-rich elements: characterization and importance in mRNA degradation. Trends Biochem. Sci. 20, 465 – 470. Chomczynski, P., Sacchi, N., 1987. Single-step method of RNA isolation by acid guanidinium thiocyanate – phenol – chloroform extraction. Anal. Biochem. 162, 156 – 159. Cross, S.H., Bird, A.P., 1995. CpG islands and genes. Curr. Opin. Genet. Dev. 5, 309 – 314. Dergunova, L.V., Vladychenskaia, I.P., Polukarova, L.G., Raevskaia, N.M., Lelikova, G.P., Limborskaia, S.A., 1998. Features of the structure, expression and chromosomal mapping of the nucleotide sequences of Hmob3 and Hmob33, obtained from a human medulla oblongata cDNA library. Mol. Biol. (Mosk.) 32, 249 – 254. Dergunova, L.V., Raevskaya, N.M., Vladychenskaya, I.P., Limborska, S.A., 2003. Hmob3 brain-specific sequence is a part of phylogenetically conserved human MAP1B gene 3V-untranslated region. Biomol. Eng. 20, 91 – 96. Guhaniyogi, J., Brewer, G., 2001. Regulation of mRNA stability in mammalian cells. Gene 265, 11 – 23. Kozak, M., 1989. Circumstances and mechanisms of inhibition of translation by secondary structure in eucaryotic mRNAs. Mol. Cell. Biol. 9, 5134 – 5142. Kozak, M., 2000. Do the 5Vuntranslated domains of human cDNAs challenge the rules for initiation of translation (or is it vice versa)? Genomics 70, 396 – 406. Meijer, H.A., Dictus, W.J., Keuning, E.D., Thomas, A.A., 2000. Translational control of the Xenopus laevis connexin-41 5V-untranslated region by three upstream open reading frames. J. Biol. Chem. 275, 30787 – 30793. Meijer, H.A., Thomas, A.A., 2002. Control of eukaryotic protein synthesis by upstream open reading frames in the 5V-untranslated region of an mRNA. Biochem. J. 367, 1 – 11. Pesole, G., Mignone, F., Gissi, C., Grillo, G., Licciulli, F., Liuni, S., 2001. Structural and functional features of eukaryotic mRNA untranslated regions. Gene 276, 73 – 81. Sambrook, J., Fritsch, E.F., Maniatis, T., 1989. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, New York, sections 9.38 – 9.58. van der Velden, A.W., Thomas, A.A., 1999. The role of the 5V untranslated region of an mRNA in translation regulation during development. Int. J. Biochem. Cell Biol. 31, 87 – 106. Vladychenskaya, I.P., Dergunova, L.V., Limborska, S.A., 2002. In vitro and in silico analysis of the predicted human MOB gene encoding a phylogenetically conserved transmembrane protein. Biomol. Eng. 18, 263 – 268.