Molecular and Biochemical Parasitology, 56 (1992) 181 184 © 1992 Elsevier Science Publishers B.V. All rights reserved. / 0166-6851/92/$05.00
181
MOLBIO 01851
Short Communication
Identification of a cryptic intron in the Plasmodium vivax Duffy binding protein gene J o h n H. A d a m s b, X i a n g d o n g F a n g a, D a v i d C. K a s l o w a a n d Louis H. Miller a ~Laboratory of Malaria Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA; and b Department of Biological Sc&nces, University of Notre Dame, Notre Dame, IN, USA (Received 6 July 1992; accepted 30 July 1992)
Key words: Plasmodium; Malaria; Invasion; Adhesin; Dully blood group antigen; Microneme; Exon
Some genes in Plasmodium falciparum have no introns (CSP, MSP-1, RAP-2 [1-3]). Others have introns identified by comparison of cDNA to genomic sequence (reviewed in ref. 4). Introns were bounded by consensus donor and acceptor splice sites, were more AT-rich than the coding regions, and usually failed to maintain an open reading frame. However, the actual open reading frame of a gene may be misinterpreted when it is deduced from genomic DNA alone if introns are not suspected and the introns do not disrupt the open reading frame. We describe a revised deduced amino acid sequence of the Plasmodium vivax Duffy binding protein based on new data from PCR-amplified reverse-transcribed mRNA. Correspondence address: John H. Adams, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA. Abbreviations: cDNA, complementary DNA; CSP, circumsporozoite protein; EBA-175, erythrocyte binding antigen 175; MSP-I, major surface protein 1; PCR, polymerase chain reaction; RAP-2, rhoptry associated protein 2. Note: The GenBank accession number of the Plasmodium vivax Duffy binding protein gene is M37514 and has been amended from the original submission [6]. The Plasmodium knowlesi fl Duffy binding protein gene GenBank accession number is M90694.
An intron in the region of the signal sequence of the Plasmodium knowlesi fl gene encoding a Duffy binding family protein was delineated by comparing cDNA to genomic sequence [5]. When the nucleotide sequence of the genes encoding the Duffy binding protein of P. vivax [6] and P. knowlesi fl gene were aligned (Fig. 1A), the sequences were similar before and after, but not within the region where the P. knowlesi fl intron is found. In P, vivax this sequence was originally interpreted as an open reading frame [6]. Since the P. vivax nucleotide sequence has potential intron splice donor and acceptor sites on each end of this region (Fig. 1A), we suspected that the P. vivax gene contained the intron even though the region maintained an open reading frame. To determine whether the P. vivax gene had an intron in this region, cDNA was generated for sequencing. The cDNA was synthesized from total cellular R N A with an antisense oligonucleotide 100 bp downstream from the possible intron and then was amplified by PCR using the cDNA primer and a sense oligonucleotide 500 bp upstream of the possible intron (based on the genomic sequence; ref. 6). The major cDNA PCR product was (135 bp) smaller than the PCR product of a genomic clone, indicating that the intron was spliced (Fig. 2, lanes 1 and 5, left panel). An
182 ATGAAAGGAAAAAACCGCTCTTTATTTGTTCTCCTAGTTTTAI-rATTGTTACACAA6gtat c a t a t a a g g a t g a t t t t t c t a t ca c a c t a a t a a a t t a t cetgaoggao M K G K N R S L F V L L V k L L L H K v s y k d d f s t t l t n y h e g k M K (; K N R S L F V L L V L L L L H K . . . . . . . . . . . . . . . . . . aaaa•t•tttaatt•tactaQ••agaa•attagaanaagctaataat•gtgatgtttgCaattttttt•ttc•tttctctc•gGTAAATAATGTAT•ATTA•AAC•AACA k y l i i l k r k l e k a n n r d v c n f f l h f s q V N N V L L E . . . . . . . . . . . . . . . . . . . . . . . . . . . V N N V L L E
R R
T T
Fig. 1. (A) Each nucleotide sequence, P. vivax (Pv) and P. knowlesi [3 (Pk#) Duffy binding protein genes, begins with the putative initiation of translation codon (ATG) and continues through the first 27 residues of exon 2. Spaces have been inserted to give the best alignment of the sequences and conserved bases are indicated with a vertical bar (I). Nucleotides from exons are in upper case; nucleotides from introns are in lower case including the consensus splice sites (gta...yag). (B) The deduced amino acid sequences of the c D N A are aligned with one space inserted. A slash mark (/) indicates the location of exon junctions. An asterisk (*) identifies conserved residues and a period (.) identifies semiconserved changes in residues
sequence - ~
~5'PCR~_
#'PC P. vivax
oligonucleotide probe genomic DNA 5' 3'
cDNA 5' 3'
1
2
3
4
5
6
genomic DNA 5' 3'
cDNA 5' 3'
1
2
3
4
5
6
Fig. 2. Intron spliced from m R N A in the 5' end of the gene encoding the P. vivax Duffy binding protein. The diagram depicts the open reading frame in the exons as boxes and the untranslated regions as a line. The cysteine-rich regions of exon 2, the erythrocyte binding domain, are black. Using primers (antisense oligonucleotide 5'-CTCAACACAGTGGTGTCC-3'; sense oligonucleotide 5 ' - C G A A A C A T T G C C A C A A A A A TTTT-3') that bracketed the predicted 5' intron (5' PCR) of the P. vivax gene, late schizont c D N A (lane 1) and genomic D N A (lane 5) were amplified by PCR and size fractionated by agarose gel electrophoresis (left panel, ethidium bromide stained gel with HaeIII-digested X D N A standards, lanes 3,4). As a control for the presence of genomic DNA, PCR was done using primers inside (5'-AACTTCACAAATGGAAT-3') and outside (5'C A G A G A T G T C A A A G C A A C A A G G G A - 3 ' ) a 3' intron (3' PCR). A Southern blot of the PCR products was hybridized with an oligonucleotide (5'-CTATCACACTAATAAAT-3') located in the predicted intron. This oligonucleotide hybridized with the major PCR product amplified from genomic D N A (right panel, lane 5) but only a faint hybridization band of that size (not visible on the reproduction) was detectable in the c D N A (left panel, lane 1), the Southern blot hybridization was washed in 6X SSC, 0.5% SDS at 44°C. The 5' PCR product was directly sequenced (dsDNA Cycle Sequencing System, Life Technologies, Inc.) without cloning using oligonucleotide sequence primers from upstream (5'-TGATATATGCGCTTAAATT-3') and inside (5'-CTATCACACTAATAAAT-3') the predicted intron. Sequence was generated using a primer upstream of the intron, but none was generated using the primer from the intron (data not shown).
183
oligonucleotide from the intron used to probe the Southern blot of the cDNA PCR product hybridized with only a minor band in the region where the intron was still present; the major ethidium bromide band did not hybridize with the probe (Fig. 2, lane 1 right panel). The minor band could have resulted from one of the following: (a) the intron was not spliced in rare transcripts; (b) amplification of unspliced nuclear mRNA; or (c) minor contamination by genomic DNA or a plasmid containing this region of genomic DNA. We have not to date excluded a minor unspliced transcript. The cDNA PCR product was sequenced directly using radiolabeled internal primers by Taq polymerase-based cycle sequencing. The cDNA sequence confirmed that the intron was removed in the major P. vivax transcript giving rise to an amino acid sequence homologous to the P. knowlesi [3 gene (Fig. 1B) This study raises the question of whether amino acid sequences deduced from genomic sequence alone should be confirmed, when possible, by PCR amplification of transcribed m R N A in order to exclude introns. Errors in identifying the complete open reading of a gene can also occur at the presumed end of the open reading frame where introns may have gone unrecognized. Such was the case with the 175kDa erythrocyte binding protein of P. falciparum (EBA-175/ sialic acid binding protein) which was recently corrected [5,7].
Failure to predict the correct open reading flame may complicate efforts to study functional molecules in other expression systems.
References 1 Dame, J.B., Williams, J.U, McCutchan, T.F., Weber, J.L., Wirtz, R.A., Hockmeyer, W.T., Maloy, W.U, Haynes, J.D., Schneider, I., Roberts, D., Sanders, G.S., Reddy, E.P., Diggs, C.L. and Miller, L.H. (1984) Structure of the gene encoding the immunodominant surface antigen on the sporozoite of the human malaria parasite Plasmodium falciparum. Science 225, 593-599. 2 Holder, A.A., Lockyer, M.J., Odink, K.G., Sandhu, J.S., Riveros, M.V., Davey, L.S., Tizard, M.L.V., Schwarz, R.T. and Freeman, R.R. (1985) Primary structure of the precursor to the three major surface antigens of Plasmodium falciparum merozoites. Nature 317, 270-273. 3 Saul, A., Cooper, J., Hauquitz, D., Irving, D., Cheng, Q., Stowers, A. and Limpaiboon, T. (1992) The 42kilodalton rhoptry-associated protein of Plasmodium falciparum. Mol. Biochem. Parasit. 50, 139-150. 4 Brown, H.J. and Coppel, R.L. (1991) Primary structure of a Plasmodium falciparum rhoptry antigen. Mol. Biochem. Parasitol. 49, 99 110. 5 Adams, J.H., Sim, B.K.L., Dolan, S.A., Fang, X., Kaslow, D.C. and Miller, L.H. (1992) A family of erythrocyte binding proteins of malaria parasites. Proc. Natl. Acad. Sci. USA, in press. 6 Fang, X., Kaslow, D.C., Adams, J.H. and Miller, L.H. (1991) Cloning of the Plasmodium vivax Duffy receptor. Mol. Biochem Parasitol. 44,125-132. 7 Sire, B.K., Orlandi, P.A., Haynes, J.D.; Klotz, F.W. Carter, J.M., Camus, D., Zegans, M.E. and Chulay, J.D. (1990) Primary structure of the Plasmodiurn falciparum erythrocyte binding antigen and identification of a peptide which elicits antibodies that inhibit malaria merozoite invasion. J Cell Biol. 111, 1877-84.