International Journal for Parasitology 30 (2000) 723±728
www.elsevier.nl/locate/ijpara
Research note
Inter- and intra-strain variation in the 5.8S ribosomal RNA and internal transcribed spacer sequences of Entamoeba histolytica and comparison with Entamoeba dispar, Entamoeba moshkovskii and Entamoeba invadens1 Indrani Som a, Amir Azam c, Alok Bhattacharya b, Sudha Bhattacharya a,* a
School of Environmental Sciences, Jawaharlal Nehru University, New Delhi, 110067, India b School of Life Sciences, Jawaharlal Nehru University, New Delhi, 110067, India c Department of Chemistry, Jamia Milia Islamia, New Delhi, 110025, India Received 10 January 2000; received in revised form 3 April 2000; accepted 3 April 2000
Abstract The ribosomal RNA genes in Entamoeba histolytica are located on circular DNA molecules in about 200 copies per genome equivalent. Nucleotide sequence analysis of the 5.8S rRNA gene and the ¯anking internal transcribed spacers was carried out to determine the degree of sequence divergence in the multiple rRNA gene copies of a given strain; amongst three dierent E. histolytica strains (HM-1:IMSS, Rahman and HK-9); and amongst four species of Entamoeba (Entamoeba histolytica, Entamoeba dispar, Entamoeba moshkovskii and Entamoeba invadens ). The results show that all rRNA gene copies of a given strain are identical. Few nucleotide positions varied between strains of a species but the dierences were very pronounced amongst species. In general, the internal transcribed spacer 2 sequence was more variable and may be useful for strain- and species-identi®cation. The 5.8S rRNA gene and the internal transcribed spacer 2 of E. invadens were unusually small in size. 7 2000 Australian Society for Parasitology Inc. Published by Elsevier Science Ltd. All rights reserved. Keywords: E. histolytica rRNA; Entamoeba rRNA; Internal transcribed spacers; Ribosomal RNA genes; rRNA sequence divergence
Ribosomal RNA genes are amongst the most ubiquitous and conserved DNA sequences found in nature [1]. For this reason they have been extensively used for phylogenetic analysis [2] and as diagnostic probes [3]. These genes have been analysed from a large variety of organisms, ranging from protozoa to higher plants [4±11]. In general, while the rRNA gene sequences change very slowly due to selection pressure, the internal transcribed spacer (ITS) regions which sep* Corresponding author. Tel.: +91-011-6107676/2308; fax: +91011-6172438. E-mail address:
[email protected] (S. Bhattacharya). 1 New sequence data reported here are deposited in GenBank with the following accession numbers: X89635, Y12249, Y12250 and Y12251.
arate the 18S rRNA from 5.8S rRNA (ITS-1) and 5.8S rRNA from 28S rRNA (ITS-2) are much more variable both in length and sequence. The nature of these variations when better understood, may provide insights into the mechanisms by which DNA sequences evolve. The extent of conservation of rRNA genes and ITS sequences seems to vary, depending on the organism. In most organisms, the rRNA gene sequences are well conserved within strains of the same species. However, there are instances of distinct sets of rRNA genes existing in the same organism. For example, in Plasmodium there are two clearly de®ned sets of rRNA genes, the A-type (expressed during the asexual phase of the life cycle) and the S-type (expressed in the mosquito during sporogony). The 18S rRNA genes in members of the same set are completely identical but
0020-7519/00/$20.00 7 2000 Australian Society for Parasitology Inc. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 2 0 - 7 5 1 9 ( 0 0 ) 0 0 0 5 0 - 3
724
I. Som et al. / International Journal for Parasitology 30 (2000) 723±728
show about 10% variation between the two sets. The ITS sequences are identical at 80% to 91% of the positions among the A-type and 75% between the S-type with only 42% to 57% identity between the two sets [12]. This developmental expression of rRNA genes may be controlled by speci®c sequences in the spacer regions. Studies with human rRNA show the existence of 28S rRNA variants in a single individual [13]. The proportion in which each variant is expressed is the same in all tissues of a given individual. In the present paper we have made a comparative analysis of rRNA gene sequences in the protozoan parasite E. histolytica. The rRNA genes of this organism are located extrachromosomally on circular DNA molecules [14,15] with no evidence of a chromosomal copy [16]. Depending upon the strain of E. histolytica, each extrachromosomal element has either one or two rRNA transcription units. For example, the strain HM-1:IMSS has two units arranged in an inverted fashion, while the strain HK-9 has only one unit per circle. A knowledge of the extent of variation of rRNA gene sequences is important both from the point of view of understanding the evolution of these sequences and their potential use in strain identi®cation in a given species, especially for epidemiological purposes. For this reason we undertook the analysis of these sequences from dierent species of Entamoeba, within dierent strains of E. histolytica and within the transcription units in a given strain. Entamoeba histolytica, strain HM-1:IMSS and strain Rahman; E. dispar, strain CDC 0784; E. moshkovskii, strain Laredo and E. invadens, strain IP-1 were grown axenically in TYI-S-33 medium [17]. The rDNAs were analysed by nucleotide sequencing of PCR-ampli®ed fragments of rDNA transcription units. The primers used for ampli®cation were designed from the known sequence of E. histolytica rDNA and were derived from the 3' end of 18S- (Primer P1: 5 '-AGGTGAACCTGCGGAAGGATCATTA-3 ') and the 5 ' end of 28S-rRNA (Primer P2: 5 '-TCATTCGCCATTACTTAAGAAATCATTGTT-3 '). These primers would amplify a DNA fragment of around 0.5 kb containing the ITS-1, 5.8S and ITS-2 regions. PCR ampli®cation of genomic DNA was carried out in 30 cycles with denaturation at 948C for 1 min, annealing at 508C for 2 min and elongation at 728C for 2 min. The primers were purchased from GIBCO-BRL and the PCR kit from Perkin-Elmer-Cetus. The ampli®ed products were cloned into the plasmid vector, pBluescript II KS+ (Stratagene) by blunt-end ligation. Clones with inserts of expected size were selected for sequencing. Strain HM-1:IMSS of E. histolytica contains two copies of rRNA genes located on a 24.5 kb extrachromosomal circular plasmid molecule named EhR1 [18]. The two copies are arranged as inverted repeats. They
are separated by an upstream spacer of 010 kb and a downstream spacer of 04 kb, both containing several classes of tandem repeat units. On analysing the restriction enzyme pattern of the two rRNA transcription units of HM-1:IMSS, both units were found to be exactly identical. However, this does not discount the possibility of minor sequence dierences in the two units. The complete nucleotide sequence of one of the units (rDNA I) was already known. It was therefore decided to determine the partial sequence of the second unit (rDNA II), especially the ITS regions (which are generally much less conserved in sequence than the rRNA coding regions). The strategy for cloning rDNA II was based on some unique restriction enzyme sites present in the upstream sequence of the two rDNA units. As shown in Fig. 1 ScaI produces two diering sized fragments corresponding to the two units. The 5 '- half of rDNA II was contained in the 4.2 kb ScaI fragment. This fragment contains the entire 18S rRNA gene, 5.8S rRNA gene, both the ITS and a part of the 28S rRNA gene. To clone this ScaI fragment, E. histolytica genomic DNA from strain HM-1:IMSS and plasmid vector, pBluescript II KS+DNA were digested with ScaI and EcoRV respectively. The vector molecules were dephosphorylated at their 5' termini with alkaline calf intestinal phosphatase and mixed in a molar ratio of 1:2 with the digested genomic DNA. Ligation was carried out with 6 Weiss units of T4 DNA ligase in ligase buer [50 mM Tris-Cl (pH 7.8), 10 mM MgCl2, 10 mM dithiothreitol, 25 mg/ml BSA and 1mM rATP] at 168C for 16 h. 4±6 ml of the ligation mixture was used for transforming E. coli competent cells. The recombinant white colonies were further screened by colony hybridisation and by plasmid DNA isolation and restriction mapping. The identity of the clones was con®rmed by Southern hybridisation with appropriate DNA probes. Unidirectional deletions of the 4.2 kb clone were generated by using Exonuclease III [19] and clones containing ITS-1 and ITS-2 were selected for sequencing. Nucleotide sequences were determined for the individual clones by the dideoxy chain termination method [20]. Upon alignment with the already known sequence, the sequence of ITS-1 and ITS-2 in the 4.2 kb ScaI clone was found to be exactly identical to the reported sequence. Further sequence data of parts of the 18S and 28S regions (>1.0 kb) also con®rmed the absolutely identical status of the two units. Therefore, it is inferred that the two transcription units of EhR1 are identical in terms of their nucleotide sequence. Since EhR1 exists in as many as 200 copies per genome, it was of interest to see whether any minor sequence dierences exist in the various copies. A library of cloned fragments containing ITS-1, 5.8S and ITS-2 was generated by PCR ampli®cation of total DNA using primers homologous to the 3 '- end of 18S-
I. Som et al. / International Journal for Parasitology 30 (2000) 723±728
725
Fig. 1. Identity of the 4.2 kb Sca I clone containing a portion of rDNA II. The linear maps of the two Sca I fragments corresponding to the two units respectively are shown. The numbers 1 and 2 ¯anking the 5.8S indicate the ITS-1 and ITS-2 respectively. In rDNA II,Tr codes for a 0.7 kb transcript [18].
and 5 '- end of 28S- rDNA as derived from the known sequence. Ten random recombinant clones carrying the 0.5 kb insert of interest were selected and analysed by complete nucleotide sequencing. It was found that all the clones were exactly identical in their nucleotide sequence. Therefore, it was concluded that the ITS-1, 5.8S and ITS-2 regions were not only fully conserved among the two transcription units of EhR1, but also amongst EhR1 molecules in general. Since this region of the transcription unit is subject to greater variations, it can be used to generate species-speci®c probes and for phylogenetic analysis. To check for the pattern of sequence variation, if any, the nucleotide sequence from another strain of E. histolytica and three other species of Entamoeba was determined after PCR ampli®cation of this region. The sequences of E. histolytica strain HK-9 and isolate 43 of E. dispar were already reported in the database [21] and these were also included. Nucleotide sequence analysis was done by using Clustal W. The sequence alignment output from Clustal W was re®ned manually to obtain better consensus. Phylogenetic analysis was done with the PHYLIP package as implemented in Data Analysis in Molecular Biology and Evolution (DAMBE) version 3.7.49. The neighbor joining method was used to calculate distance values and bootstrap analysis was done with thousand replicates. A multiple alignment of all the sequences is shown in Fig. 2 and the results are summarised in Table 1. The boundaries of ITS-1, 5.8S rRNA and ITS-2 shown in Fig. 2 have been deduced from the multiple alignment. The 3' end of 5.8S rRNA sequence was complementary to the 5 ' end of 28S rRNA sequence in each case (including E. invadens ). The 5.8S rRNA gene sequence was most conserved of all as expected, with the E. histolytica strains closer to each other than the other species of Entamoeba. The 5.8S rRNA gene sequence of E. histolytica strains HM-1:IMSS and Rahman was
100% identical, and strain HK-9 showed only one nucleotide change. Isolate 43 and the CDC 0784 strain of E. dispar showed a variation of 6% and 2.7% in the nucleotide sequence and 2% and 0.7% in their length, respectively, with respect to HM-1:IMSS. Entamoeba moshkovskii strain Laredo and E. invadens strain IP-1 were quite divergent with a variation of as much as 18.7% and 39.3% in the nucleotide sequence, respectively. While the 5.8S rRNA of Laredo was only one nucleotide longer than that of HM-1:IMSS, the IP-1 5.8S rRNA was 20 nucleotides shorter. As expected, the ITS sequences were quite dierent, with ITS-2 being more variable than ITS-1. A dendrogram of the multiple alignment showing phylogenetic distances is given in Fig. 3. The species and strains included in the analysis separated into three major groups consisting of E. histolytica-type where strain Rahman is overall closer to HM-1:IMSS than to strain HK-9; E. dispar-type consisting of isolate 43 and CDC Table 1 The length in bp of the internal transcribed spacers and the 5.8S rRNA region of various Entamoeba strains and species. The ®gures in brackets refer to the number of nucleotide changes including additions, deletions and mismatches with respect to the E. histolytica strain HM-1:IMSS sequence Spp. /Strains E. histolytica HM-1:IMSS Rahman HK-9 E. dispar CDC 0784 isolate 43 E. moshkovskii Laredo E. invadens IP-1
ITS-1
5.8S
ITS-2
Total length
123 (0) 124 (1) 124 (1)
150 (0) 150 (0) 149 (1)
121 (0) 121 (1) 126 (11)
394 395 399
124 (6) 123 (7)
149 (4) 153 (9)
113 (28) 113 (29)
386 389
118 (42)
151 (28)
113 (65)
382
106 (57)
130 (59)
66 (75)
302
726
I. Som et al. / International Journal for Parasitology 30 (2000) 723±728
0784; and the other Entamoeba, E. moshkovskii and E. invadens. These results are in agreement with phylogenetic analysis of Entamoeba using other gene sequences [22]. These sequences can also be used for epidemiolo-
gical purposes, as one can identify the strains HM1:IMSS and HK-9 of E. histolytica in a mixed population based on their restriction enzyme patterns. For example an Alu I site in the ITS-2 of HK-9 is absent
Fig. 2. Multiple alignment of the nucleotide sequence of the internal transcribed spacers and 5.8S rRNA region from dierent strains and species of Entamoeba. The names are abbreviated as: 1, HM-1 (E. histolytica, strain HM-1:IMSS); 2, RMN (E. histolytica, strain Rahman); 3, HK-9 (E. histolytica, strain HK-9); 4, DISP (E. dispar, strain CDC 0784); 5, NP (E. dispar, isolate 43); 6, LAR (E. moshkovskii, strain Laredo) and 7, INV (E. invadens, strain IP-1). The 5.8S sequence is shown in bold and the start of the ITS-1 and end of the ITS-2 is shown by vertical arrows.
I. Som et al. / International Journal for Parasitology 30 (2000) 723±728
in HM-1:IMSS, while a Sal I site present in the ITS-2 of HM-1:IMSS is absent in HK-9. In the present study the degree of polymorphism in the rDNA gene loci between dierent EhR1 molecules was determined by cloning and complete sequencing of 10 random clones from a library of PCR-ampli®ed DNA containing the ITS-1, 5.8S and ITS-2 regions. All the clones were identical in sequence, implying that the rRNA sequence may be fully conserved amongst all gene copies. While the possibility remains that variations might exist at a very low level which were undetected in our sample size of 10 clones, certainly, the majority of rRNA genes in E. histolytica belong to one sequence class unlike the situation in Plasmodium [12] and humans [23]. Since the literature available on the comparison between dierent rRNA transcription units from the same cell or strain is quite limited, we cannot say whether our observation of sequence homogeneity in rDNA transcription units is unique or not. It would be interesting to study the mechanism by which these rRNA gene copies are rendered so well conserved, even in the ITS regions, in spite of the large number of copies present in a single cell. Any random mutations occurring in the rDNA loci must be eliminated by an active process like gene conversion. Unequal crossing over is also thought to maintain the high copy number and identity of these genes [24]. The nucleotide sequence of E. invadens, strain IP-1 which was most distant from E. histolytica was unusual, due to the small size of its ITSs and the 5.8S rRNA. The reduction in size of the 5.8S rRNA of IP-1 was mainly due to deletions clustered in the 3 '-half of the molecule. This fact can be used to study the functional signi®cance of the apparently ``dispensable'' nucleotides of the 5.8S rRNA molecule in ribosomal function. Small-sized rRNA genes have been reported in Giardia too [25], where the reduction in size does not seem to aect the functional status of the rRNAs. In order to accommodate the changes in rRNA, ribosomal proteins may also be modi®ed in these organisms (concerted evolution) to ultimately conserve function.
Fig. 3. A dendrogram showing the phylogenetic relationship among the dierent strains and species of Entamoeba under study. The names are abbreviated as indicated in the legend to Fig. 2. Bootstrap values (percent) are given at each branch-point.
727
Acknowledgements I Som acknowledges the Dept of Atomic Energy for a research fellowship; A Azam acknowledges the C.S.I.R for a research associateship. This work was funded by a grant to S Bhattacharya from Council of Scienti®c and Industrial Research, India. References [1] Raue HA, Klootwijk J, Musters W. Evolutionary conservation of structure and function of high molecular weight ribosomal RNA. Prog Biophys Mol Biol 1988;51:77±129. [2] Wagele JW, Rodding F. Origin and phylogeny of metazoans as reconstructed with rDNA sequences. Prog Mol Subcell Biol 1998;21:45±70. [3] Stackebrandt E, Liesack W, Witt D. Ribosomal RNA and rDNA sequence analysis. Gene 1992;115:255±60. [4] Porter CH, Collins FH. Species-diagnostic dierences in a ribosomal DNA internal transcribed spacer from the sibling species Anopheles freeborni and Anopheles hermsi (Diptera: Culicidae). Am J Trop Med Hyg 1991;5:271±9. [5] Cai J, Collins MD, McDonald V, Thompson DE. PCR cloning and nucleotide sequence determination of the 18S rRNA genes and internal transcribed spacer 1 of the protozoan parasites Cryptosporidium parvum and Cryptosporidium muris. Biochim Biophys Acta 1992;1131:317±20. [6] Nielsen H, Engberg J. Sequence comparison of the rDNA introns from six dierent species of Tetrahymena. Nucl Acids Res 1985;13:7445±55. [7] Chambers C, Dutta SK, Crouch RJ. Neurospora crassa ribosomal DNA: sequence of internal transcribed spacer and comparison of N. intermedia and N. sitophila. Gene 1986;44:159±64. [8] DeJonckheere JF. Sequence variation in the ribosomal internal transcribed spacers, including the 5.8S DNA of Naegleria spp. Protist 1998;149:221±8. [9] Navajas M, Lagnel J, Fauvel G, deMoraes G. Sequence variation of ribosomal internal transcribed spacers (ITS) in commercially important Phytoseiidae mites. Exp Appl Acarol 1999;11:851±9. [10] VanHerwerden L, Blair D, Agatsuma T. Intra- and inter-speci®c variation in nuclear ribosomal internal transcribed spacer 1 of the Schistosoma japonicum species complex. Parasitology 1998;116:311±7. [11] Jobst J, King K, Hemleben V. Molecular evolution of the internal transcribed spacers and phylogenetic relationships among species of the family Cucurbitaceae. Mol Phylogenet Evol 1998;9:204±19. [12] Rogers JM, McConkey GA, Li J, McCutchan TF. The ribosomal DNA loci in Plasmodium falciparum accumulate mutations independently. J Mol Biol 1995;254:881±91. [13] Kuo BA, Gonzalez IL, Gillespie DA, Sylvester JE. Human ribosomal RNA variants from a single individual and their expression in dierent tissues. Nucl Acids Res 1996;24:4817±24. [14] Bhattacharya S, Bhattacharya A, Diamond LS, Soldo AT. Circular DNA of Entamoeba histolytica encodes ribosomal RNA. J Protozool 1989;36:455±8. [15] Huber M, Koller B, Gitler C, et al. Entamoeba histolytica ribosomal RNA genes are carried on extrachromosomal palindromic circular molecules. Mol Biochem Parasitol 1989;32:285±96. [16] Bagchi A, Bhattacharya A, Bhattacharya S. Lack of a chromosomal copy of the circular rDNA plasmid of Entamoeba histolytica. Int J Parasitol 1999;29:1775±83. [17] Diamond LS, Harlow DR, Cunnick C. A new medium for axe-
728
[18]
[19] [20] [21]
I. Som et al. / International Journal for Parasitology 30 (2000) 723±728 nic cultivation for Entamoeba histolytica. Trans R Soc Trop Hyg 1978;72:431±2. Sehgal D, Mittal V, Ramachandran S, Dhar SK, Bhattacharya A, Bhattacharya S. Nucleotide sequence organisation and analysis of the nuclear ribosomal DNA circle of the protozoan parasite Entamoeba histolytica. Mol Biochem Parasitol 1994;67:205± 14. Heniko S. Unidirectional digestion with exonuclease III in DNA sequence analysis. Methods Enzymol 1987;155:156±65. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain terminating inhibitors. Proc Natl Acad Sci USA 1977;74:5463± 7. Que X, Reed S. Nucleotide sequence of a small subunit ribosomal RNA (16S-like rRNA) gene from Entamoeba histolytica:
[22] [23] [24] [25]
dierentiation of pathogenic from non-pathogenic isolates. Nucl Acids Res 1991;19:5438. Clark CG, Diamond LS. Intraspeci®c variation and phylogenetic relationships in the genus Entamoeba as revealed by riboprinting. J Eukaryot Microbiol 1997;44:142±8. Gonzalez IL, Gorski JL, Campen TJ, et al. Variation among human 28S ribosomal RNA genes. Proc Natl Acad Sci USA 1985;82:7666±70. Petes TD. Unequal meiotic recombination within tandem arrays of yeast ribosomal DNA genes. Cell 1980;19:765±72. van Keulen H, Horvat S, Erlandsen SL, Jarroll EL. Nucleotide sequence of the 5.8S and large sub-unit rRNA genes and the internal transcribed spacer and part of the external spacer from Giardia ardeae. Nucl Acids Res 1991;19:6050±2.