Cloning and bioinformatic identification of small RNAs in the filarial nematode, Brugia malayi

Cloning and bioinformatic identification of small RNAs in the filarial nematode, Brugia malayi

Molecular & Biochemical Parasitology 169 (2010) 87–94 Contents lists available at ScienceDirect Molecular & Biochemical Parasitology Cloning and bi...

776KB Sizes 0 Downloads 96 Views

Molecular & Biochemical Parasitology 169 (2010) 87–94

Contents lists available at ScienceDirect

Molecular & Biochemical Parasitology

Cloning and bioinformatic identification of small RNAs in the filarial nematode, Brugia malayi Catherine B. Poole ∗ , Paul J. Davis, Jingmin Jin, Larry A. McReynolds New England Biolabs, 240 County Rd, Ipswich, MA 01938, USA

a r t i c l e

i n f o

Article history: Received 30 July 2009 Received in revised form 15 October 2009 Accepted 16 October 2009 Available online 27 October 2009 Keywords: Brugia malayi microRNA Filarial nematode Small RNAs

a b s t r a c t Characterization of small RNAs from the filarial nematode Brugia malayi is the initial step in understanding their role in gene silencing. Both RNA cloning and bioinformatics were used to identify 32 microRNAs (miRNAs) belonging to 24 families. One family, miR-36 only occurs in helminths including B. malayi. Several of the miRNAs are arranged in clusters and are coordinately expressed as determined by northern blot analysis. In addition, small RNAs were identified from Pao/Bleo retrotransposons and their associated repeat sequences indicating that B. malayi uses an RNAi mechanism to maintain genome integrity. Analysis of these data provides a first glimpse into how small RNA-mediated silencing pathways regulate the parasitic life cycle of B. malayi. © 2009 Elsevier B.V. All rights reserved.

1. Introduction Recent discoveries have greatly expanded the role of small RNAs in the regulation of endogenous gene expression and in the protection of genomes from mobile transposable elements as well as from viruses. Small RNAs serve to guide argonaute proteins to complementary sequences in gene transcripts affecting multiple cellular functions through mRNA instability, translational suppression and chromatin modification [1]. Small RNAs have been intensely studied in Caenorhabditis elegans and currently consist of three major families: endogenous small interfering RNAs (endo-siRNAs) [2], piwi-interacting RNAs (piRNAs) [3] and microRNAs (miRNAs). C. elegans lin-4 was the first gene demonstrated to encode a small RNA and shown to post-transcriptionally regulate lin-14 protein levels by binding to complementary sequences in the 3 UTR of its mRNA [4,5]. Since the original description of lin-4 in 1993, miRNAs have been found in a wide range of eukaryotic organisms ranging from sponges to mammals [6,7]. The biogenesis of miRNAs is well understood [1]. In the nucleus, miRNAs are usually transcribed by RNA polymerase II as long primary (pri-) transcripts containing an imperfect stem loop or

Abbreviations: miRNA, microRNA; siRNA, small interfering RNA; mf, microfilariae; WGS, whole genome sequence; RISC, RNA-induced silencing complex; MSA, multiple sequence alignment. ∗ Corresponding author. Tel.: +1 978 380 7293; fax: +1 978 921 1350. E-mail address: [email protected] (C.B. Poole). 0166-6851/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.molbiopara.2009.10.004

“hairpin” structure of ∼70 nucleotides. The RNA hairpin is cleaved from the pri-miRNA transcript by the dsRNA specific endonuclease, Drosha. After being transported to the cytoplasm, the pre-miRNA stem loop structure is cleaved by the ribonuclease, Dicer generating a ∼22 nt duplex consisting of the mature miRNA and its complementary strand designated miRNA*. The mature miRNA is loaded into an RNA induced silencing complex (RISC) containing an argonaute protein. MicroRNAs guide RISC to sequences in the 3 UTR of mRNAs complementary to nucleotides 1–6 or 2–7 of the miRNA known as the “seed” sequence [8]. MicroRNA-RISC regulates mRNA stability and translational suppression through its interaction with members of the GW182 protein family. GW182 proteins serve as a molecular link tethering decapping and deadenylation complexes to miRNA-RISC resulting in mRNA degradation [9] and translational suppression [10]. Little is known about small RNAs in filarial parasites. However, sequencing of the Brugia malayi genome enabled identification of many of the components required for miRNA processing including argonaute proteins, the riboendonucleases Drosha and Dicer and a GW182 orthologue [11]. In this article, we describe our initial attempts to characterize small RNAs in B. malayi. Using bioinformatic and cloning approaches, 32 miRNAs were identified belonging to 24 different families. Genomic analysis identified six clusters each containing two miRNAs located within 2 kb of one another. In addition, small RNAs were identified from Pao/Bleo retrotransposons and their associated repeat sequences. These findings will be discussed in relation to what is currently understood about small RNAs in C. elegans as well as in humans and mosquitos, the host organisms of B. malayi.

88

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94

2. Materials and methods 2.1. Bioinformatic identification of miRNAs To identify putative miRNAs, the B. malayi genome was searched with miRNA hairpin sequences conserved in C. elegans and other protostomes [6]. Hairpin sequences were downloaded from miRBase (http://microrna.sanger.ac.uk/) [12,13] as well as from the supplemental data of [6] and [14]. Hairpin sequences for a particular miRNA from different species were aligned using the multiple sequence alignment (MSA) program, Clustal W [15]. Consensus sequences generated from these MSAs were used to search the whole genome sequence (WGS) as well as the trace archive of B. malayi at NCBI (http://www.ncbi.nlm.nih.gov/) with BLASTN (default settings). Positive hits contained an absolutely conserved seed sequence defined as an exact match to bases 1–6 or 2–7 of the putative mature miRNA [8]. To be considered a solid candidate, sequence adjacent to and including the putative mature miRNA must fold into a hairpin with a G ≤ −25 kcal/mol [6,16]. Sequence was chosen such that the putative mature miRNA was located on the 5 (20 nt 5 and 50 nt 3 to a putative miRNA) and 3 (50 nt 5 and 20 nt 3 to the putative miRNA) arms of the hairpin and it was folded with Mfold v. 3.2 using default settings [17]. 2.2. RNA isolation B. malayi males, females and microfilariae (mf, TRS Laboratories, Athens, GA) were homogenized in Trizol (Invitrogen) with a steel ball bearing [18]. Following phase separation with 1-bromo3-chloropropane, the RNA containing upper aqueous phase was removed to a fresh tube then precipitated and washed as described by Invitrogen and resuspended in water. 2.3. Small RNA cloning Small RNAs were cloned from B. malayi using a 5 ligation independent method that takes advantage of the ability of reverse transcriptases to add non-templated nucleotides to cDNAs and template switch [19]. A mixture of male, female and mf total RNA (∼150 ␮g) was separated on a 15% polyacrylamide gel under denaturing conditions along side miRNA markers (17, 21 and 25 nt, New England Biolabs). Following recovery from the gel, the 3 ends of the 17–25 nt RNA fraction (33.3 pmoles) were ligated to the 5 -adenylated miRNA cloning linker, rAppCTGTAGGCACCATCAAT – NH2 (200 pmoles, New England Biolabs) in 50 mM Tris–HCl (pH 7.5@25 ◦ C), 2 mM MgCl2 , 1 mM DTT and 12% PEG 8000 with 200 units of truncated T4 RNA ligase 2 (New England Biolabs) for 2 h at RT [20,21]. Ligated small RNAs were reverse transcribed with Superscript II (Invitrogen) using the reverse transcription primer (5 ATTGATGGTGCCTACAG3 ) as instructed by the manufacturer in the presence of a second oligonucleotide, (5 GCGTATCGGGCACCACGTATGCTATCGATCGTGAGATGGG3 ), called the 5 adapter. The terminal deoxynucleotidyl transferase activity found in some reverse transcriptases incorporates nontemplated nucleotides at the 3 ends of first strand cDNA [19,22]. By taking advantage of this characteristic, the 5 adapter can be incorporated avoiding a second ligation step. The three 3 terminal guanosine nucleotides, underlined in the 5 adapter, will anneal to any non-templated deoxycytidines added by the reverse transcriptase and the enzyme will polymerize through to the end of the 5 adapter. The products of this reaction were amplified using the forward (5 AGCUCGCGTATCGGGCACCACGTATGC3 ) and reverse (5 AGCUCATTGATGGTGCCTACAG3 ) uracil (bold nt) containing amplification primers in a reaction containing Pfu Turbo Cx Hotstart DNA Polymerase (Stratagene). Following an initial denaturation step at 95 ◦ C for 2 min, the reaction was cycled 30

times at 95 ◦ C for 30 s and 55 ◦ C for 30 s. This was followed by a 5 min extension at 72 ◦ C. The RT-PCR products were separated on a 10% polyacrylamide gel and a band of ∼90 bp (when compared to dsDNA markers) was excised. The DNA recovered from the gel was treated with Uracil-Specific Excision Reagent (USER, New England Biolabs) for 15 min at 37 ◦ C in T4 DNA ligase buffer (50 mM Tris–HCl (pH 7.5), 10 mM MgCl2 , 10 mM DTT and 1 mM ATP). USER excises uracil from the amplification products enabling the nucleotides (underlined in the forward and reverse amplification primers) between the ends of the fragment and the excised uracil to disassociate creating SacI overhangs [23]. This approach eliminates the possibility of losing inserts containing internal SacI restriction sites. SacI cleaved LITMUS 28i (New England Biolabs) was added directly to the USER reaction for a final molar ratio of 1 vector for every 10 insert molecules to encourage concatamerization. T4 DNA ligase was added and the reaction was incubated at 16 ◦ C overnight. 2.4. Nucleotide sequencing and analysis After transformation into 10-beta competent cells (New England Biolabs), plasmid DNA was purified from ∼500 recombinant colonies using the R.E.A.L. Prep 96 plasmid Kit (Qiagen) as instructed by the manufacturer. Nucleotide sequences were determined by capillary sequencing using the big dye Terminator v. 3.1 cycle sequencing kit and a 3130 XL genetic analyzer (Applied Biosystems). Nucleotide sequence data was analyzed using the Lasergene software suite (DNAstar). Only inserts with 100% identity to the B. malayi WGS and ≥17 and ≤24 nt were considered for analysis as miRNAs. The genomic sequence on either side of an insert was extracted as described previously in Bioinformatic Identification of miRNAs for folding with Mfold. Those predicted to fold into a stem loop with a G ≤ −25 kcal/mole were considered putative miRNAs. To ascertain the identity of these miRNAs, the inserts were blasted against miRBase (parameters: −W4 −S1 −C F). The stem loop sequences of inserts that were not identified by miRbase were searched against the NCBI nr database using BLASTN to determine if they exhibited strong homology with other genes or might be unique B. malayi miRNAs. The sequences that did not fold into stem loops where also searched against the NCBI nr nucleotide database using BLASTN to determine their identity. 2.5. Northern blot analysis Twenty ␮g of total B. malayi male, female or mf RNA were loaded into the lanes of a 15% polyacrylamide denaturing gel. Following electrophoresis, the RNA was electroblotted onto Genescreen Plus membrane (Perkin Elmer Life and Analytical Sciences) using a Semi-Dry Electroblotter (Continental Lab Products) in 1× TBE as described by the Manufacturer then crosslinked to the membrane (Spectrolinker XL-1000 UV crosslinker, Spectronics Corporation) at 120 mJ/cm2 . Membranes were prehybridized and then hybridized overnight at 42 ◦ C in Ultrahyb-Oligo hybridization buffer (Ambion) with 32 P-labeled oligonucleotide probes (1 × 106 cpm/ml hybridization solution). After hybridization, blots were washed twice at 42 ◦ C and once at RT for 30 min each in 2× SSC, 0.1% SDS. Oligonucleotide probes were labeled with [␣-32 P] dATP (10 mCi/ml, 6000 Ci/mmole, Perkin Elmer/NEN radiochemicals) using StarFire (Integrated DNA technologies [24]). Following exposure to Hyperfilm MP (GE Heathcare) for 1–7 days, the blots where stripped; tested for residual radioactivity by autoradiography and then rehybridized and washed as described previously to a biotinylated B. malayi U6 snRNA probe to assay for equal lane loading. The U6 probe was derived from the sequence of B. malayi U6 snRNA (DS239431 bp 749799–749902). It was labeled with Biotin-

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94

Fig. 1. Northern blot analysis of miRNA expression. Twenty ␮g of B. malayi mf, male or female total RNA was resolved on 15% TBE urea gels and electroblotted onto Genescreen Plus membrane. The blots were hybridized with 32 P-labeled probes complementary to the miRNAs. NP = not probed; Stars (*) indicate that only higher MW bands were detected. Following hybridization, blots were stripped and reprobed with a B. malayi U6 snRNA oligonucleotide to demonstrate equal lane loading.

14-dATP (Invitrogen) using StarFire. Hybridized biotinylated probe was detected using the Phototope-Star Detection Kit (New England Biolabs). The oligonucleotide probes used for StarFire labeling and hybridization are as follows: U6 probe: 5 TCCTTGCGCAGGGGCCATGCTAATCTTCTCTGTAT3 , U6 template: 5 TTTTTTTTTTATACAGAGA-NH2 -3 ; Let-7 probe: 5 AACTATACAACCTACTACCTCA3 , Let-7 template: 5 TTTTTTTTTTTGAGGTAGT-NH2-3 ; miR-36a probe: 5 TGCGGAACAATGTCTCCGGTGACGGGCGG3 ; miR-124 probe: 5 TGGCATTCACCGCGTGCCTTACGGGCGG3 ; miR153 probe: ATCACTTTTGTGACTATGCAACGGGCGG3 ; miR-2c probe: 5 GAGCCGAATCGTGCTGTGATGCGGGCGG3 ; miR-2b probe: TCGCTGCATCGGGTCTGTCGGGCGG3 . The template for miR-36a, -124, -153, -2b and -2c probes is 5 TTTTTTTTTTCCGCCCGNH2 -3 . The underlined region of each template-probe pair is complementary to enable annealing for StarFire labeling. 3. Results 3.1. B. malayi miRNA identification Using a combination of bioinformatic and cloning methods, 32 miRNAs have been identified in B. malayi (Table 1). MicroRNAs were identified bioinformatically using consensus sequences derived from MSAs. Only miRNA families conserved among protostomes or more widely conserved across the bilateria were chosen for mining the B. malayi genome [6]. Typically, many hits were obtained when consensus sequences were blasted against the B. malayi genome at NCBI. Only those hits that contained an absolutely conserved seed sequence in what would be the mature miRNA were considered for analysis with Mfold. Sequence for folding was chosen so that the putative mature miRNA was located on the 5 and 3 arms of the hairpin. Hits that folded into hairpins with a G ≤ −25 kcal/mol were considered solid miRNA candidates. The first miRNA found in B. malayi using this method was let-7. The sequence of mature B. malayi let-7 (bma let-7) is identical to the sequence of C. elegans let7 and the stem loop identified has an excellent G (−36.6 kcal/mol) (Table 1). Northern analysis showed that bma let-7 is expressed in adult males and females but not mf (Fig. 1). This is the same expression pattern found for C. elegans let-7 [25]. Using this bioinformatic protocol, a total of 24 miRNAs were successfully identified in B. malayi (Table 1).

89

Candidate genes for mir-1 and mir-124, two miRNAs conserved across the bilateria, were not identified in the B. malayi WGS using the bioinformatic approach [6]. However, a search of the B. malayi trace archive yielded two likely mir-124 candidates, TI|1174796747 and TI|1174094970. The 3 end of TI|1174094970 overlaps by ∼500 bp with the 5 end of TI|1174796747. No significant matches were found when these trace sequences were blasted against the non-redundant nucleotide database at NCBI indicating that they are of parasite origin as vertebrate sequences are known to contaminate the trace archive [11]. In addition, northern blot analysis confirmed the expression of bma miR-124 in female RNA (Fig. 1). Good candidates for mir-1 were not found in the B. malayi trace archive. It is possible that B. malayi lacks a gene for mir-1 but more likely the sequence encoding miR-1 is missing from the B. malayi genome as it is only ∼75% complete [11]. Although bioinformatic identification is useful for finding conserved miRNAs, identifying less conserved and unique miRNAs in filarial nematodes necessitated cloning small RNAs for sequencing. Small RNAs were cloned from B. malayi using a 5 ligation independent method that takes advantage of the ability of reverse transcriptases to add non-templated nucleotides to cDNAs and template switch [19]. Of the 500 sequenced inserts, 170 were considered for analysis as miRNAs because they were identical to the B. malayi WGS and fell within a specific nucleotide size range (≥17 and ≤24 nt long; Table 2). Of these, 23 were identified as putative B. malayi miRNAs. No unique B. malayi miRNAs were identified in this population of clones. The remaining 147 inserts were derived from messenger, ribosomal, transfer and small nuclear RNA sources as well as from retrotransposon and repeat related transcripts (Table 2). The 23 inserts identified 12 distinct miRNAs (Table 1). Four of the cloned miRNAs (bantam, miR-2a, -71, and -87a), we had previously identified bioinformatically. The newly identified miRNAs are miR2b-1, -2b-2, -2c, -36a, 100a, -100d, -153 and -279 (Table 1). The miR-2 orthologues were the most abundant miRNA cloned from B. malayi. They accounted for 6 of the 23 clones (26%). The 32 distinct miRNAs identified in B. malayi fall into 24 different families (Table 1). Twenty-two of these appear to be highly conserved and found in a wide range of animals including helminths, arthropods and vertebrates. Only 3 families: miR-36, -87 and -153 exhibit some restricted species expression (Table 1). Two orthologues of mir-36 were identified on separate scaffolds in the B. malayi genome (miR-36a and b, Table 1). This family appears to be limited to helminths. Orthologues have been identified in Caenorhabditis spp. and the planaria, Schmidtea mediterranea [26]. Clustal W alignment shows conservation of the seed sequence, UCACCGG, at the 5 end of each miR-36 family member whereas the remaining sequence of the mature miRNAs is quite variable (Fig. 2). B. malayi miRNAs 36a and b are identical except for a single G:A nucleotide change. In addition, the stem loops for mir-36a and b exhibit 89% (81/91) nucleotide identity (data not shown). Expression of mature bma miR-36a was detected in adult female RNA by northern blot analysis (Fig. 1). In addition, higher MW bands of ∼150 and 230 bases were detected in mf (data not shown). Expression of the miR-87 family appears to be limited to protostomes including nematodes and arthropods [6,27]. Two orthologues of miR-87 have been identified on separate scaffolds in B. malayi (Table 1). The mature miRNAs are 90.5% identical with two U:C substitutions (Fig. 2). Alignment of mature miR-87 from different protostomes shows high sequence conservation throughout the mature miRNA and not just within the seed region (Fig. 2). Two of the 23 miRNA clones from B. malayi were identified as miR-153. MicroRNA 153 has been identified in a variety of organisms including humans as well as the invertebrates, Lottia gigantea (sea snail) and Capitella capitata (marine polychaete worm) but it appears to be absent from C. elegans and Drosophilia melanogaster

90

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94

Table 1 MicroRNAs from B. malayi. miRNA

Sequence

Acc. no. or TIa

Positionb

Strand

G

Genomic contextc

Identificationd

Conservatione

bantam let-7 lin-4 miR-2a miR-2b-1 miR-2b-2 miR-2c miR-7 mir-9 miR-34 miR-36a miR-36b miR-50 miR-57 miR-71 miR-72 miR-79 miR-87a miR-87b miR-92 miR-100a miR-100b miR-100c miR-100d miR-124

UGAGAUCAUUGUGAAAGCUAU UGAGGUAGUAGGUUGUAUAGUU UCCCUGAGACCUCUGCUGCGA UAUCACAGCCAGCUUUGAUGU AUCACAGACCCGAUGCAGCGA AUCACAGACCCGAUGCAGCGA CAUCACAGCACGAUUCGGCUC UGGAAGACUUGUGAUUUUGUUGU CUUUGGUUAUCUAGCUGUAUGA UGGCAGUGUGGUUAGCUGGUUGU UCACCGGAGACAUUGUUCCGCA UCACCGGAGACAUUAUUCCGCA UGAUAUGUCUGAUAUUCUUGGGUU UACCCUGUGGUACCGAGCUGUGU UGAAAGACAUGGGUAGUGA AGGCAAGAUGUUGGCAUAGCUGA AUAAAGCUAGGUUACCAAAGCU GUGAGCAAAGUUUCAGGUGUU GUGAGCAAAGUCUCAGGUGU UAUUGCACUCGUCCCGGCCU AACCCGUAGUUUCGAACAUGUG AACCCGUAGAUCCGAACUUGUG AACCCGUAGAACUGAAAUCGUG UACCCGUAGCUCCGAAUAUGUG UAAGGCACGCGGUGAAUGCCAA

−37.5 −36.6 −38.8 −31.4 −34.2 −30.8 −32.6 −30.6 −31.6 −44.6 −35.7 −34.5 −36.2 −32.7 −32.9 −50.5 −31.6 −38.6 −32.5 −31.8 −30.0 −36.6 −35.1 −35.3

Intergenic ” ” ” ” ” ” ” ” ” ” ” Intronic Intergenic ” ” ” ” ” ” Intronic Intergenic Intronic Exonic

B,C B,N B B,C C,N C,N C,N B B B C,N B B B B,C B B B,C B B C B B C

A,H,V A,H,V A,H,V A,H,V A,H,V A,H,V A,H,V A,H,V A,H,V A,H,V H H A,H,V A,H,V A,H,V A,H,V A,H,V A,H A,H A,H,V A,H,V A,H,V A,H,V A,H,V

UUGGUCCCCUUCAACCAGCUA UUAUUGCUUGAGAAUACA UUGCAUAGUCACAAAAGUGAUG

− + +

−34.4 −32.1 −35.2

NA Intergenic ”

B,N B B

A,H,V A,H,V A,H,V

miR-228 miR-236 miR-279 miR-281

AAUGGCACUAGAUGAAUUCACGG UAAUACUGUCAGGUAAUGACGAU UGACUAGAACCAUACUCAGC UGUCAUGGAAUUGCUCUCUUUA

148736–148826 25271–25394 197979–198068 2477–2567 1770–1861 2496–2587 21520–21610 1553908–1553998 2204563–2204647 5570685–5570795 2626–2716 1894–1984 1719–1809 2045–2135 40648–40738 760056–760170 2204563–2204647 19799–19889 150148–150234 1144116–1144220 84696–84786 25479–25569 110900–110975 85613–85703 271–366 911–1001 2739013–2739142 1007521–1007621 91–180 532–621 469–530 62361–62448 106–182 19512–19602 1178–1268

− − − − + + + − − − + + − + + − − + + − − − − −

miR-133 miR-137 miR-153

DS237873 DS238124 DS239215 DS237074 DS237458 DS239143 DS237916 DS239429 DS239429 DS239411 DS239143 DS237458 DS239794 DS239506 DS238944 DS239414 DS239429 DS238776 DS239419 DS239429 DS238870 DS238124 DS239383 DS238870 TI|1174796747 TI|1174094970 DS239429 DS239416 TI|1172814984 TI|1173557506 DS260136 DS239411 DS257052 DS237916 DS238467

− + − + +

−33.2 −29.7 −33.6 −36.5 −34.6

NA Intergenic ” ” ”

C,N B B C B

A,H,V A,H,V A,H,V A,H,V A,H,V

a b c d e

Genbank accession no. or trace identifier is listed. Traces are accessed at NCBI’s trace archive: www.ncbi.nlm.nih.gov/Traces/trace.cgi? The nucleotide position defining a hairpin for each miRNA is shown. Context of the miRNA hairpins in the genome. NA = not annotated. Methods of miRNA identification and validation: B, bioinformatic; C, cloned; N, northern blot. Organisms expressing orthologous miRNAs: A, arthropods; H, helminths; V, vertebrates.

[7,13]. The sequence of mature bma mir-153 is located from bases 488 to 509 on DS260136, a short 530 bp fragment of DNA (Table 1). Because the mature miRNA sits so close to the end of this fragment, there was not enough sequence to fold. To identify overlapping sequence, DS260136 was used to blast the B. malayi trace archive. Two overlapping sequences, TI|1172814984 and TI|1173557506 were identified that when folded generated a stem loop with a G = −33.2 (Table 1). The sequence of bma miR-153 is identical to human, L. gigantea and C. capitata miR-153 except for a single C to G substitution at the 5 end (Fig. 2). Despite cloning this miRNA twice, the mature miRNA was not detected by northern blot

Table 2 Classification of small RNAs cloned from B. malayi. RNA species

Number of inserts

miRNAs rRNA, tRNA, snRNA mRNA Retrotransposon related Non-annotateda No exact genome hitsb <17 or >24 bps longc

23 81 27 20 19 43 290

Total

503

a

% total inserts 4.6 16.1 5.4 4 3.8 8.5 57.6 100

Blast searches with these small RNAs hit non-annotated regions of the B. malayi genome. Some of them are repeats. b These inserts are not an exact match to the B. malayi genome. c Inserts less than 17 or greater than 24 nucleotides were not evaluated.

after a 5-day exposure to X-ray film (Fig. 1). However, higher MW bands of ∼145 and 210 bases were detected in microfilariae after an overnight exposure (data not shown). The size of these bands approximate the expected size of 2 (145 bases) to 3 (210 bases) adjacent miRNA hairpins and may represent miRNA processing intermediates. Microfilariae are likely the source of the miR-153 clones identified in this study but the mature miRNA is probably expressed below the levels that can be detected by northern blot. 3.2. MicroRNA clusters Twelve of the 32 B. malayi miRNAs identified in this report are located in clusters consisting of 2 miRNAs each separated by up to 2 kb (Table 3). Bma mir-9 and mir-79 comprise the smallest cluster with the miRNAs located on opposite arms of the same hairpin (Table 3 and Fig. 3). Both mir-36a and -36b are located ∼40 bp downstream of mir-2b. Although mature miR-36a and miR2b were only detected in adult female and not mf RNA, higher MW bands of ∼150 and 230 nt were detected in mf by northern blot analysis (Fig. 1). These bands may be miRNA processing intermediates consisting of two clustered hairpins and possibly a third as yet unidentified miRNA in the case of the 230 bp band. The similar expression patterns of bma miR-36a and miR-2b in mf and adult females suggest that these miRNAs are the product of a single polycistronic transcript (Fig. 1 and [1]). Three of the four miR-100 members identified in this study are found in clusters.

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94

91

Fig. 2. Clustal W alignments of mature B. malayi miRNAs with orthologous family members. All family members may not be shown. Species abbreviations are as follows: bma, B. malayi; cel, C. elegans; sme, Schmidtea mediterranea; hsa, Homo sapiens; cin, Ciona intestinalis; csa, Ciona savignyi; aga, Anopheles gambiae; lgi, Lottia gigantea; cap, Capitella capitata.

Bma miR-100b is clustered ∼300 bp down stream of let-7 and miR100d is located ∼1 kb downstream of miR-100a (Tables 1 and 3). Both miR-100a and -100d are located within the B. malayi GdRAD54 gene (genbank accession no. EDP35808) and all 3 are transcribed from the same strand. Bma miR-100a is located within an intron whereas the stem loop encoding miR-100d overlaps with an exon of the GdRAD54 gene (Table 1). The largest cluster contains bma miR279 and -2c separated by ∼2 kb (Table 3). Mature bma miR-2c was detected in mf as well as adult males and females by northern blot analysis (Fig. 1). Its expression pattern is quite different than its paralogue, bma miR-2b suggesting that although these miRNAs have the same seed sequence, they likely repress different targets and their expression pattern reflects the expression pattern of their targets.

3.3. Repeat associated siRNAs In addition to miRNAs, a small collection of RNAs derived from repetitive elements were also identified in B. malayi. Analysis of the 500 inserts identified ∼40 repeat-associated siRNAs. These fall into two groups: retrotransposon-related and non-annotated repeats (Table 2). The majority of retrotransposon-related small RNAs are derived from genes annotated as Pao/Bleo long terminal repeat retrotransposons. Most of these retrotransposon related small RNAs have a guanosine as the 5 nucleotide and are derived from the antisense strand (Table 4, sequence #1, 4–15). Many of the sequences are derived from regions of the transposons annotated as exons in the assembled genome. In addition, a number of small RNAs derived from repeats were also identified in B.

Fig. 3. Structure of the B. malayi miR 9/79 hairpin as predicted by Mfold. On the 5 arm of the hairpin, miR-9 is boxed with a solid line. On the 3 arm of the hairpin, miR-79 is boxed with a dashed line.

92

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94 Table 3 B. malayi miRNA clusters.

a

The relative distance and orientation of each miRNA in a cluster is shown. MicroRNAs on the forward strand are shown above and those on the reverse strand are depicted below the line. b MicroRNA-9 and -79 are on opposite arms of the same hairpin.

malayi. For example, the sequence GATATTATTATATTTGCACACAA (Table 4, sequence #15) occurs throughout the B. malayi genome. It is annotated as repeat sequence downstream of retrotransposon related proteins on at least two contigs including DS238081 and DS239382 suggesting that it may be derived from the long terminal repeats of the retrotransposons. It also occurs multiple times on non-annotated contigs such as DS240662. 4. Discussion As an initial step in understanding the role of small RNAs in parasitic nematodes, a combination of bioinformatic and cloning methods were used to identify miRNAs in B. malayi (Table 1). Through bioinformatic identification, it was determined that B. malayi expresses a group of conserved miRNAs found in a wide range of protostomes and bilateria. The small RNA cloning project was initiated to try and identify less conserved and possibly unique miRNAs in the parasite. Given that the genome sizes of B. malayi (90–95 Mb) and C. elegans (100 Mb) are roughly similar [11], these nematodes may encode similar numbers of miRNAs. The current version of miRBase (release 14, September 2009) contains 174 C. elegans miRNAs [12,13]. Based on this assumption, our identification of 32 miRNAs represents an estimated 20% of the miRNAs in B. malayi. At least 84

of the 174 C. elegans miRNAs sort into 20 different family groups based on sequence identity at the 5 end [28]. Twenty-two of the 32 miRNAs identified in B. malayi sort into one of these 20 C. elegans family groups [28]. Of the 24 different miRNA families identified in B. malayi, 22 are conserved across evolution. For example, the miR-100 family is found from the sea anemone, Nematostella vectensis to humans suggesting that it fulfills an essential function [6]. Many organisms express multiple miR-100 paralogues. C. elegans expresses six (miR-51 through 56) and we have identified 4 B. malayi miR100 paralogues in this study. Two were identified bioinformatically (miR-100b and c) and two were identified through cloning (miR100a and d, Table 1). Clustal W alignment of the B. malayi miR-100s with orthologues from human, mosquito and C. elegans highlights the conserved seed sequence, ACCCGUA (Fig. 2). The alignment also shows that bma miR-100b is 100% identical with human and mosquito miR-100 but the highest identity with a C. elegans miR100 orthologue is 73% with miR-51. Mir-100 orthologues are often found clustered with let-7. In humans, mosquito and D. melanogaster, mir-100 let-7 clusters range in size from ∼500 to 4000 bp [29]. This is also true in B. malayi. Bma mir-100b is clustered within ∼300 bp of bma let-7 (Table 3). In contrast, C. elegans let-7 is separated by 1.6 million bp on the X chromosome from a mir-100 cluster containing mir-54, -55 and -56 [29]. Clustering of miRNAs suggests that they are coordinately expressed. In D. melanogaster, expression of miR-100 and let-7 are up regulated in larvae in response to a pulse of ecdysone and a decrease in juvenile hormone. These miRNAs are involved in the heterochronic gene pathway that regulates developmental transition from the larval to pupae stage [29]. In C. elegans let-7 also regulates development timing; it promotes the transition from the L4 to adult stage [30]. But it is unclear in C. elegans if miR-100 members function in a coordinated fashion with let-7. However, mutant strains lacking multiple mir-100 genes, exhibited defects in growth rate and development [31]. Both bma mir-100a and -100d are clustered within the GdRAD54 gene. This is an orthologue of RAD-54 in C. elegans, a DNA damage repair protein involved in repairing double stranded breaks [32]. Because of their location within the GdRAD54 gene, the expression of bma mir-100a and -100d is likely dependent on and coordinated with the expression of this gene [1]. The genomic location of bma miR-100a and -100d may be unique to B. malayi as none of the mir-100 orthologues from C. elegans, humans or mosquito are located within their respective RAD54 genes. This suggests that B. malayi may have evolved novel routes for the regulation and functioning of some members of this conserved miRNA family.

Table 4 B. malayi repeat associated siRNAs.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

a

Sequence

Genbank accession no.

Positiona (bps)

Strand

Description

GAGACUGAAAUCUGCUGGA AGATTCATTGAGGCTT AACCAGGTTTATCAACAAA GTAACATTTTATATTCTCGTTGGA GGAATCGCTTGTGTGTGGAATA GAAATATTCCTTGGTCACCTT GAGACTTCTGATATAGTGATT GGAGCTATATCATATCCAGA GGAATTGAGTGGCATTGTCAC GAAACTGTCAACATTCACTCA GATCCTTGAATATTCGCGAGA GATTCTGTATAAATTTTGGTA GGAACTAAAAATCCTAGTGGGT GAGTGAGTGAATGTTGACAGT GATATTATTATATTTGCACACAA

XM 001897268 XM 001902919 XM 001899301 XM 001895153 XM 001894642 XM 001897584 XM 001902919 XM 001897584 XM 001902251 XM 001902240 XM 001899142 XM 001897268 XM 001900963 XM 001902240 DS238081

248–266 20–35 100–118 118–141 408–428 284–304 930–950 442–461 1388–1408 533–553 3707–3727 1352–1372 203–224 536–556 Multiple

− − − − − − − − − + − − − − −

Integrase core domain containing protein Pao retrotransposase peptidase Pao retrotransposase peptidase Pao retrotransposase peptidase gag protein gag protein Pao retrotransposase peptidase gag protein, putative Integrase core domain containing protein gag protein Pao retrotransposon peptidase Integrase core domain containing protein Pao retrotransposon peptidase gag protein This sequence occurs 15 times between bps 1747 and 2951 downstream of a gag protein

Nucleotide position of the cloned sequence in the gene defined by the genbank accession no.

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94

Only three of the families identified in B. malayi, miR-36, -87 and -153 exhibit some restricted species expression (Table 1). The miR-36 family appears to be specific for helminths. Members of this family have been previously identified in Caenorhabditis spp. and the flat worm, S. mediterranea. We have identified two B. malayi orthologues clustered with mir-2 family members on separate scaffolds (Tables 1 and 3). Northern Blot analysis indicates that bma miR-36 and miR-2 are coordinately expressed in adult females (Fig. 1). In C. elegans, the mir-36 family consists of eight members arranged in two clusters on chromosome II. In one of these clusters, a family member (mir-42) is located within 100 bp of a mir-2 orthologue. C. elegans miR-36 family members are only expressed in embryos and in young adults carrying embryos [33]. Deletion of the seven member mir-36 cluster on chromosome II caused embryonic and larval lethality [31]. The mir-2 orthologue found clustered with mir-36 in B. malayi and C. elegans may also regulate embryogenesis. In D. melanogaster, miR-2 family members have been found to post-transcriptionally suppress apotosis during embryogenesis by binding to the 3 UTRs of the proapototic proteins hid, grim, reaper and sickle [34]. The apparent specificity of the miR-36 family for helminths and its putative role in embryogenesis suggests that this miRNA family as well as any helminth specific targets in B. malayi would be novel targets for an anti-filaricide. MicroRNA-87 appears to be limited to protostomes. It is found in nematodes, the platyhelminth, S. mediterranea [26] and a variety of arthropods. Two orthologues have been identified in B. malayi that differ by a single nucleotide (Table 1 and Fig. 2). Little is known about this microRNA family apart from its expression pattern in C. elegans and Drosophila. It is expressed at high levels throughout the C. elegans life cycle [35] and in D. melanogaster, miR-87 is expressed in embryos, pupae and adults [29]. It is particularly abundant in the heads of adult flies suggesting that perhaps it functions in the nervous system [36]. Unlike the miR-87 and -36 families, miR-153 is found throughout the Bilateria including the free living polychaete worm, C. capitata but so far it has not been found in C. elegans or D. melanogaster [7]. The sequence of this miRNA is highly conserved between invertebrates including B. malayi and humans suggesting that some of its targets are also conserved and essential to the bilateria (Fig. 2). Microfilariae are likely the source of the miR-153 clones identified in B. malayi as determined by northern blot, but additional work is required to confirm this result (Fig. 1). In addition, it would be interesting to try and confirm if C. elegans has lost miR-153 or if, as the northern results with B. malayi indicate, it is only expressed at very low levels. In mice and humans, miR-153 is found almost exclusively expressed in embryonic tissue but it is also found in endocrine and nerve tissue [37]. Identification of the targets of miR153 in B. malayi; whether these targets are conserved in C. elegans and how they are regulated by miRNAs may provide some insight into the parasitic evolution of nematodes. A collection of the cloned small RNAs are derived from the Pao/Bleo family of retrotransposons in B. malayi (Tables 2 and 4, [38,39]). Transposons have been described as nucleic acid parasites. Because of their ability to transpose to new locations and propagate within a genome, they have the potential to cause catastrophic damage by disrupting essential genes. Many organisms have been shown to employ endogenous RNAi to regulate transposons [40]. For example, C. elegans RNAi deficient mutants mut-7 [41] and rde-2 [42] exhibit increased levels of transposition. In addition, Piwi and piRNAs have been found to act upstream of an siRNA pathway to suppress Tc3 transposon mobility in the C. elegans germline [43]. The cloning of the small antisense RNAs from the Pao/Bleo retrotransposon family suggests that B.

93

malayi also uses endogenous RNAi to control transposon mobility. A deep sequencing project for small RNAs has been initiated to identify the remaining miRNAs in B. malayi. An understanding of miRNAs and their targets will help elucidate basic regulatory pathways in the parasite. MicroRNAs unique to B. malayi are particularly interesting as they may help identify previously uncharacterized filarial specific pathways that could be targeted for chemotherapeutic or vaccine development. In the genome, 20% of the predicted proteins were found to be B. malayi specific [11] and a portion of these might be regulated by B. malayi specific miRNAs. In addition, deep sequencing will enable the identification and characterization of other small RNA families in B. malayi including piRNAs [3] and endo-siRNAs [2]. Comparison with what is known in C. elegans may enlighten our understanding of RNA regulation in parasitic nematodes. An understanding of the small RNA families may lead to the development of RNAi based therapeutics against filarial parasites. The appeal of this approach is that small RNAs could be designed to target parasite specific mRNAs. The difficulty arises with trying to deliver small RNAs to the parasite but recent advances in delivering chemically modified nucleic acid to cells make this approach increasing viable [44]. One option is using antagomirs, modified antisense RNAs that effectively compete with potential mRNA targets for microRNAs [45]. Recently, researchers have been successful in treating hepatitis C infections by targeting human miR-122, a liver specific miRNA that facilitates hepatitis C replication, using miR-122 antagomirs [44]. By targeting parasite specific miRNAs, the host’s miRNA regulatory pathways would not be disrupted. Characterization of the small RNA families in B. malayi will aid our understanding of filarial molecular biology and hopefully will highlight options that researchers may manipulate for the development of effective therapies against an insidious disease. Acknowledgments We would like to thank Donald G. Comb for his generous support of parasitology at New England Biolabs. We thank Sanjay Kumar for helpful discussions about mining genomic databases for microRNAs and Tilde Carlow, Bill Jack, Lise Raleigh and Brett Robb for their review of this manuscript and comments. References [1] Kim VN, Han J, Siomi MC. Biogenesis of small RNAs in animals. Nat Rev Mol Cell Biol 2009;10:126–39. [2] Lee RC, Hammell CM, Ambros V. Interacting endogenous and exogenous RNAi pathways in Caenorhabditis elegans. RNA 2006;12:589–97. [3] Batista PJ, Ruby JG, Claycomb JM, et al. PRG-1 and 21U-RNAs interact to form the piRNA complex required for fertility in C. elegans. Mol Cell 2008;31: 67–78. [4] Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 1993;75: 843–54. [5] Wightman B, Ha I, Ruvkun G. Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 1993;75:855–62. [6] Prochnik SE, Rokhsar DS, Aboobaker AA. Evidence for a microRNA expansion in the bilaterian ancestor. Dev Genes Evol 2007;217:73–7. [7] Wheeler BM, Heimberg AM, Moy VN, et al. The deep evolution of metazoan microRNAs. Evol Dev 2009;11:50–68. [8] Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB. Prediction of mammalian microRNA targets. Cell 2003;115:787–98. [9] Behm-Ansmant I, Rehwinkel J, Doerks T, Stark A, Bork P, Izaurralde E. mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2 decapping complexes. Genes Dev 2006;20:1885–98. [10] Eulalio A, Huntzinger E, Izaurralde E. GW182 interaction with Argonaute is essential for miRNA-mediated translational repression and mRNA decay. Nat Struct Mol Biol 2008;15:346–53. [11] Ghedin E, Wang S, Spiro D, et al. Draft genome of the filarial nematode parasite Brugia malayi. Science 2007;317:1756–60.

94

C.B. Poole et al. / Molecular & Biochemical Parasitology 169 (2010) 87–94

[12] Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res 2006;34:D140–4. [13] Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic Acids Res 2008;36:D154–8. [14] Hertel J, Lindemeyer M, Missal K, et al. The expansion of the metazoan microRNA repertoire. BMC Genomics 2006;7:25. [15] Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994;22:4673–80. [16] Ambros V, Bartel B, Bartel DP, et al. A uniform system for microRNA annotation. RNA 2003;9:277–9. [17] Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 2003;31:3406–15. [18] Laney SJ, Buttaro CJ, Visconti S, et al. A reverse transcriptase-PCR assay for detecting filarial infective larvae in mosquitoes. PLoS Negl Trop Dis 2008;2:e251. [19] Takada S, Berezikov E, Yamashita Y, et al. Mouse microRNA profiles determined with a new and sensitive cloning method. Nucleic Acids Res 2006;34:e115. [20] Lau NC, Lim LP, Weinstein EG, Bartel DP. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science 2001;294:858–62. [21] Ho CK, Wang LK, Lima CD, Shuman S. Structure and mechanism of RNA ligase. Structure 2004;12:327–39. [22] Zhu YY, Machleder EM, Chenchik A, Li R, Siebert PD. Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechniques 2001;30:892–7. [23] Bitinaite J, Nichols NM. DNA cloning and engineering by uracil excision. Curr Protoc Mol Biol 2009 [Chapter 3: Unit 3 21]. [24] Behlke MA, Dames SA, McDonald WH, Gould KL, Devor EJ, Walder JA. Use of high specific activity StarFire oligonucleotide probes to visualize low-abundance pre-mRNA splicing intermediates in S. pombe. Biotechniques 2000;29:892–7. [25] Bracht J, Hunter S, Eachus R, Weeks P, Pasquinelli AE. Trans-splicing and polyadenylation of let-7 microRNA primary transcripts. RNA 2004;10:1586–94. [26] Palakodeti D, Smielewska M, Graveley BR. MicroRNAs from the Planarian Schmidtea mediterranea: a model system for stem cell biology. RNA 2006;12:1640–9. [27] Ruby JG, Jan C, Player C, et al. Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell 2006;127:1193–207. [28] Ibanez-Ventoso C, Vora M, Driscoll M. Sequence relationships among C. elegans, D. melanogaster and human microRNAs highlight the extensive conservation of microRNAs in biology. PLoS One 2008;3:e2818.

[29] Sempere LF, Sokol NS, Dubrovsky EB, Berger EM, Ambros V. Temporal regulation of microRNA expression in Drosophila melanogaster mediated by hormonal signals and broad-Complex gene activity. Dev Biol 2003;259:9–18. [30] Reinhart BJ, Slack FJ, Basson M, et al. The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 2000;403:901–6. [31] Miska EA, Alvarez-Saavedra E, Abbott AL, et al. Most Caenorhabditis elegans microRNAs are individually not essential for development or viability. PLoS Genet 2007;3:e215. [32] Boulton SJ, Gartner A, Reboul J, et al. Combined functional genomic maps of the C. elegans DNA damage response. Science 2002;295:127–31. [33] Alvarez-Saavedra EA, Miska EA, Abbott AL, et al. The miR-35 family of microRNAs acts redundantly in embryonic development in C. elegans. In: 15th Biennial International C. elegans Conference. 2005. [34] Leaman D, Chen PY, Fak J, et al. Antisense-mediated depletion reveals essential and specific functions of microRNAs in Drosophila development. Cell 2005;121:1097–108. [35] Lim LP, Lau NC, Weinstein EG, et al. The microRNAs of Caenorhabditis elegans. Genes Dev 2003;17:991–1008. [36] Ruby JG, Stark A, Johnston WK, Kellis M, Bartel DP, Lai EC. Evolution, biogenesis, expression, and target predictions of a substantially expanded set of Drosophila microRNAs. Genome Res 2007;17:1850–64. [37] Landgraf P, Rusu M, Sheridan R, et al. A mammalian microRNA expression atlas based on small RNA library sequencing. Cell 2007;129:1401–14. [38] Xiong Y, Burke WD, Eickbush TH. Pao, a highly divergent retrotransposable element from Bombyx mori containing long terminal repeats with tandem copies of the putative R region. Nucleic Acids Res 1993;21:2117–23. [39] Whitton C, Daub J, Quail M, et al. A genome sequence survey of the filarial nematode Brugia malayi: repeats, gene discovery, and comparative genomics. Mol Biochem Parasitol 2004;137:215–27. [40] Girard A, Hannon GJ. Conserved themes in small-RNA-mediated transposon control. Trends Cell Biol 2008;18:136–48. [41] Ketting RF, Haverkamp TH, van Luenen HG, Plasterk RH. Mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner syndrome helicase and RNaseD. Cell 1999;99:133–41. [42] Tabara H, Sarkissian M, Kelly WG, et al. The rde-1 gene, RNA interference, and transposon silencing in C. elegans. Cell 1999;99:123–32. [43] Das PP, Bagijn MP, Goldstein LD, et al. Piwi and piRNAs act upstream of an endogenous siRNA pathway to suppress Tc3 transposon mobility in the Caenorhabditis elegans germline. Mol Cell 2008;31:79–90. [44] Castanotto D, Rossi JJ. The promises and pitfalls of RNA-interference-based therapeutics. Nature 2009;457:426–33. [45] Krutzfeldt J, Rajewsky N, Braich R, et al. Silencing of microRNAs in vivo with ‘antagomirs’. Nature 2005;438:685–9.