Gene 330 (2004) 159 – 168 www.elsevier.com/locate/gene
Characterization of a major outer membrane protein multigene family in Ehrlichia ruminantium Henriette van Heerden a, Nicola E. Collins b, Kelly A. Brayton c, Celia Rademeyer d, Basil A. Allsopp b,* b
a Department of Chemistry and Biochemistry, Rand Afrikaans University, P.O. Box 524, Auckland Park 2006, South Africa Department of Veterinary Tropical Diseases, Faculty of Veterinary Science, University of Pretoria, Private Bag X04, Onderstepoort 0110, South Africa c Department of Veterinary Microbiology and Pathology, Washington State University, Pullman, WA 99164-7040, USA d Institute of Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Anzio Road, Observatory, 7925 Cape Town, South Africa
Received 30 July 2003; received in revised form 26 November 2003; accepted 26 January 2004 Received by A.M. Campbell
Abstract Ehrlichia ruminantium is a tick-transmitted rickettsial pathogen, which causes heartwater or cowdriosis in wild and domestic ruminants. A dominant antibody response of animals infected with E. ruminantium is directed against the outer membrane protein MAP1 (major antigenic protein 1). Part of the locus containing map1 has been characterized and consists of four map1 paralogs, designated map1-2, map11, map1 and map1 + 1, indicating that map1 is encoded by a multigene family. The purpose of this study was to determine the total number of map1 paralogs and their transcriptional activities. Using genome walking and data from an ongoing E. ruminantium genome sequencing project at the Onderstepoort Veterinary Institute, we found 16 paralogs of the map1 gene tandemly arranged in a 25 kb region of the E. ruminantium genome. The map1 multigene family is downstream of a hypothetical transcriptional regulator gene and upstream of the secA gene. Thirteen paralogs at the 5Vend of the 25-kb locus were connected by short intergenic spaces (ranging from 0 to 42 bp) and the remaining three paralogs at the 3Vend were connected by longer intergenic spaces (ranging from 375 to 1612 bp). All 16 map1 paralogs were transcriptionally active in E. ruminantium grown in endothelial cells and paralogs with short intergenic spaces were co-transcribed with their adjacent genes. D 2004 Elsevier B.V. All rights reserved. Keywords: Heartwater; Ehrlichiosis; Major antigenic protein 1 (map1); transcriptional activity
1. Introduction Abbreviations: API, AP2, adapter-specific primers; BA, bovine aorta endothetical cell line; BLAST, basic local alignment search tool; bp, base pairs of DNA; Da, daltons; dNTP, 2V-deoxy ribonucleoside 5V-triphosphate; E value, expected chance of a random hit during a database (BLAST) search; kb, kilo base pairs of DNA; LFT, lamb foetal testis endothelial cell line; map, MAP, major antigenic protein, outer membrane protein family of Ehrlichia ruminantium; omp, OMP, outer membrane protein; omp-1, OMP-1, p28, P28, outer membrane protein family of Ehrlichia chaffeensis; ORF, open reading frame; p30, P30, outer membrane protein family of Ehrlichia canis; PCR, polymerase chain reaction; RT-PCR, reverse transcriptase-PCR; LA-PCR, ‘‘long and accurate’’ PCR; SBE, sheep brain endothelial cell line; SDS, sodium dodecyl sulfate; SSC, standard saline citrate; Taq, Thermus aquaticus; TR, transcriptional regulator; U, unit of enzyme activity; un, unknown ORF. * Corresponding author. Tel.: +27-12-529-8426; fax: +27-12-5298312. E-mail address:
[email protected] (B.A. Allsopp). 0378-1119/$ - see front matter D 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2004.01.020
Heartwater, or cowdriosis, is a rickettsial disease causing major economic losses in wild and domestic ruminants. The disease is endemic in sub-Saharan Africa and is also present in the Caribbean (Uilenberg, 1983) where it poses a threat to livestock on the American continents. The causative agent is an obligate intracellular gram-negative bacterium, Ehrlichia ruminantium (formerly Cowdria ruminantium; Dumler et al., 2001), which is transmitted by ticks in the genus Amblyomma. Phylogenetic studies based on 16S ribosomal DNA and outer membrane protein sequence comparisons have revealed a close relationship between E. ruminantium, E. chaffeensis and E. canis (Dame et al., 1992; van Vliet et al., 1992; Reddy et al., 1998; McBride et al., 1999). In
160
H. van Heerden et al. / Gene 330 (2004) 159–168
infected animals and humans, the dominant serological immune responses are directed against the outer membrane proteins of Ehrlichia spp. These immunodominant outer membrane (omp) protein genes have been designated the major antigenic protein 1 (map1) of E. ruminantium (van Vliet et al., 1994; Sulsona et al., 1999), outer membrane protein p28 (omp-1) of E. chaffeensis, and the p30 multigene family of E. canis (Ohashi et al., 1998a, b; Reddy et al., 1998; Yu et al., 1999, 2000). The OMP-1 and P30 protein families are each encoded by a single polymorphic multigene family consisting of 22 genes tandemly arranged in a cluster (Ohashi et al., 2001). The multigene clusters of E. canis and E. chaffeensis are upstream from the secA gene and downstream of a hypothetical transcriptional regulator gene. The 5V end of the omp cluster in E. canis and E. chaffeensis each consists of paralogs linked by short intergenic spaces, while the paralogs at the 3Vend are connected by longer intergenic spaces (Ohashi et al., 2001). Previously only four map1 paralogs of the E. ruminantium multigene family had been identified. Three map1 paralogs, map1-1 (orf2), a partial sequence of map1-2 (orf3), and map1 itself, were characterized by Sulsona et al. (1999). In a preliminary study, we obtained further sequence data from the map1-2 gene, and characterized another map1-like gene downstream of map1, designated map1 + 1 (van Heerden et al., 2002). Additionally, the large open reading frame (ORF) at the 3V end of the locus, downstream of map1 + 1, was identified as a secA gene (van Heerden et al., 2002). Various studies investigating the transcriptional activity of the omp genes in E. canis and E. chaffeensis have produced contradictory results which may be due to differences in PCR primer specificities (Reddy et al., 1998; McBride et al., 2000; Yu et al., 2000; Ohashi et al., 2001; Unver et al., 2001). Ohashi et al. (2001) and Unver et al. (2001) reported that all p30 paralogs (22 genes) in E. canis were transcriptionally active in monocyte culture when analyzed by reverse transcriptase-PCR (RT-PCR) using gene-specific primers. The p30 paralogs with short intergenic spaces were co-transcriptionally active in monocyte culture (Ohashi et al., 2001). The transcriptional activities of map1-2, map1-1 and map1 of E. ruminantium, grown in bovine endothelial cells, tick cell lines and in A. variegatum ticks, were investigated by Bekker et al. (2002). The map1 gene was always transcribed, whereas transcription of map1-2 was not detected under any of the conditions used. The map1-1 gene was reported to be transcribed in A. variegatum ticks and tick cell lines, but not in bovine endothelial cells. In this study we used genome walking and genome sequence data from an ongoing E. ruminantium sequencing project at the Onderstepoort Veterinary Institute (http:// www.arc-ovi.agric.za/main/divisions/genome.htm) to identify additional map1 paralogs upstream from map1-2. We also investigated the transcriptional activities of all the map1
paralogs identified in the map1 multigene family using RTPCR of E. ruminantium grown in endothelial cell lines at both 30 and 37 jC. In addition, we examined the cistronic nature of adjacent map1 paralogs.
2. Materials and methods 2.1. Organisms, cultures and DNA isolation The Welgevonden stock of E. ruminantium (du Plessis, 1985) was grown in a bovine aorta endothelial (BA) cell line, a lamb fetal testis endothelial (LFT) cell line and a sheep brain endothelial (SBE) cell line (Zweygarth et al., 1998). Separate infected cultures of each cell line were incubated and passaged at 30 and 37 jC. Organisms of the Welgevonden stock grown in BA cells at 37 jC were purified by discontinuous Percoll density gradient centrifugation (Mahan et al., 1995) and genomic DNA was isolated using phenol/chloroform extraction (Sambrook et al., 1989). 2.2. Genomic library construction and screening A large insert E. ruminantium genomic library was constructed in the LambdaGEMR-11 (Promega, Madison, USA) vector as follows: E. ruminantium organisms (Welgevonden stock) were purified by Percollk density gradient centrifugation and the DNA isolated from these cells was assayed to ensure a bovine DNA content of less than 5% (de Villiers et al., 1998). DNA (113 Ag) was partially digested with 0.125 units of Sau3AI, and 1/3 of the sample was removed at 3, 5 or 8 min, after which the samples were pooled and loaded on a 1.25– 5.0 M NaCl gradient and centrifuged at 39,000 rpm for 3.5 h at 18 jC in a Beckman SW41 rotor (DiLella and Woo, 1987). Fractions (1 ml) were collected, the DNA was ethanol-precipitated, and the yield from fraction 5 was ligated into prepared LambdaGEMR-11 arms and packaged as described by the manufacturer (Promega). When transduced into KW251 cells, the library had an initial titer of 2.4 105 pfu/ml, which equates to a total of approximately 5.1 104 pfu in the library. Assuming an average insert size of 17 kb, this represents f 500 genome equivalents. A map1 probe (Brayton et al., 1997) was used to screen the library to select clones containing the map1 gene. Plaques were transferred in duplicate onto nylon membranes (Hybond N+, Amersham International). The membranes were hybridized overnight at 65 jC with the map1 probe, which was labelled with [a-32P]dATP using the Megaprime kit (Amersham) as described by Brayton et al. (1997). Excess probe was removed by two washes at room temperature for 20 min in 2 SSC, 0.1% SDS, followed by a wash at 65 jC for 30 min in 2 SSC, 0.1% SDS, and a final wash at 65 jC for 30 min in 0.2 SSC, 0.1% SDS. Results were visualized by autoradiography. Two clones from the LambdaGEM-11 library hybridized with the map1
H. van Heerden et al. / Gene 330 (2004) 159–168
probe. One clone of 12 kb was selected for further characterization as described in van Heerden et al. (2002). 2.3. PCR amplification, cloning and sequencing of map1 multigene family The LambdaGEM-11 clone contained map1-2 (partial), map1-1, map1, map1 + 1 and secA (van Heerden et al., 2002; Fig. 1). Sequences upstream of map1-2 were obtained by genome walking as described by Siebert et al. (1995). Briefly, E. ruminantium (Welgevonden) genomic DNA was digested with PvuI, EcoRV, HaeIII, DraI and StuI and ligated with adaptors (Siebert et al., 1995). The ligation mixture of adaptor and E. ruminantium genomic fragments was used as the template in a primary PCR amplification using a map1-specific primer and an adaptor-specific primer (AP1). These primary PCR reactions contained 0.2 AM of each primer, 0.25 mM of each dNTP, 1.25 mM MgCl2, 1 U Taq polymerase (Roche) and 20 ng digested Welgevonden DNA in a volume of 50 Al. The reaction conditions were: 6 min initial denaturation at 94 jC; 35 cycles of 30 s at 94 jC, 5 min at 65 jC, and 30 s at 72 jC; final extension at 72 jC for 10 min; final hold at 4 jC using a GeneAmp PCR System 9700 (Perkin Elmer Applied Biosystems). A secondary PCR was performed using 1 Al of the primary PCR product (diluted 1/50) with the map1-specific primer and a second adaptor-specific primer (AP2) using the PCR conditions as described above. Secondary PCR products were inserted into pGEM-T Easy Vector (Promega) and recom-
161
binant plasmids were introduced into E. coli XL1 Blue. Sequencing was performed using universal primers on an ABI Prism 377 DNA sequencer (Perkin Elmer Applied Biosystems). Contigs from the E. ruminantium (Welgevonden) genome sequencing project at the Onderstepoort Veterinary Institute (http://www.arc-ovi.agric.za/main/divisions/ genome.htm) were searched, using BLAST (Altschul et al., 1990), for homology to the hypothetical transcriptional regulator genes at the 5Vends of the omp loci of E. canis and E. chaffeensis. A forward primer (9092F) was designed in a contig showing such homology, and this primer was used, together with a reverse primer (7537R), designed at the 5V end of the available map1 multigene family sequence, in a PCR reaction to amplify the remainder of the map1 locus (Fig. 1). A ‘‘Long and Accurate’’ PCR kit (LA-PCR, Takara Biomedicals, Ohtsu, Japan) was used, and the PCR conditions were: 2 min initial denaturation at 94 jC; 10 cycles of 10 s at 94 jC, 30 s at 52 jC and 15 min at 68 jC; 15 cycles of 10 s at 94 jC, 30 s at 52 jC and 15 min at 68 jC with 20 s increases per cycle; final extension at 68 jC for 7 min. The PCR products were nebulized in a Medel jet nebulizer reservoir (Medel, Italy) for 5 min at 100 kPa and fragments in the 600– 1500-bp range were selected using the Takara Chip for DNA recovery (Recochip, Takara Biomedicals). Size-selected nebulized PCR products were subcloned in pMOSBlue, using a blunt end cloning kit (Amersham Biosciences, USA) according to the manufacturer’s instructions, and sequenced.
Fig. 1. Schematic representation of the E. ruminantium map1 multigene family. The genes are represented as open boxes with the gene orientation indicated by arrows. The schematic also indicates the construction of the 25-kb sequence using cloning, genome walking, long-template PCR (indicated by dashed lines), and consensus sequences from the E. ruminantium genome project. The ORF names are indicated by numbers below the open boxes with the hypothetical transcription regulator gene indicated by (TR) and unknown genes indicated by (un). PCR primers critical to the cloning scheme are shown with labels above the arrow indicating the direction of the primer. Sequences of the primers are also shown.
162
H. van Heerden et al. / Gene 330 (2004) 159–168
Table 1 Characteristics of the 24 993-bp map1 locus of E. ruminantium Genes and gaps
bpa
Molecular mass (Da)
pI
Primers for RT-PCR (5V– 3V)
gap TR
187 606
22,982
9.44
TR-F: AGACCTCACCCTGTTGATGA TR-R: GCACACTTGCAAGCTGGTAT
gap map1-14
278 930
34,400
6.45
map1-14F: TTCATCACCACTTCCTGTTG map1-14R: TCTTTTTCAAGCTCATGCTG
gap map1-13
883 885
32,964
8.88
map1-13F: CATTTCTGGTGCTTTAGGGT map1-13R: ACCCGTGGTAGTAACCTTCA
gap un1
88 711
27,816
9.7
UN1F: ATTAGCAGCACTCCCAATCC UN1R: TGGAAAACAACACTTTTGTGG
overlap map1-12
4 828
31,330
7.74
map1-12F: TACAAGCCAAGCATTTCGTA map1-12R: TTTGCTGATGATGAATCTGG
gap map1-11
15 882
33,759
9.08
map1-11F: TTTGCCTTTTCAACATTTCA map1-11R: GTCTCCACCAATACCGAAAC
gap map1-10
27 774
28,990
8.63
map1-10F: TTACCAGCCAACTTAAGCCT map1-10R: GGGACTGCTGATGAATTACC
gap map1-9
16 870
32,910
9.38
map1-9F: CGGTTTTAGCGGAGCACTTGG map1-9R: GAAATCTCCGCCAATTCCTA
gap map1-8
25 849
31,356
6.23
map1-8F: CGCTAAAGAAAGCAACCTTC map1-8R: AAATCGTCTAACGCGAAATC
gap map1-7
10 852
31,895
8.68
map1-7F: TGTAGGTAGACGTGGCTGGC map1-7R: AGCTCCACAGGTTGAAGTACG
gap map1-6
19 888
33,426
9.14
map1-6F: TTTGATGCAGAAATCCCTGA map1-6R: TCACCAAAATGTGTTGTGGC
gap map1-5
8 618
24,338
9.96
map1-5F: AATAGCGTCAACTTGCCTGTC map1-5R: TCACCAAAATGTGTTGTGGC
gap map1-4
39 894
33,751
9.06
map1-4F: CCTCAGCATTTTACAACACCA map1-4R: CATCCGTTTAGGAGAACAGACA
gap map1-3
20 948
35,633
9.27
map1-3F: GAAATCCAAATCCTGGACCT map1-3R: TGGTGCATTGTGTAAATTGG
gap map1-2
23 921
34,552
8.9
map1-2F: ACTCCTAAACCTCCGCAGAA map1-2R: ACCGATGTGCATCGTGTAGT
gap map1-1
1392 849
31,100
6.83
map1-1F: GTGCAAAATACAACCCAAGCA map1-1R: CCGCCAATAAATGCAGAAAT
372 873
31,552
5.74
map1sF: CATTAGCGCAAAATACATGC map1sR: AAAACCTGGATTGGCTACAG
1606 858
32,108
6.95
map1 + 1F: TAATCCCCACCAATACCAGC map1 + 1R: GCTTCAGAAACAATCCCTGG
17,957
4.09
14,110
10.33
NA
NA
gap map1
gap map1 + 1 gap un2 gap un3 gap secA Total a b
487 336 1081 534 324 2191b 24,993 Includes stop codons for genes. Incomplete gene.
H. van Heerden et al. / Gene 330 (2004) 159–168
2.4. Preparation of cDNA Total RNA was isolated from E. ruminantium-infected endothelial cells grown at either 30 jC or 37 jC using a RNeasy mini column kit (Qiagen). DNA was eliminated by treatment with RNase-free DNase I (RQ1, Promega) and the isolated RNA was quantified, and its purity checked, by UV spectrophotometry (A260/A280 = 1.8 –2.0). Total RNA (0.5 Ag) was reverse-transcribed using the Superscript II first strand cDNA synthesis system (Invitrogen Life Technologies, Gaithersburg, USA) and random hexamer primers (Promega) according to the manufacturer’s instructions. Negative control samples were also prepared in an identical manner, except that reverse transcriptase was omitted, for use in the subsequent RT-PCR reactions.
163
Hunt Valley, Madison, USA) (Fig. 3). Multiple alignments of nucleic acid and amino acid sequences were carried out using CLUSTAL W (Thompson et al., 1994). Protein comparisons of the amino acid sequences of the gene clusters of E. canis (AF078553), E. chaffeensis (U72291) and E. ruminantium were performed using the global alignment algorithm NEEDLE (implemented in EMBOSS version 2.7.1) (Needleman and Wunsch, 1970). The consensus sequence of the 25 kb contig containing the map1 multigene family of E. ruminantium has been assigned the Genbank accession number AY343331.
3. Results 3.1. Genome walking
2.5. Reverse transcriptase (RT) and DNA PCR of map1 paralogs Eighteen gene-specific primer pairs for the 16 map1 paralogs, the hypothetical transcription regulator gene, and the unknown gene (Table 1) were designed using the Primer 3 program (Rozen and Skaletsky, 1996, 1997, 1998). These primers were also used to investigate co-transcription of adjacent genes separated by intergenic spaces (Table 1, Fig. 4). PCR reactions contained 0.4 AM of each gene-specific primer, 0.25 mM of each dNTP, 1.25 mM MgCl2, 1 U Taq polymerase (Takara Biomedicals) and 25 ng genomic Welgevonden DNA or 2 Al cDNA product in a final volume of 50 Al. The PCR conditions were: 1 min of denaturation at 94 jC; 35 cycles of 30 s at 94 jC, 30 s at 55 jC, 30 s at 72 jC; final extension 72 jC for 8 min. In each case positive and negative control reactions were performed. The negative controls, where reverse transcriptase had been omitted, were included to confirm the absence of DNA contamination in the RNA, and the positive controls consisted of an amplification of each map1 paralog with gene-specific primers from 25 ng E. ruminantium genomic DNA. To confirm the presence of E. ruminantium RNA in each specimen, PCR was performed with a primer pair specific for the 16S rRNA gene of E. ruminantium (Bekker et al., 2002) with reaction mixture and conditions similar to those described above (2 Al cDNA or 25 ng DNA template, 55 jC annealing temperature). We also performed RT-PCRs and genomic DNA PCRs under the same conditions to investigate cotranscription (Fig. 4). In these reactions the primers were, for each pair of adjacent genes, the forward primer of the first gene and the reverse primer of the second gene. 2.6. Sequence analysis Database homology searches were carried out using the BLAST program (Altschul et al., 1990). Dot plot analyses of the omp gene clusters of E. canis (AF078553), E chaffeensis (U72291) and E. ruminantium were performed using the program Omiga 2.0 (Oxford Molecular Group,
A 12-kb clone was obtained using a map1 probe from the large insert E. ruminantium LambdaGEM-11 library. Sequencing of this clone revealed map1-2 (partial), map1-1, map1 and map1 + 1 (van Heerden et al., 2002). A reverse primer (walk1R) was designed from those data and, by genome walking using DNA digested with DraI, a PCR product was obtained which extended the data approximately 500 bp upstream (Fig. 1). The remainder of the map1-2 and a partial map1-3 gene were identified in the new sequence. A second reverse primer (walk2R) was designed in the newly characterized map1-3 sequence and, using DNA digested with HaeIII, a further genome walking product of approximately 3 kb was obtained. The complete map1-3, map1-4 and map1-5 genes, and a partial map1-6 gene, were identified in the sequence of this product. 3.2. Long-template PCR A contig from the E. ruminantium genome sequencing project was found which had homology (E < 10 4) with ehrlichial omp genes and with the hypothetical transcriptional regulator of E. chaffeensis. A forward primer, 9092F, was designed at the 5V end of this contig, and a reverse primer, 7573R, was designed at the 5Vend of the genome walking product containing map1-6. An 11-kb PCR product was obtained using these primers and eight more map1 paralogs (designated map1-7 to map1-14) were identified in the sequence of this product. 3.3. map1 multigene family The 24,993-bp contig containing the map1 multigene family had 21 ORFs (Fig. 1) and the G + C content was 27.19%. Sixteen of the twenty-one ORFs had homologies to map1 (including map1 itself) and 15 of them, from map1 upstream to map1-14, were tandemly organized in a head to tail arrangement. The ORF downstream from map1 (map1 + 1) was on the opposite strand. The ORF at the 5V end of the locus had 73.6% identity to the hypothetical
164
H. van Heerden et al. / Gene 330 (2004) 159–168
Table 2 The percentage identities and similarities of outer membrane protein orthologs between E. ruminantium and E. canis or E. chaffeensis E. ruminantium map1 paralogs
E. canis or E. chaffeensis paraloga
Percent identity of orthologs
Percent similarity of orthologsb
MAP1-14 MAP1-13 MAP1-12 MAP1-11 MAP1-10 MAP1-9 MAP1-8 MAP1-1 MAP1c
OMP-1M OMP-1N OMP-1Q P30-16 OMP-1T OMP-1U OMP-1V P30-10 OMP-1E OMP-1C P28 P30 P30-2 OMP-1F OMP-1D P30-1 P30-4 P30a P30-3 P28-1
50.8 48.2 41.9 53.2 46.3 51.7 59.8 75.6 63.1 62.8 62.4 61.4 59.5 58.8 57.1 55.6 50.8 49.2 48.5 58.2
70.6 63.3 59.1 66.1 68.7 65.3 73.4 84.8 77.2 77.9 75.5 74.4 74.9 73.3 71.3 68.4 63.5 62.6 67.6 75.1
MAP1 + 1
proteins in the equivalent region of E. canis or E. chaffeensis. The remaining E. ruminantium map1 paralogs had homology with genes from an equivalent region of the locus in either E. canis or E. chaffeensis. The map1 gene has homology with the genes in the a region of E. canis (p30-4, p30a, p30-3, p30-2, p30-1 and p30) and E. chaffeensis (omp-1C, omp-1D, omp-1E, omp-1F and p28). The percentage identities between map1 and the genes in the a
a
Paralogs termed P30 for E. canis and OMP / P28 for E. chaffeensis. Similarities were calculated using the EBLOSUM62 comparison matrix. c MAP1 has high similarity to the genes in the a-region of E. canis and E. chaffeensis. b
transcriptional regulator of E. chaffeensis and another incomplete ORF at the 3Vend was 48.3% identical to the secA gene of Rickettsia prowazekii; SECA is known to be required for outer membrane protein transport (Bernstein, 2000). Three ORFs of unknown function were identified in the locus: un1 is located between map1-13 and map1-12 and the other two, un2 and un3, are located downstream from map1 + 1. The unknown genes u2 and un2 of E. canis and E. chaffeensis were homologous to un1 of E. ruminantium (10.5% and 15.8% identity, respectively); un2 of E. ruminantium was homologous to u4 of E. canis (27.8% identity) and un4 of E. chaffeensis (23.0% identity), while un3 of E. ruminantium was homologous to u5 of E. canis (31.4% identity) and un5 of E. chaffeensis (35.2% identity). In the upstream section of the locus, from map1-13 to map1-2, the intergenic spaces varied in size from a 4-bp overlap to 39 bp, while those towards the 3V end, from map1-2 to secA, were longer, at 372 to 1606 bp (Table 1). The 16 map1 paralogs are predicted to encode proteins of between 205 and 315 amino acids, with molecular masses of 24,338 to 35,633 Da (Table 1). The estimated isoelectric points of the predicted proteins in the upstream section of the locus were more basic (6.45 to 9.96) than those in the downstream section (5.74 to 6.95) (Table 1). Orthologs with E. canis or E. chaffeensis OMPs were defined on the basis of gene location within the locus and protein sequence identities. The map1 paralogs from map12 upstream to map1-7 appeared to lack closely related
Fig. 2. Dot plot analysis of the map1 multigene cluster of E. ruminantium with (A) the map1 cluster of E. ruminantium, (B) the omp cluster of E. chaffeensis and (C) the omp cluster of E. canis using the Omiga 2.0 program. The three repetitive regions (a, h and g) are indicated with bars.
H. van Heerden et al. / Gene 330 (2004) 159–168
region of E. canis and E. chaffeensis ranged from 48.5% to 63.1% (Table 2). Dot plot comparison of the map1 multigene locus with itself, and with the omp clusters of E. chaffeensis and E. canis, showed that there were three repetitive regions (Fig. 2). The a region contains five and six paralogs in E. chaffeensis and E. canis, respectively. However, there is only one orthologous gene, map1, in this region in E. ruminantium. In E. ruminantium the h region consisted of map1-2 and map1-3 and the g region was the largest, consisting of five genes (map1-10 to map1-6). The four non-repetitive regions consisted of the area upstream of map1-11, the region from map1-5 to map1-4, map1-1 and map1 + 1 (Fig. 2). 3.4. Transcription of the multigene family of E. ruminantium The gene-specific primers designed for the 16 map1 paralogs (Table 1), when used in the positive control PCR reactions with genomic DNA, specifically amplified their target sequences (Fig. 3A). The transcriptional activities of the individual genes were inferred from the RT-PCR amplifications of cDNA prepared from total RNA isolated
165
from E. ruminantium (Welgevonden) cultures in a range of different endothelial cells grown at 30 and 37 jC. RNA transcripts were detected with all 16 map1 gene-specific primer pairs for each of the cell cultures. RNA transcripts were also detected for the transcriptional regulator (TR) and the unknown ORF (un1), although only a very faint RT-PCR product was detected for the latter. Typical results are shown in Fig. 3B for E. ruminantium-infected BA cells grown at 37 jC. E. ruminantium 16S-specific primers (EHR primer pair) were used to test for E. ruminantium genomic DNA in the RNA preparation (Fig. 3A and B). No amplicon was detected in any of the RT-PCR negative controls, indicating the absence of DNA contamination (data not shown). In the experiments investigating the polycistronic transcription of map1 genes, the positive control amplifications of all pairs of adjacent map1 paralogs (Fig. 4A) produced amplicons of the correct size. Two PCR products (one of the correct size and a smaller band) were produced for adjacent genes map1-10 and map1-9. The results of the RT-PCR amplifications indicated that the hypothetical transcriptional regulator and map1-14 are polycistronically transcribed, and so too are the map1 paralogs from map1-12 to map1-2 (Fig. 4B). We were not able to demonstrate the polycistronic
Fig. 3. Transcriptional analysis of the map1 gene cluster of E. ruminantium Welgevonden isolate grown in a bovine aorta endothelial cell line at 37 jC using (A) PCR from genomic DNA template (positive control) and (B) RT-PCR. Lane M is the fX174 HaeIII marker, lane 16S indicates the PCR product generated using E. ruminantium 16S primers (EHR-F and EHR-R), lane TR indicates the PCR product generated from the hypothetical transcriptional regulator gene using TR-F and TR-R primers, and lane un1 is the PCR product generated from the unknown gene.
166
H. van Heerden et al. / Gene 330 (2004) 159–168
Fig. 4. Schematic representation of the results of RT PCRs and DNA PCRs to investigate co-transcription of adjacent E. ruminantium map1 paralogs, using the forward primer of the first gene and the reverse primer of the second gene. The schematic indicates the (A) amplification of adjacent genes using genomic DNA as positive controls and (B) the co-transcription adjacent genes from the RNA template. The map1-like genes are indicated by numbers in the open boxes with the hypothetical transcription regulator gene indicated by (TR) and unknown genes indicated by (un).
transcription of map1-13 and un1, nor of any paralogs downstream from map1-2 (Fig. 4B).
4. Discussion We report for the first time the complete sequence of the entire map1 multigene locus of E. ruminantium with the adjacent genes, a hypothetical transcriptional regulator and secA. We also demonstrate that all 16 map1 paralogs, the hypothetical transcription regulator gene, and un1, are transcribed in in vitro culture conditions. The map1 paralogs from map1-12 to map1-2, connected by short intergenic spaces, are polycistronically transcribed, as are the hypothetical transcriptional regulator and map1-14 (Fig. 4B). Three repetitive regions (a, h and g) first identified within the omp cluster of E. chaffeensis and E. canis (Ohashi et al., 2001) were also present in the map1 multigene cluster of E. ruminantium. Sulsona et al. (1999) first demonstrated that map1 is encoded by a polymorphic multigene family when they found map1 located in tandem with map1-1 (orf2) and the partial map1-2 (orf3). We later characterized map1 + 1 and secA downstream of map1 (van Heerden et al., 2002). The omp multigene families of E. canis and E. chaffeensis had 22 paralogs arranged in tandem, with a hypothetical transcriptional regulator upstream and secA downstream of the omp cluster (Ohashi et al., 2001). In the syntenic E. ruminantium locus we find that the map1 multigene family consists of only 16 paralogs, bounded as in the other omp families by a hypothetical transcriptional regulator upstream and secA downstream. Bekker et al. (2002) investigated the transcription of map1-2, map1-1 and map1 genes and found that only the map1 gene was transcribed in E. ruminantium grown in bovine endothelial cells at 30 and 37 jC. We, however, found that all 16 paralogs of the map1 cluster of E. ruminantium were transcriptionally active in all cell lines tested at both temperatures. In the present study, genespecific primers for each map1 paralog were used, whereas Bekker et al. (2002) used a general forward primer and a gene-specific reverse primer in the transcriptional analysis of map1-2, map1-1 and map1. Their failure to detect the transcription of map1-1 and map1-2 could be due to a lack
of specificity of the general forward primer. This primer was identical to the map1 sequence, but had only 96% and 75% identities with the Senegal map1-1 and map1-2 sequences, respectively. Discrepancies have also shown up amongst different transcriptional studies of the multigene families of E. canis and E. chaffeensis and they have been attributed to a number of possible causes, including the use of different primer pairs in different laboratories (Reddy et al., 1998; McBride et al., 2000; Yu et al., 2000). Other possibilities are that RNA isolated from cultured organisms may contain RNA derived from several different life cycle stages, and also that expression may be influenced by different culture conditions and techniques (Ohashi et al., 2001; Unver et al., 2001; Long et al., 2002; Cheng et al., 2003). In our hands all 16 paralogs of E. ruminantium were found to be transcriptionally active in culture, and the map1-12 to map1-2 paralogs with short intergenic spaces were polycistronically transcribed. These results correspond with the demonstration that all 22 paralogs of E. canis omp genes were transcriptionally active in monocyte cultures, and that the paralogs with short intergenic spaces were also polycistronically transcribed (Ohashi et al., 2001). Unver et al. (2001) studied the expression of p30-10 of E. canis (orthologs: omp-1B in E. chaffeensis and map1-1 in E. ruminantium) grown in the dog monocyte cell line DH82 and found it to be higher at 25 jC than at 37 jC. The p30-10 gene was also the only p30 paralog expressed in infected ticks (Unver et al., 2001). A similar situation was observed in ticks infected with E. chaffeensis: omp-1B, the ortholog of p30-10, was the only omp-1 transcript detected in ticks (Unver et al., 2002). The authors concluded that the temperature differential between the tick vector and the mammalian host may play a role in the transcriptional regulation of p30 and omp-1 paralogs. In our study we showed that the map1 paralogs were transcriptionally active at 30 and 37 jC, but we did not investigate up- or down-regulation of the map1 paralogs at the lower temperature. This would require quantitative experiments to determine whether transcript level differences exist. While up- or down-regulation might be observed at temperatures lower than 30 jC, such low temperatures restrict the growth of E. ruminantium in endothelial cells. The expression of the map1 multigene family of E. ruminantium, therefore, needs to be investigated in vivo in its mammalian and tick hosts.
H. van Heerden et al. / Gene 330 (2004) 159–168
The repetitive regions (a, h, and g) in the E. ruminantium map1 cluster are smaller, and the non-repetitive regions larger, than those in E. canis and E. chaffeensis. In E. ruminantium, map1 was the only ortholog of the genes in the a region of E. canis and E. chaffeensis; the a region in E. canis consisted of six genes (p30-4, p30a, p30-3, p30-2, p30-1 and p30), and in E. chaffeensis of five genes (omp1C, omp-1D, omp-1E, omp-1F and p28) (Ohashi et al., 2001). The map1 gene is highly variable amongst different E. ruminantium isolates, while other map1 paralogs (map11, map1-2, map1-3 and map1 + 1) are more conserved (Reddy et al., 1996; Sulsona et al., 1999; Allsopp et al., 2001; Bekker et al., 2002; van Heerden, unpublished). This is in contrast to the situation in E. canis and E. chaffeensis where the individual gene sequences of various genes in the a region (p30 of E. canis as well as omp-1B, omp-1C, omp1D, omp-1E, omp-1F and p28 of E. chaffeensis) are relatively conserved amongst different geographical isolates (McBride et al., 1999; Yu et al., 1999; Long et al., 2002). Recently Cheng et al. (2003) found gene deletion and insertion mutations within the a region in a comparison of 10 E. chaffeensis isolates. The variability of map1, and the insertion and deletion of genes in the a region of E. chaffeensis, may indicate that genes in this region could be involved in mechanisms to evade the immune response in the mammalian host, especially perhaps in carrier animals with long-standing persistent infections. Despite the fact that there is a strong immunodominant response to MAP1 in infected ruminants, this response provides no protection against E. ruminantium challenge (van Kleef et al., 1993). Allsopp et al. (2001) therefore analyzed the map1 sequences of 21 different E. ruminantium isolates looking for evidence of positive selection pressure, such as would be expected if the protein were involved in evading attack by the host’s immune system. There was no statistically significant evidence to support this hypothesis, and the authors suggested that other factors, such as conformational constraints on the MAP1 molecule, might be acting to mask the effect of selection pressure. The possibility exists, therefore, that the function of MAP1 is to block an immune response to paralogs which are important to survival of the organism. The constraint would arise because of the need for the gene to remain sufficiently similar to the putatively important paralog(s) that its ability to interfere with an immune reaction against them is not lost, and also to remain sufficiently different from them that a response against MAP1 did not also recognize the other paralogs. While this speculation fits the currently known facts, the exact biological functions of the ehrlichial omp families remain to be elucidated.
Acknowledgements This study was conducted at the Onderstepoort Veterinary Institute, Private Bag X05, Onderstepoort 0110, South
167
Africa, and was supported by a European Union INCO DEV grant (ICA4-CT-2000-30026). We are grateful to Dr. Erich Zweygarth and Ms. Antoinette Josemans of the Onderstepoort Veterinary Institute for the propagation of E. ruminantium in different endothelial cell lines.
References Allsopp, M.T.E.P., Dorfling, C.M., Maillard, J.C., Bensaid, A., Haydon, D.T., van Heerden, H., Allsopp, B.A., 2001. Ehrlichia ruminantium major antigenic protein gene (map1) variants are not geographically constrained and show no evidence of having evolved under positive selection pressure. J. Clin. Microbiol. 39, 4200 – 4203. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403 – 410. Bekker, C.P.J., Bell-Sakyi, L., Paxton, E.A., Martinez, D., Bensaid, A., Jongejan, F., 2002. Transcriptional analysis of the major antigenic protein 1 multigene family of Cowdria ruminantium. Gene 285, 193 – 201. Bernstein, H.D., 2000. The biogenesis and assembly of bacterial membrane proteins. Curr. Opin. Microbiol. 3, 203 – 209. Brayton, K.A., Fehrsen, J., de Villiers, E.P., van Kleef, M., Allsopp, B.A., 1997. Construction and initial analysis of a representative EZAPII expression library of the intracellular rickettsia Cowdria ruminantium: cloning of map1 and three other genes. Vet. Parasitol. 72, 185 – 199. Cheng, C., Paddock, C.D., Ganta, R.R., 2003. Molecular heterogeneity of Ehrlichia chaffeensis isolates determined by sequence analysis of the 28-kilodalton outer membrane protein genes and other regions of the genome. Infect. Immun. 71, 187 – 195. Dame, J.B., Mahan, S.M., Yowell, C.A., 1992. Phylogenetic relationship of Cowdria ruminantium, agent of heartwater, to Anaplasma marginale and other members of the order Rickettsiales determined on the basis of the 16S rRNA sequence. Int. J. Syst. Bacteriol. 42, 270 – 274. de Villiers, E.P., Brayton, K.A., Zweygarth, E., Allsopp, B.A., 1998. Purification of Cowdria ruminantium organisms for use in genome analysis by pulsed-field gel electrophoresis. Ann. N.Y. Acad. Sci. 849, 313 – 320. DiLella, A.G., Woo, S.L.C., 1987. Cloning large segments of genomic DNA using cosmid vectors. Methods Enzymol. 152, 199 – 212. Dumler, J.S., Barbet, A.F., Bekker, C.P., Dasch, G.A., Palmer, G.H., Ray, S.C., Rikihisa, Y., Rurangirwa, F.R., 2001. Reorganization of genera in the families Rickettsiaceae and Anaplasmataceae in the order Rickettsiales: unification of some species of Ehrlichia with Anaplasma, Cowdria with Ehrlichia and Ehrlichia with Neorickettsia, descriptions of six new species combinations and designation of Ehrlichia equi and ‘HGE agent’ as subjective synonyms of Ehrlichia phagocytophila. Int. J. Syst. Evol. Microbiol. 51, 2145 – 2165. du Plessis, J.L., 1985. A method for determining the Cowdria ruminantium infection rate of Amblyomma hebraeum: effects in mice injected with tick homogenates. Onderstepoort J. Vet. Res. 52, 55 – 61. Long, S.W., Zhang, X.-F., Qi, H., Stabdaert, S., Walker, D.H., Yu, X.-J., 2002. Antigenic variation of Ehrlichia chaffeensis resulting from differential expression of the 28-kilodalton protein gene family. Infect. Immun. 70, 1824 – 1831. Mahan, S.M., Andrew, H.R., Tebele, N., Burridge, M.J., Barbet, A.F., 1995. Immunization of sheep against heartwater with inactivated Cowdria ruminantium. Res. Vet. Sci. 58, 46 – 49. McBride, J.W., Yu, X.-J., Walker, D.H., 1999. Molecular cloning of the gene for a conserved major immunoreactive 28-kilodalton protein of Ehrlichia canis: a potential serodiagnostic antigen. Clin. Diagn. Lab. Immunol. 6, 392 – 399. McBride, J.W., Yu, X.-J., Walker, D.H., 2000. A conserved, transcriptionally active p28 multigene locus of Ehrlichia canis. Gene 254, 245 – 252. Needleman, S.B., Wunsch, C.D., 1970. A general method applicable to the
168
H. van Heerden et al. / Gene 330 (2004) 159–168
search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443 – 453. Ohashi, N., Unver, A., Zhi, N., Rikihisa, Y., 1998a. Cloning and characterization of multigenes encoding the immunodominant 30-kilodalton major outer membrane proteins of Ehrlichia canis and application of the recombinant protein for serodiagnosis. J. Clin. Microbiol. 36, 2671 – 2680. Ohashi, N., Zhi, N., Zhang, Y., Rikihisa, Y., 1998b. Immunodominant major outer membrane proteins of Ehrlichia chaffeensis are encoded by a polymorphic multigene family. Infect. Immun. 66, 132 – 139. Ohashi, N., Rikihisa, Y., Unver, A., 2001. Analysis of transcriptionally active gene clusters of major outer membrane protein multigene family in Ehrlichia canis and E. chaffeensis. Infect. Immun. 69, 2083 – 2091. Reddy, G.R., Sulsona, C.R., Harrison, R.H., Mahan, S.M., Burridge, M.J., Barbet, A.F., 1996. Sequence heterogeneity of the major antigenic protein 1 genes from Cowdria ruminantium isolates from different geographical areas. Clin. Diagn. Lab. Immunol. 3, 417 – 422. Reddy, G.R., Sulsona, C.R., Barbet, A.F., Mahan, S.M., Burridge, M.J., Alleman, A.R., 1998. Molecular characterization of a 28-kDa surface gene family of the tribe Ehrlichiae. Biochem. Biophys. Res. Commun. 247, 636 – 643. Rozen, S., Skaletsky, H.J., 1996, 1997, 1998. Primer3. Code available at: http://www-genome.wi.mit.edu/genome_software/other/primer3.html. Sambrook, J., Fritsch, E.H., Maniatis, T., 1989. Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Siebert, P.D., Chenchik, A., Kellogg, D.E., Lukyanov, K.A., Lukyanov, S.A., 1995. An improved method for walking in uncloned genomic DNA. Nucleic Acids Res. 23, 1087 – 1088. Sulsona, C.R., Mahan, S.M., Barbet, A.F., 1999. The map1 gene of Cowdria ruminantium is as member of a multigene family containing both conserved and variable genes. Biochem. Biophys. Res. Commun. 257, 300 – 305. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through
sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673 – 4680. Uilenberg, G., 1983. Heartwater (Cowdria ruminantium infection): current status. Adv. Vet. Sci. Comp. Med. 27, 427 – 480. Unver, A., Ohashi, N., Taijma, T., Stich, R.W., Grover, D., Rikihisa, Y., 2001. Transcriptional analysis of p30 major outer membrane multigene family of Ehrlichia canis in dogs, ticks, and cell culture at different temperatures. Infect. Immun. 69, 6172 – 6178. Unver, A., Rikihisa, Y., Stich, R.W., Ohashi, N., Felek, S., 2002. The omp1 major outer membrane multigene family of Ehrlichia chaffeensis is differentially expressed in canine and tick hosts. Infect. Immun. 70, 4701 – 4704. van Heerden, H., Collins, N.E., Allsopp, M.T.E.P., Allsopp, B.A., 2002. Major outer membrane proteins of Ehrlichia ruminantium encoded by multigene family. Ann. N.Y. Acad. Sci. 969, 131 – 134. van Kleef, M., Neitz, A.W., De Waal, D.T., 1993. Isolation and characterization of antigenic proteins of Cowdria ruminantium. Rev. E´lev. Me´d. Ve´t. Pays Trop. 46, 157 – 164. van Vliet, A.H.M., Jongejan, F., van der Zeijst, B.A.M., 1992. Phylogenetic position of Cowdria ruminantium (Rickettsiales) determined by analysis of amplified 16S ribosomal DNA sequences. Int. J. Syst. Bacteriol. 42, 494 – 498. van Vliet, A.H.M., Jongejan, F., van Kleef, M., van der Zeijst, B.A.M., 1994. Molecular cloning, sequence analysis, and expression of the gene encoding the immunodominant 32-kilodalton protein of Cowdria ruminantium. Infect. Immun. 62, 1451 – 1456. Yu, X.-J., McBride, J.W., Walker, D.H., 1999. Genetic diversity of the 28kilodalton outer membrane protein gene in human isolates of Ehrlichia chaffeesis. J. Clin. Microbiol. 37, 1137 – 1143. Yu, X.-J., McBride, J.W., Zhang, X.-F., Walker, D.H., 2000. Characterization of the complete transcriptionally active Ehrlichia chaffeensis 28 kDa outer membrane protein multigene family. Gene 248, 59 – 68. Zweygarth, E., Josemans, A.I., Horn, E., 1998. Serum-free media for the in vitro cultivation of Cowdria ruminantium. Ann. N.Y. Acad. Sci. 849, 307 – 312.