In silico analysis of potential diagnostic targets from Burkholderia pseudomallei

In silico analysis of potential diagnostic targets from Burkholderia pseudomallei

Transactions of the Royal Society of Tropical Medicine and Hygiene (2008) 102/S1, S61 S65 available at www.sciencedirect.com j o u r n a l h o m e p ...

196KB Sizes 0 Downloads 58 Views

Transactions of the Royal Society of Tropical Medicine and Hygiene (2008) 102/S1, S61 S65 available at www.sciencedirect.com

j o u r n a l h o m e p a g e : w w w . e l s e v i e r h e a l t h . co m / j o u r n a l s /t r s t

In silico analysis of potential diagnostic targets from Burkholderia pseudomallei Denis B. Thompsona , Kerianne Crandalla , Sarah V. Hardingb , Sophie J. Smitherb , G. Barrie Kittoa , Richard W. Titballc , Katherine A. Browna,d,e, * a

Department of Chemistry and Biochemistry, University of Texas at Austin, Austin, TX 78712, USA Defence Science and Technology Laboratory, Porton Down, Salisbury, Wiltshire SP4 0JQ, UK c School of Biosciences, Geoffrey Pope Building, University of Exeter, Exeter EX4 4QD, UK d Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA e Department of Life Sciences, Division of Cell and Molecular Biology, Centre for Molecular Microbiology and Infection, Imperial College London, London SW7 2AZ, UK b

KEYWORDS Melioidosis Burkholderia Bioinformatics Diagnostics Bacterial genome Proteins

Summary Administration of appropriate therapeutic regimes for infections arising from pathogenic species of Burkholderia is critically dependent upon rapid and accurate diagnoses. The purpose of this work is to establish a bioinformatic pipeline to assess protein sequences for their potential as diagnostic targets for the detection of Burkholderia species. Data are presented showing both a bioinformatic methodology for prediction of surface-associated and secreted proteins and its application to a test dataset of proteins from the pathogen B. pseudomallei. A subset of proteins, known to be produced by the organism, is identified which represents potential targets for development of new diagnostic reagents. In addition, a ‘reverse diagnostics’ bioinformatics approach has been established which can now be extended to whole genome analyses. © 2008 Published by Elsevier Ltd on behalf of Royal Society of Tropical Medicine and Hygiene.

1. Introduction Burkholderia pseudomallei is the causative agent of melioidosis, a debilitating and sometimes deadly disease endemic in Southeast Asia and northern Australia. Acquisition of the organism involves inhalation, ingestion or inoculation. 1 Burkholderia pseudomallei along with the related organism B. mallei, the causative agent of glanders, are both classified as Category B pathogens by the US Centers for Disease Control and Protection. Rapid diagnosis is important to ensure that appropriate treatments are administered, as B. pseudomallei is intrinsically resistant to many antibiotics. 2,3 Although a number of diagnostic methods exist for detection of the organism (reviewed by Peacock, 2006 4 ), it would be useful to develop diagnostic reagents with improved sensitivities and specificities which could be used in multiplexed systems. * Corresponding author. Tel.: +44 20 75945298; fax: +44 20 75945207. E-mail address: [email protected] (K.A. Brown). 0035-9203/ $

As part of an effort to develop improved assays for detecting pathogenic Burkholderia species, we are developing high-affinity molecular reagents such as aptamers using proteins known to be produced by B. pseudomallei that may have potential as diagnostic targets. 5 The availability of genomic information for B. pseudomallei, B. mallei and B. thailandensis (a related organism with very low levels of virulence) raises the possibility of using bioinformatic methods to predict suitable diagnostic targets. These methods would be similar to those used in bioinformatic ‘reverse vaccinology’ approaches 6,7 that focus on predictions of surface-associated and secreted proteins. In this study we have used a number of bioinformatic prediction tools to assess a test set of B. pseudomallei proteins as diagnostic targets. The test set is derived from recent proteomic or protection studies. 8,9 The suitability of proteins for use in diagnostic assay development is discussed as well as the potential of expanding this method to larger gene sets and combining it with other experimentally derived information.

see front matter © 2008 Published by Elsevier Ltd on behalf of Royal Society of Tropical Medicine and Hygiene.

S62

Immunoreactive b

Function

BOMP

Protein a

Phobius

GenBank accession no.

SignalP

Proteins from 2D-gel rated by programs to get indication of suitability. Those proteins not classified as cytoplasmic location by P-CLASSIFIER

P-CLASSIFIER

Table 1

D.B. Thompson et al.

53721903

BPSS0879

Porin protein

O

Y

N

Y

Y

53722869

BPSS1850

Hypothetical protein

O

Y

N

Y

Y

53722699

BPSS1679

Porin-related exported protein

O

Y

N

Y

Y

53722763

BPSS1742

Outer membrane copper receptor

O

Y

N

N

Y

53718186

Ssb

Single-strand DNA-binding protein

P

N

N

N

N

53721247

BPSS0212

Hypothetical protein

P

N

N

N

N

53721253

BPSS0218

Isomerase

P

N

N

N

N

53721248

BPSS0213

Hypothetical protein

P

N

N

N

N

53720622

BPSL3012a

Hypothetical protein

P

N

N

N

N

53720358

BPSL2748

Putative oxidoreductase

P

N

N

N

N

53718815

LolC

ABC transport system protein

I

N

Y

N

Y

53719346

BPSL1732

Putative methyl-accepting chemotaxis citrate transducer

I

Y

Y

N

N

53720012

BPSL2406

Hypothetical protein

I

Y

Y

N

N

I: inner membrane; O: outer membrane; P: periplasmic; Y: yes; N: no. a Proteins selected were from Harding et al. (2007), 8 with the exception of LolC, which came from Harland et al. (2007). 9 b Immunoreactivity was determined in Harding et al. (2007). 8

2. Materials and methods The set of 40 protein sequences were prepared in FASTA format. The sequences were obtained from proteomic studies of B. pseudomallei 8 including the protein sequence from BPSL2748. The sequence of the B. pseudomallei protective antigen LolC 9 was also included in this test set. All 40 sequences were analyzed in silico using four webbased programs: P-CLASSIFIER, SignalP 3.0, Phobius and BOMP. P-CLASSIFIER uses multiple support vector machines to predict the subcellular localization of proteins in Gramnegative bacteria. 10 The five cellular location classes are: extracellular, outer membrane, periplasmic, inner membrane and cytoplasmic. The SignalP 3.0 Server 11,12 uses a neural net model and a hidden Markov model to predict the presence or absence of a type II secretion pathway signal. The two prediction algorithms of SignalP agreed 100% of the time for the 40 sequences submitted. Phobius, which is based on a hidden Markov model, predicts the presence or absence of transmembrane helices and distinguishes between transmembrane helices and signal peptides. 13 The BOMP program predicts beta-barrel outer membrane proteins based on the presence of two sequence recognition features typical of this class of proteins. 14 Seven proteins (four outer membrane and three inner membrane) were subsequently analyzed using the program Phyre 15 to predict secondary structure and threedimensional folds. Data were inspected for the presence of non-membrane regions or domains.

3. Results Of the 40 protein sequences analyzed in silico for properties suitable for diagnostic development, P-CLASSIFIER predicts 13 of these to be localized to the outer membrane,

periplasmic space or inner membrane (Table 1). Three of the four outer membrane proteins are predicted to be beta-barrel structures using the sequence recognition program BOMP. The fourth protein BPSS1742 (Table 1), with a predicted function as a copper receptor, does not appear to fit the BOMP criteria as a beta-barrel. However, using the structure-based homology program Phyre, BPSS1742 and the other three outer membrane proteins (Table 1) all show similar patterns of beta-strands in the secondary structure prediction, and the top hits for Phyre’s fold homology search are beta-barrel structures for all four proteins. The three proteins predicted to be inner membrane proteins, LolC, BPSL1732 and BPSL2406 (Table 1) are also the only proteins predicted to contain transmembrane helices using the Phobius server. Secondary structure predictions of these three proteins using Phyre all show the presence of alpha-helices at positions in the sequence, which correlate well with the Phobius’s predicted positions of the transmembrane helices. In comparison, the remaining 17 proteins from this test set are predicted to be located in the cytoplasm (Table 2). None of these proteins, nor any of the six proteins predicted to be periplasmically located (Table 1), show evidence of containing transmembrane alpha-helices or of being folded as a beta-barrel. In the current study, proteins were classified as containing a signal sequence or not, based, as recommended by Bendtsen et al., 11 on the ‘D-score’ generated by the SignalP 3.0 server. By this method, none of these 23 proteins is classified here as containing a signal sequence. However, BPSS0839 and SdhA (Table 2) gave ‘Sscores’ (another SignalP 3.0 measure of the likelihood of having a signal sequence) above the S-score threshold, suggesting more ambiguity in the assignment of the lack of a signal sequence to these proteins, compared with other proteins in this test set.

In silico analysis of potential diagnostic targets from Burkholderia pseudomallei

S63

GenBank accession no.

Protein a

Function

P-CLASSIFIER

SignalP

Phobius

BOMP

Immunoreactive b

Table 2 Proteins from 2D-gel rated by programs to get indication of suitability. Those proteins classified as cytoplasmic location by P-CLASSIFIER

53720836

Tuf

Elongation factor Tu

C

N

N

N

Y

53720307

GroEL

Chaperonin

C

N

N

N

Y

53718519

SodB

Putative superoxide dismutase

C

N

N

N

N

53719566

ScoB

Succinyl-CoA:3-ketoacid-coenzyme A transferase subunit B

C

N

N

N

N

53721865

BPSS0839

Hypothetical protein

C

N

N

N

Y

53717828

MreB

Putative rod shape-determining protein

C

N

N

N

N

53720795

RpoA

DNA-directed RNA polymerase alpha subunit

C

N

N

N

N

53722739

SdhA

Succinate dehydrogenase

C

N

N

N

N

53718843

Pnp

Polyribonucleotide nucleotidyltransferase

C

N

N

N

Y Y

53720436

DnaK

Molecular chaperone

C

N

N

N

53720551

Fur

Ferric uptake regulator

C

N

N

N

N

53719392

BPSL1778

Putative siderophore related no-ribosomal peptide synthase

C

N

N

N

N

53721350

BPSS0315

ABC transport system

53720465

PhnG

Putative phosphonate metabolism PhnG protein

53719707

BPSL2096

Putative hydroperoxide reductase

C

N

N

N

N

53719908

PhaP

Phasin-like protein

C

N

N

N

Y

53723306

BPSS2288

HSP20/alpha crystallin family protein

C

N

N

N

N

53719852

CysS

Cysteinyl-tRNA synthetase

C

N

N

N

N

53720869

BPSL3259

Putative plasmid conjugal transfer protein

C

N

N

N

N

53720809

RplE

50S ribosomal protein L5

C

N

N

N

N

53719900

BPSL2290

Hypothetical protein

C

N

N

N

N

53719067

BPSL1431

Putative esterase/lipase

C

N

N

N

N

53717821

BPSL0179

6-Pyruvoyl tetrahydropterin synthase

C

N

N

N

N

53722408

BPSL1382

Endonuclease/exonuclease/phosphatase family protein

C

N

N

N

N

ATP-binding protein

C

N

N

N

N

C

N

N

N

N

53722620

PilQ

Type IV pilus biosynthesis protein

C

N

N

N

N

53721003

AtpD

ATP synthase subunit B

C

N

N

N

Y

53721005

AtpA

ATP synthase subunit A

C

N

N

N

Y

C: cytoplasmic; Y: yes; N: no. a Proteins selected were from Harding et al. (2007). 8 b Immunoreactivity was determined in Harding et al. (2007). 8

4. Discussion Bioinformatics methods, which predict extracellular proteins in bacterial genomes, are increasingly being used to identify potential candidates for vaccine development (reviewed by Davies and Flower, 2007 6 ). The process, known as reverse vaccinology, was first successfully used to discover proteins from Neisseria meningitidis that proved to be protective against infection by this organism, which causes meningococcal meningitis. 16 The method was subsequently extended to other bacterial pathogens (see Davies and Flower, 2007 6 ; Mora et al., 2006 7 ). In this study we have used a test set of proteins from B. pseudomallei to establish an initial bioinformatics-based strategy to identify candidate proteins or protein domains that have the potential to be used in the development of diagnostic reagents. The test set used in this study is principally derived from surface proteins identified using proteomics methods 8 and also includes the LolC protein, which has recently been shown to be a protective protein antigen

against challenge with B. pseudomallei in a mouse model of the infection. 9 The overall strategy for this ‘reverse diagnostic’ approach is summarized in Figure 1. P-CLASSIFIER was used to predict the cellular location of each of the proteins in the test set. Tables 1 and 2 are sorted according to this output; first outer membrane proteins, then periplasmic, inner membrane (Table 1) and finally cytoplasmic proteins (Table 2). This classification broadly agrees with the analysis done by Harding et al. 8 Proteins previously classified as ‘unknown’ with regard to cellular location were able to be classified here as either periplasmic or cytoplasmic. Only one protein, BPSL1732, a putative methyl-accepting chemotaxis citrate transducer, appears to have been misclassified in the prior study as a cytoplasmic protein rather than an inner membrane protein. Interestingly, BSPL1732 was predicted to contain a signal sequence 8 but it is also predicted here to contain two transmembrane helices using the Phobius server. Further analysis using Phyre also identified fold features consistent with the structural organization of a

S64

D.B. Thompson et al. A. Strategy

B. Implementation

Starting set of sequences of potential target proteins

Set of possible surface proteins identified in 8 Harding et al., 2007

Analyze sequences with P-CLASSIFIER, SignalP, Phobius and BOMP

Promising outer membrane protein targets: BPSS0879, BPSS1850, BPSS1679, BPSS1742

Further analyze most promising sequences with Phyre

Final set of ‘best’ potential target proteins

Phyre hit for BPSS1742 Phyre hit for BPSS1850

Fig. 1 Schematic outline of steps in the reverse diagnostic analysis of potential Burkholderia pseudomallei diagnostic targets. The flow charts show (A) the general strategy and (B) the implementation in this study, illustrated with the outer membrane proteins identified at steps in the process.

bacterial chemotaxis receptor that includes the presence of a large helical-bundle cytoplasmic domain used for signaling, and a periplasmic domain used for sensing. 17 The order in which the proteins are presented in Tables 1 and 2 represents an approximate ranking of which proteins should be considered most likely to be good targets of diagnostic reagents. Of course, this prioritization is only a prediction based on sequence motifs, and other criteria need to be taken into account in choosing the most promising targets. For example, the four proteins predicted to be located in the outer membrane were all predicted to have a signal sequence and to have betabarrel folds. The three proteins predicted to be located in the inner membrane, including LolC, are all predicted to contain transmembrane helices using Phobius. Notably, all other proteins predicted to be either periplasmically or cytoplasmically located showed no beta-barrel or transmembrane structural features. Taken together, these predictions provide a more confident indication of cellular location but also need to be examined in the context of experimental data. In this regard it is worth considering that the proteins in this test set, with the exception of LolC, were derived from proteomic studies that aimed to identify proteins associated with the surface of B. pseudomallei. Harding et al. 8 biotinylated proteins from the most hydrophobic fraction of the extraction process used for the proteomic analysis and then immunoblotted the resulting two-dimensional gel with a streptavidin-antibody conjugate. Interestingly, the proteins identified in that proteomic analysis contained a substantial number of proteins that were predicted to be cytoplasmically located. It is possible that a number of these proteins were carried along in the extraction process because of their high abundance in the cytoplasm, but it is also possible that some may be associated with the surface (see Harding et al., 2007 8 ). Furthermore, 12 proteins in that study were shown to be immunoreactive to human convalescent sera. Nine of these 12 proteins were also biotinylated, including all four outer membrane proteins. The remaining immunoreactive proteins are predicted to be cytoplasmically located but could still potentially be useful for serological diagnosis. These proteins may also

prove useful in detection of B. pseudomallei, particularly if they are in high abundance and samples for testing may include some lysed organism. It is also worth noting that LolC, which is predicted to be an inner membrane protein, would not necessarily be chosen as a good diagnostic or even a vaccine target according its localization. The fact that it did prove to be a protective antigen 9 further emphasizes that one may need to assess potential targets beyond the outer membrane in order to obtain proteins with properties suitable for development of diagnostic reagents and vaccines. Finally, we are hoping to use the information derived from this type of bioinformatic and literature analysis to select protein targets that we can recombinantly produce as a soluble form for the development of high-affinity molecular recognition molecules such as aptamers 5 and antibodies to improve detection methods for B. pseudomallei. Implementation of the current strategy (shown in Figure 1) suggests that a number of our best candidates share integral outer membrane beta-barrel proteins. Purification of these membrane proteins in sufficient quantities for development of aptamers and antibodies is technically demanding and may not yield enough folded material. This was also true of the LolC protein, which eventually proved to be a suitable vaccine candidate for protection from a melioidosis infection. In this case a large periplasmically located domain of LolC was identified from analyses of the protein sequence and subsequently expressed as a soluble protein. 9 Figure 1 shows two known beta-barrel structures, obtained using the Phyre server, that may be representative of the fold for two of these B. pseudomallei outer membrane proteins. Inspection of these images shows that, although they contain a substantial amount of integral betabarrel structure, there is the potential to generate peptides or small domains that can be used to derive high-affinity aptamers or antibodies.

5. Conclusions Selection of suitable targets for the development of new reagents for improving the diagnosis of melioidosis is challenging. Bioinformatic methods can help guide the selection of potential candidates by predicting cellular location, as well as functional and structural properties. When combined with experimental data such as expression levels, localization and immunoreactivity, the quality of the predictions that arise from these bioinformatic or reverse diagnostic methods may be improved. This combined bioinformatic and experimental strategy can ultimately lead to better target prioritization. Our future work in this area includes applying this strategy to the whole genomes of B. pseudomallei and related species such as B. mallei and B. thailandensis. Using this process we hope to select additional protein targets for the development of new diagnostic reagents to identify and discriminate between these organisms. Authors’ contributions: KAB conceived and designed the study; DBT, KC and KAB analysed and interpreted the data; SVH, SJS and RWT provided input on target selection; GBK provided laboratory support; KAB obtained financial support; DBT and KAB prepared and revised the manuscript.

In silico analysis of potential diagnostic targets from Burkholderia pseudomallei All authors read and approved the final manuscript. KAB is guarantor of the paper. Funding: This study was supported by the Welch Foundation (TI-3D grant from the University of Texas at Austin), and the Defence Science and Technology Laboratory, Porton Down, UK. Conflicts of interest: None declared. Ethics approval: Not required.

References 1. Cheng AC, Currie BC. Melioidosis: epidemiology, pathophysiology, and management. Clin Diagn Lab Immunol 2005;5:225 9. 2. Dance DA, Wuthiekanun V, Naigowit P, Whie NJ. The antimicrobial susceptibility of Pseudomonas pseudomallei: emergence of resistance in vitro and during treatment. J Antimicrob Chemother 1989;24:295 309. 3. Jenney AW, Lum G, Fisher DA, Currie BJ. Antibiotic susceptibility of Burkholderia pseudomallei from tropical northern Australia and implications for therapy of melioidosis. Int J Antimicrob Agents 2001;17:109 13. 4. Peacock SJ. Melioidosis. Curr Opin Infect Dis 2006;19: 421 8. 5. Gnanam AJ, Hall B, Shen X, Piasecki S, Vernados A, Galyvov EE, et al. Development of aptamers specific for potential diagnostic targets in B. pseudomallei. Trans R Soc Trop Med Hyg 2008;102(Suppl 1):S55 7. 6. Davies MN, Flower DR. Harnessing bioinformatics to discover new vaccines. Drug Discov Today 2007;12: 389 95. 7. Mora M, Donati C, Medini D, Covacci A, Rappuoli R. Microbial genomes and vaccine design: refinements to the classical reverse vaccinology approach. Curr Opin Microbiol 2006;9:532 6.

S65

8. Harding SV, Sarkar-Tyson M, Smither SJ, Atkins TP, Oyston PC, Brown KA, et al. The identification of surface proteins of Burkholderia pseudomallei. Vaccine 2007;25:2664 72. 9. Harland DN, Chu K, Haque A, Nelson M, Walker NJ, SarkarTyson M, et al. Identification of a LolC homologue in Burkholderia pseudomallei, a novel protective antigen for melioidosis. Infect Immun 2007;75:4173 80. 10. Wang J, Sung WK, Krishnan A, Li KB. Protein subcellular localization prediction for Gram-negative bacteria using amino acid subalphabets and a combination of multiple support vector machines. BMC Bioinformatics 2005;6:174. 11. Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides SignalP 3.0. J Mol Biol 2004; 340:783 95. 12. Nielsen H, Engelbrecht J, Brunak S, von Heijne G. A neural network for identification of prokaryote and eukaryote signal peptides and prediction of their cleavage sites. Int J Neural Syst 1997;8:581 99. 13. K¨ all L, Krogh A, Sonnhammer ELL. Advantages of combined transmembrane topology and signal peptide prediction the Phobius we server. Nucleic Acids Res 2007;35: W429 32. 14. Berven FS, Flikka K, Jensen HB, Eidhammer I. BOMP: a program to predict integral b-barrel outer membrane proteins encoded within genomes of Gram-negative bacteria. Nucleic Acids Res 2004;32:W394 9. 15. Bennett-Lovsey RM, Herbert AD, Sternberg MJ, Kelley LA. Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre. Proteins 2007; PMID: 17876813. 16. Pizza M, Scarlato V, Masignani V, Giuliani MM, Arico B, Comanducci M, et al. Identification of vaccine candidates against serogroup B meningogoccus by whole-genome sequenceing. Science 2000;287:1816 20. 17. Stock J, Levit M. Signal transduction: hair brains in bacterial chemotaxis. Curr Biol 2000;10:R11 4.