A novel DNA microarray design for accurate and straightforward identification of Escherichia coli safety and laboratory strains

A novel DNA microarray design for accurate and straightforward identification of Escherichia coli safety and laboratory strains

ARTICLE IN PRESS Systematic and Applied Microbiology 31 (2008) 50–61 www.elsevier.de/syapm A novel DNA microarray design for accurate and straightfo...

514KB Sizes 0 Downloads 16 Views

ARTICLE IN PRESS

Systematic and Applied Microbiology 31 (2008) 50–61 www.elsevier.de/syapm

A novel DNA microarray design for accurate and straightforward identification of Escherichia coli safety and laboratory strains Andreas Peter Bauer, Wolfgang Ludwig, Karl-Heinz Schleifer Lehrstuhl fu¨r Mikrobiologie, Technische Universita¨t Mu¨nchen, Am Hochanger 4, 85350 Freising, Germany Received 21 November 2007

Abstract Escherichia coli K-12, B, C and W strains and their derivates are declared in biological safety guidelines as risk group 1 organisms as they are unable to colonise the human gut. Differentiation and identification of these safety strains is mainly based on pulsed-field gel electrophoresis (PFGE), phage sensitivity tests or PCR-based methods. However, these methods are either tedious and time consuming (phage sensitivity, PFGE) or based on single specific fragments (PCR) or patterns (PFGE) lacking additional information for further differentiation of the strains. In the current study, subtractive hybridisation techniques were applied to detect specific DNA fragments which were used to design a microarray (chip) for accurate and simple identification of these organisms, and to differentiate them from other E. coli strains. The chip can be used to identify E. coli safety strains and monitor them during ongoing experiments for changes in their genome and culture purity. The hybridisation layout of the microarray was arranged in such a way that the respective lineages of safety strains could be easily identified as distinct letters (K, B, C or W). Differentiation of single strains or subtyping was possible with further probes. In addition, a set of probes targeting genes coding for common virulence factors has been included, both to differentiate safety strains from pathogenic variants and to make sure that no transfer of these genes happens during handling or storage. The reliability of the approach has been tested on a comprehensive selection of E. coli laboratory strains and pathogenic representatives. r 2008 Elsevier GmbH. All rights reserved. Keywords: Escherichia coli safety strains; Laboratory strains; Microarray; Identification

Introduction Escherichia coli safety strains are best known as workhorses [15] in molecular laboratories and companies worldwide. They are the main host strains for cloning or expression experiments in molecular research and are used as genetically engineered production strains for enzymes (e.g. restriction enzymes), amino Corresponding author. Tel.: +49 8161 715456.

E-mail address: [email protected] (A.P. Bauer). 0723-2020/$ - see front matter r 2008 Elsevier GmbH. All rights reserved. doi:10.1016/j.syapm.2008.01.001

acids [18,28], vitamins [17] or even chemicals [6,20]. As these strains are non-hazardous and easy to manipulate they are declared as risk group 1 organisms in biological safety guidelines. The most popular representative of these strains is E. coli K-12 and its derivatives. In addition, derivatives of E. coli B, C and W strains are also specified in several statutes for genetic engineering and are commercially available. E. coli K-12 was originally isolated from a convalescent diphtheria patient [5] in 1922 and became prominent through Lederberg and Tatum’s investigations on its

ARTICLE IN PRESS A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

genetic and biochemical properties [26]. The W strain was originally isolated from the soil of a graveyard by Selman A. Waksman [9], although it is not known in which year. The B and C strains were first of all used for studying their phages (T-phages and F  174, respectively) but the authors did not mention where their host strains were originally isolated from [1,25]. Each lineage possesses different properties and is therefore used for different applications. E. coli K-12 strains are mainly used for cloning and sequencing, B/BL21 strains are the major protein expression strains, Stratagene’s (LaJolla, USA) C-derivatives are able to deal with proteins which are toxic for other E. coli strains and the W mutant Mach from Invitrogen (Carlsbad; USA) is the fastest growing laboratory strain to date. Despite the popularity of these organisms, accurate identification methods for single strains are still missing. Published PCR-based detection techniques are either designed for a single lineage [16,24] or the differentiation of the four lineages [2], and, therefore, are not able to identify single strains, nor to differentiate the commercially available strains from the original strains. Another drawback of PCR-based methods is the relatively limited number of genes that can be screened simultaneously. Pulse-field gel electrophoresis (PFGE) analysis would be able to differentiate single strains (data not shown) and is proposed for the detection of K-12 strains [4] (http:// www.lag-gentechnik.de/dokumente). However, this technique is time consuming and there is the need for a reference database containing all possible patterns. To overcome these problems, we developed a DNA microarray for accurate and straightforward identification of E. coli safety strains. DNA microarrays have been successfully employed for species identification [13,14] and virulence factor screening [7,8]. Specific DNA fragments were either detected by subtractive hybridisation techniques [27,29] or by in silico analysis of all available E. coli genomes and literature research.

Materials and methods Bacterial strains and growth conditions All investigated E. coli strains are listed in Table 1. Genomic DNA was purified as described previously [2]. Strains were obtained from different sources, as listed in Table 1.

Subtractive hybridisation Subtractive hybridisation was performed according to a previously published protocol [29], with the modifications

Table 1. sources

51

E. coli strains investigated in this study and their

E. coli strain

Source

K-12 strains MG 1655 W 3110 XL1 Blue DH5 a HB 101 TOP F0 TOP 10 pHis17 btubA EN 99 BMH WK6 5K C600 LE 392 J53 678-54 DH1 E. coli 35 M15 JM 83 W3350 AN92 AN260 TH2 DSM492 DE NovaBlue M28

(1) (1) (1) (1) (1) (1) Pilhofer, unpublished (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) Qiagen, Germany (2) (2) (3) (3) Takara, Japan (3) Novagen, Germany (2)

B strains B B B/r Bs-1 BL21 BL21 pLys C41 C41 pHis17 btubA

DSM 613 (1) DSM 500 DSM 501 (1) Invitrogen, USA [21] Pilhofer, unpublished

C strains C ABLE C ABLE K

DSM 13127 Stratagene, USA Stratagene, USA

W strains W W-mutant Mach 1 pCR2.1

ATCC 9637 DSM 2607 Invitrogen, USA

Pathogenic E. coli strains EHEC O157:H7 EDL 933 EAEC O42 ETEC H10407 EPEC E2348/69 EIEC

(1) (1) (1) (1)

ARTICLE IN PRESS 52

A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

Table 1. (continued )

M13rev in a Licor 4200 IR2 system. Sequence analysis was performed as described previously [2].

E. coli strain

Source

EDL1284 UPEC 536 J96 CFT073 MENEC IHE3034 Sepsis RS218

(1)

Oligonucleotide primers

(1) (1) (3)

All oligonucleotide primers used were synthesised by MWG Biotech AG (Ebersberg, Germany). Primer sequences, annealing temperatures, amplicon lengths and references (if previously published) are listed in Table S2 as supplementary material.

(1) (1)

Construction of the array

Others Nissle 1917 (Mutaflor)

(3)

Uncharacterized field isolates 1: Faeces, pork 2: Faeces, cattle 3: Organ, cattle 4: Faeces, dog 5: Faeces, cattle 6: Milk, pork 7: Faeces, cattle 8: Faeces, pork 9: Faeces, pork 10: Faeces, pork

(4) (4) (4) (4) (4) (4) (4) (4) (4) (4)

(1) Dr. Ulrich Dobrindt, University of Wuerzburg. (2) Dr. Wolfgang Schwarz, TU Muenchen. (3) Dr. So¨ren Schubert, Pettenkofer Institut Muenchen. (4) Dr. Ulrich Busch, Bavarian Health and Food Safety Authority. ATCC: American Type Culture Collection. DSM: German collection of microorganisms and cell cultures.

listed by Bauer et al. [2]. In parallel, the method of Wassill et al. [27] was applied with the following modifications: after hybridisation, the reaction mix was cleaned by ethanol precipitation [23] and the pellet was resuspended in 75 ml of 1  ASH buffer (from 5  ASH stock containing 250 mM Hepes, 2.5 mM NaCl and 1 mM EDTA, pH8.3). This solution was mixed with 100 ml of prewashed (according to the manufacturer’s manual) Magnasphere magnetic particles (Promega GmbH, Germany) and incubated for 1 h at 4 1C in the dark. The removal of biotinylated DNA bound to the particles was conducted in the appropriate magnetic separation stent. In all, 1 ml of flow-through was used for the PCR, using primer P1 with the reaction mixture and the cycling programme described by Zwirglmaier et al. [29]. The application of both methods successfully produced DNA fragments specific for the investigated strain, but the efficiency of the modified method of Wassill et al. [27] was slightly higher. All PCR fragments produced by subtractive hybridisations were cloned using the TOPO TA Cloning kit (Invitrogen, USA). Inserts of randomly picked clones were sequenced using the labelled primers M13for and

DNA-probes were generated with the primers listed in Table S2. PCR was performed in a total volume of 50 ml with 0.25 mM of each deoxyribonucleotide, 1  ExTaqbuffer (Takara Shuzo Co., Otsu, Japan), 1.25 U ExTaq (Takara), 0.5 mM of each primer, 38.25 ml H2Oultrapure and 10 ng (1 ml) of the according template DNA (see Table S2). Cycling conditions were as follows: initial denaturation at 94 1C for 3 min, 35 cycles of 94 1C for 30 s, x 1C (see Table S2 for TM of each oligonucleotide primer) for 30 s and 72 1C for 0.5–1.5 min (depending on the length of the amplicon, see Table S2) in an Eppendorf epgradient S thermocycler (Eppendorf, Germany). The fragments were checked by agarose gel electrophoresis to verify their size and quantity. Specific PCR products were cleaned using the AccuPrep PCR Purification Kit (Bioneer, UK) and evaporated using a vacuum manifold. The pellets were resuspended in 50 ml of 50% DMSO, transferred into a 384-well microplate (Greiner Bio-one GmbH, Germany) and stored at 20 1C until used. The concentration of PCR products was between 100 and 200 ng/ml. Corning GAPSII coated slides were spotted using a GMS 417 arrayer (Genetic Micro systems Inc., USA) at 45% humidity and 20 1C. Each fragment was spotted with one replicate. After the spotting process, slides were held over a bath of boiling purified water for 5 s and subsequently placed on a hot plate (80 1C) for 2 s. Afterwards, the slides were stored at 20 1C in the dark until used.

DNA labelling The labelling reaction was based on a protocol from the University of Wisconsin (www.genome.wisc. edu/resources/protocols/genomiclabeling.htm) and was slightly modified. In all, 5 mg of genomic DNA were digested with Bsp143I (Fermentas, Germany) and the reaction mix was purified with the AccuPrep PCR Purification Kit (Bioneer, UK), according to the manufacturers’ protocols. Two microlitres of random hexamers (5 mg/ml) (GE Healthcare, USA) were added to the purified DNA (30 ml) and the mixture was incubated

ARTICLE IN PRESS A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

at 94 1C for 10 min and immediately cooled to 4 1C. The reaction mixture was completed by addition of 5 ml dAGC (5 mM), 5 ml buffer 2 (New England Biolabs GmbH, Germany), 1 ml [50 U] Klenow exo- (New England Biolabs), 1 ml Cy3/Cy5 dUTP (GE Healthcare, USA) and 7 ml H2Oultrapure. The labelling mixture was incubated at 37 1C for 3–16 h, purified by ethanol precipitation and the pellet was dissolved in 10 ml of H2Oultrapure. Labelled DNA was stored at 20 1C in the dark until further use. One labelling reaction was used for 4–5 hybridisations.

Array hybridisation and washing Slides were prehybridised for 10 min at 42 1C in prehybridisation solution (5  SSC (20  SSC stock containing 3 M NaCl and 0.3 M NaCitrate, pH 7), 0.1 mg/ml BSA and 0.1% SDS), rinsed in 1  SSC (twice) and H2Oultrapure. Hybridisation was performed immediately after drying. Hybridisation buffer consisted of 0.5  PerfectHyb buffer (Sigma, Germany), 25% formamide and 500 ng/ml sheared salmon sperm DNA (Roche, Germany). In all, 2 ml of each labelled DNA (Cy3/Cy5) were added to 25 ml of hybridisation buffer and the mixture was denatured for 10 min at 94 1C and immediately cooled to 4 1C. The solution was applied on a coverslip (24  50 mm2); the array was placed on the coverslip and incubated in a hybridisation chamber (array side up, chamber moistened with 20 ml of H2Oultrapure in each corner) at 42 1C for 16 h in a hybridisation oven. Array washing was performed using the AdvaWash slide washing station (Implen, Germany) starting with 2  SSC/0.1% SDS for 2.5 min at 50 1C (twice), 1  SSC/0.1% SDS for 10 min at room temperature with pulsed fluid supply, 1  SSC for 5 min at room temperature with pulsed fluid supply and finally 0.05  SSC for 30 s. Slides were dried in a mini-centrifuge (Slide spinner, Roth, Germany) and scanned immediately using a GMS 418 array scanner with 100% laser power, and 60–80% of laser gain at the appropriate wavelength. Slides were stored at 20 1C in the dark for further analysis.

Image and data analysis Imagene 4.0 (Biodiscovery, Inc., USA) software was used to generate false colour images of Cy3 and Cy5 signals and to quantify the arrays. Signal to noise ratio and normalisation of the signals was performed according to the formula developed by Loy et al. [19] with 50% DMSO as the negative control and a 16S rDNA fragment as the positive control. All signals above a value of 0.1 were considered as positive.

53

Results Development of the array Subtractive hybridisation and literature research revealed a set of 125 specific DNA fragments useful for the differentiation and identification of E. coli safety strains by PCR (data not shown). The whole set of PCR products was spotted on a prototype array and was hybridised with genomic DNA from a small selection of strains (two of each lineage (K-12, B, C and W) and two pathogenic strains) to check the specificity under different stringency settings and hybridisation buffer compositions. Best results were achieved with hybridisation conditions of 42 1C and 25% formamide and a high stringency washing step at 50 1C. All probes producing unspecific or no hybridisation signals were removed from the array layout. Out of the remaining probes, 92 were selected to design the low-density array presented in this study (Table 2). Careful probe selection and adjustment of hybridisation conditions permitted the straightforward interpretation of hybridisation patterns without complicated data analysis and quantification. To allow a simple and accurate identification, the hybridisation patterns were arranged to generate the letter of the respective lineage (K, B, C or W) on the array (Fig. 1). The visible letters on the array were based on lineage or strain-specific DNA fragments. Each letter contained, as a first spot, a 16S rDNA fragment representing the positive control. Further along, an ATP-dependent protease (lon) and type I fimbriae (fimH) served as additional positive controls and were each printed three times (see Fig. 1A and Table 2). Type I fimbriae were originally described as an epidemiological marker associated with extraintestinal infections [10]; however, the use of this gene as a marker for virulence should be reconsidered due to its almost ubiquitous appearance in E. coli (except E. coli K-12 TOP F0 , K-12 TOP10, K-12 678-54 and IHE3034). The remaining spots of each letter consisted of PCR products specific to either one lineage or strain or to several lineages or strains. To circumvent the problem of any unpredicted hybridisation signals of the probes, which will definitely occur by testing a huge selection of different E. coli strains, the set of probes was crucial. A reliable identification of a safety strain required the hybridisation of all DNA fragments belonging to one (or the respective) letter (see Fig. 1A–E). Hybridisation results for all tested organisms are shown in Table S1 as supplementary material.

Identification of K-12 strains The 10 spot ‘‘K’’-pattern on the chip was mainly based on different insertion elements, namely, IS1 to

54

Table 2.

The arrangement of DNA fragments on the microarray

A1 B1 C1 D1 E1 B2 C2 A3 D3 E4

16S rDNA IS1 IS2 IS3 gltF IS5 ISL yi83 IS4 IS150

16S rRNA gene Insertion sequence IS1 Insertion sequence IS2 Insertion sequence IS3 gltBDF operon gltF gene Insertion sequence IS5 Insertion sequence:IS5I in wbbL gene Insertion sequence element IS186 Insertion sequence IS4 Insertion element IS150

A7 B7 C7 D7 E7 A8 E8 A9 E9 A11 B11 C11 D11 E11 A12 C12 E12 A13 B13 C13 D13 E13 H13

16S rDNA W824 W826 pcoD rtlD PRP 310706 CRT NIS 16S 893HP EPI TYPII REVTRA vioA 21_1 CABC 224 GALA REPRESS HEL 914SPEC maoA

16S rRNA gene Insertion sequence:IS1222 Putative dicarboxylate-binding periplasmic protein Copper resistance protein Ribitol dehydrogenase, ribitol kinase Putative transcriptional regulator Hypothetical protein Reverse transcriptase like protein Truncated transposase 16S rRNA gene Intergenic region Putative MFS superfamily hexuronate transporter; similar to c4495 from E. coli CFT073 Hypothetical type II secretion protein Transposon:retron EC86 Synthesis of dTDP-4-amino-4,6-dideoxyglucose Hypothetical protein Putative ATP binding protein of ABC transporter Strain B- and derivatives-specific genomic sequence Putative 2-keto-3-deoxygalactokinase Repressor protein Helicase-related protein Putative fimbrial protein Copper amine oxidase

K12

+         +             +

B

C

W Probe length (bp)

Accession number

+ + + +    + + +

+ +

 

+ 1465  668  1240 + 250  705  1146  928  1076  1272 + 1369

NC_000913 U91745.1 54398625 M55511 M74162 J01734 U00096 X03123.1 J01733 X07037

+ +         + +             + 

+ 1465  577  922  456  715  355  289  917  215 + 1465  210  758  174  340  969  208  411  232  312  295  512  174 + 475

NC_000913 EU250022 EU250023 DQ517526 AY005817 EU250024 EU250025 D37918 EU250026 NC_000913 EU250027? EU250028 EU250029 EU250030 AF125322 EU250031 EU250032 EF121002 EU250033 EU250034 EU250035 EU250036 L47571

 + +

  

ARTICLE IN PRESS

BLAST Result

A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

GLa Short name

16S rRNA gene Phosphopantetheine-binding, NAD-dependent epimerase/dehydratase Putative adhesin pts dependent N-acetyl-galactosamine- and galactosamine IIA component Bacteriophage protein Q Phage-related tail fiber protein gene Hypothetical protein Putative bacteriophage tail protein Unknown DNA fragment, no similarity Lateral flagellar hook protein (FlgE-like) pac gene for penicillin G acylase Guanine-hypoxanthine phosphoribosyltransferase CP4-6 prophage; ornithine carbamoyltransferase 2, chain F Specificity gene of EcoK restriction enzyme Galf synthesis pathway protein L-arabinose isomerase CPZ-55 prophage; predicted protein Methyl cytosine restriction enzyme CPZ-55 prophage; predicted integrase lit gene encoding a bacteriophage T4 late gene expression blocking protein (gplit) Transposon Tn10 component B of the 4HPA-hydroxylase Putative oxidoreductase Homoprotocatechuate dyoxygenase Putative acetyl-CoA:acetoacetyl-CoA transferase Minor component of type 1 fimbriae ATP-dependent protease Putative membrane protein precursor, host specificity protein Similar to hypothetical protein c3665 from E. coli CFT073 Putative invasin Hypothetical protein Similar to hypothetical protein VV1862 (Vibrio vulnificus YJ016) Hypothetical protein Cryptic plasmid pRK2 Minor component of type 1 fimbriae ATP-dependent protease 16S rRNA gene Polyketide biosynthesis gene cluster, bacteriophage integrase, hypothetical protein Polyketide biosynthesis gene cluster, IS1400 transposase A+B Polyketide biosynthesis gene cluster, putative polyketide synthase Colibactin polyketide biosynthesis gene cluster, putative hybrid polyketide-non-ribosomal peptide synthetase Hemin receptor precursor Similar to FyuA precursor (Yersinia pestis) Siderophore receptor IroN

+          

+ +    

+ + + + +       +  + +    + +           + + + + + + + + + + + +               + + + + + +        

+ 1465 757 1185 382 529 804 366 342 607 445 2556 + 368 + 244  1324  1012 + 660  918  749  1330  846  3072  297  776  857  268 + 1373 + 1445  847  261 + 492  210  676  158  740 + 1373 + 1445 + 1465  1785  1373  2265  2382

  

  

  

             

 +       



+ + + + +       +

  

639 609 2001

NC_000913 NZ_AAWW01000001 AE005174 AF228498 EU250037 EF121000 EU250038 EU250039 EU250040 EU250041 X04114 U00096 U00096 V00288 ECU09876 U00096 NP_416945 Z19104 U00096 M19634 AY528506 Z37980 EU250042 Z37980 EU250043 NC_000913 L20572 EU250044 EU250045 EU250046 EU250047 EU250048 EU250049 AY639886 NC_000913 L20572 NC_000913 AM229678 AM229678 AM229678 AM229678 AJ586887 AE014075 AJ586887

ARTICLE IN PRESS

16S rDNA PAI B1134 agaF Q P27 2.2 T3443 SAMP5 FLAG02 pac gpt argF hsds glf araA yffs mcra int gplit tn10 hpaB OXIDO hpaD ACOA fimH LON GS8 EPI2 PUINV 21_9 1306 1310 prk2 fimH LON 16S rDNA pks left pks right pks ORF 6 pks ORF 17 A28 PAI III A29 fyua A30 iroN

A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

A17 B17 C17 D17 E17 D18 A19 B19 C19 D19 E19 G1 H1 G2 H2 G3 H3 G4 H4 G5 H5 G7 H7 G8 H8 G9 H9 G11 G12 G13 G17 H17 G18 H18 G19 H19 A21 A22 A23 A24 A25

55

56

Table 2. (continued ) GLa Short name

B

C

W Probe length (bp)

Accession number

F1C periplasmic chaperone PapA protein Secreted autotransporter toxin (sat) gene Outer membrane protein 3b (a) Ferric aerobactin receptor precusor IutA Hypothetical protein, intimin receptor M-agglutinin subunit (bmaE) polysialic acid transport protein KpsM protein Capsule transport protein KpsT F1C minor fimbrial subunit protein G precusor cnf1 gene for cytotoxic necrotizing factor 1 Plasmid-DNA for EHEC-hemolysin operon Shiga toxin A-subunit Plasmid F genomic DNA Plasmid pCVD 432 DNA Plasmid pLysS genomic DNA Plasmid F genomic DNA Plasmid pLysS genomic DNA Plasmid pCR 2.1, Invitrogen Prosthecobacter dejongeii BtubA Negative control Minor component of type 1 fimbriae ATP-dependent protease

   +         

  

            

            

359 3140 565 702 260 341 462 227 347 321 450 1508 307  247  589  410  504  600  801  1376  – + 1373 + 1445

AJ586887 NC_004431 AF289092 NC_000913 AJ586888 NC_004431 M15677 AJ586888 AF007777 AJ586887 X70670 X86087 AJ251325 AP001918 X81423 www.embl-hamburg.de AP001918 www.embl-hamburg.de www.invitrogen.com AY186783 – NC_000913 L20572

Frames indicate the fragments responsible for the K, B, C and W patterns. +Indicates positive hybridisation signal. Indicates negative hybridisation signal. Indicates possibility for subtyping, presence or absence in different strains of one lineage. a Gene location on microarray according to Fig. 1(A).

         













   

   

   

 

  + +

  + +

 +

ARTICLE IN PRESS

sfa papA sat ompT aerJ tir bma kpsII kpsIII focG cnf hlya stx traT eagg plys traE crm bla btubA DMSO fimH LON

K12

A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

A31 A32 B21 B22 B23 B24 B25 B26 B27 B29 B30 C22 C23 C31 C32 D21 D22 D23 D24 E21 E22 D32 E32

BLAST Result

ARTICLE IN PRESS A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

57

Fig. 1. Setup of the array and different hybridisation patterns. Panels A–F: Panels A–F display the setup of the chip as well as different hybridisation patterns for several E. coli safety strains and a pathogenic representative. Differences between the strains are framed. (A): Schematic drawing of the array pattern. Each spotted PCR product is listed in Table 2 according to the numbering of the drawing. Black spots are positive controls. Grey spots indicate the patterns of the safety strain letters K, B, C and W. White spots are additional spots for subtyping of strains, virulence factors (A22–A32, B21–B30, and C22/23/31/32) and plasmid-related genes (D21–24). Negative controls are located in fields E21–22. (B1): One representative K-12 pattern (E. coli K-12 MG1655). The ‘‘K’’-spots all show positive signals in all tested K-12 strains, and the frame below the K (G1–5, H1–5) can be used for subtyping of different K-12 strains. (B2): K-12 pattern (E. coli K-12 WK6). The ‘‘K’’-spots again all show positive signals, and framed spots show differences in K-12 strains. In addition, the strain contains the genes coding for resistance to ampicillin and chloramphenicol (D22–24), as well as the F-plasmid (D21, C31). (C1): C-pattern (E. coli C). All ‘‘C’’-spots show positive signals, and framed spots below the C-pattern are specific to C and B-strains (G7–8, H7–8), whereas spots in the right frame are designated as non-K-12 specific (BCDE17). (C2): C-pattern (E. coli ABLE C, Stratagene). Again, all ‘‘C’’-spots show positive signals. Differentiation is possible by the presence of different IS-elements (C1, B2, C2 and H5), as well as the presence of the F-plasmid (D21, C31). (D1): B-pattern (E. coli B). All ‘‘B’’-spots show positive signals. The frame below the B-pattern is specific to E. coli B (G11–G13). The right frame shows the non-K-12 specific spots (BCDE17). (D2): B-pattern (E. coli BL21pLysS). All ‘‘B’’-spots show positive signals. The E. coli B-specific frame is negative for BL21 strains. The presence of the pLysSplasmid is detected by two spots in the virulence factor section (D21, D24) and, additionally, BL21 strains are ompT (B21) negative, whereas B strains are positive. (E1): W-pattern (E. coli W). All ‘‘W’’-spots show positive signals. Spots in the frame below the W-pattern are specific to strains ATCC 9637 and DSM 2607. Additionally, the two spots containing fragments of the 4-hydroxyphenylacetate catabolic pathway operon (G7–8) are present in these two strains, as well as in all B and C strains. (E2): W-pattern (E. coli Mach 1, Invitrogen). All ‘‘W’’-spots show positive signals. Differentiation of W strains is possible by the presence of IS-elements (C1, B2, C2 and H5). The strain was transformed with the TOPO vector (pCR2.1), which is detected by the ampicillin resistance gene (D23). (F): Diffuse pattern (E. coli J96). All pathogenic or wild-type E. coli strains show diffuse hybridisation signals all over the chip. None of the letters is complete. In addition, several of the spotted virulence factors show positive signals for this strain (right frame).

ARTICLE IN PRESS 58

A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

Fig. 1. (Continued)

IS5, IS50, IS150 and IS186 (yi81). Additionally, a PCR product of UDP-galactopyranose mutase (glf) completed the K. This set of PCR products allowed the accurate identification of all 27 tested K-12 derivatives. Further along, several PCR products for subtyping E. coli K-12 strains are present on the array (gpt, hsdS, ara, mcrA, gplit, arg, glf, yffs, int and tn10). This selection was mainly based on the genotypes of different K-12 strains. Only completely deleted genes were chosen for subtyping to ensure no false positive hybridisation resulted through silent mutations. To complete the screening, plasmid-specific probes were included (plysS, traE, cam, bla). TraE, previously described as a virulence gene associated with extraintestinal infections [3], is present on the F-plasmid and the use of this gene as a marker for extraintestinal virulence should be reconsidered. All probes were derived from the available

genomes of different E. coli strains and the specificity was tested by PCR amplification. Interestingly, two K-12 strains (K12 67-854 and K12 TOP10) were fimH negative, whereas all other tested strains (except some uncharacterised isolates) were positive.

Identification of B strains Thirteen E. coli B-specific PCR products were arranged on the chip to generate a ‘‘B’’-pattern. All of these DNA fragments were detected by subtractive hybridisation against K-12 strains. The set contained genes for metabolism (GALA, vioA), DNA processing (HEL), regulation (REPRESS), transport and secretion systems (CABC, TYPII), as well as hypothetical

ARTICLE IN PRESS A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

proteins (893HP, 21_1), mobile genetic elements (REVTRA) and up to now unknown fragments (224, 914SPEC) with no similarities to any sequences in the databases. These 13 DNA fragments allowed the identification of all seven tested B strains and derivatives. For differentiation of B and BL21 strains, three additional DNA fragments, which are present in all B strains but not in BL21 strains, were printed below the ‘‘B’’-pattern. These DNA fragments were specific for a bacteriophage protein (GS8), a hypothetical protein (EPI2) and a putative invasin gene (PUINV). Again, the presence of plasmids could be verified using the PCR products mentioned in the K-12 section (e.g. identification of BL21 pLysS; see Fig. 1D). In addition, the B strains were ompT positive, whereas BL21 strains and their derivatives were negative.

59

iophage proteins (T3443, SAMP5), a flagellar hook protein (FLAG02), a hypothetical protein (2.2) and the penicillin acylase (pac). A differentiation of W derivatives and its commercially available derivative Mach 1 (Invitrogen) is possible by the presence of different IS elements (in the ‘‘K’’-pattern, Mach 1 positive) and the absence of four additional spots containing hypothetical proteins (21_9, 1306 and 1310) and a part of the W-specific plasmid pRK2 located below the ‘‘W’’-pattern (Mach 1 negative; see Fig. D1). Additionally, the two genes of the 4-hydroxyphenylacetate catabolic pathway (hpaB/hpaD) are present in the W strains ATCC 9637 and DSM 2607 but not in the commercial derivative Mach 1 (Invitrogen).

Separation of pathogenic E. coli strains and detection of foreign DNA

Identification of C strains Nine C strain-specific DNA fragments were arranged on the chip to form a ‘‘C’’. Again, all these fragments were detected by subtractive hybridisation against K-12 strains and their specificity was evaluated by PCR. The DNA fragments consisted of mobile genetic elements (NIS, W824), a hypothetical protein (310706), and metabolic or structural genes (CRT, W826, pcoD, rtl, PRP). Interestingly, a part of these genes (pcoD, PRP) is present on a virulence plasmid of the avian pathogenic E. coli strain APEC 01 [11]. The C strains were screened by PCR, using the primers described by Johnson et al. [12], for further fragments of this virulence plasmid but they were negative (data not shown). It can be assumed that some parts of the plasmid are integrated into the chromosome of E. coli C strains. For differentiation of the original C strain and its derivatives (ABLE C and ABLE K, Stratagene, USA), IS elements present in the ‘‘K’’-pattern and the occurrence of the F-plasmid in the ABLE strains could be used (see Fig. 1C). Additionally, a set of genes present in all C and B strains was arranged below the ‘‘C’’-pattern. It consisted of two genes of the 4-hydroxyphenylacetate catabolic pathway (hpaB/hpaD), an oxidoreductase and an acetyl-CoA-transferase.

Identification of W strains The ‘‘W’’-pattern comprised 11 DNA fragments which were detected by subtractive hybridisation. The left part of the ‘‘W’’ consisted of four non-K-12 specific fragments, two of which were metabolic enzymes (PAI, aga), one was a putative adhesin (B1134) and one a bacteriophage protein (Q). These DNA fragments were positive for all B, C and W strains (see Fig. 1C–E). To complete the ‘‘W’’-pattern, W and derivative specific fragments were spotted that coded for putative bacter-

A set of prevalent virulence factors was included to separate the safety strains from pathogenic variants, as well as to screen the strains for uptake of these genes. The set consisted of genes coding for different types of fimbriae (papA, sfa, focG), siderophores (iron, fyuA), parts of the colibactin polyketide biosynthesis gene cluster (pks-left, pks-right, pks ORF6, pksORF17) [22], capsule proteins (kpsII, kpsIII), toxin genes (hlyA, cnf, sa, stx), and different receptor types (aerJ, PAI III, tir). Hybridisation characteristics of these fragments were tested with genomic DNA from a small but broad-range selection of pathogenic strains (see Table 1). All of these samples were negative in generating hybridisation signals forming a complete letter specific for E. coli safety strains (see Table S1 in supplementary material). The plasmid-associated fragments already mentioned in the K-12 section could also provide information about the uptake of foreign DNA, based on the presence of plasmid-specific fragments (pLysS, traE, traT) or plasmid encoded antibiotic resistance genes (bla, cam).

Discussion We describe a simple and cost effective PCR-based DNA microarray for the identification of E. coli safety and laboratory strains. The designed array allows an accurate and straightforward detection of E. coli safety and laboratory strains by merely looking at the hybridisation pattern. It has to be stressed that only a hybridisation pattern comprising all spots present in one letter (K, B, C or W) is sufficient for the identification of a safety strain lineage. Further along, the presence or absence of some additional spots allows single strain identification in the majority of cases (see Table S1 in supplementary material). The addition of DNA fragments hybridising to genes coding for common virulence

ARTICLE IN PRESS 60

A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

factors is important for monitoring the uptake of foreign DNA. This new method can be used to control the reference stocks of institutes and companies, and to monitor safety and laboratory strains during long-term experiments. It has been evaluated by hybridising a selection of 40 different E. coli safety strains, 10 pathogenic variants, 10 uncharacterised field isolates and the probiotic strain E. coli Nissle 1917 (Mutaflor, ArdeyPharm, Germany). Although these strains were obtained from various sources they were all correctly identified. This fact underlines the reliability of the newly designed chip. All other (non-safety) strains and isolates could be clearly separated by generating a diffuse pattern on the chip without the presence of a clear letter. We could also develop a new kind of microarray setup by using the possibility of printing the pattern most suitable for any experimental request. This creative capability of microarray chips has not yet been adopted by other researchers. However, analysis of experimental data can be absolutely simplified and automated by using this technique. The major advantage of the new method is that it does not need many PCR reactions for accurate identification or tedious and time-consuming techniques, such as PFGE. Following this new protocol, identification can be achieved within 24 h, starting with a full grown culture. Moreover, it also allows a better and more reliable differentiation/identification than traditional techniques, such as phage tests, which can only be used for the identification of some E. coli safety strains (e.g. E. coli K-12 or C).

Acknowledgements We thank Dr. Ulrich Dobrindt from the Institute for Molecular Infectious Biology at the University of Wuerzburg for providing a selection of safety and laboratory strains and DNA samples from a broadrange selection of pathogenic E. coli strains, Dr. So¨ren Schubert from the Pettenkofer Institute in Munich for a selection of strains and primers, and Dr. Ulrich Busch from the Bavarian Health and Food Safety Authority for providing diagnostic samples. We are grateful to the Bavarian State Ministry of the Environment, Public Health and Consumer Protection for financial support.

Appendix A. Supplementary materials Supplementary data associated with this article can be found in the online version at doi:10.1016/ j.syapm.2008.01.001.

References [1] S.T. Abedon, The murky origin of Snow White and her T-even dwarfs, Genetics 155 (2000) 481–486. [2] A.P. Bauer, S.M. Dieckmann, W. Ludwig, K.H. Schleifer, Rapid identification of Escherichia coli safety and laboratory strain lineages based on multiplex-PCR, FEMS Microbiol. Lett. 269 (2007) 36–40. [3] S. Bekal, R. Brousseau, L. Masson, G. Prefontaine, J. Fairbrother, J. Harel, Rapid identification of Escherichia coli pathotypes by virulence gene detection with DNA microarrays, J. Clin. Microbiol. 41 (2003) 2113–2125. [4] G. Blum, I. Muhldorfer, P. Kuhnert, J. Frey, J. Hacker, Comparative methodology to investigate the presence of Escherichia coli K-12 strains in environmental and human stool samples, FEMS Microbiol. Lett. 143 (1996) 77–82. [5] G. Blum, M. Schmittroth, J. Hacker, Escherichia coli K-12: Herkunft, Nachweiskriterien und Ausbreitung, Biospektrum 1 (1995) 11–16. [6] J. Bongaerts, R. Bovenberg, M. Kra¨mer, U. Mu¨ller, R. Raeven, M. Wubbolts, Metabolic engineering to produce fine chemicals in Escherichia coli: DSM Biotech GmbH, Chemie Ingenieur Technik 74 (2002) 694. [7] S. Chen, S. Zhao, P.F. McDermott, C.M. Schroeder, D.G. White, J. Meng, A DNA microarray for identification of virulence and antimicrobial resistance genes in Salmonella serovars and Escherichia coli, Mol. Cell Probes 19 (2005) 195–201. [8] V. Chizhikov, A. Rasooly, K. Chumakov, D.D. Levy, Microarray analysis of microbial virulence factors, Appl. Environ. Microbiol. 67 (2001) 3258–3263. [9] E. Diaz, A. Ferrandez, M.A. Prieto, J.L. Garcia, Biodegradation of aromatic compounds by Escherichia coli, Microbiol. Mol. Biol. Rev. 65 (2001) 523–569 table of contents. [10] J.R. Johnson, A.L. Stell, Extended virulence genotypes of Escherichia coli strains from patients with urosepsis in relation to phylogeny and host compromise, J. Infect. Dis. 181 (2000) 261–272. [11] T.J. Johnson, S. Kariyawasam, Y. Wannemuehler, P. Mangiamele, S.J. Johnson, C. Doetkott, J.A. Skyberg, A.M. Lynne, J.R. Johnson, L.K. Nolan, The genome sequence of avian pathogenic Escherichia coli strain O1:K1:H7 shares strong similarities with human extraintestinal pathogenic E. coli genomes, J. Bacteriol. 189 (2007) 3228–3236. [12] T.J. Johnson, Y.M. Wannemeuhler, J.A. Scaccianoce, S.J. Johnson, L.K. Nolan, Complete DNA sequence, comparative genomics, and prevalence of an IncHI2 plasmid occurring among extraintestinal pathogenic Escherichia coli isolates, Antimicrob. Agents Chemother. 50 (2006) 3929–3933. [13] K. Kakinuma, M. Fukushima, R. Kawaguchi, Detection and identification of Escherichia coli, Shigella, and Salmonella by microarrays using the gyrB gene, Biotechnol. Bioeng. 83 (2003) 721–728. [14] G. Keramas, D.D. Bang, M. Lund, M. Madsen, S.E. Rasmussen, H. Bunkenborg, P. Telleman, C.B. Christensen, Development of a sensitive DNA microarray suitable for rapid detection of Campylobacter spp, Mol. Cell Probes 17 (2003) 187–196.

ARTICLE IN PRESS A.P. Bauer et al. / Systematic and Applied Microbiology 31 (2008) 50–61

[15] P. Kuhnert, P. Boerlin, J. Frey, Target genes for virulence assessment of Escherichia coli isolates from water, food and the environment, FEMS Microbiol. Rev. 24 (2000) 107–117. [16] P. Kuhnert, J. Nicolet, J. Frey, Rapid and accurate identification of Escherichia coli K-12 strains, Appl. Environ. Microbiol. 61 (1995) 4135–4139. [17] B.H. Lee, W.K. Huh, S.T. Kim, J.S. Lee, S.O. Kang, Bacterial production of D-erythroascorbic acid and L-ascorbic acid through functional expression of Saccharomyces cerevisiae D-arabinono-1,4-lactone oxidase in Escherichia coli, Appl. Environ. Microbiol. 65 (1999) 4685–4687. [18] V.A. Livshits, Production of isoleucine by Escherichia coli having isoleucine auxotrophy and no negative feedback inhibition of isoleucine production, United States patent 5534421, 1996. [19] A. Loy, A. Lehner, N. Lee, J. Adamczyk, H. Meier, J. Ernst, K.H. Schleifer, M. Wagner, Oligonucleotide microarray for 16S rRNA gene-based detection of all recognized lineages of sulfate-reducing prokaryotes in the environment, Appl. Environ. Microbiol. 68 (2002) 5064–5081. [20] T. Maeda, V. Sanchez-Torres, T.K. Wood, Enhanced hydrogen production from glucose by metabolically engineered Escherichia coli, Appl. Microbiol. Biotechnol. (2007). [21] B. Miroux, J.E. Walker, Over-production of proteins in Escherichia coli: mutant hosts that allow synthesis of some membrane proteins and globular proteins at high levels, J. Mol. Biol. 260 (1996) 289–298.

61

[22] J.P. Nougayrede, S. Homburg, F. Taieb, M. Boury, E. Brzuszkiewicz, G. Gottschalk, C. Buchrieser, J. Hacker, U. Dobrindt, E. Oswald, Escherichia coli induces DNA double-strand breaks in eukaryotic cells, Science 313 (2006) 848–851. [23] J. Sambrook, D. Russel, Molecular Cloning: A Laboratory Manual, third ed., Cold Spring Harbour Laboratory, New York, 2001. [24] D. Schneider, E. Duperchy, E. Coursange, R.E. Lenski, M. Blot, Long-term experimental evolution in Escherichia coli. IX. Characterization of insertion sequence-mediated mutations and rearrangements, Genetics 156 (2000) 477–488. [25] R.L. Sinsheimer, Purification and properties of bacteriophage-Phi-X174, J. Mol. Biol. 1 (1959) 37–53. [26] E.L. Tatum, J. Lederberg, Gene recombination in the bacterium Escherichia coli, J. Bacteriol. 53 (1947) 673–684. [27] L. Wassill, W. Ludwig, K.H. Schleifer, Development of a modified subtraction hybridization technique and its application for the design of strain specific PCR systems for lactococci, Fems Microbiol. Lett. 166 (1998) 63–70. [28] X. Zhang, K. Jantama, J.C. Moore, K.T. Shanmugam, L.O. Ingram, Production of L-alanine by metabolically engineered Escherichia coli, Appl. Microbiol. Biotechnol. (2007). [29] K. Zwirglmaier, L. Wassill, W. Ludwig, K.H. Schleifer, Subtraction hybridization in microplates: an improved method to generate strain-specific PCR primers, Syst. Appl. Microbiol. 24 (2001) 108–115.