The Staphylococcus aureus collagen adhesin-encoding gene (cna) is within a discrete genetic element

The Staphylococcus aureus collagen adhesin-encoding gene (cna) is within a discrete genetic element

Gene 196 ( 1997) 239–248 The Staphylococcus aureus collagen adhesin-encoding gene (cna) is within a discrete genetic element Allison F. Gillaspy a, J...

570KB Sizes 0 Downloads 48 Views

Gene 196 ( 1997) 239–248

The Staphylococcus aureus collagen adhesin-encoding gene (cna) is within a discrete genetic element Allison F. Gillaspy a, Joseph M. Patti b, Frankie L. Pratt Jr. c, John J. Iandolo c, Mark S. Smeltzer a,* a Department of Microbiology and Immunology, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA b Institute of Biosciences and Technology, Center for Extracellular Matrix Biology, Texas A&M University, Houston, TX 77030, USA c Department of Microbiology and Pathobiology, Kansas State University, Manhattan, KS 66506, USA Received 17 July 1996; accepted 3 April 1997; Received by J. Wild

Abstract Although the gene (cna) encoding the Staphylococcus aureus (Sa) collagen adhesin is not present in all strains, the DNA both upstream and downstream of cna is present in all Sa strains. Using oligo primers corresponding to the conserved nt flanking cna and template DNA from Sa strains that do not encode cna, we amplified a 372-bp fragment. These results illustrate that the conserved regions upstream and downstream of cna are contiguous in strains that do not encode cna. Using primers corresponding to the conserved flanking DNA together with primers corresponding to the 5∞ and 3∞ ends of cna, we also amplified DNA fragments containing the junctions between the cna genetic element and the conserved flanking sequences. Sequence comparisons of the amplification products from four cna negative and four cna positive strains revealed that cna is within a discrete genetic element that extends 202 bp upstream from the cna start codon and 100 bp downstream of the cna stop codon. Sequence analysis of the ends of the cna element did not reveal any of the repeats characteristic of transposable elements. These results suggest that cna may be part of a larger element (e.g., a phage) that may or may not contain cna. Alternatively, cna may be a subject to a precise excision event resulting in its deletion from the chromosome. Based on sequence analysis of the flanking DNA amplified from strains that do not encode cna, the presence of a cna genetic element does not disrupt an ORF. © 1997 Elsevier Science B.V. Keywords: Gram-positive bacteria; Surface protein; Host matrix protein

1. Introduction Staphylococcus aureus (Sa) is an opportunistic pathogen that causes a variety of infections in both humans * Corresponding author. Tel. +1 501 6867958; Fax +1 501 6865359; e-mail: [email protected] Abbreviations: aa, amino acid(s); att, temperate bacteriophage attachment site; CN, conserved regions upstream and downstream of cna element in cna negative strains; cna, gene encoding Sa collagen adhesin; fib, gene encoding Sa fibrinogen-binding protein; fnbA and fnbB, genes encoding Sa fibronectin-binding proteins; FnBP, fibronectin-binding protein of Sa; hlb, gene encoding Sa b-toxin; LJ, left junction fragment of the cna genetic element; MSCRAMM, microbial surface components recognizing adhesive matrix molecules; nt, nucleotide(s); oligo, oligodeoxyribonucleotide; ORF, open reading frame; pcp, gene encoding Sa pyrrolidone carboxyl peptidase; RJ, right junction fragment of the cna genetic element; Sa, Staphylococcus aureus; sak, gene encoding Sa staphylokinase; sea, gene encoding Sa enterotoxin A; seb, gene encoding Sa enterotoxin B; tst, gene encoding Sa toxic shock syndrome toxin-1. 0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved. PII S0 3 78 - 11 19 ( 9 7 ) 00 25 6 -4

and animals. The diversity of infections caused by Sa is consistent with its potential to produce a wide variety of virulence factors, some of which are encoded within discrete genetic elements that are not present in every strain. Included among these are the genes encoding enterotoxin A (sea) (Betley and Mekalanos, 1985), enterotoxin B (seb) (Johns and Khan, 1988), staphylokinase (sak) (Coleman et al., 1989), toxic-shock syndrome toxin (tst) ( Kreiswirth et al., 1989), the penicillinbinding protein PBP2∞ (mecA) ( Kreiswirth et al., 1993) and the genes required for capsule production (Lee, 1995). Some of these genes are encoded within plasmids (seb) (Shafer and Iandolo, 1979) or lysogenic bacteriophage (sea, sak) (Betley and Mekalanos, 1985; Coleman et al., 1989), however, the genetic elements encoding other virulence factors have not been clearly defined. Sa has the ability to bind specific host matrix proteins (Patti et al., 1994a). The Sa surface proteins (adhesins) that mediate the binding of these proteins have been

240

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

termed MSCRAMMs to denote their role as ‘microbial surface components recognizing adhesive matrix molecules’ (Patti et al., 1994a). To date, genes encoding Sa MSCRAMMs that bind collagen (Patti et al., 1992), fibronectin (Signas et al., 1989; Jonsson et al., 1991), fibrinogen (Boden and Flock, 1994; McDevitt et al., 1994; Cheung et al., 1995 ), elastin (Park et al., 1996) and osteopontin (Jonsson et al., 1995 ) have been identified. Unlike the genes encoding other Sa MSCRAMMs, the gene encoding the collagen-binding MSCRAMM (cna) is not present in every strain of Sa (Smeltzer et al., 1997). In an eort to characterize the genetic element encoding cna, we performed a series of Southern blots using probes specific for cna and for the regions flanking cna. Unlike cna, the flanking DNA both upstream and downstream of cna was present in all Sa strains. Sequence analysis confirmed that cna is encoded within a discrete genetic element that extends 202 bp upstream of the cna start codon and 100 bp downstream of the cna stop codon. The cna genetic element does not encode any other gene, and its presence does not disrupt a resident chromosomal gene.

2. Materials and methods 2.1. Bacterial strains Sa strains FDA574 (Patti et al., 1992 ), Phillips (Patti et al., 1994b ), UAMS-1 (Gillaspy et al., 1995), and ISP479C (Smeltzer et al., 1993 ) have been described. UAMS strains 639, 622, 633 and 652 are clinical isolates obtained from patients at the University of Arkansas for Medical Sciences (Smeltzer et al., 1996). Previous experiments have established that FDA574, Phillips, UAMS-1 and UAMS-639 bind collagen, whereas ISP479C and the UAMS strains 622, 633 and 652 do not (data not shown). E. coli strain DH5a was used as a host strain for all cloning experiments.

Table 1 DNA primersa used for PCR amplifications Fragment

5∞ primerb

3∞ primer

cna-up Full-length cna cna B pcp 12 pcp pcp 34 CN LJ RJ

acagcttccggtttaataggtgta atgcacttgtattcgttatactg agtggttactaatactg tatgtaagcagactaagagtgg tttaaggaatgtattttgtta gcattacgctcatgtttagcac gcccaaataaggagacctacaacg taaggagacctacaacgatagg aacttcatggattacatggg

cataaaaacccctcctta aggccactcttagtctgcttacat caggatagattggttta tatgttttatttatgggggagg atgcacattttagtaacagggttc tcgatccaagacatacaactgg tgccattcacatcccacttcatgc tactctatgtcatatccacagg tctccaaactttacctgaacgtgc

aRelative location of each fragment is shown in Fig. 1. bAll primers are written 5∞ to 3∞ as synthesized. 5∞ and 3∞ primers correspond to opposite strands of the DNA molecule and have the opposite orientation.

Fig. 1A. Amplified fragments were gel-purified for use as probes. 2.3. Cloning and sequencing To define the ends of the cna element and determine whether the ends were conserved among dierent Sa strains, primers were synthesized for amplification of (1) the left junction (LJ ) between the cna element and the conserved flanking DNA upstream of cna, (2 ) the right junction (RJ) between the cna element and the conserved flanking DNA downstream of cna and (3) the fragment in cna negative strains that corresponds to the conserved (CN ) regions upstream and downstream of cna in cna positive strains. The location of these primers relative to cna is illustrated in Fig. 1B. The primers were synthesized, based on the sequence data from FDA574. Amplified fragments were cloned into the pT7Blue T-vector (Novagen, Inc., Madison, WI) according to the manufacturer’s protocol. The cloned fragments were sequenced using the standard forward and reverse primer sites present in the vector DNA. Sequencing was done using an Applied Biosystems Model 373A automated sequencer (Applied Biosystems, Foster City, CA, USA).

2.2. Isolation of genomic DNA, Southern blot analysis and PCR amplification 3. Results and discussion Sa genomic DNA was purified over cesium chloride gradients as previously described (Smeltzer et al., 1992). Restriction enzyme digestion, gel electrophoresis and DNA transfer to nylon membranes were all performed, using standard protocols. Southern blot analysis was carried out using DNA probes labeled with digoxygenin-11-dUTP (Smeltzer et al., 1992). DNA fragments used as probes were generated by PCR amplification of FDA574 genomic DNA (Patti et al., 1992). Primers used to amplify probe fragments are listed in Table 1. The location of each probe is illustrated in

3.1. Southern blot analysis of Sa isolates Based on the published sequences of the cna and pyrrolidone carboxyl peptidase ( pcp) genes from FDA574 (Patti et al., 1992, 1995), we generated a series of DNA probes corresponding to various sites within the 6851-bp region that includes cna, pcp and the DNA flanking both genes ( Fig. 1A). When genomic DNA from UAMS-1, Phillips, FDA574 and UAMS-639 was digested with HaeIII and probed with a full-length cna

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

241

Fig. 1. Location of probe fragments and generation of amplification products for sequencing. ( A) Schematic representation of the chromosomal region encoding cna and pcp and the relative location of each probe used for Southern blot analysis (Section 3.1 ). The nt numbers are based on the sequence of Sa strain FDA574 (Patti et al., 1992, 1995). Location and direction of transcription for cna and pcp are indicated by thick, horizontal arrows. Important restriction sites are indicated by vertical lines. Only those Sau3A sites located within the cna B domains ( Patti et al., 1992) are shown. Strains with a dierent number of B domains have a corresponding number of Sau3A sites. UAMS-1 and UAMS-639 also have HincII sites within the B domains ( Fig. 2B). The location of the unsequenced 419-bp region between cna and pcp is denoted by a vertical arrow. DNA fragments amplified for use as probes are indicated by double-headed arrows. Terminal nt in the primers used for each amplification ( Table 1) are noted in parentheses below each fragment. In cna negative strains, the DNA between nt 728 and 4580 is absent (Section 3.2 and Fig. 6). ( B) Location of primers ( Table 1 ) used to amplify DNA fragments for sequencing ( Fig. 6 ). CN refers to the conserved DNA flanking cna. LJ and RJ refer to the left and right junctions between the cna element and the conserved flanking DNA. The numeric designators denote primers corresponding to opposite DNA strands ( Table 1) with arrows indicating the direction of priming. The number in parentheses is the terminal (5∞) nt based on the FDA574 sequence illustrated in Fig. 1A.

probe ( Fig. 1A), two hybridizing fragments were observed ( Fig. 2A). The size of the larger fragment in UAMS-1 (2.3 kb ), Phillips (2.85 kb ), FDA574 (3.4 kb) and UAMS-639 ( 3.99 kb) (Fig. 2A) was consistent with HaeIII sites at nt 1161 and 4586 (Fig. 1A) and the fact that these strains encode cna variants with one, two, three and four B domains, respectively (Patti et al., 1994a; Smeltzer et al., 1996 ). In every strain, except Phillips, the smaller of the hybridizing fragments was approximately 1.0 kb, which is consistent with the HaeIII sites at nt 276 (upstream of cna) and 1161 (within the 5∞ end of cna) ( Fig. 1A). Because the variation in the number of B domains was evident ( Fig. 2A), the HaeIII site at nt 1161 must be conserved in all strains. The increased size of the smaller fragment in Phillips (1.3 kb) must therefore be due to the absence of the HaeIII site at nt 276. The observation that UAMS-1, Phillips, FDA574 and UAMS-639 encode cna is consistent with the fact that these strains bind collagen (data not shown). Similarly, the observation that ISP479C, UAMS-622, UAMS-633

and UAMS-652 do not bind collagen is consistent with the fact that no hybridization signal was observed when genomic DNA from these strains was digested with HaeIII and hybridized with the full-length cna probe (data not shown). However, when HincII-digested DNA was hybridized with a DNA probe corresponding to the region immediately upstream of cna (cna-up probe, Fig. 1A), a hybridization signal was observed in all strains ( Fig. 2B). The smaller of the two fragments observed in strains that encode cna was approximately 0.44 kb, which is consistent with the HincII sites at nt 334 and 774 ( Fig. 1A). However, a similar, although slightly smaller, fragment was also observed in strains that do not encode cna ( Fig. 2B). These results demonstrate that DNA upstream of the HincII site at nt 774 is conserved in all strains. Additionally, the absence of a second hybridizing fragment in the cna negative strains suggests that the conserved DNA does not extend beyond the HincII site at nt 774 ( Fig. 2B). Although a second fragment was observed in every cna positive strain ( Fig. 2B), the size of this fragment in UAMS-1

242

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

Fig. 2. Southern blot analysis of cna and the region upstream of cna. (A) HaeIII-digested genomic DNA was hybridized with the full-length cna probe ( Fig. 1A). Lanes: U1, UAMS-1; Ph, Phillips; 574, FDA574; 639, UAMS-639. Approximate sizes are indicated in kb. (B) HincII-digested genomic DNA was hybridized with a 472-bp fragment (cna-up, Fig. 1A) corresponding to the region immediately upstream of cna. Lanes for cna positive strains are as described for panel A. The designation 479 refers to ISP479C, whereas the designations 622, 633 and 652 refer to UAMS isolates.

and UAMS-639 was unexpected since these strains encode a cna gene with a dierent number of B domains (see above), and there are no HincII sites in the published sequence of cna (Patti et al., 1992). However, Southern blot analysis using HincII-digested genomic DNA with a cna probe defined by the HaeIII sites at nt 1161 and 4586 ( Fig. 1A) suggested that this discrepancy was due to HincII sites within the B domain (s) of UAMS-1 and UAMS-639 (data not shown). These sites did not appear to be present in Phillips or FDA574 (data not shown). The presence of HincII sites within the B domain(s) of UAMS-1 and UAMS-639, and the absence of these sites in Phillips and FDA574, was subsequently confirmed by PCR amplification of the B domain region followed by digestion with HincII (data not shown). The results discussed above suggest that DNA upstream of the HincII site at nt 774 is conserved, even in strains that do not encode cna. To determine whether the DNA downstream of cna was conserved, we hybridized HaeIII-digested genomic DNA with a probe corresponding to an internal fragment of pcp. In FDA574, the pcp gene is located approximately 740 bp downstream of cna (Patti et al., 1995). When pcp was used as a probe, a single hybridizing fragment was observed in all strains ( Fig. 3A). The same pattern of fragments was observed when HaeIII-digested DNA from each strain was hybridized with a DNA probe ( pcp34) corresponding to the region upstream of pcp (data not

shown). Although the size of the hybridizing fragment in each strain varied ( Fig. 3A), the fact that a signal was observed in all strains clearly indicates that pcp and the region upstream of pcp are conserved in all strains, including those that do not encode cna. A signal was also observed in every strain when a DNA fragment ( pcp12 ) corresponding to the region between cna and pcp was used to probe HaeIII-digested DNA (Fig. 3B). The presence of a hybridization signal in every strain demonstrates that at least part of the intergenic region between cna and pcp is also conserved, even in strains that do not encode cna. In UAMS-1 and UAMS-639, the pcp12 probe hybridized with a single fragment (Fig. 3B) that also hybridized with the pcp and pcp34 probes ( Fig. 3A). These results are consistent with the absence of HaeIII restriction sites downstream of the site at nt 4586 ( Fig. 1A). However, in Phillips and FDA574, the hybridization pattern observed with the pcp12 probe diered from that seen with the pcp and pcp34 probes. Specifically, hybridizing fragments of 2.5 and 0.35 kb were detected in both strains ( Fig. 3B). The 2.5-kb fragment appeared to correspond with the fragment seen with the pcp and pcp34 probes (Fig. 3A). The presence of the 0.35-kb fragment was unexpected; however, there is a gap of approximately 420 bp in the sequence of the 740 bp intergenic region between cna and pcp (J.M. Patti, unpublished data). The presence of the 0.35-kb fragment in Phillips and FDA574 suggests

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

243

Fig. 3. Southern blot analysis of the region downstream of cna. HaeIII-digested genomic DNA was hybridized with the pcp (A), pcp12 (B) or cna-up (C ) probes (Fig. 1). Lanes are as noted in the legend for Fig. 2. Fragments that hybridized with both the pcp12 and cna-up probes in cna negative strains are indicated by arrowheads to the left of the relevant fragment. The approximate size of the UAMS-1 fragment that hybridized with the pcp and pcp12 probes is indicated in kb.

that there is at least one HaeIII restriction site in this unsequenced region. In FDA574, the presence of this site was confirmed by sequence analysis (Fig. 4 ). We addressed the presence or absence of this site in other strains by amplifying the relevant region and cutting the amplified product with HaeIII. As expected, the amplification product obtained from Phillips was cut whereas

the products obtained from UAMS-1 and UAMS-639 were not (data not shown). Similar variability was observed when the pcp12 fragment was used to probe HaeIII-digested DNA from strains that do not encode cna. In UAMS-622, a single hybridizing fragment of approximately 2.5 kb was observed ( Fig. 3B). A fragment of the same size also

Fig. 4. Nucleotide sequence of the chromosomal region encoding cna and pcp. Sequence determined as part of this study is noted in bold. Given the size of the relevant region and the fact that the sequence of cna and pcp has been published ( Patti et al., 1992; Patti et al., 1995), gaps were introduced into the sequence of the cna and pcp genes. Bases are numbered according to Fig. 1A, with the numbers corresponding to the first and last base on each side of the gap noted above the line. Relevant restriction sites are indicated by a dashed underline. The cna and pcp genes are double-underlined with the start and stop codons for each gene indicated below the line. It should be noted cna and pcp are transcribed in opposite directions; as written, cna (nt 931–4480 inclusive) is transcribed from left to right, whereas pcp (nt 5849–5214 inclusive) is transcribed from right to left. The 5∞ and 3∞ ends of the cna genetic element (nt 728–4580 inclusive) are indicated by a single underline. The 15 of 17 bp repeats centered at 578 and 4529 are shown in double-underlined italics. In cna-positive strains, the 440-bp fragment observed in Fig. 2B is defined by the HincII sites at 334 and 774. The slightly smaller fragment observed in cna-negative strains ( Fig. 2B) is defined by the HincII sites at 334 and 4612; in the absence of the cna element, these sites are separated by 425 bp.

244

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

hybridized with the pcp and pcp34 probes (Fig. 3A). The remaining cna negative strains all had two fragments that hybridized with the pcp12 probe (Fig. 3B). The presence of these two fragments is consistent with the observation that some strains contain a HaeIII site within the previously unsequenced region between cna and pcp. Of most relevance, however, is the observation that every cna negative strain appeared to contain a HaeIII fragment that hybridized with both the pcp12 and the cna-up DNA probes (Fig. 3B and C, arrowheads). These results suggest that the conserved regions upstream and downstream of cna are contiguous in the chromosome of strains that do not encode cna. 3.2. Characterization of the cna genetic element The results discussed above suggest that cna is encoded within a discrete genetic element that is not present in all strains. To verify that hypothesis, we synthesized primers that could be used to amplify DNA fragments containing the junctions between cna and the conserved flanking DNA upstream (LJ primers) or downstream (RJ primers) of cna ( Fig. 1B). We also synthesized primers (CN primers) that could be used to

amplify the corresponding region in the chromosome of strains that do not encode cna (Fig. 1B). Primers were chosen to yield amplification products small enough to allow complete sequencing in both directions. Using the CN primers corresponding to conserved DNA upstream (CN5) and downstream (CN3) of cna, a fragment was amplified from every strain except UAMS-639 ( Fig. 5). In the other cna positive strains, the size of the amplification product was consistent with the location of the CN primers (Fig. 1B) and the number of B domains (Fig. 5 ). Presumably, the failure to amplify a product from UAMS-639 was due to the increased size of the cna gene in that strain. The size of the CN fragment was identical (372 bp) in all cna negative strains ( Fig. 5). These results demonstrate that the conserved DNA upstream and downstream of cna is contiguous in all cna negative strains. The LJ and RJ primer sets yielded an amplification product from every cna positive strain but did not yield an amplification product from any cna negative strain ( Fig. 5). These results are consistent with the fact that each primer set includes a primer (LJ3 and RJ5) corresponding to DNA present only in strains that encode cna. The fact that the amplification product obtained with each primer set was the same size in every

Fig. 5. Confirmation of Southern blot data by PCR. Genomic DNA from UAMS-1 ( U1), Phillips (Ph), FDA574 (574), UAMS-639 ( 639), ISP479C ( 479), UAMS-622 ( 622), UAMS-633 ( 633) and UAMS-652 ( 652) was used as template DNA in a PCR amplification using the primers shown in Table 1. The location of each primer and its directionality are indicated in Fig. 1B. The CN fragments amplified from cna negative strains and the LJ and RJ fragments amplified from cna positive strains were cloned and sequenced ( Fig. 6). The lanes between 652 and U1 contain molecular size markers. Approximate sizes are indicated in kb.

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

strain ( Fig. 5) suggests that the junctions between the cna element and the conserved DNA are identical in all strains. To define the ends of the cna element, the CN amplification products from each cna negative strain and the LJ and RJ amplification products from each cna positive strain were cloned and sequenced. Alignment of the sequence obtained from the cna negative strains ( Fig. 6A) with the sequences obtained from the cna positive strains ( Fig. 6B and C ) allowed us to define the genetic element that contains cna. This element extends 202 bp upstream of the cna start codon and 100 bp downstream of the stop codon (Fig. 4 ). According to the numbering scheme shown in Fig. 1B, the 5∞ nt in the cna element corresponds to position 728, whereas the 3∞ nt corresponds to position 4580. In FDA574, which encodes a cna gene with three 561-bp B domains, the cna element contains 3853 bp ( Fig. 4). The ends of this element are precisely defined such that nt 727 and 4581 ( Fig. 1A) are conserved in all strains. Although our sequence data demonstrate that the junctions between the cna element and the flanking DNA are highly conserved in strains that encode cna, and that the chromosomal region in which the cna element resides is highly conserved in strains that do not encode cna, it is interesting to note that there were consistent sequence dierences between cna-positive and cna-negative strains in the DNA flanking the cna element. Specifically, when sequence data from four cna negative and four cna positive strains were compared, four consistent dierences were identified (Fig. 6 ). These dierences were: ( 1) substitution of a T (cna negative) with a G (cna positive) 42 bp upstream from the left junction between the cna element and the conserved flanking DNA; ( 2) substitution of an A (cna negative) with a G (cna positive) in the first nt downstream of the right junction; ( 3) substitution of an A (cna negative) with a G (cna positive) 6 bp downstream of the right junction; and (4 ) the addition of 2 nt (A followed by A or T ) in cna positive strains 26 bp downstream of the right junction ( Fig. 6). Although these dierences are relatively minor, the consistency observed in each of four cna positive and four cna negative strains suggests that they may have some relevance with respect to the presence or absence of cna. Alternatively, these dierences may suggest that the acquisition or loss of cna is a rare event such that cna positive and cna negative strains are restricted to a relatively small group of clonal variants. 3.3. The origin of the cna element The data presented here demonstrate that cna is encoded within a discrete genetic element. At present, we are unable to draw any conclusions with regard to the nature of this element. Although its size is consistent

245

with a relatively small transposable element, the cna element does not contain any ORFs other than the ORF that encodes cna. The absence of at least one additional ORF encoding a transposase clearly indicates that the cna element is not transposable. Additionally, we searched the junction fragments between cna and the conserved regions flanking cna and did not find any of the repeated sequences associated with transposable elements. We did identify a 17-bp sequence (TTCTATGTTTTATACAT ) that occurs only once in cna negative strains and is repeated a second time at 15 of 17 bp (TTCTATGTATAATACAT ) in cna positive strains ( Fig. 4 ). Since neither of these is located at the junctions of the cna element and the flanking DNA, it is unlikely that they define a transposable element or serve a functional role with respect to the cna element. Additionally, since only 3 of 17 bp in the repeat are C or G, such a repeat would be relatively common in the A+T-rich Sa genome. It should also be noted that the overall G+C content of the cna element (33%) is consistent with the G+C content of the Sa genome (30-39%). A second possibility is that cna is encoded within a phage genome. Although the cna element is much smaller than a typical Sa phage genome (Coleman et al., 1989; Smeltzer et al., 1994 ), it remains possible that cna has been incorporated into a phage and that there are variants of the same phage that do not encode cna. Such a scenario is consistent with the group of phages that utilize the att site located in the Sa b-toxin gene (hlb). More directly, there are at least four Sa bacteriophages (42E, w13, w42 and w15) that exhibit some degree of sequence homology and utilize the same hlb att site (Coleman et al., 1989; Smeltzer et al., 1994 ). Two of these (w15 and w42) also encode the gene for enterotoxin A (sea) (Betley and Mekalanos, 1985; Smeltzer et al., 1994). Moreover, as is expected, based on an imprecise excision event leading to incorporation of chromosomal DNA, sea is located near one terminus of the phage genome (Betley and Mekalanos, 1985). By analogy, cna could reside near one end of a phage genome, with the conserved region on at least one side of cna representing phage rather than chromosomal DNA. When we examined the 372-bp fragment amplified from cna negative strains for the occurrence of sequences similar to the att sites for the staphylococcal bacteriophages WGW11, W13 and L54a, we identified a 14-bp sequence (TAAATTCAAAAAAG) with 57% nt identity with the W13 att site (Coleman et al., 1991). However, this 14-bp region extends 3 bp beyond the insertion site ( Fig. 6A) and is not present in cna positive strains. These observations clearly suggest that the 14-bp sequence is not a phage att site. We have extended this analysis to include the entire 727 bp of sequenced DNA upstream of the cna element (Fig. 4 ) and did not identify any additional sequences with significant similarity to any of the recog-

246

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

Fig. 6. Sequence of PCR amplification products. The nt sequence of the CN ( A), LJ (B) and RJ (C ) fragments are shown for comparison. The CN fragments were amplified from the cna negative strains ISP479C ( 479), UAMS-622 ( 622), UAMS-633 ( 633) and UAMS-652 ( 652), whereas the LJ and RJ fragments were amplified from the cna positive strains FDA574 (574), UAMS-1 ( U1), Phillips ( Ph) and UAMS-639 ( 639). DNA present in both cna negative and cna positive strains is shown in the upper case, whereas DNA found only in cna positive strains is shown in the lower case. In all panels, gaps were introduced to highlight significant sites. In panel A, the gap indicates the site where the cna element is located in cna positive strains. In B and C, the gaps indicate the junction between the cna element ( lower case) and the conserved flanking DNA (upper case). Underlined bases indicate mismatches. In A, mismatches were defined by comparison to the ISP479C sequence. The dash indicates the

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

nized Sa att sites (data not shown). These results do not rule out the possibility that an att site exists further upstream of the cna element. In an attempt to address that possibility, we induced phage from cna-positive strains using mitomycin C. However, we were unable to document the presence of cna within the genomes of any of the induced phage (data not shown). Finally, if cna is associated with a phage genome, then it should be possible to identify strains that are not lysogenized and therefore do not carry the DNA located on at least one side of cna. To date, we have examined approximately 20 cna negative strains, all of which were known to be distinct on the basis of other genotypic markers (Smeltzer et al., 1996 ). In every case, the flanking DNA both upstream and downstream of cna was present (data not shown). Although these results suggest that the cna element is not associated with a phage genome, it remains possible that cna is associated with a phage and that other Sa strains are lysogenized with a similar phage that does not encode cna. However, if that is true, these lysogenic phage must be highly conserved among dierent Sa strains. Finally, to determine whether the presence of the cna element disrupts a gene encoded by cna negative strains, the sequence of the 372-bp CN fragment amplified from each cna negative strain was examined for the presence of ORFs. This analysis revealed a 33-bp ORF that spans the site in which the cna element is found in cna positive strains and is read in the same direction as cna. Based on the numbering scheme illustrated in Fig. 1A, this ORF extends from nt 706 to 738. The 372-bp fragment contains larger ORFs, some of which remain open at either end of the fragment. However, based on the data we present, these ORFs would be present in both cna positive and cna negative strains. Therefore, although we cannot rule out a functional role for the 11-aa peptide discussed above, or state definitively that the presence of the cna element does not have an eect on transcription of an adjacent ORF, the presence of the cna element in the Sa chromosome does not appear to have any genotypic eect other than the addition of cna. That is of interest in light of the suggestion that an Sa strain (PH100) in which cna was inactivated by insertion of a gentamicin resistance determinant (Patti et al., 1994b) had a reduced level of surface-associated FnBP by comparison to its isogenic parent strain (Phillips) (Hienz et al., 1996). Our results suggest that this eect is mediated at the phenotypic rather than the genetic level. One possibility is that the surface architec-

247

ture of PH100 is somehow disrupted such that the localization of surface proteins is inhibited ( Hienz et al., 1996). However, because there is no evidence to suggest a consistent dierence between cna positive and cna negative Sa strains with respect to the ability to bind fibronectin, this phenotypic eect may be due to the nature of the cna mutation itself rather than an eect directly associated with the absence of a functional cna gene. 3.4. Conclusions ( 1) The Sa collagen adhesin gene (cna) is encoded within a discrete genetic element that extends 202 bp upstream from the cna start codon and 100 bp downstream of the cna stop codon. ( 2) The genetic element encoding cna does not encode any other genes and does not appear to be associated with the repeated genetic elements characteristic of transposable elements. The cna genetic element may be associated with a lysogenic bacteriophage that is highly conserved and occurs in at least one alternative form that does not include cna. ( 3) The presence of the cna element does not disrupt a chromosomal gene present in strains that do not encode cna.

Acknowledgement The excellent technical assistance of Allen Gies at Kansas State University is greatly appreciated. This work was supported by grant 95-B-45 from the Arkansas Science and Technology Association and grant A137729 from the National Institute of Allergy and Infectious Disease. The work was done in a partial fulfilment of the requirements for the Ph.D. degree of A.F.G. This work was presented in part at the 1996 meeting of the American Society for Microbiology in New Orleans, LA.

References Betley, M.J., Mekalanos, J.J., 1985. Staphylococcal enterotoxin A is encoded by phage. Science 229, 185–187. Boden, M.K., Flock, J.I., 1994. Cloning and characterization of a gene for a 19 kDa fibrinogen-binding protein from Staphylococcus aureus. Mol. Microbiol. 12, 599–606. Cheung, A.I., Projan, S.J., Edelstein, R.E., Fischetti, V.A., 1995. Cloning, expression, and nucleotide sequence of a Staphylococcus aureus

location of the two additional nt that were present only in cna positive strains (C ). In B, mismatches were defined by comparison to FDA574. In C, the designations 574-1 and 574-2 denote the sequence originally reported by Patti et al. ( 1992) (574-1 ) and the sequence obtained during the course of this study ( 574-2). Dierences include ( 1) the presence of an additional T in 574-1 in the 3∞ end of cna and ( 2) the absence of a nt (adenine or thymine) in 574-1 in the conserved region downstream of cna. Mismatches in C are defined relative to the 574-2 sequence. Additionally, Sa strains U1 and 639 are missing a cytosine residue (dash) by comparison to FDA574 and Phillips. Bases shown in bold denote consistent dierences between cna positive and cna negative strains (Section 3.1 ).

248

A.F. Gillaspy et al. / Gene 196 (1997) 239–248

gene ( fbpA) encoding a fibrinogen-binding protein. Infect. Immun. 63, 1914–1920. Coleman, D., Knights, J., Russell, R., Shanley, D., Birkbeck, T.H., Dougan, G., Charles, I., 1991. Insertional inactivation of the Staphylococcus aureus b-toxin by bacteriophage W13 occurs by site- and orientation-specific integration of the W13 genome. Mol. Microbiol. 5, 933–939. Coleman, D.C., Sullivan, D.J., Russell, R.J., Arbuthnott, J.P., Carey, B.F., Pomeroy, H.M., 1989. Staphylococcus aureus bacteriophages mediating the simultaneous lysogenic conversion of b-lysin, staphylokinase and enterotoxin A: molecular mechanism of triple conversion. J. Gen. Microbiol. 1351, 679–1697. Gillaspy, A.F., Hickmon, S.G., Skinner, R.A., Thomas, J.R., Nelson, C.L., Smeltzer, M.S., 1995. Role of the accessory gene regulator (agr) in the pathogenesis of staphylococcal osteomyelitis. Infect. Immun. 63, 3373–3380. Hienz, S.A., Palma, M., Flock, J.-I., 1996. Insertional inactivation of the gene for collagen-binding protein has a pleiotropic eect on the phenotype of Staphylococcus aureus. J. Bacteriol. 178, 5327–5329. Johns, M.B., Khan, S.A., 1988. Staphylococcal enterotoxin B gene is associated with a discrete genetic element. J. Bacteriol. 170, 4033–4039. Jonsson, K., Signas, C., Muller, H.P., Lindberg, M., 1991. Two dierent genes encode fibronectin binding proteins in Staphylococcus aureus: the complete nucleotide sequence and characterization of the second gene. Eur. J. Biochem. 202, 1041–1048. Jonsson, K., McDevitt, D., McGavin, M.H., Patti, J.M., Hook, M., 1995. Staphylococcus aureus expresses a major histocompatibility complex class II analog. J. Biol. Chem. 270, 21457–21460. Kreiswirth, B.N., Projan, S.J., Schlievert, P.M., Novick, R.P., 1989. Toxic shock syndrome toxin 1 is encoded by a variable genetic element. Rev. Infect. Dis. 11, S83–S89. Kreiswirth, B., Kornblum, J., Arbeit, R.D., Eisner, W., Maslow, J.N., McGeer, A., Low, D.E., Novick, R.P., 1993. Evidence for a clonal origin of methicillin resistance in Staphylococcus aureus. Science 259, 227–230. Lee, C.Y., 1995. Association of staphylococcal type-1 capsule-encoding genes with a discrete genetic element. Gene 167, 115–119. McDevitt, D., Francois, P., Vaudaux, P., Foster, T.J., 1994. Molecular characterization of the clumping factor (fibrinogen receptor) of Staphylococcus aureus. Mol. Microbiol. 11, 237–248. Park, P.W., Rosenbloom, J., Abrams, W.R., Rosenbloom, J.,

Mecham, R.P., 1996. Molecular cloning and expression of the gene for elastin-binding protein (ebpS) in Staphylococcus aureus. J. Biol. Chem. 271, 15803–15809. Patti, J.M., Jonsson, H., Guss, B., Switalski, L.M., Wiberg, K., Lindberg, M., Hook, M., 1992. Molecular characterization and expression of a gene encoding a Staphylococcus aureus collagen adhesin. J. Biol. Chem. 267, 4766–4772. Patti, J.M., Allen, B.L., McGavin, M.J., Hook, M., 1994. MSCRAMM-mediated adherence of microorganisms to host tissue. Annu. Rev. Microbiol. 48, 585–617. Patti, J.M., Bremell, T., Krajewska-Pietrasik, D., Abdelnour, A., Tarkowski, A., Ryden, C., Hook, M., 1994. The Staphylococcus aureus collagen adhesin is a virulence determinant in experimental septic arthritis. Infect. Immun. 62, 152–161. Patti, J.M., Schneider, A., Garza, N., Boles, J.O., 1995. Isolation and characterization of pcp, a gene encoding a pyrrolidone carboxyl peptidase in Staphylococcus aureus. Gene 166, 95–99. Shafer, W.M., Iandolo, J.J., 1979. Genetics of staphylococcal enterotoxin B in methicillin-resistant isolates of Staphylococcus aureus. Infect. Immun. 25, 902–911. Signas, C., Raucci, G., Jonsson, K., Lindgren, P.E., Anantharamaiah, G.M., Hook, M., Lindberg, M., 1989. Nucleotide sequence of the gene for a fibronectin-binding protein from Staphylococcus aureus: use of this peptide sequence in the synthesis of biologically active peptides. Proc. Natl. Acad. Sci. USA 86, 699–703. Smeltzer, M.S., Gill, S.R., Iandolo, J.J., 1992. Localization of a chromosomal mutation aecting expression of extracellular lipase in Staphylococcus aureus. J. Bacteriol. 174, 4000–4006. Smeltzer, M.S., Hart, M.E., Iandolo, J.J., 1993. Phenotypic characterization of xpr, a global regulator of extracellular virulence factors in Staphylococcus aureus. Infect. Immun. 61, 919–925. Smeltzer, M.S., Hart, M.E., Iandolo, J.J., 1994. The eect of lysogeny on the genomic organization of Staphylococcus aureus. Gene 138, 51–57. Smeltzer, M.S., Pratt Jr., F.L., Gillaspy, A.F., Young, L.A., 1996. Genomic fingerprinting for the epidemiological dierentiation of Staphylococcus aureus clinical isolates. J. Clin. Microbiol. 34, 1364–1372. Smeltzer, M.S., Gillaspy, A.F., Pratt, F.L., Thames, M.D., Iandolo, J.J., 1997. Prevalence and chromosomal map location of Staphylococcus aureus adhesin genes. Gene 196, 249–259.