SWIM analysis allows rapid identification of residues involved in invasin-mediated bacterial uptake

SWIM analysis allows rapid identification of residues involved in invasin-mediated bacterial uptake

Gene 211 (1998) 109–116 SWIM analysis allows rapid identification of residues involved in invasinmediated bacterial uptake Eric S. Krukonis 1, Ralph ...

153KB Sizes 0 Downloads 19 Views

Gene 211 (1998) 109–116

SWIM analysis allows rapid identification of residues involved in invasinmediated bacterial uptake Eric S. Krukonis 1, Ralph R. Isberg * Tufts University School of Medicine, Howard Hughes Medical Institute, Department of Molecular Biology and Microbiology, 136 Harrison Avenue M&V 409, Boston, MA 02111, USA Received 10 October 1997; received in revised form 27 January 1998; accepted 28 January 1998; Received by A.M. Campbell

Abstract The Yersinia pseudotuberculosis invasin protein promotes bacterial uptake into normally non-phagocytic cells. Combinations of six alanine substitutions in a region of invasin previously shown to be important for bacterial internalization were analyzed using binomial and codon mutagenesis strategies. A single pool of mutants, potentially containing 64 derivatives with various combinations of alanine substitutions, was enriched by one passage through HEp2 cells. DNA was isolated from the resulting pool of internalization-competent bacteria and sequenced in a single set of reactions to determine which alanine substitutions maintained activity. Results of the single sequencing run performed on the pool indicated that strains harboring the D911A substitution were absent after enrichment, confirming the importance of an aspartate residue at this site. When single clones were subsequently isolated from the pool, those containing multiple alanine substitutions in invasin showed uptake defects that were additive, with the exception of S904A/M912A and S910A/M912A double mutants. Binomial mutagenesis combined with a pooled enrichment and sequencing strategy, called ‘SWIM’ mutagenesis (selection without isolation of mutants), could be applied to any system for which there exists an enrichment scheme, using a single oligonucleotide pool to analyze multiple residues. © 1998 Elsevier Science B.V. All rights reserved. Keywords: Mutagenesis; Additivity of mutations; Yersinia; Receptor–ligand interactions

1. Introduction Invasin is a 986-amino-acid protein of Yersinia pseudotuberculosis that is able to confer upon E. coli K12 the ability to enter mammalian cells (Isberg and Falkow, 1985; Isberg et al., 1987). The N-terminal 600-aminoacid region of the protein is required for secretion, outer membrane localization and multimerization, and shows a strong homology to the intimin family of proteins found in a number of pathogenic enteric bacteria (Isberg et al., 1987; Jerse et al., 1990; P. Dersch, pers. commun.). * Corresponding author. Tel: +1 617 636 7393; Fax: +1 617 636 0337; e-mail: [email protected] 1 Present address: University of Michigan Medical School, Unit for Laboratory Animal Medicine, 104 ARF, Ann Arbor, MI 48109-0614, USA. Abbreviations: A, alanine or adenine; Ab, antibody; BSA, bovine serum albumin; C, cysteine or cytosine; D, aspartate; E, glutamate; G, glycine or guanine; inv, gene encoding the invasin protein; M, methionine; mAb, monclonal antibody; Q, glutamine; R, arginine; S, serine; SWIM, selection without isolation of mutants; T, threonine or thymine. 0378-1119/98/$19.00 © 1998 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 8 ) 0 0 08 7 - 0

The C-terminal region of 192 amino acids is necessary and sufficient for binding multiple integrin receptors containing the b1 chain found on the surface of cultured cells (Leong et al., 1990; Isberg and Leong, 1990; Rankin et al., 1992). All mutations in invasin that show a defect in bacterial entry without affecting protein stability cluster within a stretch of 11 amino acids between residues ala903 (A903) and ser913 (S913) (Leong et al., 1995). A total loss of activity occurs with proteins having a disrupted disulfide bond between C907 and C982 or a substitution at D911 (Leong et al., 1993). Even a conservative change to glutamate at D911 results in a drastic reduction in cell binding, whereas bacterial entry is completely abolished (Leong et al., 1995). Thus, D911 appears to play a critical role in recognition of mammalian cell receptors by invasin. We decided to use information regarding the ability of invasin to enter mammalian cells in order to develop a strategy that allows rapid analysis of a number of substitutions within a protein of interest by selecting for altered proteins that retain activity. To date, alaninescanning mutagenesis has been the strategy of choice

110

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

for analyzing large numbers of amino acid substitutions, because the small side chain reduces the likelihood of structural changes resulting from the substitution (Cunningham and Wells, 1989). When dealing with large regions of a protein, this may involve the isolation and characterization of dozens of mutants, whereas only a few strongly affect protein activity (Clackson and Wells, 1995; Irie et al., 1995). This report introduces a strategy called SWIM (selection without isolation of mutants), which allows simultaneous mutagenesis of several amino acids within a short stretch of a protein of interest without isolating individual bacterial colonies or using large numbers of oligonucleotides. SWIM is derived from the binomial mutagenesis technique described by Gregoret and Sauer (1993). However, SWIM does not depend on the isolation and characterization of individual mutants. Instead, the entire mutagenized pool is subjected to an enrichment strategy, and the starting pool and enriched pool are compared by single nucleotide sequencing runs. Alanine substitutions present in the sequencing run from the starting pool that dramatically disrupt activity will not be present in the sequencing run from the enriched pool. In this report, the results of this strategy are described, as well as the analysis of several isolated clones from the mutagenized pool that carried multiple alanine substitutions. This allowed the identification of potential side-chain interactions, based on the lack of additivity of defects found in mutants harboring multiple alanine substitutions.

2. Materials and methods 2.1. Bacterial strains and bacterial and mammalian cell growth conditions Bacterial strains DH5a (supE44 DlacU169 (f80lacZDM15) hsdR17 recA1 endA1 gyrA96 thi-1 relA1), JM109 [recA1 supE44 endA1 hsdR17 gyrA96 relA1 thi D(lac-proAB)], and BMH71-18 (thi supE D(lacproAB) mutS::Tn10) were grown in L broth or 2XYT supplemented with 100 mg/ml ampicillin or 10 mg/ml tetracycline, as required. HEp2 cells were grown in RPMI 1640 (Irvine Scientific, Santa Ana, CA) containing 5% new-born calf serum (Sigma Chemical Co., St. Louis, MO) at 37°C in a 5% CO incubator. HEp2 cells 2 were washed with PBS prior to infection, and the uptake assay was performed in 1.5 ml RPMI 1640 including 20 mM HEPES (pH=7.0) and 0.4 mg/ml BSA (‘Binding Buffer’) for 90 min at 37°C and 5% CO . 2 2.2. Binomial codon mutagenesis Single-stranded DNA (ssDNA) derived from plasmid pSelect2-8 carrying the wild-type inv gene from Yersinia

pseudotuberculosis was mutagenized using a mixture of oligonucleotides (Howard Hughes Medical Institute Biopolymer Facility, Boston, MA) designed to have a degeneracy of 64-fold (Saltman et al., 1996). Six different codons were synthesized as 50% wild-type sequence and 50% alanine substitution (binomial mutagenesis). The oligonucleotide had the sequence 5∞ TTCAAGAACCGCAGA(CAT/CGC)(ATC/AGC)(TGA/TGC)ACC( TTG/TGC )GCATTG( TCT/TGC ) (GCT/TGC )GGCCTCGAGACTGGA 3∞ (negative strand ) with those codons substituted by alanine codons in parentheses (wt/Ala). Codons in which more than one substitution was required to obtain alanine were generated by splitting the resin in half and adding precursor nucleotides corresponding to the wild-type codon to one sample or the alanine codon to the other sample (Cormack and Struhl, 1993). After completion of codons noted in parentheses, the two samples were then remixed, and the oligonucleotide was extended further until another mutagenized codon was desired. The oligonucleotide pool generated by this procedure was gel-purified, phosphorylated and annealed to the pSelect2–8 template, and a second-strand synthesis was performed in conjunction with an AmpR repair primer (Promega, Madison, WI ). Gel purification of synthesized oligonucleotides was critical, as end labeling revealed several premature termination products. Double-stranded products were transformed into CaCl -competent E. coli strain 2 BMH71-18 (mutS−) and plated on ampicillin containing plates. Approximately 800 colonies were pooled, and plasmid DNA was isolated (Qiagen, Chatsworth, CA). E. coli strain DH5a was then electroporated with plasmid DNA derived from the BMH71-18 transformants and plated on ampicillin plates. Approximately 5000–10 000 DH5a AmpR colonies were pooled and grown in 10 ml L broth+ampicillin for 3 h at 37°C, and DNA was again prepared. 2.3. Mutagenized pool enrichment The mutagenized plasmid pool isolated from DH5a was retransformed into DH5a by electroporation, and 1 ml of a 1 h growout was plated on ampicillin plates. The following day, two confluent 10-cm plates were pooled in 15 ml of L broth and grown 1 h at 37°C. Two microliters of this culture (titer=7.4×109 bacteria/ml ) were added for 90 min at 37°C, in duplicate, to ~2×106 HEp2 cells cultured overnight in RPMI 1640 ( Irvine Scientific) containing 5% new-born calf serum in a six-well tissue culture plate. The remainder of the bacterial culture was used to isolate plasmid DNA from the starting pool (‘starting pool DNA’). In parallel, HEp2 cells were challenged with 3 ml non-transformed DH5a (titer 3.4×109 bacteria/ml ) as a control. Infected cells were washed once with PBS, and extracellular bacteria were killed by the addition of 10 mg/ml genta-

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

micin to the binding buffer for 45 min. Following gentamicin treatment, the cells were washed three times with PBS and lysed in 0.5 ml of H O containing 0.1% Triton 2 X-100. Twenty microliters of 10−2 dilutions of cell lysate were plated on to ampicillin plates, and the uptake efficiency was determined. Entry efficiencies were 0.67% (mutant pool ) and 0.0008% (DH5a-plasmid). Colonies derived from 50 ml of undiluted cell lysate from both experimental enrichments harboring the mutagenized plasmid were pooled separately and referred to as Inv+ pool-1 and Inv+ pool-2. Pooled colonies (~5000–10 000 each pool ) were resuspended in 15 ml of L broth containing ampicillin, and grown for 90∞ at 37°C before plasmid DNA was isolated en masse for sequencing (Qiagen; plasmid preparation from ‘enriched pools’). Two micrograms of plasmid DNA from the mutagenized starting pool and both Inv+ HEp2 cell enriched pools were sequenced using Sequenase ( US Biochemical Co., Cleveland, OH ) and the primer, 5∞ GCGAAAAGTAAAAAATTCCC 3∞ at a concentration of ~5 mM. Reactions were radiolabeled using 10 mCi of a-35S-dATP (New England Nuclear). Sequencing gels were analyzed by densitometry (Molecular Dynamics Computing Densitometer utilizing ImageQuant v.3.2 software) and the percentage having the mutated codon present at each of the targeted positions relative to the wild-type sequence was calculated, with 100% being a 1:1 ratio of alanine to wild type (true binomial mutagenesis). Percentages from the starting pool were compared to the HEp2 cell enriched pools to assess the ability of invasin to tolerate particular alanine substitutions. 2.4. Activity of single and multiple mutants Isolated invasin clones were colony-purified from either the starting mutagenized pool or the HEp2 cell enriched pool (Inv+) and sequenced using Sequenase (Sanger et al., 1977). In addition, two mutants Q908A and M912A were constructed and sequenced separately. Cellular uptake assays on purified bacterial strains were performed as described above except that the assays were performed in 300 ml binding buffer using Falcon 24-well tissue culture plates, and 50 mg/ml gentamicin was used to kill extracellular bacteria. These assays were performed in triplicate, and standard deviations were determined.

111

then blocked for 1 h at room temperature in PBS containing 10 mg/ml bovine serum albumin (BSA) and probed with supernatants containing mouse mAb 3A2 directed against invasin (Leong et al., 1991) supplemented with a 1:2000 dilution of a rabbit polyclonal antibody directed against E. coli OmpA protein (a kind gift from Carol Kumamoto). After probing for 2 h at room temperature, the fixed bacteria were washed five times with PBS and challenged with goat anti-mousehorseradish peroxidase (HRP) at room temperature as described by Leong et al. (1995). After probing for 1 h, the wells were washed five times with PBS and incubated with a 1:100 dilution of 2,2-azino-di(3-ethylbenzthiazoline) sulfonic acid (ABTS) ( Zymed, South San Francisco, CA) in 100 mM NaCitrate (pH=4.5), containing 0.03% H O . The wells were then washed, 2 2 reprobed with goat anti-rabbit-alkaline phosphatase (AP) for 1 h at room temperature, washed five times with PBS and developed in 100 mM Tris (pH=9.5) containing 1 mg/ml p-nitrophenylphosphate (PNPP) (Sigma Chemical Co., St. Louis, MO), 100 mM NaCl and 50 mM MgCl to quantify the amount of OmpA. 2 The total invasin surface exposure was expressed as the ABTS signal/PNPP signal. 2.6. Analysis of additivity The additivity of the effects of multiple alanine mutations was determined by comparing the uptake efficiency of a mutant containing multiple substitutions with the predicted defect, which should be the product of relative uptake efficiencies of each of the single codon changes. Predicted standard deviations (SD ) were p calculated as (SD )=[(SD )2[(SD )2+(meas )2]+ p 1 2 2 (meas )2(SD )2]0.5, where SD =standard deviation of a 1 2 N particular mutant N, meas =measured entry value for N a particular alanine mutation N, and SD =predicted p standard deviation of the multiple mutant (Susan Murray, pers. commun.). Measured defects were determined to be significantly different from the predicted defect by employing Chebychev’s Inequality, which states that for any distribution, [1−(1/k)2] of the measurements in that distribution lie within k standard deviations of the mean (Casella and Berger, 1990; Pagano and Gauvreau, 1993). Using this equation, our measured defect for a mutant having multiple codon changes must be greater than 5 SD from the predicted mean to obtain a p<0.04, where the standard deviation is estimated from triplicate measurements.

2.5. Invasin surface expression To determine the amount of surface-localized invasin, 30-ml aliquots of overnight bacterial cultures used for uptake assays were immobilized on Falcon 24-well plates coated with 10 mg/ml poly--lysine. Wells were washed once with PBS and fixed for 15 min at room temperature in PBS containing 3% paraformaldehyde. Wells were

3. Results 3.1. Determination of tolerated substitutions by SWIM We used SWIM to analyze several alanine substitutions neighboring and including residue D911 of invasin

112

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

by designing a single oligonucleotide pool that contained an equal mix of wild-type codons or alanine codons at six positions (S904, R905, Q908, S910, D911 and M912). To ensure that the only possible mutations to arise would be changes to alanine, the oligonucleotide was synthesized using the codon replacement technique for sites that required two nucleotide changes to obtain an alanine (Cormack and Struhl, 1993). This created a pool with a complexity of 64 potential mutants from a single synthesis run. To enrich for bacteria able to enter mammalian cells, the mutagenized pool was then transformed into E. coli DH5a, and the resulting bank of strains was used to challenge a HEp2 cell monolayer followed by gentamicin killing. Those derivatives capable of entering the mammalian cells were protected from the antibiotic and recovered by lysis of the cultured mammalian cells following removal of the gentamicin. Plasmid DNA from the entire heterogeneous pool prior to enrichment (‘starting pool’) was then sequenced in one set of reactions and compared to the sequence of the entire pool that survived HEp2 cell enrichment (‘enriched pool’, Fig. 1). The relative distribution of each codon in the enriched pool was then assessed by densitometry ( Table 1 and Fig. 2), comparing the fraction of the pool harboring each alanine codon change after enrichment to the fraction of the pool harboring the change before enrichment. Of the six residues targeted for mutagenesis, only

Fig. 2. SWIM mutagenesis reveals the loss of D911A substitution after enrichment. The SWIM starting pool and Inv+ pool were compared by sequencing for the retention of alanine substitutions following HEp-2 cell enrichment. DNA was obtained as described in Materials and methods, and sequencing reactions were performed simultaneously and run on the same polyacrylamide gel. The D911A substitution that was present in the starting pool was totally absent following enrichment.

D911A was dramatically underrepresented in the enriched pool that survived gentamicin treatment, as determined by comparing sequencing ladders before and after enrichment (Fig. 2, Table 1). Other alanine codon changes were present in the enriched pool at ~37–100% of their starting levels, indicating that individual proteins Table 1 Loss of D911A substitution from enriched pools

Fig. 1. SWIM mutagenesis: comparison of input versus enriched pools. The region of invasin spanning from S904 to M912 was chosen for SWIM mutagenesis with six of nine residues simultaneously targeted for alanine substitution by a single oligonucleotide. Those codons that required more than one nucleotide change to obtain alanine were synthesized by codon mutagenesis (Materials and methods; Cormack and Struhl, 1993). The mutagenized pool was enriched by passage through HEp-2 cells, and the starting pool and enriched pool were compared by sequencing.

Substitution

Enriched/start (±SD)

SRQCQGSDM(wt) S904A R905A Q908A S910A D911A M912A

0.57±0.17 0.52±0.24 1.0±0.13 0.78±0.28 <0.00004a 0.37±0.18

Plasmid DNA was isolated from duplicate SWIM uptake enrichment assays. The nucleotide sequence of both enriched pools was determined individually, and the mutagenized region was compared to the sequence from the starting SWIM mutagenized invasin pool. The fraction of alanine substitution at each codon targeted in the starting pool was calculated, compared to the fraction of that particular mutation appearing in the HEp-2 enriched pool, and is presented as enriched pool/starting pool ( Enriched/Start). The absolute frequencies at which individual substitutions appeared in the starting pool were: 36% S904A, 29% R905A, 19% Q908A, 15% S910A, 13% D911A and 11% M912A, where 50% represents the true binomial representation. Values presented are averages from two uptake enrichments. The sample size (number of bands compared on a sequencing gel ) varied depending on the number of nucleotides changed in order to obtain an alanine codon as follows: S904A n=6 (3 nucleotides in duplicate); R905A n=4; Q908A n=4; S910A n=2; D911A n=2; M912A n=4. Codons S904A, R905A, Q908A and S910A were compared based on a 2-day gel exposure as shown in Fig. 2. Codons D911A and M912A were compare based on a 10-day gel exposure due to their faintness at 2 days. aBelow the limit of densitometry detection.

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

113

having multiple alanine changes in this region are still competent to promote entry, as the majority of oligonucleotides in the original pool contained multiple alanine substitutions ( Table 1). Results were in good agreement with previous information regarding the effects of individual mutations previously isolated in this region of invasin (Leong et al., 1995). The SWIM mutagenesis technique, however, allowed us to assess the effects of six alanine mutations in the invasin protein simultaneously from a single infection, using sequencing information from only two plasmid preparations and without the isolation of individual bacterial strains.

2). Similarly, the substitution causing the largest defect in uptake that allowed some activity (M912A; Table 2, SRQCQGSDA), was the least well represented of the entry competent strains in the HEp2 cell enriched pool ( Table 1). These results are remarkably similar to the less quantitative analysis using SWIM and densitometry in which D911A was completely absent from the enriched pool, but the remaining five mutants were found to be tolerated ( Table 1).

3.2. Characterization of single amino acid substitutions

Analysis of strains encoding invasin derivatives having multiple alanine substitutions allowed us to assess the additivity of various amino acid changes. In theory, amino acid substitutions at residues that neither interact with one another nor affect the ability of one another to bind substrate show additivity in their defects ( Wells, 1990; Gregoret and Sauer, 1993). To assess the additivity of the six alanine mutations, 11 colony-purified strains were analyzed for entry efficiencies: six double, four triple, and one quadruple substitution. If the defects exhibited by the substitutions were independent, then the magnitude of the defect observed in an individual strain should be equal to the product of defects caused by individual substitutions. The predicted magnitude of defects for each of the strains harboring multiple alanine mutations was calculated, and this was compared to the observed levels of entry for each mutant. Since multiplication of two normal distributions rarely gives a product that is also normal, we assessed whether the predicted vs. observed values were significantly different by employing Chebychev’s Inequality ( Table 3, p values; Casella and Berger, 1990; Pagano and Gauvreau, 1993). Most strains containing multiple alanine substitutions were not significantly different in their entry efficiencies from that predicted by the analysis of single substitutions ( Table 3). Three strains, however, had activities significantly different from the predicted values ( Table 3). Two of these carried the double substitution S904A/M912A (ARQCQGSDA and AAQCQGSDA). Both of these mutants were more than 100-fold defective for bacterial uptake relative to the wild type ( Table 3). These results suggest that S904A and M912A may have side chains that interact with one another within the cell binding domain of invasin, or their combined absence may significantly affect the ability of D911 to recognize a receptor. The other non-additive substitution was SRQCQGADA (S910A/M912A), again involving residue M912. In this case, the double change was approximately fivefold less defective for uptake than predicted from the single substitutions ( Table 3). Presence of the S910A mutation alleviated the defect of an invasin molecule carrying the M912A mutation, suggesting that M912 does not directly bind receptor. As this residue is

To determine whether the degree of underrepresentation of a particular codon change in the SWIM enrichment strategy was consistent with the known defect caused by that single-point mutation, strains harboring individual alanine substitutions were isolated. This also allowed us to compare the magnitude of the defects caused by single substitutions with those caused by multiple alanine changes. Colony-purified mutants were tested for internalization by HEp2 cells, and activities were expressed as efficiency of uptake relative to strains expressing wildtype invasin ( Table 2). All proteins having a single alanine change, including D911A, were expressed on the bacterial cell surface at levels comparable to wildtype invasin ( Table 2). In agreement with previous results, strains harboring D911A had uptake levels ~300–400-fold below that seen for strains encoding wild-type invasin (Leong et al., 1995). The remainder of strains harboring single substitutions ranged from a fourfold defect in uptake efficiency (M912A) to no observable defect (Q908A). As would be expected, the substitution having the least effect on entry (Q908A), was also the most highly represented in the enrichment pool, based on SWIM (SRQCAGSDM, Tables 1 and Table 2 Bacterial entry efficiencies of six single alanine substitutions Substitution

Uptake (±SD)

Surface expression

SRQCQGSDM(wt) ARQCQGSDM SAQCQGSDM SRQCAGSDM SRQCQGADM SRQCQGSAM SRQCQGSDA DH5a (inv−)

1.00±0.191 0.605±0.107 0.725±0.037 1.02±0.094 0.669±0.121 0.003±0.002 0.250±0.013 0.002±<0.001

1.000±0.089 1.246±0.132 1.338±0.103 1.077±0.080 0.985±0.132 1.123±0.081 0.940±0.051 0.073±0.009

Uptake efficiencies of strains harboring single alanine point mutations relative to strains encoding wild-type invasin (n=3). Surface expression corresponds to amount of invasin on the bacterial surface (Materials and methods) determined on identical cultures used in the uptake assay (n=3).

3.3. Characterization of multiple substitutions and additivity

114

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

Table 3 Additive effects of invasin molecules carrying multiple substitutions Substitution

Uptake (±SD)

Experimental/.. (±SD)

Experimental/predicted

Surface expression

SRQCQGSDM (wt) ARQCQGADM ARQCQGSDA SAQCQGSDA SRQCAGADM SRQCAGSDA SRQCQGADA AAQCQGADM AAQCQGSDA SAQCAGADM SAQCAGSAM ARQCQGAAA DH5a (inv−)

1.00±0.191 0.800±0.048 0.005±0.002 0.263±0.013 0.693±0.069 0.103±0.027 0.805±0.029 0.067±0.014 0.007±<0.001 0.580±0.105 0.001±<0.001 0.002±0.001 0.002±<0.001

1.00 0.405±0.103 0.151±0.028 0.191±0.013 0.682±0.139 0.255±0.027 0.167±0.032 0.293±0.076 0.110±0.021 0.495±0.104 0.002±0.001 <0.001±<0.001 NA

1.00 1.98 0.034b 1.38 1.02 0.404 4.82c 0.229d 0.060e 1.17 0.500 NA NA

1.000±0.089 1.257±0.060b 1.224±0.015 1.314±0.024b 1.107±0.051 1.037±0.037 1.352±0.186b 1.186±0.033b 1.257±0.034b 1.262±0.057 1.799f 0.779f 0.073±0.009

Determination of additivity of defects in mutants containing multiple alanine substitutions. Observed uptake efficiencies (n=3) were compared to predicted efficiencies based on entry of mutants containing single alanine substitutions ( Table 2), and predicted SDs were determined (Materials and methods). The degree of divergence from predicted efficiencies was expressed as observed The p value determination is described in the Material and methods (Pagano and Gauvreau, 1993). The p value range was based on the upper limit determined using Chebychev’s inequality while assuming a known SD, and the lower limit assuming both distributions to be normal with an estimated SD based on three measurements. aSurface expression determined on a different culture to the uptake assay (n=3). b0.040
contiguous to D911, which is critical for bacterial uptake, it may affect the presentation of residues crucial for binding the mammalian integrin receptor.

4. Discussion Determination of amino acids critical for protein–protein interaction often involves the isolation and characterization of dozens of substitutions when alteration of only a few residues results in strong binding defects (Clackson and Wells, 1995; Irie et al., 1995; Gaal et al., 1996). The method of choice for large scale, unselected, mutagenesis is alanine-scanning because this introduces the shortest side chain that does not show a predilection for structural changes (Zubay, 1988). SWIM mutagenesis is an alternate strategy that expands on the binomial mutagenesis technique of Gregoret and Sauer (Gregoret and Sauer, 1993). Using SWIM, we generated pools of plasmids carrying the Y. pseudotuberculosis inv gene, in which six amino acid positions were targeted to be 50% wild type and 50% alanine (Fig. 1). The region that we selected for mutagenesis was a short sequence of invasin known to be involved in receptor recognition surrounding the critical residue D911 (Leong et al., 1995). Substitutions in residues that potentially cause defects by structurally altering the protein were avoided, such as C907 and G909. The former residue is known to participate in a disulfide

bond with C982 (Leong et al., 1993). This pool was then introduced into DH5a and enriched by passaging once through HEp2 cells. After enrichment, both the starting plasmid pool and the enriched plasmid pool were sequenced and compared for the presence of alanine at each of the six positions ( Fig. 2). Of the six targeted residues, D911A was the most dramatic in its defect for uptake as determined by SWIM. Whereas D911A was present prior to HEp2 cell enrichment, no clones carrying this mutation were represented in the enriched pool ( Table 1). This was consistent with the entry efficiencies of individual alanine mutant determined independently ( Table 2) and previous results (Leong et al., 1995). Thus, SWIM is a rapid way in which to analyse regions of a protein for the amino acids that are most critical for activity. Using this technique, a protein of interest can be divided into manageable blocks of amino acids, and these amino acids can be mutagenized simultaneously. We have used six amino acids, but others have targeted as many as 11 amino acids at once for binomial mutagenesis (Gregoret and Sauer, 1993). One is limited by the ability of the protein to maintain stability while harboring multiple mutations and the feasibility of generating long mutagenic oligonucleotides. Since we used oligonucleotide primer extension rather than cassette replacement, our frequency of incorporation of alanine substitutions ranged from 11 to 36% alanine codon substitution at each mutagenized residue

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

(Fig. 2 and Table 1). The low incorporation of mutations may in part be due to poor annealing of oligonucleotides containing multiple substitutions to a wildtype single-stranded DNA template. Replacement with alanine codons occurred in a gradient-like fashion with those substitutions encoded by the 3∞ end of the oligo being incorporated better than those at the 5∞ end (the oligos were in the negative strand). Addition of a longer 5∞ unmutagenized sequence (clamp) may give a more equivalent incorporation in the future. Additionally, those oligonucleotides with fewer mutations may outcompete those with multiple mutations during the annealing step of generation of the mutagenized pool. This hypothesis is supported by the fact that upon sequencing, randomly picked clones from the starting pool, several single, double, and triple mutants were obtained, but only one quadruple mutant was isolated, and no mutants carrying five or six substitutions were found among 23 sequenced. Replacing the region of interest with a double-stranded DNA cassette containing the alanine substitutions should give a ratio closer to 50% mutant, but relies on the availability of convenient restriction sites throughout a gene, if one is to cover a large region with multiple cassettes. Levels of incorporation of alanine substitutions were sufficient using oligonucleotide primer extension to assess whether an alanine substitution was detrimental to the activity of the protein. This was clearly demonstrated with the D911A substitution (Fig. 2 and Table 1, SRQCQGSAM ). Following the generation of a mutagenized pool, a single enrichment was all that was required to assess the location of critical residues within the region of interest. The rapidity, simplicity and affordability of SWIM make it a technique with great potential for characterizing regions of interaction between proteins. In addition to SWIM, we took this opportunity to determine the additivity of several isolated mutants containing multiple alanine substitutions. Of 11 isolated clones, eight appeared to be additive in their defects for HEp2 cell entry, as their observed defect was predicted by multiplying the observed defect for each individual alanine substitution represented in that particular clone ( Table 3). Two of the non-additive mutants carried both S904A and M912A mutations and were severely defective for HEp2 cell uptake ( Table 3, ARQCQGSDA and AAQCQGSDA), whereas the S904A or M912A single substitution mutants were capable of efficiently entering mammalian cells (Table 2). S904 and M912 may be involved in proper presentation of residues within invasin critical for integrin binding, and elimination of one side chain could be compensated by presence of the other. Upon truncation of both side chains to alanine, crucial residues such as D911 may no longer be able to interact functionally with receptor. Alternatively, S904

115

and M912 may interact with the receptor directly, but rely on D911 for delivery to the proper site on the integrin. The presence of either residue may allow sufficient receptor interaction to establish binding, but the loss of both side chains disrupts the interaction surfaces between invasin and the b1 integrin receptor. The remaining non-additive mutant contained S910A and M912A. This mutant had a higher activity than expected ( Table 3). Removal of the S910 side chain somehow suppressed the defect imparted by the M912A mutation. One possibility is that the substitution of M912 with alanine leaves S910 in a conformation where it interferes with receptor binding. Thus, the double mutant alleviates the defect caused by M912A, by removing the interfering side chain of S910. These residues are separated on the invasin molecule by D911, the most critical amino acid in this region. The fact that all three non-additive mutants contain the M912A substitution suggests that this residue plays an important role in bacterial uptake previously unappreciated. The fact that M912 is adjacent to D911 means it could affect the presentation of D911 to b1 integrins.

Acknowledgement This work was supported by grant R01-AI23538 to RI, training grant T-32AI07422 to EK and from the Center for Gastroenterology Research on Absorptive and Secretory Processes, PHS grant 1 P30DK39428 awarded by NIDDK. We thank Dr John Coffin for discussions involving the development of SWIM, John Rush at Howard Hughes Biopolymer Facility for generating the codon mutagenized SWIM oligonucleotides, the Isberg lab for helpful discussions, Dr Petra Dersch for development of the invasin surface expression assay, Dr Joan Mecsas for proposing the name SWIM, Dr Carol Kumamoto for anti-OmpA antiserum, and Dr Susan Murray for assistance in statistical analysis.

References Casella, G., Berger, R.L., 1990. Statistical Inference. Wadsworth and Brooks/Cole, Pacific Grove, CA. Clackson, T., Wells, J.A., 1995. A hot spot of energy binding in a hormone–receptor interface. Science 267, 383–386. Cormack, B.P., Struhl, K., 1993. Regional codon randomization: defining a TATA-binding protein surface required for RNA polymerase III transcription. Science 262, 244–248. Cunningham, B.C., Wells, J.A., 1989. High-resolution epitope mapping of hGH-receptor interactions by alanine-scanning mutagenesis. Science 244, 1081–1085. Gaal, T., Ross, W., Blatter, E.E., Tang, H., Jia, X., Krishnan, V.V., Assa-Munt, N., Ebright, R.H., Gourse, R.L., 1996. DNA-binding determinants of the a subunit of RNA polymerase: novel DNAbinding domain architecture. Gen. Dev. 10, 16–26. Gregoret, L.M., Sauer, R.T., 1993. Additivity of mutant effects

116

E.S. Krukonis, R.R. Isberg / Gene 211 (1998) 109–116

assessed by binomial mutagenesis. Proc. Natl. Acad. Sci. USA 90, 4246–4250. Irie, A., Kamata, T., Puzon-McLaughlin, W., Takada, Y., 1995. Critical amino acid residues for ligand binding are clustered in a predicted b-turn of the third N-terminal repeat in the integrin a4 and a5 subunits. EMBO J. 14, 5550–5556. Isberg, R.R., Falkow, S., 1985. A single genetic locus encoded by ersinia pseudotuberculosis permits invasin of cultured animal cells by Escherichia coli K-12. Nature 317, 262–264. Isberg, R.R., Leong, J.M., 1990. Multiple beta 1 chain integrins are receptors for invasin, a protein that promotes bacterial penetration into mammalian cells. Cell 60, 861–871. Isberg, R.R., Voorhis, D.L., Falkow, S., 1987. Identification of invasin: a protein that allows enteric bacteria to penetrate cultured mammalian cells. Cell 50, 769–778. Jerse, A.E., Yu, J., Tall, B.D., Kaper, J.B., 1990. A genetic locus of enteropathogenic Escherichia coli necessary for the production of attaching and effacing lesions on tissue culture cells. Proc. Natl. Acad. Sci. USA 87, 7839–7843. Leong, J.M., Fournier, R.S., Isberg, R.R., 1990. Identification of the integrin binding domain of the Yersinia pseudotuberculosis invasin protein. EMBO J. 9, 1979–1989. Leong, J.M., Fournier, R.S., Isberg, R.R., 1991. Mapping and topo-

graphic localization of epitopes of the Yersinia pseudotuberculosis invasin protein. Infect. Immun. 59, 3424–3433. Leong, J.M., Morrissey, P.E., Isberg, R.R., 1993. A 76-amino acid disulfide loop in the Yersinia pseudotuberculosis invasin protein is required for integrin receptor recognition. J. Biol. Chem. 268, 20524–20532. Leong, J.M., Morrissey, P.E., Marra, A., Isberg, R.R., 1995. An aspartate residue of the Yersinia pseudotuberculosis invasin protein that is critical for integrin binding. EMBO J. 14, 422–431. Pagano, M., Gauvreau, K., 1993. Principles of Biostatistics. Wadsworth, Belmont, CA. Rankin, S., Isberg, R.R., Leong, J.M., 1992. The integrinbinding domain of invasin is sufficient to allow bacterial entry into mammalian cells. Infect. Immun. 60, 3909–3912. Saltman, L.H., Lu, Y., Zaharias, E.M., Isberg, R.R., 1996. A region of the Yersinia pseudotuberculosis invasin protein that contributes to high affinity binding to integrin receptors. J. Biol. Chem. 271, 23438–23444. Sanger, F., Nicklen, S., Coulson, A.R., 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463–5467. Wells, J.A., 1990. Additivity of mutational effects in proteins. Biochemistry 29, 8509–8517. Zubay, G., 1988. Biochemistry, 2nd ed. MacMillan, New York.