METHODS: A Companion to Methods in Enzymology Vol. 3, No. 3, December, pp. 183-192, 1991
Antibody EngineeringUsingVery Long Template-Assembled Oligonucleotides Paul Carter, *'1 Lisa Garrard, t and Dennis H e n n e F Departments of *Protein Engineering and #Cell Genetics, Genentech, Inc., 460 Point San Bruno Boulevard, South San Francisco, Cali[ornia 94080
We have previously constructed single-stranded DNA fragments up to 361 nucleotides long by the ligation of shorter synthetic fragments assembled on a closely homologous template and used them for site-directed mutagenesis. Here we have extended the utility of such long oligonucleotides for protein engineering in two ways. First oligonucleotides have been assembled on a template to which they show only relatively weak sequence homology (<70% identity), allowing conversion of genes encoding rather distantly related proteins. We have converted a gene encoding human light chain constant domain from the kappa to the lambda isotype, which share <40% amino acid identity. Second, libraries of random mutants were constructed within antibody variable domains with up to 20 targeted codons separated by up to 186 nucleotides. These libraries were created by template-directed assembly of degenerate oligonucleotides followed by PCR amplification and subcloning. Nucleotide sequence analysis of clones demonstrated that this is an effective method for the simultaneous random mutagenesis of several codons distant within the target gene. This should provide a useful tool for modifying the ligand binding properties of proteins, e.g., the antigen binding affinity and specificity of antibodies. © 1991 AcademicPress, Inc.
Site-directed mutations using mismatched oligonucleotides up to ~ 100 nucleotides in length are routinely
1 To whom correspondence should be addressed. Fax: {415) 2253734. 1046~2023/91 $5.00 Copyright © 1991 by Academic Press, Inc. All rights of reproduction in any form reserved.
constructed by a variety of efficient strategies (e.g., see Ref. (1) and literature cited therein), allowing mutants to be readily isolated by nucleotide sequencing without prior screening. Our aim has been to extend the scope of site-directed and targeted random mutagenesis using mismatched oligonucleotides to contiguous or discontiguous regions spanning hundreds of nucleotides. This would allow one to mutate readily and simultaneously numerous residues that are distant in the DNA sequence but are physically close in the three-dimensional structure of the corresponding protein. Mutagenesis of such distant sites is likely to be widely useful, e.g., for modifying the complementarity determining region (CDR) residues of an antibody or residues at the active site of an enzyme to modify antigen or substrate specificities, respectively. Unfortunately, it is not possible to reliably synthesize oligonucleotides that are hundreds of nucleotides in length by conventional DNA synthesis involving the stepwise addition of activated nucleotide monomers (reviewed in (2)). To overcome this limitation, we previously constructed oligonucleotides up to 361 nucleotides in length by ligation of 6 shorter oligonucleotides assembled on a closely homologous (>~84% sequence identity) template. Two such assembled oligonucleotides were then used for site-directed mutagenesis for the efficient and simultaneous humanization of an antibody variable light (VL) gene and a variable heavy (VH) gene (3). Template-directed assembly of very long oligonucleotides would have broader utility if it could be achieved on sequences that are only distantly related. This possibility is demonstrated here by creating a gene encoding human (hu) light chain constant domain (Ca) 183
CARTER, GARRARD, AND HENNER
184
of the lambda (~) isotype by mutagenesis of a hu kappa (K) isotype CL gene, which share <70% identity at the nucleotide level and <40% identity at the amino acid level. These genes should prove useful in a mutational analysis of antibody domain-domain interactions. In particular, we are investigating how hu IgG1 CH1 domain accommodates sequence diversity to pair efficiently with either h u h Ca or hu K CL. Preassembled synthetic DNA fragments may also be useful in the construction of diverse libraries of humanized antibody variable domains in which CDR residues are randomly mutated. In this study we chose humanized anti-lysozyme antibody D1.3 (huD1.3) (4; Foote and Winter, personal communication) for the construction of such libraries using template-directed assembly of degenerate oligonucleotides and PCR to convert them into doublestranded cassettes for efficient cloning. These libraries could be screened in the context of antibody phage (5-9) to try to isolate humanized antibodies of defined antigen specificity. This strategy will undoubtedly generate some unnatural CDR sequences and perhaps antigen specificities not represented in the natural repertoire. This may therefore provide a useful alternative strategy to the direct isolation of human antibodies by screening antibody phage of the naive human repertoire (10) or generated from immunized SCID-Hu mice (11).
VT. -> F
G
METHODS ii
i
G
T
K
V
E
I
K
R
T
i
ii
i
i
Gene Conversion Mutagenesis: Hu K C L ~
A set of five contiguous oligonucleotides (LR1 to LR5) was designed to convert hu K CL to hu X CL in a single mutagenesis step (Fig. 1). These oligonucleotides are 60 to 108 residues long, contain 18 to 36 mismatches to the hu K CL template, and in some cases delete or insert I or 2 codons. The synthetic fragments are constrained to have 3 or more perfectly matched nucleotides at each end, including as many G and C nucleotides as possible in an attempt to promote efficient annealing and ligation of adjacent oligonucleotides. At least 9 perfectly matched nucleotides were used for the regions corresponding to the 5' and 3' ends of the assembled oligonucleotide. This was an attempt to promote efficient primer extension and subsequent ligation in the in vitro construction of covalent closed circular heteroduplex DNA used for gene conversion mutagenesis. Unique EspI and AccI restriction sites in hu KCL are removed during the mutagenesis to permit enrichment for mutants by restriction selection (12). Oligonucleotides were synthesized using phosphoramidite (13) chemistry and purified by polyacrylamide gel electrophoresis (14).
Z20 V
A
A
P
S
V
F
Hu ~ CL
Design of ~ CL Oligonucleotides
CL ->>
Q
ii
I
F
P
P
130 S
D
E
Q
L
K
S
G
T
A
S
V
V
C
L
L
N
TTCGGACAGGGCACCAAGGTG---GAGATCAAACGAACTGTGGCTGCACCATCTGTCTTCATCTTCCCGCCATcTGATGAGCAGTTGAAATCTGGAACTGCCTCTGTTGTGTGCCTGCTGAAT <* * @@@ * * * * * ** ** * -X** * * * *** ** * * * * * TTCGGACAGgqta ccAAGCTGACCGTGcTCAGACAACCTAAGGCTGCACCATCTGTCACCCTCTTCcCGCcATcTTCTGAGGAGTTGCAAGCTAAcAAAGCCACTCTTGTGTGCCTGATCAGT F
G
Q
G
T
K
L
T
V
L
R
X w.->
N
F
Q
P
K
A
A
P
S
V
T
L
F
P
P
S
S
E
E
L
Q
A
N
K
A
T
L
V
C
L
I
S
D
N
A
L
Q
S
G
N
S
160 Q
E
S
V
T
E
Q
D
S
K
170 D
S
T
Y
S
L
S
S
T
cT.->>
140 Y
P
R
E
A
K
V
Q
W
K
150 V
AACTT CTAT CCCAGAGAGGCCAAAGTACAGT GGAAGGTGGATAACGCCCTC CAATCGGGTAACT C CCAGGAGAGT GT CACAGAGCAGGACAGCAAGGACAGCACCTACAGC CT CAGCAGCACC * -X* * * * ** * * * * ** ** * ** ** * ** * ** * ** * * * ###* * ** ** ** - > < - ** GACTTCTATCCCGGAGCGGTcACAGTAGCGTGGAAGGCGGATAGCTCcCCCGTAAAGGCTGGCGTCGAGACGACTACccCAT•GAAGCAGAGC---AACAACAAATACGCCGCCAGCAGCTAC D
F
Y
P
G
A
V
T
V
A
W
180 L
T
K
A
D
S
S
P
V
K
A
G
190 L
S
K
A
D
Y
E
K
H
K
V
V
E
T
T
S
P
V
T
P
S
K
Q
S
200 Y
A
C
E
V
T
H
Q
G
L
S
N
N
K
Y
A
A
S
S
Y
210 T
K
S
F
N
R
G
E
C
Oc
CTGACgctgag~AAA~AGACTACGAGAAACAcAAA~ctac~CT~G~GTCAcc~TCAGGGCCTGA~TC~CCGTCACAAAGAG~Tc~CAGGGGAGAGTGT---T~T~TCCT * * ** * * * *** ** ** * * -XCTGTC~TGACCC~GAACAGTGGAAGA~CACAAAA~TACTCCTGCC~GTCACCCATGAGGGC L
S
L
T
P
E
Q
W
K
S
H
K
S
Y
S
C
Q
V
T
* H
#####% * ** * * ** ** ** @@8 ...... TCGACCGTCGAAAAGACCGTC~CCCGACAGAGTGTTCTT~GCT~TCCT E
G
S
T
V
E
K
T
V
A
P
T
E
C
S
->
Oc
F I G . 1. A m i n o acid sequences of h u K CL (REI, Ref. (28)) a n d h u X CL ( N E W , Ref. (29)) n u m b e r e d according to K a b a t et al. (20) together with genes encoding t h e m . T h e 5' a n d 3' ends of t h e oligonucleotides u s e d for gene conversion m u t a g e n e s i s from K to h isotype are s h o w n by single arrows. U n i q u e restriction sites in each gene are s h o w n in lowercase: KpnI, ggtacc; EspI, gctgagc; a n d AccI, gtctac. Also s h o w n are t h e location of m i s m a t c h e s (*), insertions (®), a n d deletions (#).
ANTIBODY ENGINEERING USING ASSEMBLED OLIGONUCLEOTIDES
185
Assembly of ~ CL Oligonucleotides ing gel (14). Oligonucleotide size standards are readily The five hu }, CL oligonucleotides ( ~ 100 pmol each) generated from a dideoxy sequencing reaction. The aswere separately phosphorylated by incubation for 30 sembled 357-mer oligonucleotide was located by aumin at 37°C in 20 ttl of 50 mM Tris-HC1 (pH 8.0), 10 toradiography and excised, and the gel slice was frozen mM MgC12, 5 mM DTT, 0.25 mM ATP containing 5 for 30 min at -20°C. The gel slice was then incubated units of T4 polynucleotide kinase (Pharmacia). A sec- in the presence of 400 #l of 0.5 M sodium acetate (pH ond aliquot (1 pmol) of the 5' most oligonucleotide, 4.5), 0.1% (w/v) SDS at 55°C for 60 min. The eluted LR1, was phosphorylated under the same conditions oligonucleotide was separated from gel debris by spin except that [~-32P]ATP (10 ~Ci, 5000 Ci/mmol, Amer- filtration through a 0.2-#m filter (Millipore No. UFC3 sham) was used in place of unlabeled ATP. The five 0GV 00) in a microcentrifuge. The oligonucleotide was phosphorylated oligonucleotides (30 pmol of LR1 to precipitated by the addition of 2.5 vol ethanol, washed LR5, plus 1 pmol of 32P-labeled LR1) were annealed twice with 70% (v/v) ethanol, and finally resuspended with 7.5 pmol of single-stranded pB11 template (light in 10 #l of 10 mM Tris-HC1 (pH 7.6), 1 mM EDTA. chain from humanized version 1 of the anti-CD3 antibody UCHT1, Ref. (15)) cloned into pUC119 (16) in Mutagenesis: H u ~ CL -~ Hu K CL The assembled oligonucleotide (~0.2 pmol, based 66 ttl of 40 mM Tris-HC1 (pH 8.0) and 8 mM MgC12 by cooling from 100°C to room temperature over ~20 rain. on crude estimates of recovered radioactivity) was anThe annealed oligonucleotides were joined by incu- nealed to 0.2 pmol of single-stranded p B l l template bation with T4 DNA ligase (18 Weiss units; New in 10 #l of 40 mM Tris-HC1 (pH 7.5) and 16 mM England Biolabs) in the presence of 3 ~1 of 5 mM ATP MgC12 as above. The molar ratio of primer to template and 1.5 ttl of 0.1 M DTT for 10 min at 14°C. The ligation (~1:1) was chosen in an attempt to prime as large a reaction was boiled for 3 min in the presence of 90 #1 fraction of the available template molecules as possible of formamide loading dyes (United States Biochemi- but without favoring spurious priming at additional cals) and electrophoresed on a 6% acrylamide sequenc- sites by using a large molar excess of primer. The pre-
b
a
A n n e a l C L ~ o l i g o m e r s to C L ]~ t e m p l a t e ,
LR5
LR4
LR3
LR2
LR1
A
A
Ak
A
A
5'
• Ligate
• Sequencing gel + autoradiograph
LR5 + LR4 + LR3 + LR2 + LRI*
357-mer
LR4 + LR3 + LR2 + LRI*
295-mer
LR3 + LR2 + LRI*
235-mer
LR2 + LRI*
127-rner
LR 1 *
66-mer
Isolate 357-mer from gel
FIG. 2. (a) Schematicrepresentationof template-directedassemblyof five oligonucleotides(LR1to LR5) to create a 357-mer for gene conversion mutagenesisof hu CLfrom Kto Xisotype. (b) Autoradiographof a sequencinggel used to isolateassembledoligonucleotide.
186
CARTER, GARRARD, AND HENNER
cise ratio of primer:template is probably not critical to the success of gene conversion mutagenesis since wildtype clones are very efficiently eliminated by the mutagenesis procedure used (see below) and to date we have seen no evidence of spurious priming at additional sites. Heteroduplex DNA was constructed by extending the primer with T7 DNA polymerase in the presence of dNTPs and was transformed into a mismatch repairdeficient Escherichia coli host strain, B M H 71-18 mutL (17), to improve the frequency of mutant clones (18) and grown up in liquid culture as previously described (1). Wild-type clones were eliminated from the resultant phagemid DNA pool by restriction selection (12) using AccI. Resultant mutant clones were analyzed by dideoxynucleotide sequencing (19).
as far as possible the amino acid repertoire at given residue positions in the sequence compilation of Kabat et al. (20). A different strategy was chosen for the randomization of VH CDR3, which is more diverse than the other CDRs in both length and sequence. The codon N N T (N = G or A or T or C), used for all positions in VH CDR3, was chosen as a compromise between maximizing the amino acid degeneracy (NNT encodes all amino acids except Trp, Met, Glu, Lys, and Gln) and maximizing the number of functional clones by avoiding stop codons. In addition, the diversity of this region was increased by varying the number of codons between 9, 10, 11, and 12 for hu VH libraries A, B, C, and D, respectively, in an attempt to mimic the natural length variation of VH CDR3.
Mutagenesis to Generate Antibody Libraries
Design of Oligonucleotides for VL and VH Libraries
Assembly of Oligonucleotides /or VL and VH Libraries
Sets of four contiguous oligonucleotides were designed to construct libraries of hu VL and VH mutants. The templates used for assembly of oligonucleotides were based on a humanized version of the anti-lysozyme antibody D1.3 (VH (4); VL, Foote and Winter, personal communication). Residues within the three CDRs of VL and VH were targeted for mutagenesis to simultaneously randomize 10 or 20 residues for the VL library and the VH library D, respectively. The codon randomization strategy for VH library D is shown in Fig. 5. The composition of the pools in VH CDR1 and VH CDR2 was chosen to mimic
The protocol for oligonucleotide assembly was modified for library construction: sets of four phosphorylated oligonucleotides (10 pmol each) were annealed to ~ 3 pmol of single-stranded pDH156 template (hu VL libraries) or pLG1 template (hu VH libraries) in 10 mM Tris-HCl (pH 8.0) and 1 mM MgC12 by denaturing at 75°C for 3 min and then cooling to room temperature over ~ 3 0 min. The four oligonucleotides used to create the VH library D included restriction sites for forced directional cloning (shown by underlining and listed after the sequences):
TABLE 1 Summary of Antibody Gene Conversion Mutagenesis Synthetic D N A Starting gene
T a r g e t gene
mu4D5 VL m u 4 D 5 Vu hu4D5 VL hu4D5 VH h u U C H T 1 VH h u K CL
hu4D5 VLb h u 4 D 5 VHb h u U C H T 1 VL h u U C H T 1 VH h u H 5 2 VH hu ~ CL
No. of m u t a t i o n s targeted
No. of clones
Error No. of f r a g m e n t s T o t a l length M i s m a t c h e s Insertions Deletions Sequenced Correct frequency ~ Ref. 6c 6 ¢~ 4c 4~ 4c 5c
311 361 246 283 285 357
39 59 36 46 33 97
0 0 0 6 6 6
0 0 0 0 0 9
7 8 5 6 4 4
2 2 0 0 1 0
0.0042 0.0045 0.016 0.010 0.0009 0.017
(1) (1) (15) (15) e I
Note. mu, murine; hu, h u m a n i z e d or h u m a n ; monoclonal antibodies: 4D5 (30), U C H T 1 (31), a n d H52 (32). a Average a p p a r e n t error frequency per nucleotide (X) calculated as t h e total n u m b e r of spurious u n t a r g e t e d m u t a t i o n s (replacements plus deletions, no u n p l a n n e d insertions found) divided by t h e total n u m b e r of nucleotides sequenced corresponding to synthetic DNA. b m u 4 D 5 VL and VH were s i m u l t a n e o u s l y humanized: one of eight clones was perfect a n d a n o t h e r contained a single silent m u t a t i o n t h a t did n o t corrupt t h e corresponding a m i n o acid sequence. c Oligonucleotides were s y n t h e s i z e d u s i n g p h o s p h o r a m i d i t e c h e m i s t r y (13). Oligonucleotides were s y n t h e s i z e d u s i n g H - p h s o p h o n a t e c h e m i s t r y (33). e P.C., u n p u b l i s h e d data. f T h i s work.
ANTIBODY ENGINEERING USING ASSEMBLED OLIGONUCLEOTIDES
187
munication) in a pRK5 (21)-derived vector, and the pLG1 plasmid contains the variable and first constant domains of the huD1.3 heavy chain in a pRK5-based vector. EfTGGACCGTATTC3',SaII; H2, 5' CATGGTAACTCTGCCCTTCACAGAGTC- ficient recovery of the assembled oligonucleotides is esTGCGTAGTAAGTAKNAKNGCCAKNAKNGA- sential for library construction (but not for gene converTCACTCCGATCCATTCCAGACCACG3'; sion mutagenesis). This was achieved by electroelution H 3 , 5 ' AGTGTCAGCAGCAGTAACAGAAGACAG- of the DNA from gel slices followed by ethanol precipiACGCAGAGAGAACTGGTTTTTAGAAGTGTC- tation in the presence of unrelated carrier DNA. AAGCAG3'; H4, 5' ATGCTCATTGCTGGTGACCAGGGACCC- Construction of Antibody Libraries CTGACCCCAANNANNANNANNANNANNANThe recovered assembled oligonucleotides (~0.05 NANNANNANNANNANNANNCCGGGCACA- pmol, based on counting the Cerenkov radiation from GTAGTAAACAGC3',Bs~II; the band that was excised from the gel) for VH libraries H 1 , 5 ' ACCAGGCGGCTGGCGAACCCAAKNCATCNYGTAAKNAGAGAAAGTCGAC-CCAGAAAC-
where K = G or T, N = T or C or G or A, and Y = A or G. These oligonucleotides were synthesized using phosphoramidite chemistry (13) and purified by polyacrylamide gel electrophoresis (14). The plasmid pDH156 contains sequence encoding the huD1.3 light chain (Foote and Winter, personal com-
A to D were used as template for PCR using the primers (20 pmol each) 5' AGTACCGCATGCGAATACGGTCCAGTTTCTGGG 3' and 5' ACTAGCAAGCCTATGCTCATTGCTGGTGACCAGGGA 3'
lOO
P=O.95
under reaction conditions recommended by the manufacturer of the thermal cycler (Perkin Elmer Cetus). The thermal cycling protocol was as follows: 12 min at 94°C, followed by 25 cycles of 30 s at 55°C, 1 min at 68°C plus 20 s at 94°C, and then 3 min at 50°C and 12 min at 68°C. After PCR amplification the DNA was restricted using the endonucleases corresponding to the sites flanking the ends of the sequences and then ligated to corresponding fragments from phagemid vector pMY90 or pMY93 (9). The ligated DNA was precipitared and then used to transform the SR101 tonA strain of E. coli (9) using an efficient electroporation procedure (22) in an attempt to maximize the library size.
80
Z X = 0.01 t-
.go O
60
J
O .El
X = 0.005 40
E z X = 0.0025
20
X = 0.001
RESULTS AND DISCUSSION 0
200
400
6o0
800
1000
L e n g t h of D N A , L F I G . 3. Number of clones (N) t h a t m u s t be sequenced to have a 95% probability (P) of finding a perfect clone plotted as a function of the total length of the synthetic fragment (L) and the mean apparent error rate in the synthetic DNA per nucleotide (X) calculated as
N~>
log(1 - P) l o g ( l - ( l - X ) L)"
Gene Conversion Mutagenesis
Assembly of Oligonucleotides A 357-mer oligonucleotide designed to convert a hu CL gene from Kto Xisotype was generated by templatedirected assembly of five oligonucleotides (Fig. 2a). After electrophoresis of the assembly reaction on a sequencing gel, a ladder of five fragments was shown by autoradiography (Fig. 2b) and probably corresponds to the 32P-labeled primer, LR1, and its four possible li-
CARTER, GARRARD, AND HENNER
188
gation products. A fourfold molar excess of each oligonucleotide over template was used in an attempt to populate as large a fraction as possible of the template molecules with all five oligonucleotides. Thus, a minimum of 75% of the labeled LR1 oligonucleotide was anticipated to be found as the unligated 66-mer. The shorter fragments may represent molecules in which the adjacent oligonucleotide failed to anneal to the template, or the 5' end was not capable of being ligated, e.g., not phosphorylated or missing a nucleotide. Thus, quantitative phosphorylation is a prerequisite for efficient assembly of oligonucleotides. Additional shorter fragments that do not contain the labeled LR1 oligonucleotide are likely to be formed in the assembly reaction. The ligation kinetics are very rapid and the reaction reached completion within the first minute of adding ligase (not shown). This reflects that the ends to be ligated are held in close proximity by virtue of the fragments being annealed to the template.
CDR 1
Mutagenesis of Hu CL from K to ~ Isotype The assembled 357-mer oligonucleotide was used to convert hu Ca from K to k isotype by means of an efficient mutagenesis protocol involving restriction selection (12) and repair-deficient (mutL) host strains (2). The DNA sequences of all four clones analyzed were within a few nucleotides of the target sequence, with the closest differing by three nucleotide replacements. A perfect clone was obtained by one additional round of mutagenesis using short oligonucleotides and by subcloning from nearly perfect clones.
Applications and Limitations of Gene Conversion Mutagenesis Gene conversion mutagenesis provides a convenient strategy for rapidly generating genes encoding unnatural proteins such as humanized versions of antibody VH and VL domains from other V domain sequences. It also has utility in generating genes
CDR 2
CDR 3
,
..
,
IxxxxxxxxxlH1
H2
j
H3
- 3'
5'
H4 I
• Ligate • Sequencing gel + autoradiograph
• Isolate long oligonucleotide
H4+H3+H2+H1 H4+H3+H2
H4+H3
H4
d Residue
FIG. 4. Schematic representation of the template-directed assembly of four oligonucleotides (H1 to H4) to create a 300~mer for generation of hu VH library D.
ANTIBODY ENGINEERING USING ASSEMBLED OLIGONUCLEOTIDES
189
TABLE 2
Summary of Antibody Library Construction Synthetic DNA b huD1.3 library ~
No. of clones
No. of fragments
Total length
Sequenced
Functional c
Frequency of spurious missense mutations~
4 4 4 4 4
279 297 300 303 306
20 23 24 21 23
13 13 15 12 16
0.006 0.004 0.002 0.004 0.003
VL
VHA V~ B VH C VH D
Note. Mutations were tabulated within the region of the restriction sites t h a t were to be used for cloning purposes (see Methods). VH libraries are identical except for the length of CDR3: libraries A, B, C, and D have 9, 10, 11, and 12 N N T codons in CDR3, respectively. b Oligonucleotides were synthesized using phosphoramidite chemistry (13). c The number of functional clones was calculated as the total number sequenced minus those t h a t contained either a stop codon or frameshift mutation (insertion or deletion) within the coding sequence. The frequency of spurious missense mutations is calculated by dividing the number of mutations resulting in nucleotide changes outside of the codons targeted for random mutagenesis by the total number of nucleotides.
encoding naturally occurring proteins for which a gene encoding a homologous protein is available. PCRbased strategies (see Ref. (23)) such as "sticky-feet" mutagenesis (24) may provide useful alternatives to gene conversion mutagenesis in cases in which the target gene is available and the aim is to install it into a new vector. Our gene conversion mutagenesis strategy requires less than half the amount of synthetic DNA than total gene synthesis and does not require convenient restriction sites in the target DNA. Furthermore, it has been possible to simultaneously mutate two large discontiguous regions successfully in one step using a pair
TABLE 3 Comparison of Observed with Expected Nucleotide Frequencies for Hu VL Library Frequency Nucleotide
Expected a
Observed
Observed/expected
G A T C
122 95 81.7 81.7
101 141 53 81
0.83 1.5 O.65 1.0
Total
380
376 b
a The expected nucleotide frequencies were calculated on the basis of the number of codons targeted for mutagenesis and their degeneracy together with the total number of clones sequenced. b A total of four deletions were also found.
ofpreassembled fragments (3). A quantity of assembled oligonucleotide sufficient for subsequent site-directed mutagenesis experiments was recovered at the first attempt in all cases to date (Table 1). It therefore seems likely that this strategy would be successful for the interconversion of even more divergent genes that may be only very weakly homologous at the amino acid level. Our procedure appears to be simpler and more reliable than a similar method described by Rostapshov et al. (25), and more efficient than mutagenesis using multiple mismatched oligonucleotides (26). The main limitation in the use of very long oligonucleotides for gene conversion mutagenesis is the stringent demands that it places on the quality of the synthetic DNA. All gene conversion mutants sequenced (Table 1) were either correct or differed from the target sequence by a small number of apparently random point errors, which probably reflect minor imperfections in the synthetic DNA. Additional factors associated with mutagenesis using mismatched oligonucleotides do not appear to contribute significantly to the observed error rates since these were similar to error rates observed in sequencing two synthetic genes constructed (P.C., unpublished data). Of the observed errors, 91% were point differences with respect to the target sequence and 9% were point deletions with no unplanned insertions found. The average apparent error frequency per nucleotide (X) was calculated as the total number of errors (replacements plus deletions) divided by the total number of nucleotides sequenced corresponding to synthetic
190
CARTER, GARRARD, AND HENNER
DNA. In half of the cases of gene conversion mutagenesis (Table 1), the apparent error rate was low (X ~< 0.0045) and it was possible to find a perfect clone in
sequencing a small number (4 to 8) of clones. For example, in the simultaneous humanization of the murine antibody 4D5 VL and VH genes using a 311-mer and
CDR H2
CDR H1 30
40
60
50
R Q P P G R G L E W
I G V
I 1 1 G
S T F S 1 Y
2 M
1 W V
1
....
H-
G-
H
Y N -AT
2 3
.... ....
DKD-T-
H H
DN -DD P T STN
4
....
D -~]-
H
NN
-DN
5
....
N-T-
H
NN
-NT
6
....
D - R - N
N
7
....
D - G - D
NN
8
....
N-A-
9
....
D-M-
i0 ii
.... ....
AT-
NA
-DN -DN-NN
D
ND
RT-N
H N
D N - TN NA -NN HD -HT T D -NN
12
....
H-
T-
....
D-
T-N
14 15 DI.3
.... .... ....
bank 1
K N Q F S L R L S S V L
NTTA [G~- K T G - G V N 80
HD -PH DY -NY M - W G D GN-
a b
c
S
D-N
SAL-
S
90 95 105 T A A D T A V Y Y C A R 3 3 3 3 3 3 3 3 3 3 3 3 W G Q G S L V T A GY PAP D N C A V H G
2 3
T T
T R GH T . . . . . . NH G - - Y N GP
4 5 6 7
H - - - N I
8 9 I0
RRD
RGD
HSDD
P R D H T S D S C N D D N T N R G S N S A S H G T S D D D S Y S D N N T P S A N D D S N N T T C N H
A G P ND S G H D G D G S G A L T H H T S H V G DN N C G D G T G C S S
T
15
H G V D D N T A I D S N GC D S S D D R
DI.3
E R D Y R L D Y * *
Ii 12 13
14
S V K G R V T M L V D T
S -DT
D
13
70
1 1 T Y Y A D
bank
Q
T G G R D S A G A Y S T I
GN S
D D S S D R N F G S T H D H S P S A V S I S S G C **
CDR H3 Pool
I:
Y
H
D
S P A T
N
Pool
2:
V
A
G
M T R K
E
Pool
3:
F L I V S P T A Y H N D C R G
F I G . 5. Sequence analysis of 15 functional isolates from t h e h u VH library D. T h e top row (labeled bank) indicates t h e V H library D generated a n d is n u m b e r e d above t h e sequence according to K a b a t e t al. (20). N u m b e r s e m b e d d e d within t h e b a n k sequence indicate t h e degenerate nucleotide pool u s e d at t h a t codon: 1 = N M T , 2 = RNG, 3 = N N T , where N = T or C or G or A, M = A or C, a n d R = A or G a n d are t r a n s l a t e d b e n e a t h t h e sequence. D a s h e s denote identity between individual sequences a n d t h e b a n k sequence. T h e sequence for huD1.3 VH (4), which was used as t h e t e m p l a t e to assemble t h e oligonucleotides, is also shown. Residues within t h e C D R s t h a t are different between t h e b a n k a n d huD1.3 were c h a n g e d from t h e huD1.3 sequence to c o n f o r m to a c o n s e n s u s of h u VH group III (20). T h e sequence of huD1.3 VH CDR3 is four residues shorter (denoted by *) t h a n t h e b a n k sequence. Boxed residues indicate a m i n o acids t h a t were n o t expected from t h e codon pool (i.e., t h o s e arising from spurious mutagenesis).
ANTIBODY ENGINEERING USING ASSEMBLED OLIGONUCLEOTIDES 361-mer oligonucleotide, respectively, one of eight clones was correct and another contained a single silent mutation that did not corrupt the corresponding amino acid sequence (3). In the three cases in which the apparent error rate was high (X ~>0.016), no perfect clones were found in sequencing a small number (4 to 6) of clones. In these cases the target sequence was obtained by subcloning fragments from nearly perfect clones or by an additional round of mutagenesis, or a combination of both of these strategies. Sequencing of additional clones in the hope of finding a perfect one was judged to be futile on the basis of the large number of clones that one would need to sequence in order to have a 95% probability of finding a perfect clone (Fig. 3). These data underscore the importance of low error rates in the synthetic DNA to the success of gene conversion mutagenesis. Nevertheless, the relatively low error rates reported by others (reviewed in Ref. (2): 0.0005 ~
Construction of VL and VH Libraries Assembly of Oligomer to Direct Synthesis of Antibody Libraries Degenerate oligonucleotides for library construction ranging in size from 279 to 306 nucleotides were produced by template-directed assembly of four contiguous oligonucleotides (Fig. 4). The efficiency of assembly of these oligonucleotides was qualitatively similar to that for the ~ CL oligonucleotide shown in Fig. 2b. The yield of assembled oligonucleotide (-~0.1 pmol, i.e., ~ 6 × 10 l° molecules in each case) is not expected to limit the library size since the maximum diversity readily attainable by current transformation technology is 10s109 individual clones.
Analysis of Libraries Produced from Oligonucleotides At least 20 clones from each library were sequenced and analyzed for amino acid sequence diversity and frameshift mutations. All isolates sequenced were unique, with the exception of two identical sequences in the VH library A which probably reflect sibling clones. The percentage of sequences containing frameshift mutations varied from 31 to 43%, as shown by the number of functional clones in Table 2. The frequency of frameshift errors may reflect the limitations of gel purification of full-length (N) degenerate oligonucleotides, which have a range of electrophoretic mobilities, away from shorter failure
191
sequences (N - 1), which may have an overlapping range of electrophoretic mobilities. In contrast to the libraries, deletions were rarely found during the gene conversion mutagenesis experiments (above). This probably reflects the fact t h a t full-length oligonucleotides of unique sequence are readily resolved from N - 1 failure sequences by polyacrylamide gel electrophoresis. The frequency of spurious point mutations found for the libraries (0.002 ~< X ~< 0.006) was similar to that seen for the gene conversion mutagenesis experiments (Table 1). In the case of the libraries, the observed errors may include contributions from minor imperfections of the synthetic DNA as well as from the PCR. Data comparing the observed number of nucleotides versus those expected for each position were compiled for the VL library (Table 3). The observed frequencies of G and C nucleotides were very close to those expected, whereas for A and T nucleotides the observed frequencies were somewhat lower and higher than those expected, respectively. Very similar results were obtained for the four VH libraries. The deduced amino acid sequences of the open reading frames from sequences obtained from the VH library D are shown in Fig. 5. Throughout CDR1 and CDR2, at least five of the eight possible residues that could result from pool 1 were found. However, some residues, e.g., His, Asp, and Asn, appear to be overrepresented. This reflects the small nucleotide bias toward A at the expense of C in the second position of the codon pool, and further emphasizes the influence of the quality of the oligonucleotides in the final outcome of library sequences. The degree of randomization of VH CDR3 was also biased toward residues t h a t contained adenosine. For example, Asn, Asp, and His are all overrepresented, while Ile, Val, and Phe are all underrepresented. Additional diversity was obtained from a low frequency of spurious missense mutations, including 2 in the CDRs and an additional 11 in the intervening framework sequences. These spurious missense mutations were considered a useful mechanism for generating additional sequence diversity in the V domain libraries, analogous to the natural mechanism of somatic hypermutation of antibody V genes.
Applications and Limitations of Libraries Constructed with Degenerate Preassembled Oligonucleotides Construction of antibody libraries using degenerate preassembled oligonucleotides allows for multiple dis-
192
CARTER, GARRARD, AND H E N N E R
c o n t i n u o u s r e g i o n s to b e r a n d o m i z e d r a p i d l y a n d simultaneously. Highly diverse libraries containing very large n u m b e r s ( > 1 0 s) of p o t e n t i a l l y f u n c t i o n a l c l o n e s h a v e b e e n g e n e r a t e d . T h i s is in s p i t e o f f r a m e s h i f t m u t a t i o n s , w h i c h r e d u c e t h e n u m b e r o f in f r a m e p r o t e i n s b y a b o u t 31 t o 43%, a n d s m a l l n u c l e o t i d e b i a s e s in t h e d e g e n e r a t e c o d o n s . A low f r e q u e n c y o f s p u r i o u s m u t a t i o n s w a s also e n c o u n t e r e d , w h i c h m a y b e b e n e f i c i a l rather than deleterious by virtue of increasing library d i v e r s i t y . T h i s t e c h n i q u e for g e n e r a t i n g d i v e r s e l i b r a r ies is l i k e l y to b e u s e f u l in m o d i f y i n g t h e a n t i g e n s p e c ificity o r i m p r o v i n g t h e b i n d i n g a f f i n i t y o f a n t i b o d i e s . I n a d d i t i o n , t h i s m e t h o d s h o u l d b e b r o a d l y u s e f u l in t h e c o n s t r u c t i o n o f l i b r a r i e s of m u t a n t s o f a n y p r o t e i n for w h i c h s i m u l t a n e o u s r e p l a c e m e n t o f r e s i d u e s d i s t a n t in t h e l i n e a r s e q u e n c e is d e s i r e d . T h i s m e t h o d h a s t h e a d v a n t a g e o v e r c o n v e n t i o n a l c a s s e t t e m u t a g e n e s i s (27) in t h a t t h e r e s i d u e s t o b e m u t a t e d c a n b e m u c h f u r t h e r a p a r t in t h e l i n e a r s e q u e n c e .
ACKNOWLEDGMENTS i
i
i
We thank Greg Winter for generously providing the sequence of VL of humanized D1.3. We also thank our colleagues at Genentech, Mark Vasser, Parkash Jhurani, Peter Ng, and Leonie Meima, for synthesis and purification of oligonucleotides; Polly Moore for the probability calculation behind Fig. 3; Wayne Anstine for preparing Figs. 2 and 3; Mark Zoller for critical review of the manuscript and helpful discussions; and Tony Kossiakoff for continued support.
REFERENCES 1. Carter, P. (1991) in Mutagenesis: A Practical Approach (McPherson, M. J., Ed.), Chap. 1, pp. 1-25, IRL Press, Oxford, UK. 2. Engels, J. W., and Uhlmann, E. (1989) Angew. Chem. Int. Ed. Engl. 28, 716-733. 3. Carter, P., Presta, L., Gorman, C. M., Ridgway, J. B. B., Henner, D., Wong, W. L. T., Rowland, A. M., Kotts, C., Carver, M. E., and Shepard, H. M. (1992) Proc. Natl. Acad. Sci. USA, 89, 42854289. 4. Verhoeyen, M., Milstein, C., and Winter, G. (1988) Science 239, 1534-1536. 5. McCafferty, J., Griffiths, A. D., Winter, G., and Chiswell, D. J. (1990) Nature 348, 552-554. 6. Kang, A. S., Barhas, C. F., Janda, K. D., Benkovic, S. J., and Lerner, R. A. (1991) Proc. Natl. Acad. Sc£ USA 88, 4363-4366. 7. Clackson, T., Hoogenboom, H. R., Griffiths, A. D., and Winter, G. (1991) Nature 352,624-628. 8. Barbas, C. F., Kang, A. S., Lerner, R. A., and Benkovic, S. J. (1991) Proc. Natl. Acad. Sci. USA 88, 7978-7982. 9. Garrard, L. J., Yang, M., O'Connell, M. P., Kelley, R. F., and Henner, D. J. (1991) Bio/Technology 9, 1373-1377.
10. Marks, J. D., Hoogenboom, H. R., Bonnert, T. P., McCafferty, J., Griffiths, A. D., and Winter, G. (1991) J. Mol. Biol. 222,581597. 11. Duchosal, M. A., Eming, S. A., Fischer, P., Leturcq, D., Barbas, C. F., III, McConahey, P. J., Caothien, R. H., Thornton, G. B., Dixon, F. J., and Burton, D. R. (1992) Nature 355, 258-262. 12. Wells, J. A., Cunningham, B. C., Graycar, T. P., and Estell, D. A. (1986) Phil. Trans. R. Soc. Lond. A 317, 415-423. 13. Caruthers, M. H., Barone, A. D., Beaucage, S. L., Dodds, D. R., Fisher, E. F., McBride, L. J., Matteucci, M. D., Stabinsky, Z., and Tang, J.-Y. (1987) in Methods in Enzymology (Wu, R., and Grossman, L., Eds.), Vol. 154, pp. 287-313, Academic Press, San Diego. 14. Boyle, A. (1990) in Current Protocols in Molecular Biology (Ausubel, F. A., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A., and Struhl, K., Eds.), Chap. 2, Greene Publishing/Wiley-Interscience,New York. 15. Shalaby, M. R., Shepard, H. M., Presta, L., Rodrigues, M., Beverley, P. C. L., Feldmann, M., and Carter, P. (1992) J. Exp. Med. 175, 217-225. 16. Vieira, J., and Messing, J. (1987) in Methods in Enzymology (Wu, R., and Grossman, L., Eds.), Vol. 153, pp. 3-11, Academic Press, San Diego. 17. Kramer, B., Kramer, W., and Fritz, H.-J. (1984) Cell 38, 879887. 18. Carter, P., Bedouelle, H., and Winter, G. (1985) Nucleic Acids Res. 13, 4431-4443. 19. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sc£ USA 74, 5463-5467. 20. Kabat, E. A., Wu, T. T., Reid-Miller, M., Perry, H. M., and Gottesmann, K. S. (1987) Sequences of Proteins of Immunological Interest, National Institutes of Health, Bethesda, MD. 21. Gorman, C. M., Gies, D., and McCray, G. (1990) DNA Protein Eng. Techn. 2, 3-10. 22. Zabarovsky, E. R., and Winberg, G. (1990) Nucleic Acids Res. 18, 5912. 23. Horton, R. M., and Pease, L. R. (1991) in Mutagenesis: A Practical Approach (McPherson, M. J., Ed.), Chap. 11, pp. 217-247, IRL Press, Oxford, UK. 24. Clackson, T., and Winter, G. (1989) Nucleic Acids Res. 17, 1016310170. 25. Rostapshov, V. M., Chernov, I. P., Azhikina, T. L., Borodin, A. M., and Sverdlov, E. D. (1989) F E B S Lett. 249, 379-382. 26. Perlak, F. J. (1990) Nucleic Acids Res. 18, 7457-7458. 27. Wells, J. A., Vasser, M., and Powers, D. B. (1985) Gene 34, 315323. 28. Palm, W., and Hilschmann, N. (1975) Z. Physiol. Chem. 356, 167-191. 29. Langer, B., Steinmetz-Kayne, M., and Hilschmann, N. (1968) Z. Physiol. Chem. 349,945-951. 30. Fendly, B. M., Winget, M., Hudziak, R. M., Lipari, M. T., Napier, M. A., and Ullrich, A. (1990) Cancer Res. 50, 1550-1558. 31. Beverley, P. C. L., and Callard, R. E. (1982) Eur. J. Immunol. 11,329-334. 32. Hildreth, J. E. K., and August, J. T. (1985) J. Immunol. 134, 3272-3280. 33. Froehler, B. C., Ng, P. G., and Matteucci, M. D. (1986) Nucleic Acids Res. 14, 5399-5407.