Molecular Cell, Vol. 16, 505–508, November 19, 2004, Copyright 2004 by Cell Press
B Cell Development Leads Off with a Base Hit: dU:dG Mismatches in Class Switching and Hypermutation Gregory S. Lee, Vicky L. Brandt, and David B. Roth* Skirball Institute of Biomolecular Medicine and Department of Pathology New York University School of Medicine New York, New York 10016
The mechanisms underlying somatic hypermutation (SHM) and class switch recombination (CSR) have been the subject of much debate. Recent studies from the Neuberger and Honjo labs have lent insight into these distinct processes, and we discuss a new, comprehensive model for how AID, uracil DNA glycosylase (UNG) and the mismatch repair system function in both SHM and CSR. To prepare for the myriad antigens they might encounter over an individual’s lifetime, B cells employ distinct genetic processes to create a diverse repertoire of highaffinity antibodies. In the fetal liver and bone marrow of both humans and mice, V(D)J recombination rearranges gene segments arrayed along the immunoglobulin (Ig) loci to generate a primary repertoire of low-affinity IgM molecules. Once a mature naive B cell encounters an antigen that can bind its IgM surface receptors, it activates two further diversification processes, somatic hypermutation (SHM) and class switch recombination (CSR). SHM introduces mutations into the variable region, which encodes the antigen binding portion of the receptor. Repeated rounds of mutation and selection yield antibodies that bind more tightly to the activating antigen. CSR, by contrast, involves the constant region of the receptor and does not entail mutation: it augments the antibody arsenal by replacing the C constant region of the IgM antibody with other constant region exons (C␣, C␥, etc.), thereby allowing the same V region (i.e., the same receptor specificity) to be deployed with different effector functions. CSR bears a gross resemblance to V(D)J recombination insofar as two switch regions adjoining different constant region exons are brought together and the intervening DNA is looped out, but the mechanisms of these two reactions appear to differ substantially (Posey et al., 2004). The mechanisms underlying CSR and SHM were largely impenetrable until Honjo and coworkers performed subtractive hybridization experiments in 1999 in a B cell line activated to undergo class switching in vitro. Their identification of an mRNA specifically upregulated in activated B cells that encodes the activation-induced cytidine deaminase (AID) was a breakthrough for those studying CSR (reviewed in Honjo et al., 2004). The following year, it was further demonstrated that mutational inactivation of AID in knockout mice and human patients with hyper IgM syndrome type 2 prevents CSR and severely impairs SHM. Subsequent studies showed that ectopic expression of AID allows multiple cell types to perform both SHM and CSR on artificial substrates,
*Correspondence:
[email protected]
Minireview
proving the centrality of AID to each process (reviewed in Honjo et al., 2004). How could AID be involved in such divergent activities as targeting variable regions for hypermutation, on the one hand, and initiating the joining of two switch regions on the other? The answer has been far from obvious— in fact, the mechanistic role of AID has been the subject of considerable debate. In this minireview we discuss recent work from the Honjo and Neuberger groups that refines our picture of both SHM and CSR. The First Pitch: an RNA Editing Model AID was initially thought to edit messenger RNA in part because it shares significant sequence homology with APOBEC-1 (apolipoprotein B mRNA editing catalytic polypeptide 1), an RNA editing enzyme that converts a specific cytosine in the RNA encoding apolipoprotein B to uracil to create a stop codon and produce a shorter version of the protein. It was reasonable to propose that this familial resemblance might be borne out in function as well, with AID deaminating cytosine to uracil in RNAs encoding as-yet unidentified endonucleases (or other factors) critical to SHM and CSR (see discussion and Figure 2 in Honjo et al., 2004). Since AID can stimulate switching and hypermutation in many cell types, this model further proposes that both the target mRNAs and cofactor proteins that recognize their structure are ubiquitously expressed. The RNA editing model is consistent with the recent discovery that AID, like APOBEC-1, contains both an N-terminal nuclear localization signal and C-terminal nuclear exclusion signal (Ito et al., 2004). Also compatible with RNA editing is the finding that de novo protein synthesis is required for class switching (Doi et al., 2003), specifically at the DNA cleavage step (Begum et al., 2004b). No targets of the putative RNA editing activity have yet been identified, however, and the paucity of direct evidence for this model left room for an intriguing alternative: perhaps AID acts not on RNA but on DNA. Base Hit: The DNA Deamination Model Deamination of cytosine residues in DNA would produce uracil, just as it does in RNA, but with one key difference: uracils are normal constituents of RNA but (deoxy)uracils are read as mistakes that must be repaired when found in DNA. Importantly, the repair can be mutagenic—a handy feature if one is seeking to diversify a genome. When DNA polymerases encounter uracil in DNA, they read it as thymine: the deamination of cytosine to uracil in DNA will, therefore, lead to C to T transition mutations after replication (Figure 1A, left). A major pathway used to repair the uracil lesion in the DNA begins with the removal of the uracil base by enzymes such as uracil DNA glycosylase (UNG), creating an abasic site that acts as a template to generate either transition or transversion mutations (Figure 1A, middle). Several considerations made a DNA deamination model attractive (Neuberger et al., 2003), not the least of which being the fact that it makes several clear, testable predictions. The most obvious one, that AID should be able to trigger deamination of cytosine to uracil in DNA (Figure 1), was tested by Neuberger and colleagues in bacteria (Petersen-Mahrt et al., 2002). AID showed a mutator phenotype in E. coli, producing transitions with
Molecular Cell 506
Figure 1. The Deamination Model The DNA deamination model neatly explains how the different types of mutations occur in SHM (A). If AID deaminates cytosine (C) to uracil (U) in DNA, subsequent replication would produce C to T transitions (left panel), whereas removal of the uracil base by UNG could lead to either transition or transversion mutations (middle panel). Phase 2 SHM generates mutations at nearby dA:dT pairs through patch repair processes (right panel). (B) CSR in this model begins much the same way, with AID creating a uracil in DNA. The uracil is recognized by UNG, which creates an abasic site that is then converted into a nick by an AP endonuclease. The precise DNA intermediate at this stage remains unknown, but it is conceivable that nicks could initiate CSR directly or, if closely spaced on opposite strands, they could create a doublestrand break in the switch region.
a dC/dG bias in the AID-transformed cells. As expected, this mutator phenotype was intensified by UNG deficiency. These results provided the first experimental support for the DNA deamination model. Of course, what AID is capable of doing in bacteria may not be relevant to its activities in B cells. The Neuberger group therefore set out to test a corollary of the model—that UNG deficiency should alter the pattern of mutations—in a chicken B cell line that favors transversions (Figure 2) (Di Noia and Neuberger, 2002). Upon inhibiting UNG activity with the uracil-DNA glycosylase inhibitor Ugi, Di Noia et al. found that the predominant mutation did indeed shift from transversions to transitions at G/C pairs. Furthermore, if AID acts on DNA, it follows that its deaminase activity should be targeted to the same sequences that form SHM hotspots in vivo. This prediction has been confirmed biochemically (Chaudhuri et al., 2003; Pham et al., 2003). Another important piece of the puzzle came together in the past year through several studies that provided a simple and compelling reason for transcription to be required for both CSR and SHM. Chaudhuri et al. (2003) demonstrated that AID prefers to deaminate singlestranded DNA in vitro and can be targeted by transcription. Transcription allows the formation of “R loops” in which the transcript hybridizes to the template strand, forming a single-stranded region on the nontemplate strand. These loops have been implicated in class switch recombination and are thought to make single-stranded
DNA on the nontemplate strand available for AID. R loops are not thought to be formed in V regions during SHM, but recently published work from the Alt lab has revealed that replication protein A (RPA), a single-strand DNA binding protein, binds to AID specifically in activated B cells (Chaudhuri et al., 2004). RPA appears to stabilize short regions of single-stranded DNA formed as a result of transcription and directs AID to these regions; after deamination of the target cytosines, AID is released from the AID-RPA complex. Finally, AID has been found to associate with switch regions, pointing to a direct involvement with DNA (Chaudhuri et al., 2004; Nambu et al., 2003). It is difficult to account for these observations without invoking DNA deamination. But we have yet to address a central question: how might U residues in DNA initiate processes as diverse as SHM and CSR? dU:dG Mismatches and Somatic Hypermutation dU:dG mispairs have several possible fates that could contribute to hypermutation (see Figure 1A). SHM has, in fact, been operationally separated into several phases whose mechanistic significance has recently been validated. Replication without repair creates signature dC to dT transition mutations (phase 1a). Removal of the uracil base produces an abasic site that can be replaced with any nucleotide upon replication, giving rise to both transition and transversion mutations (phase 1b). Phase 2 hypermutation describes the generation of mutations at adjacent positions, predominantly at nearby dA:dT
Minireview 507
Figure 2. Dual Recognition Model for CSR and SHM Recent work from Rada et al. (2004) poses an elegant explanation for the interplay between AID, UNG, and mismatch repair. The key to this “dual recognition” model is that the dU:dG mismatch provides two distinct signals that alert two different cellular repair machineries to its presence. On the one hand, the uracil can be recognized by UNG; on the other, the mismatch itself can be recognized by the mismatch repair machinery. This dual recognition allows UNG and MSH2 to compensate for one another in MSH2-deficient and UNGdeficient organisms, respectively.
pairs, likely through an error-prone patch repair process (reviewed in [Neuberger et al., 2003]). UNG deficiency in mice, cultured B cells, or hyper IgM syndrome type 4 patients strongly skews the mutation spectrum toward transitions at dC:dG pairs, with a significant reduction in transversions (Imai et al., 2003; Rada et al., 2002). UNG inhibition produces similar results in cultured cells (Di Noia and Neuberger, 2002). Clearly, UNG plays a major role in generating transversion mutations. The number of phase 2 mutations, however, is not greatly affected in UNG-deficient mice (Rada et al., 2002). This observation raises an important question: what might be compensating for lost UNG activity? A number of other glycosylases (SMUG1, TDG, MBD4, NEIL1) are capable of acting on dU residues incorporated into DNA, at least in the test tube (Rada et al., 2004). Could one or more of them substitute for UNG? Or, could the backup pathway be provided by mismatch repair; i.e., could the presence of the mismatch itself signal SHM by directing a mismatch repair nuclease to the site of the mispair? An early clue came from the fact that phase 2 mutations, though unaffected in UNG-deficient mice, are dramatically reduced in mice deficient in mismatched repair factor MSH2 (Rada et al., 1998). But to address the question of backup pathways directly and investigate the nature of the signal that initiates phase 2 SHM, Neuberger and colleagues examined SHM in mice doubly deficient for UNG and MSH2 (Rada et al., 2004). The result was compelling: these animals are virtually devoid of phase 1b and phase 2 mutations. (They continue to generate phase 1a mutations, C to T transitions, as is consistent with replication across dU:dG mismatches.)
Phase 1b thus appears to arise from UNG-mediated recognition of dU residues generated by DNA deamination at C:G pairs, whereas phase 2 targets other locations (predominantly A:T pairs) via error-prone mismatch repair signaled by the dU:dG mismatch (Figure 2). If the significant residual SHM observed in UNG-deficient mice takes place through MSH2-dependent mismatch recognition rather than the removal of the dU residues by other glycosylases, we are left with another question. Why is UNG evidently the only glycosylase that can act in SHM (and CSR, as discussed below)? Two explanations have been advanced: first, access to AID-generated dU lesions may be restricted, requiring specific recruitment mechanisms (Rada et al., 2004). An alternative and more interesting possibility is that UNG might perform some critical function in addition to its glycosylase activity, as discussed below. dU:dG Mismatches and Class Switch Recombination How might dU:dG mismatches stimulate CSR? In one scenario, UNG removes the uracil base, generating an abasic site that is then recognized by an AP endonuclease, which in turn produces a nick. Nicks could conceivably initiate CSR directly—recent evidence suggests that nicks may initiate recombination events in mammalian cells (Lee et al., 2004)—or, closely spaced nicks on opposite strands could create a double-strand break in the switch region (Figure 1B). In support of the notion that abasic sites might initiate CSR, UNG-deficient B cells display switching defects in culture (Rada et al., 2002). Similar defects are observed in vivo in UNG-deficient mice: serum IgG levels are reduced by about 50% (Rada et al., 2002). Furthermore, hyper IgM syndrome type 4 patients bearing UNG mutations suffer profound defects in CSR as well as partial disruption of SHM (Imai et al., 2003). The evidently greater phenotypic severity in humans could reflect ascertainment bias, i.e., there might be a number of UNG-deficient individuals who don’t display symptoms. Or perhaps mice possess more robust backup pathways for generating the DNA breaks that initiate CSR, such as alternative glycosylases to act on the uracil or alternative repair pathways that recognize the dU:dG mispair. Once again, evidence from the mice doubly defective for UNG and MSH2 is persuasive: they are profoundly deficient for CSR (Rada et al., 2004). Together, then, these two mutations closely phenocopy the AID knockout—with the interesting exception of phase 1a mutations, which are readily explained by AIDmediated deamination of cytosine residues in hotspot motifs followed by replication. Thus far, the data allow us to construct a coherent and remarkably simple “dual recognition” model for the initiation of SHM and CSR (Figure 2). Transcription makes single-stranded DNA available to AID, which deaminates cytosines at hotspot motifs, creating dU:dG mismatches. These lesions provide two discrete signals for further processing. First, the presence of dU in DNA can be recognized by UNG, which creates abasic sites and can lead to error-prone patch repair in the absence of MSH2 (and may have other roles, as described below). Second, the presence of the mismatch can recruit an MSH2-dependent mismatch repair process to induce error-prone patch repair, in the case of SHM, or to induce nicking, in the case of CSR (and perhaps SHM). The limited defects observed in mice singly deficient for either UNG or MSH2 indicates that these two processes
Molecular Cell 508
can substitute for one another to some degree. The relative contributions of these two pathways could differ from one species to the next; the precise details of B cell activation might also influence the choice of pathway and explain some of the discrepancies between animal models and cultured cells. Switch Hitter: A New Role for UNG? Lest anyone think that all the mysteries of CSR and SHM have been solved, recent work from the Honjo laboratory (Begum et al., 2004a) raises new questions. Begum et al. used Ugi to inhibit UNG activity in cultured mouse B cells undergoing CSR. As expected, class switching was dramatically reduced. Not so expected was evidence suggesting that DNA breakage continues unabated in the face of UNG inhibition. The authors assayed accumulation of the phosphorylated form of the histone variant H2AX (␥-H2AX) at the IgH locus as an indirect measure for double-strand breaks and observed no reduction in ␥-H2AX after Ugi expression (Begum et al., 2004a). Two explanations for this result come to mind. One is technical: even if UNG inhibition substantially diminishes the number of DNA breaks created, a reduction in CSRinduced ␥-H2AX levels might not be readily apparent, because a single double-strand break can elicit accumulation of many ␥-H2AX molecules distributed over megabases of adjacent chromatin (reviewed in Bassing and Alt, 2004). Indeed, as Begum et al. note, direct measurement (by LM-PCR) of CSR-associated doublestrand breaks in human cells bearing UNG mutations reveals decreased breakage in the Ig S region (Imai et al., 2003). The other explanation, as put forth by Begum et al., is that DNA cleavage might occur through an UNGindependent pathway. This is eminently reasonable in light of the more recent data from Rada et al., which indicate that the mismatch repair system is responsible for inducing UNG-independent switch recombination (Rada et al., 2004). An interesting future experiment would be to observe accumulation of ␥-H2AX (or better yet, examine DNA breaks directly) at the IgH locus in UNG/MSH2 doubly deficient mice. Not having the new work of Rada et al. to contemplate, however, Begum et al. came up with a radically different idea. They proposed that UNG is involved in the repair step of CSR rather than its initiation. This raises an interesting question. Could UNG serve additional, nonenzymatic roles in CSR? Perhaps yes. Certain UNG mutants that lack uracil glycosylase activity but retain DNA binding ability restored CSR in cultured UNG-deficient mouse cells (Begum et al., 2004a). One possible explanation for this rescue is that these mutants retain enough glycosylase activity in vivo to support CSR. But a more intriguing possibility is that UNG performs additional functions in CSR. This would provide an alternative explanation for the observation that other glycosylases, while abundant in lymphocytes and capable of cleaving uracil from DNA in the test tube, do not provide significant backup activity in UNG/MSH2 doubly deficient mice. Begum et al. (2004) posit that the role of UNG (and repair proteins) could be to recruit error-prone polymerases during CSR and SHM and that UNG deficiency perturbs the mutation pattern in B cells because different error-prone polymerases are being recruited. It is conceivable, however, that binding of dU by catalytically dead UNG mutants poses a block to replication sufficiently severe to force resolution by the MSH2-dependent CSR pathway, e.g., by causing flipping-out of the
uracil base without catalytic excision (J. Di Noia and M. Neuberger, personal communication). Put another way, in an UNG-deficient cell replication over the uracil would be the default pathway, whereas the presence of a catalytically dead UNG might force the mispair to be shunted to the backup MSH2 pathway. Further work should be done to test this hypothesis; it would be interesting, for example, to determine whether the UNG separation-of-function mutants can restore CSR in ung⫺/⫺msh⫺/⫺ B cells. Do these new data undermine the DNA deamination model or support it? In our view, the Begum study is fully consistent with the notion that AID deaminates DNA. Why would UNG deficiency affect C/G transversions or CSR if the uracils subject to AID activity are not DNA intermediates? It is remarkable that in a mere 5 years since the discovery of AID it is now possible to articulate a comprehensive model for how this enzyme initiates both SHM and CSR. Many features, of course, remain to be worked out, and the molecular details of the events downstream of AID, UNG, and mismatch repair remain obscure. How do phase 2 mutations arise? What are the repair processes involved in joining the broken ends created during CSR? These and many other questions await further investigation. It appears, however, that we are beginning to get the bases covered. Selected Reading Bassing, C.H., and Alt, F.W. (2004). Cell Cycle 3, 149–153. Begum, N.A., Kinoshita, K., Kakazu, N., Muramatsu, M., Nagaoka, H., Shinkura, R., Biniszkiewicz, D., Boyer, L.A., Jaenisch, R., and Honjo, T. (2004a). Science 305, 1160–1163. Begum, N.A., Kinoshita, K., Muramatsu, M., Nagaoka, H., Shinkura, R., and Honjo, T. (2004b). Proc. Natl. Acad. Sci. USA 101, 13003–13007. Chaudhuri, J., Tian, M., Khuong, C., Chua, K., Pinaud, E., and Alt, F.W. (2003). Nature 422, 726–730. Chaudhuri, J., Khuong, C., and Alt, F.W. (2004). Nature 430, 992–998. Di Noia, J., and Neuberger, M.S. (2002). Nature 419, 43–48. Doi, T., Kinoshita, K., Ikegawa, M., Muramatsu, M., and Honjo, T. (2003). Proc. Natl. Acad. Sci. USA 100, 2634–2638. Honjo, T., Muramatsu, M., and Fagarasan, S. (2004). Immunity 20, 659–668. Imai, K., Slupphaug, G., Lee, W.I., Revy, P., Nonoyama, S., Catalan, N., Yel, L., Forveille, M., Kavli, B., Krokan, H.E., et al. (2003). Nat. Immunol. 4, 1023–1028. Ito, S., Nagaoka, H., Shinkura, R., Begum, N., Muramatsu, M., Nakata, M., and Honjo, T. (2004). Proc. Natl. Acad. Sci. USA 101, 1975– 1980. Lee, G.S., Neiditch, M.B., Salus, S.S., and Roth, D.B. (2004). Cell 117, 171–184. Nambu, Y., Sugai, M., Gonda, H., Lee, C.G., Katakai, T., Agata, Y., Yokota, Y., and Shimizu, A. (2003). Science 302, 2137–2140. Neuberger, M.S., Harris, R.S., Di Noia, J., and Petersen-Mahrt, S.K. (2003). Trends Biochem. Sci. 28, 305–312. Petersen-Mahrt, S.K., Harris, R.S., and Neuberger, M.S. (2002). Nature 418, 99–103. Pham, P., Bransteitter, R., Petruska, J., and Goodman, M.F. (2003). Nature 424, 103–107. Posey, J.E., Brandt, V.L., and Roth, D.B. (2004). Nat. Immunol. 5, 476–477. Rada, C., Ehrenstein, M.R., Neuberger, M.S., and Milstein, C. (1998). Immunity 9, 135–141. Rada, C., Williams, G.T., Nilsen, H., Barnes, D.E., Lindahl, T., and Neuberger, M.S. (2002). Curr. Biol. 12, 1748–1755. Rada, C., Di Noia, J.M., and Neuberger, M.S. (2004). Mol. Cell 16, 163–171.