Cell, Vol. 35, 721-731,
December
1983 (Part 2). Copyright
(0 1983 by MIT
0092-8674/&?3/130721-11
$02.00/O
Modifications of a Trypanosoma b. brucei Antigen Gene Repertoire by Different DNA Recombinational Mechanisms Etienne Pays,* Marie-France Delauw,* Suzanne Van Assel,’ Monique Laurent,* Tony Vervoott,+ Nestor Van Meirvenne,t and Maurice Steinert* Department de Biologic Moleculaire Universite libre de Bruxelles Rhode St Genese, Belgium + Laboratorium voor Serologie lnstituut voor tropische Geneeskunde Antwerp, Belgium l
Summary In the Trypanosoma b. brucei AnTat 1.K clone, the gene coding for the variant-specific surface antigen is telomeric and appears as a hybrid sequence, partially modified by gene conversion. This conversion is very similar to that observed in another AnTat l.l-expressor clone (AnTat l.lB). This sequence is not activated by duplicative transposition, although it could be activated by duplication in another clone (AnTat 1.10). Instead activation of the AnTat l.lC gene seems operated by reciprocal recombination between its own telomere and the telomere carrying the previous (AnTat 1.16) ELC. Indeed, from the switch to AnTat l.lC onward, the AnTat 1.16 ELC becomes a new silent member of its gene family, whereas in the variant directly derived from AnTat l.lC (AnTat 1.38), the AnTat l.lC-containing telomere is lost, probably replaced by a large duplicate, at least 40 kb long, of the AnTat 1.3 gene-containing telomere. Different DNA rearrangement mechanisms used by the trypanosome to change its antigenic type thus contribute, by gain and loss of genes, to the evolution of the repertoire for surface antigens. Introduction The mechanism of antigenic variation in African trypanosomes has been partially elucidated (for reviews see Borst and Cross, 1982; Englund et al., 1982). The extensive antigenic repertoire of these parasites reflects the existence of a large number of genes, each one coding for a different variant-specific surface antigen (VSA). Only one VSA gene seems to be expressed at any one time. In many cases, expression of the VSA gene clearly depends on a genomic rearrangement: a silent “basic copy” (BC) of the gene gives rise to an “expression-linked” copy (ELC), which is found transposed to a probably unique “expression site” (Hoeijmakers et al., 1980; Pays et al., 198la, 1983b), only the ELC being transcribed (Pays et al., 1981 b). In other cases the change of antigenic type occurs in the absence of gene duplication (Williams et al., 1979; Young et al., 1982) yet this has been observed only in the case of telomeric genes. Two different mechanisms of VSA gene activation seem to coexist (Majiwa et al., 1982). It has been hypothesized that VSA genes located in chro-
mosome ends could be exchanged with a former ELC by a reciprocal crossing-over (Borst and Cross, 1982; Laurent et al., 1983). We have undertaken a detailed study of the T. b. brucei AnTat 1.1 VSA gene in different clones expressing the same variable antigen type (VAT), AnTat 1.1 (see the derivation scheme of these clones in Figure 1). The main results (Pays et al., 1981a, 198lc, 1983b, 1983~; Michiels et al., 1983) can be summarized as follows. The AnTat 1.1 gene belongs to a family of five related sequences, designated according to the size of the Pst I fragments in which they are almost entirely comprised: the “9 kb,” “7 kb,” “6.4 kb,” “4 kb,” and “2.15 kb” sequences (see the AnTat 1.l Pst I digestion pattern in the first block of Figure 1). Two of these sequences (“6.4 kb” and “9 kb”) are located near a chromosome end, but are differently oriented toward the corresponding DNA terminus. The same two sequences are the only AnTat 1 .l gene family members reported to encode for the synthesis of VSAs, the AnTat 1 .l- and 1 .lO-specific antigens being specified by the “6.4 kb” and “9 kb” sequences, respectively. Both genes are expressed following duplicative transposition. Segments of the “6.4 kb” sequence are also duplicated and transposed as ELCs in the AnTat 1.1 B, 1.1 D, and 1 .l E clones; however, the length of these ELCs is variable. In the AnTat 1.1 B clone, which is derived from AnTat 1.10, a 1 kb long stretch from the “6.4 kb” ELC has replaced the homologous sequence from the previously expressed “9 kb” ELC, and we could show in this case that gene conversion is the mechanism by which the ELCs are generated (Pays et al., 1983c). Segmental gene conversion clearly adds to antigenic diversity, since, for instance, the AnTat 1.1 B ELC codes for a chimeric protein with both AnTat 1 .l- and 1 .lOspecific domains. The ELC is, however, generally lost from the genome during the antigenic switch, since it is often the target for the next gene conversion. We report here, in the clone AnTat l.lC, the existence of a VSA gene activation mechanism different from the gene conversion, which can thus alternatively apply to the expression of the AnTat 1 .l VAT. The mechanism is speculatively interpreted as a telomeric reciprocal recombination, since it implies two predictions we could clearly verify-namely, the conservation of the ELC from the previously expressed VSA gene (AnTat 1.16 in this case) and the loss of the AnTat 1.1 C gene in the ensuing variant (in this case AnTat 1.38) provided the VSA gene of the latter variant is activated by duplicative transposition. Alternation of the two VSA gene activation mechanisms thus contributes, by both gain and loss of different sequences, to the rapid evolution of the variable antigen gene repertoire.
Results Alteration of an AnTat 1.1 Gene Family Member in the AnTat l.lC Clone The five members of the AnTat 1 .l multigene family have been analyzed in a series of ten trypanosome clones
Cell 722
EATRO
1125 - Al.1 -Al.3 w A 1.6 - A 1.16 - Al.lC b Al.36 I Al.10 - Al.16 c - Al.lD - (fly) - Al.lE
probe
H
P+H
P
: 3’
“9kb’in
AnTat
ll6
“9kb;n
AnTat
MC
ELC
AnTat
11
ELC
AnTat
110
ELC
AnTat
11s
P
E
SP
P I I
E 1~
SP It
t
I
EM I
PY I I
ME
HY
PM
PE
P
MSsM
SP
PE I I
P I
YS,M III
Sp IL
ME
HY
PM
SP
ME I
H I
P
sp I
ME
H
P
e-s
ELC
AnTat
1j.D
ELC
AnTat
11E
SS
SP
SSM
SI
SP
SSM
I
I
l--z--lhb Figure 1. One of the AnTat 1 .l Gene Family Members
(“9 kb”) Is Altered in Clone AnTat 1 ,I C, and Is Lost from AnTat 1.38 DNA
The restriction pattern of the AnTat 1 .I gene family members has been analyzed in ten trypanosome clones, whose derivation scheme is shown (A 1 .I is for AnTat 1 .l, etc.) (see Experimental Procedures for the description of clone derivation). Southern blots of genomic DNA digested by Pst I, Pst I+Hind Ill, or Hind Ill have been hybridized with an AnTat 1 .lO cDNA probe specific for the 3’ half of the gene (see AnTat 1.10 map below); the DNA digests are from AnTat 1.1, 1.3. 1.6. I. 16, 1.1 C, 1.3B. 1 .I 0, 1.1 B, 1.1 D, and 1 .l E clones, respectively. from left to right in each block. The gene family members are defined according to the size of the specific Pst I fragments each one engenders and are labeled with different symbols (left panel); their position in the two other panels is indicated by the same symbols. The ELCs in the AnTat 1.1 homoisotypes (AnTat 1.1, 1.16, IilD, and 1.1E) are indicated by arrowheads; they do not hybridize with the AnTat 1 .lO cDNA probe to the same extent in all cases, because of local DNA rearrangements (Pays et al.. 1963b, iQ63c). The restriction maps of the sequences found to differ in some of the clones are shown below: the “9 kb” sequence as it is in AnTat 1.16 (and in all the other clones except AnTat l.lC and 1.38) the ‘9 kb” sequence in the AnTat 1.1C clone and the AnTat 1.1, 1.10, 1.16. l.lD, and l.lE ELCs. The two arrows above these maps point to an Eco RI site and a Pst I site which are conserved in all sequences, The lefthand Pst I site in the two top maps is located Q kb from the arrowed one. When known, the extent of the corresponding cDNAs is indicated (open boxes). The extent of the converted sequence in each case is indicated by a bar under the maps (see text). Two probes have been derived from AnTat 1 .lO cDNA, as indicated below the AnTat 1 .lO map. Abbreviations, used in each restriction map, are B, Bgl I; Bg, Bgl II; E, Eco RI: H, Hind III; Hi, Hinf I; M. Msp I; P, Pst I; Fv, Fvu II; Sp, Sph I; Ss, Sst I; T, Taq I,
Activation 723
of T. brucei VSA Gene
(AnTat 1.1, 1.3, 1.6, l.lC, 1.38, 1.10, l.lB, l.lD, and 1 .I E) whose pedigree, starting from the EATRO 1125 stock, is outlined in Figure 1. We usually define the AnTat 1 .I family members according to the size of the Pst I fragment to which they give rise: as shown in Figure 1 (left), they are represented in the “9 kb,” “7 kb,” “6.4 kb,” “4 kb,” and “2.15 kb” bands. In addition, the same panel shows the ELC-containing fragments (arrowheads) (2 kb in clones AnTat I .I, 1 .lO, and 1 .lB; 17.5 kb in AnTat 1 .lD; and 21.5 kb in AnTat 1 .lE) identified thanks to detailed analysis presented elsewhere (Pays et al., 1981 a, 1981 b, 1983~). Strikingly, in AnTat 1 .lC DNA, no ELC could be detected in Pst I digests. The restriction pattern of each of these sequences upon different digestions (for instance, Pst I+Hind III and Hind III in Figure 1) has been compared in the different clones. Only one of the AnTat 1 .I gene family members, the “9 kb” sequence, was found not to be conserved in all these clones, as illustrated in Figure 1 by the disappearance of the 9 kb band in AnTat 1.3B after Pst I digestion and in both AnTat 1.1 C and 1.38 upon Pst I+Hind Ill double digestion. The restriction map of the “9 kb” sequence in AnTat 1.1 C was compared with that in the other clones, and with the ELC maps of other homoisotypes (Figure 1). It appears that in a region around the arrowed Eco RI site, the “9 kb” sequence of the AnTat 1.1 C clone has acquired cleavage sites (in particular a Hind III site) characteristic of the AnTat 1 .I, 1 .I B, 1 .I D, and 1 .I E ELCs, which all, in this region, are copied from the “6.4 kb” member (Pays et al., 1983c, and results not shown). This rearrangement is responsible for the duplication of the internal 510 bp Hind Ill-Pst I fragment in AnTat 1 .I C (Figure 1, second panel); no evidence for a duplication of the “9 kb” member itself has been found, whatever restriction endonuclease was used (Figure 1 and results not shown).
Partial Gene Conversion between Two Gene Family Members In order to analyze the alteration of the “9 kb” member more thoroughly, we sequenced the 9 kb Pst I fragment from variant AnTat 1.1 C between codons 45 and 418 of the AnTat I .l -specific sequence, and compared this sequence to those, already known, of the AnTat 1.1 and 1. IO cDNAs (transcribed from the “6.4 kb” and “9 kb” members, respectively; Pays et al., 1983c) (Figure 2). In variant AnTat 1.1 C, down to codon 320, a long stretch of the “9 kb” sequence has clearly been replaced by a copy of the corresponding “6.4 kb” sequence. Since the “6.4 kb” sequence itself is conserved unaltered in AnTat 1.1 C (Figure 1 and results not shown), the observed rearrangement can be attributed to a partial gene conversion, and not to a reciprocal recombination. This situation has also been observed in another clone, AnTat 1 .I B (Pays et al., 1983~; sequence redrawn in Figure 2) although in this case the partial conversion affected an additional copy of the “9 kb” sequence, namely the AnTat 1 .I 0 ELC, and not the “9 kb” sequence itself. The 3’ junction of the “9 kb”
sequence with the incoming block is the same in both cases (asterisk in Figure 2) suggesting that recombinations at this site are favored. In the case of AnTat l.lB, however, a stretch of 133 bp of unknown origin is intercalated between the “6.4 kb” and “9 kb” domains (Pays et al., 1983~); this is not the case in the AnTat 1 .l C sequence (Figure 2). The 5’ junction is not precisely defined, but must be located a short distance upstream from the coding sequence in both cases, as judged from restriction maps (Pays et al., 1983c; Figure 1 and results not shown). Figure 3 presents the restriction maps of the “6.4 kb”- and “9 kb”containing telomeres in AnTat 1 .I B and 1 .lC clones, as well as the AnTat 1.1 B ELC and expression site: the specific “6.4 kb” and “9 kb” rearrangements are schematically represented in the corresponding shades, with in each case some uncertainty at their 5’ edge. The length variations of telomere ends have been observed in several cases (Young et al., 1982; De Lange and Borst, 1982; Williams et al., 1982; Pays et al., 1983b, 1983~; Bernards et al., 1983). Except for these variations, no differences other than the partial gene conversion can be detected in the restriction maps of the “6.4 kb”- and “9 kb”-containing telomeres in the two clones.
The “9 kb” Sequence Is Transcribed in the AnTat l.lC Clone Since no AnTat 1 .I ELC could be detected in the AnTat 1.1 C clone, we speculated that the expressed AnTat 1.1 C gene could be the rearranged “9 kb” sequence. In isolated nuclei, expressed sequences are selectively digested by DNAase I as a result of an altered configuration of the actively transcribing chromatin (Weintraub and Groudine, 1976). We thus exposed AnTat 1 .lC nuclei to a mild DNAase I digestion, in order to see whether the “9 kb” sequence would be preferentially hydrolyzed. This was the case, as illustrated in Figure 4. Either the 2 kb Eco RI fragment of the “9 kb” sequence (A), or the 9 kb Pst I fragment itself (B) is more sensitive to DNAase I than the other AnTat 1 .I -specific fragments of the AnTat 1 .lC genomic DNA. The “9 kb” sequence is thus in an active chromatin configuration in the AnTat 1.1 C clone; this is not true in AnTat 1.lB (Figure 48) or in other clones (results not shown). Proof that the rearranged “9 kb” sequence is the template for AnTat 1 .lC mRNA synthesis came from the analysis of the AnTat 1 .lC cDNA. Indeed, the restriction map of the cloned AnTat 1 .I C cDNA is matched only by the rearranged “9 kb” sequence map (represented by the box in AnTat 1 .l C map, Figure 1). Portions of the AnTat 1 ,l C cDNA have been sequenced at both extremities (codons 22 to 59 and codons 308 to the end). These AnTat 1.1 C sequences were identical with the corresponding ones from the sequenced portion (codons 45 to 418) of the “9 kb” AnTat 1 .I -specific sequence (Figure 2); the AnTat 1 .l c cDNA is thus a perfect copy of the chimeric “6.4 kb”/“9 kb” sequence. We can exclude the possibility that the “6.4 kb” domain of the AnTat 1 .lC cDNA was
Cell 724
CCA CAA C?A CCA
CM CM CAA CM
CAA CAR CM CM
GCT GCT GCT GCT
CTA CTA CTA CTA
GCT GCT GCT GCT
30 CAG CAG CAG CAG
GGT GGT GGT GGT
AGG AGG A'32 AGG
CCC CCC CCC CCC
CTT CW CTT CTT
GCA GCA GCA GCA
GAT GAT GAT GAT
GTG Gc_G GTG GTG
40 GTA GTA GTA GTA
GGC GGC GGC GGC
AAA AM AAA AM
ACT GCA KE ACT
CTA CXCTA CTA
TGT TGc_ TGT T,GT
ACT ACT ACT ACT
CGC CGA '3% 0%
CAG CAG CAG CAG
GCA GCA GCA GCA
GCA GCA GCA GCA
MC MC MC AAC
CTG CTA Cn; Cn;
GCG GCC GC?i GCG
60 CM ACA CTA CAA CM GCT CTA GAT CM ;C; CTA FE CAA ACA CTA CM
CGA CGC CGsi CGA
GCC GGC Gi?C GCC
AGC ATC A?iC AGC
TCA ACA %A TCA
GCA GCA GCA GCA
GCA GCA GCA GCA
70 AAG AAA --MG AAG
CM AAG CM CAA
TCC TCG -TCC TCC
AGA CAA AGA AGA
CAA CAA CM CM
GCG GCG GCG GCG
CAG CAC CAF CAG
CT'2 CT'2 CTG CTG
GCC GCG e GCC
AAA AAA AAA AAA
Cn; CTA cz CTG
CCA GAC CCC CAC CC: EAC CCA GAC
TAC TAC TAC TAC
90 AM AGA A& AAA
GCG ACA CTG GCG AG-----AX GCG ACA c% GCG ACA CTG
1 10 1B 1C
20 GCT TCA GCA CTG ACA CTA CAC GGT GCC GCA CT-2 ACA CTA CAC GCT TCA GCA CT% ACA CTA CAC -------------------$TA CAC
1 10 1B 1C
TAT TAT TAT TAT
TCA TCA TCA TCA
AAA AAG 6 AAA
ACG ACA AC?i ACG
GCC GCC GCC GCC
AM AAA AM AAA
1 10 1B 1C
CAG TTA GAG TTA FAG TTA CAG TTA
GCG GCG GCG GCG
80 GCT Aa; --GCT GCT
TTA ATA TTA TTA
GCA GCA GCA GCA
1 10 III 1C
GCG CM GCA GM &FAA GCG CAA
GCC GCA GCC GCC
110 AGC MC A%2 AGC
ATC GM MC ATC GM MC ATC GAA AAC ATC GM AAC
1 10 16 1C
TTG TTG Tl'G TTG
CTA ATG FTz CTA
CTA CTA CTA CTA
140 GM GM GAA GAA
GGG CAC CGA GAG GGC CAC AGA GAC & CAC CGA GAG GGG CAC CGA GAG
1 10 18 1C
WC TTC TTC TTC
GTC GTC GTC GTC
AAA AAA AAA AAA
170 ACA ACA ACA ACA
GM GAG Gc GM
4
50
1 10 1B lc
1
TGG ACA GGA GAG TGG MA GGC CAG TGG A& GGiizAG TGG ACA GGA GAG
GM GM GAA GM
GCA GCA GCA GCA
GCC GCA GC? GCC
TTA CTC TTA TTA
ATT ATT ATT ATT
TAC TAC TAC TAC
100 GCC GCC Gee GCC
ACG AM -ACG ACG
CAC AAA ATA CAA GAC AAC AM AEA %A GAA_ CAC AAA ATA CAA GAC CAC AAA ATA CM GAC
120 ACT MG CTA ACC AAA CT2 AC? AAE CT: ACT AAG CTA
GTT GTG Ge GTT
GGC CAG GCG ATG TAT GGC GAG GCA ATG TAT GGC FAG GCE ATG TAT GGC CAG GCG ATG TAT
TCC TCC TCC TCC
TCA TCA TCA TCA
130 GGG GGC & GGG
AGA AGA AGA AGA
ATC ATC ATC ATC
GAC GAC GAC GAC
GM GAG GE GAA
CTG CTG CTfi CTG
ATG ATG ATG An;
ACT ACT ACT ACT
160 GCC GGC GGC MT ACA GGA GGC AAC RCA -- MC GCC G4X GGC AAF ACA GCC GGC GGC MT ACA
GTA GTA GTA GTA
AAT GAT i&T AAT
GM GAA GM GAA
AGC CM E AGC
GGC GAC & GGC
180 CAC MC ATC GAG GCA GAC CAC MC ATC AAC GCC GAC CAC AAC ATC ZAzGCiiGAC CAC AAC ATC GAG GCA GAC
TGC TGC TGC TGC
CTA CTA CTA CTA
GGC GGC GGC GGC
GCG GCA GC?i GCG
GCC GCA GCC GCC
MC TCA GM TCA iiFTCA AAC TCA
MC GAC %.C AAC
ATA ATA ATA ATA
190 GGG CM GCG GCA ACG ACT CTA AGC CAA GM CM GCG GCA CGA ACC CTA AGC CAG GG-----------------------------GGG CAA GCG GCA ACG ACT CTA AGC CAA
210 GCC AGC GGA GGC GCA AGC TGC AAA ATA ACA GCA AAC CTT GCC ACT GAC TAC 53 GGC GGc ca --- ~02 tic TGC AX AX ACA GGc uc CTT GCC A= GAC TAC T----------------~---------------------T KC A= GAC TAC
GM
AGT ACA GAC CC;
lo 18 1C
CCG AGT --CCG CCG
CTA T'IG CTA CTA
1 10 1B 1C
260 270 ATC AGC GCA CTC AAA AAT AAG GGC GCC GGT GTC GCA GCT AAA CTG GCA ACT GTA ACG AE A= Gee cm GCG AC~C AAG cm GCT AAc cm AAA GCA CAC ACG GM -- AA~ cat AW m ATC AGC GC; CTC E AAT AAG GGC GCC G?i? FTC GCsi GCT AAii Cn: ?i?A iiC? GTA ACA ATCAGCGCACTCAMAATAAGGGCGCCGGTGTCGCAGCTAAACTGGCAACTGTAA~
1 10 1B 1C
AAA mAAG G
ACA ACA ACA ACA
CTA CTA CTA CTA
1 10 1~ 1C
GAC &AC GAC GAC
27-m TTC TTC TTC
GAC GAC GAT GA5
1 10 1B 1C
350 GTl' CCC ---- CTC GGA GlT CCA CM AAA GTT CCA CAA AAA GTP CCA CAA AAA
1
CTC CTA CT? CTC
1 10 18 1C
GM GM GM
'XC CM KTA CM ATA CM ATA CAA
1 10 1B 1C
GAT GAT GAT GAT
WA GCA GCA GCA
GM
230 GGC GGC GGC GGC
290 CTG CTG CTG CTG 320 GCC GE GCC GCC *
380 ACA ATA AAA AAA
GGC GGA GG? GGC
GM
GCT GCT GCT GCT
150 GAC GGC GCG AAC GGA CAG GAC AAA GGA CM TCA GCA GGG CAP. -- ACC AAA G?i?=cc = GGii CAG GAC AAA GAC GGC GCG MC GGA CAG GAC AAA
TGC GAC ACG GM TGC GAC ACG GM TGC GAC ACG GAA TGC GAC ACG GAA
200 GAA AGT ACA GAC CCA GM GM MT =A GAC ux GM ----:--( not s&encefj
MT AM AAF MT
ACC ACC ACC ACC
GCC AGC GGA GGC GCA AGC TGC AAA ATA RCA GCA MC
'3% CTA ACC ATA CAC AAC Cn; CTA ACA ATA CAC MT CTG CTA AC?! ATA CAC AA? CTG CTA ACC ATA CAC AAC
GCT TCG GCA GCA &TCA GCT TCG
AAA AAA AAA AAA
GGG GAA GGA_Gc_A GCC GAG = GG
CAC ATC ~2 AAA GTG CAC ATC AAA AAA GTG CRC ATC AA?. AAA GTG CRC ATC AAA AAA GTG
GGA CCA CCA CCA
GGA -GGG GGG GCG
CAA ACC ACC ACC
xc AGC AGC AGC
GGC GGC GGC GGC
240 GGC GGC GGC GGC
300 CTC CM CTA -- GCA CT% CAA CC.7 CM
TTC T-TT TTC l"i?C
AAA %AAA AAA
wc TTC TTC TTC
330 G~-GM GAC GGc AAA GCG GAA GAC GGC AAA GCG GM GAC GGC AAA GCG GM GAC GGC AAA
GGc GGC GGC GGC
GCA GCA GCA GCA
AAG MG MG MG
WLT CCT CCT CCr
AAA ACA AAG GCA tijrCA AAA ACA
CGC GCC CGG AGC CdEC & GCC
360 ACA CAA A&& AAA CAA CTC TAT TCC RCA GM AGC AAA CM CTC TAT TCC ACA GM AGC AAA CM CTC TAT TCC ACA GM AGC AAA CM CTC TAT TCC
GCC GCA GCA ACC AAA Gee GCA GC?f AG GCC GCA GCG ACA AAA GCC GCA GCG ACA AAA
410 AA~ GAZL TGC AA~ MC GM TGC MC MC GAA TGC MC AAC GAA TGC AAC
GCA GGG G?i GeA
GCC GCC GCG GCC
MC MC MC AAC
CT-T GCC ACT GAC TAC GAC AGC CAT GCG MT
GGA CM --ACG ACA GGA CAA GGA CM
GAC GAC GM GAC
GAG GAG GAG GAG
AGC ACA ---AGC AGC
TAT TAC TAT TAT
CCA GCG CCA CCA
ACC AAC AAG CTA GCC AAC AM CTA ACC AAC AAC CTA ACC AAC AAG CTA
280 TCG GCA GCA CCT ACA TCG GCA CCA ACE GCG GCA CGA CC? AC: ?CG GCA =A CCT RCA
AGC ACC A% A%
AAG AAA AAE AAG
CAG CM CAC CAG
GM CTC GM CTC GAG TTC GAAcn:
GGC GGA -GGC GGC
MT ACT ET AAT
MC AGC AsiC AAC
GAC AGC GCC TAT Gee GAC AGC GCC TAT GCC GAC AGC GCC TAT GCC GAC AGC GCC TAT GCC
ATT FIT CTT Cl-T
TGG TGG TGG TGG
310 AAA GM ik, AAA
GCC AAA GCC GCC
AAG AAG MG AAG
CCT CCT CCC CC?
GAG GTA = GsiG
GCA GCA GCA GCA
340 CTT GM GGA ATA CTT GAA GG ATA CTT GM GGG ATA Cl-l- GM GGG ATA
TCC TCC TCC TCC
ATT ATT ATT ATT
GAG GA?i GAA GAA
Affi ACF ACC ACC
ATA ATA ATA ATA
GCA GCA GCA GCA
4
370 An: CAG CCA AAA GAC CTA ATG GCA GCT An; CAG CCA AAA GAC CTA ATG GCA GCT ATG CAG CCA AAA GAC CTA ATG GCA GCT ATG CAG CCA AAA GAC CTA ATG GCA GCT
GCA CCA CCA CCA
TGC TTGC TGC
390 CCA GGC CAT AAA CM CCA AAG CAT AM cTG CCA AAG CAT AAA CTG CCA MG CAT AAA CTG
ACA ACA ACA ACA
ACC ACC AGC AW
AU2 GCT GCT GCT
GM
TTc TTC TTC TTC
T@ TGC TGC TGC
AGT AGT AGT AGT
420 BAT AA~ GGc TAT MC iiGc TAT MC AGC TAT AAC AGC
ACT ACT ACT ACT
GAC GAC GAC GAC
TCA SicA ACA ACA
ACC ACC ACC ACC
GM GAA GM GAA
GAG TTA
250 TTG CM ACC GCA GCA ATA AM ACC GCG GO2 ---TTG CM ACC GCii GCA TTG CAA ACC GCA GCA
CCA GA GCA GCA
A
220 GAC AGC CAT GCG AAT GAG TTA GAC ~~37 CAT cc~ MT GAT --- cn: GAC AC ,-AT c,E MT GAG TTA
CCA GCA CCA CCA
TTA TTA TTA TTA
ATA ATA ATA ATA
GGA GGA GGA GGA
GM
AGT AGT AGT AGT
GAC GAC GAC GAC
400 GCC GCC GCC GCC
CTA CTA CTA CTA
TGC EC TGC EC
AGT AM ATA AAG AGT AM ATA MG AGT AAA ATA AAG AGT AAA ATA AAG
GCT GCT GCT GCT
RAA AAA AM AAA
AAG AAG RAG AAG
430 TW TGC TGC TGC
AM FAA CM CAA
TAT T?T TTT TTY'
MT MT AAT AAT
GM GM
Gee G= GM GAA
*CA AC: ACC ACC
AAA AAA AM AAA
Activation 725
1
10 IB 1c
10 1B 1c
10 1B
1c
1 10
GCT GCT GCT GCT
of T. brucei VSA Gene
TCA GCA -GAC AAA GAC AAA GAC AAA
GAC TGC GA?i TGC! GAG ~~32 GAG TGC
GTT GTT GTT G'IT
440 AGT AGT AGT AGT
GAT GGA GGA GGA
GCC CCT ___----GTT TCG Gl-32 TCG Gl-l' TCG
GTA TTG TTG TTG
ACA CCT CCT CCT
CM AAA AAA AAA
GCT --ACC ACC ACC
CAA GGA GGA GGA
470 AAA GAT GGC TCC AAA TGG GAG GCT AAA TCT CCG SiAT TGT AAA TGG GAG GE rim TCT CCG AAT TGT AAA TGG GAG GGC TGT AAA TGG GAG GGC AAA TCT CCG AAT
500 TCT GCT GCG TTA GTG GCA CT'2 --TCT GCT GCA TT? @G TCC TTG TCT GCT GCA TTT GCG TCC TTG TCT GCT GCA TTT GCG TCC TTG
CTG CT? CTT CTT
TTC TT? TTT T'lT
450 ACT ACA AGC CGA AGT -CCT ACC GT ACT ACC GGT ACT CCT CCT ACC GGT ACT
GAA GAA GM GAA
GAA ACA CCA GCA GAA GCA ACA ACr GAA GCA ACA ACT GAA GCA ACA ACT
480 ACT TGC AAA GAT TCC ACT TGC AAA GAT TCC ACT TGC AAA GAT TCC ACT TGC AAA GAT TCC
TCT TCT TCT TCT
ATT ATT ATT ATT
GAA GAr GAT GAT
460 AAJJTW-ACA GGG-AAG RA_A AA5 AAG TGT AAA GAT AAG ACA AAA AAG TGT AAA GAT AAG ACA AAA AAG TGT AAA GAT AAG ACA AAA
CTA CTA ACA CTA ZTA A@ CTA GTA ACC CTA GTA ACC
GAT GAT GAT GAT
490 AAG AAC TTC GCC CTC AK -- GTG AAG AAi TTC GCC CTC AGT CTG RAG AAA TTC GCC CTC AGT CTG RAG AAA TTC GCC CTC AGT CTG
TM ACACCTTI'C TTCCCCCCCCTT TAAAATTTTCCTTGCTACTTGAAAAAC TAA T ~CCTT~~CCCCCCCCmTAAAATTTTCCTTGCAC TAA TTTlGGC CC~-~CCCCCCCCT---Tcc~----------------------------------TAA TTTTGGCCC~RAATTCCCCCCCCT~T~T~TCCTCAC
TTCTGATATA'TT
TTAACAC
EE c
1B
-
lc
GAAGTTCCG~A~) +
Figure 2. DNA Sequence
of the AnTat 1 .lC Gene
Partial DNA sequence of the modified ‘9 kb” family member in variant AnTat 1.lC (IC, between arrowheads) and of the AnTat 1 .iC cDNA (IC, two stretches, respectively between the pairs of thick and thin arrows), compared with the AnTat 1.I (1) and 1 .I0 (10) cDNAs (transcribed from ELCs of the “6.4 kb” and “9 kb” family members, respectively; see Pays et al.. 1963~) and with the AnTat 1.16 cDNA (16, redrawn from Pays et at.. 1963c). In aligning these sequences, homologies are maximized by compensating for small deletions or insertions. The differences with respect to the AnTat 1 ,l C sequence are underlined. The asterisk marks a rearrangement junction point in both AnTat 1 .lB and 1 .l C DNAs. Codons are numbered starting from the known AnTat 1.l ATG initiation codon (Pays et at., 1963c; Michiels et al., 1983). The signal peptide and hydrophobic tail of the AnTat 1 .l antigen putative precursor are made up of 29 and 23 amino acids, respectively (Michiels et al., 1963). so that the junction marked with an asterisk (codon 320 on Figure 2) is located at 160 codons from the C-terminal end of the mature protein (between the 290th and the 291st codons on a total of 451).
AnTat 1.1 C trypanosomes are serologically indistinguishable from AnTat 1 .I is also demonstrated by the immunofluorescence test on fixed trypanosome smears (Table 1).
The AnTat l.lC Telomere Seems to Be Exchanged with the Telomere Containing the Previous (AnTat 1.16) ELC
L.--L-Y
Figure 3. Restriction Maps of the “6.4 kb” and “9 kb” Sequences in AnTat 1 .I B and 1 .lC DNAs. and Comparison with the AnTat 1 .lB ELC Map The sequence corresponding to the AnTat 1.1 mRNA is boxed in each map, with only the conserved Eco RI and Pst I sites indicated within these limits. The specific “6.4 kb” and “9 kb” sequences are represented in black and whrte, respectrvely (see text and Pays et al., 1983c). The wavy arrows represent the transcription of AnTat 1,l B and 1.1 C mature mRNAs.
synthesized on the “6.4 kb” sequence itself, since the latter, which is the AnTat 1 .I BC, never appears to be in an active chromatin configuration, either in AnTat 1 .lC or in other clones (Figure 4B and results not shown).
The AnTat l.lC Trypanosomes are Serologically Indistinguishable from AnTat 1.1 It has been shown, by the immune lysis test on living ceils (Pays et al., 1983c), that the surface-exposed AnTat 1 .I C antigenic determinants share the serological specificity of AnTat 1.1, although the AnTat 1.1 C mRNA is different from the AnTat 1.1 mRNA (boxed maps in Figure 1). That the
Since the AnTat 1.1 C clone was directly derived from AnTat 1.16, we also investigated the structure and expression of the AnTat 1.16 gene. Therefore the AnTat 1.16 cDNA was cloned and used as a probe in Southern blots of genomic digests from six successive variants (AnTat 1.1~1.~1.~1.1~1.1~1.38). A map in Figure 5 summarizes the information collected about the AnTat 1.16 sequence: this gene is telomeric, as indicated by the BAL31 exonuclease sensitivity of its 3’ environment (result not shown); moreover, its activation mechanism involves the transposition of an ELC into another telomere. The AnTat 1 .16 ELC is clearly seen in Hind Ill digestion products of the AnTat 1.16 genome (Figure 5) but, curiously, this additional copy is conserved in the two ensuing variants (AnTat 1 .lC and AnTat 1.38). The AnTat 1.16 ELC is in an active chromatin configuration in the AnTat 1.16 genome only, as revealed by its higher DNAase I sensitivity in AnTat 1.16 DNA only (Figure 5): the remaining AnTat 1.16 ELC in clones AnTat 1.1 C and 1.38 is thus probably not transcribed. We have shown also that the AnTat 1 .I C gene (in the rearranged “9 kb” sequence) is expressed without being duplicated (Figures 1 and 4) then is lost in the next variant, AnTat 1.3B (Figure 1). Together, these observations suggest that the telomere harboring the “9
Cell 726
DNAase I + E
The “Active” AnTat l.lC Telomere Is Probably Replaced by a Large Copy of Another Telomere in the Ensuing AnTat 1.38 Variant
kb
2.5 1% 1.2 -
Al.lC probe: A 1.10 (59
DNAase I + P kb
Al.16
Al.1 C
probe: Al.10 (31
Figure 4. Kinetics of DNAase I Hydrolysis Showing That the “9 kb” Family Member uration in the AnTat I. 1C Clone
of AnTat 1 .I -Specific Sequences, Is in an Active Chromatin Config-
AnTat 1.1C or l.IB nuclei were isolated and digested by DNAase I as described (Pays et al., 198Ib), for 0, 15, 30 set, 1, 2, 4, 6, 8, 10, 12. 14, 16, and 20 min, from left to right in A (only up to I2 min in B). After Eco RI (A) or Pst I (6) digestion, the DNA was hybridized with AnTat 1 .I0 cDNA probes specific for the 5’ (A) or 3’ (B) halves of the gene (see map in Figure I). Arrowheads indicate the sequences that are preferentially hydrolyzed; the 2 kb Eco RI fragment in (A) and the 2 kb Pst I fragment in (B) belong to the “9 kb” sequence and to the AnTat I .I B ELC, respectively. Note the fast disappearance of the 9 kb Pst I band in the AnTat 1.IC DNA, as compared to the same band in the AnTat I .I B DNA.
Table I. Serological 1.I Homoisotypes
Comparison
Trypanosome
of Surface-Exposed
Epitopes in AnTat
The loss of the “9 kb” sequence in the AnTat 1.36 clone, which stems from AnTat 1 .l C (Figure l), is most easily explained as a replacement by the AnTat 1.38 ELC (see scheme in Figure 6). The extent of the “9 kb” sequence replacement cannot be precisley defined, but at least encompasses the AnTat 1 .l C gene as well as a 1.4 kb long stretch in front of it, up to an Eco RI site located 2 kb upstream from the Eco RI site arrowed in Figure 1. We cloned this 2 kb Eco RI sequence, and when used as a probe, it failed to reveal the corresponding sequence in AnTat 1.38 genomic DNA digests (results not shown). To verify that the AnTat 1.38 gene is activated by transposition of an ELC, we cloned the AnTat 1.38 cDNA and used it as a probe in genomic DNA digests. Figure 7 shows that three AnTat 1.9specific sequences are revealed in AnTat 1.38 DNA, one of which is an ELC. The ELC can be revealed as a distinct band only by probing its 3’ environment, up to the DNA end (Figure 7, digestion by Pvu II). When hybridizing the cDNA probe to fragments characteristic for the gene’s 5’ environment only (as Sph I, Figure 7) the ELC fragments are perfectly superimposed on the corresponding ones from another AnTat 1.9specific sequence. Digestion by DNAase I shows that in these cases only one of the two superimposed fragments is in an active chromatin configuration (Figure 7). We found in this way that up to a Barn HI site about 40 kb upstream from the AnTat 1.3 sequence, the restriction map of the 5’ environment of the AnTat 1.38 ELC is indistinguishable from that of this other AnTat 1.3 sequence, therefore considered as the template of a large AnTat 1.38 ELC, at least 40 kb long. This ELC has thus probably replaced the “9 kb” telomere in the AnTat 1.38 clone. Interestingly, the template used is not the AnTat 1.3 BC, but the remaining AnTat 1.3 ELC (ex-ELC) conserved in a silent form in all variants derived from AnTat 1.3, probably as a result of telomeric exchange with the AnTat 1.6 gene-containing telomere (Laurent et al., submitted).
Discussion
clones
Antisera
AnTat 1.1
AnTat 1.10
AnTat l.lB
AnTat l.lC
AnTat I.6
AnTat I.8
Anti-I .l Anti-l.10 Anti-l.lB Anti-i.lC Anti-l.6 Anti-i.8
320 160 640 -
320 -
320 160 640640-
320 160
-
-
640 -
160
The end-titres of reaction of dday antisera from clone-infected rabbits with acetone-fixed tiypanosomes have been detenined by immunofluorescence test {Van Mekvenne et al., 1977). Dashes indicate no reaction at antiserum dilution of l/80.
kb” sequence has been exchanged with the AnTat 1 .16 ELC-containing telomere in the AnTat 1 .lC variant, if we assume that the recombination has taken place downstream from the putative transcription promoter(s) in the “active” telomere (see scheme in Figure 6).
This paper reports that in the AnTat 1 .lC clone, cDNA probes fail to detect an ELC, but that the “9 kb” sequence has been partially converted by another member of the AnTat 1 .I multigene family, the “6.4 kb” sequence. We had shown that this “6.4 kb” sequence acted as the BC for the AnTat 1 .l and 1 .lB ELCs (Pays et al., 1981a), and for the AnTat 1.1D and l.lE ELCs as well (Pays et al., 1983b). On the other hand, the “9 kb” sequence acted as the BC for the AnTat 1.10 ELC (Pays et al., 1983c). In all these previously examined cases, gene conversion took place within the expression site and with one of the BCs as donor. It might thus seem curious that, in the present case, the “9 kb” sequence behaved as receptor in a conversion event; at closer examination, however, it appears that this sequence shares more similarities with ELCs.
Activatron
of T. brucei VSA Gene
727
H
DNAase
I + H
Al.lC
A 1.16 probe
: A 1.16
s
end I
ELC probLwi
Figure 5. The AnTat 1.16 ELC Is Conserved,
in an Inactive Chromatin
Configuration,
in the Ensuing AnTat 1.I C and 1.38 Clones
The Hind Ill digests of AnTat 1 .I, 1.3, 1.6, 1.16, 1 .I C, and 1.38 genomic DNA& respectively, from left to right in the first panel, have been hybridized with an AnTat 1 .I6 cDNA probe, revealing an additional AnTat 1 .I6 gene copy in the last three variants (9.6 kb). This copy is highly susceptible to DNAase I hydrolysis in AnTat 1 .I6 nuclei only: the middle and left panels show kinetics of DNAase I digestion on AnTat 1 .I6 and 1 .iC nuclei, respectively, performed as described in the legend to Figure 4. Digestion times were 0, 7, 15, 30, 60 set, 2, 4, 6, 6 and IO min, from left to right in each block, respectively. The 9.6 kb Hind Ill additional fragment (arrowhead) disappears after 2 min digestion in AnTat 1.16, whereas it is still visible after IO min digestion in AnTat 1 .lC nuclei; the internal control is the 20 kb fragment, which in both cases disappears after a 4 min digestion. The AnTat 1 .I6 6.3 kb and AnTat 1.iC 7.1 kb and 5.5 kb sequences are due to polymorphism: they represent telomerfc fragments containing a divergent form. lacking the 3’ Hind Ill site, of the AnTat 1 ,I 6 BC in AnTat 1.I6 DNA, and of the AnTat 1 .I 6 BC and ex-ELC in AnTat 1 .lC DNA, respectively (data not shown). The restriction maps of the AnTat 1 .I6 BC and ELC sequences and surroundings are presented below. The known extent of the ELC is underlined under the BC map, with uncertainties at both ends; the open box represents the extent of the cDNA. The cDNA fragment used as a probe is shown below the ELC map.
VARIANT
@
EXCHANQE @
AND
VARIANT
BETWEEN
@
BY
ELC
OF
VARIANT
0
--m--B
-m--J@
--m-J@
-m-J@
Frgure 6. Possible Scheme of Gene Activation the AnTat 1.16, 1.I C, and 1.3B Genes
In a Series of Three Successrve
The wavy arrow represents
and the P the transcription
gene transcnptron.
REPLACEMENT
ELC@
Variants, Accounting
promoter
for the Observations
Reported
Here If A, B, and C are
Cell 728
Pv
DNAase
SP
Al.38 pro
be
Figure 7. The AnTat 1.38 ELC Is Probably
I + Sp
Al.lC
: A 1.38
at Least 40 kb Long
The Pvu II and Sph I digests of AnTat 1 ,I 6, 1 ,I C, and 1.38 genomic DNA% respectively, from left to right in the two left panels, have been hybridized with an AnTat 1.3 B cDNA probe. An additional AnTat 1.38 gene copy is revealed as a distinct band only when probing its 3’ environment (Pvu II digestion). When probing the 5’ environment (Sph I digestion for instance) this ELC appears superimposed to another AnTat 1.9specific fragment, but selectively disappears under DNAase I digestion (arrowhead). The restriction maps of the three distinct AnTat 1.3.specific sequences present in AnTat 1.38 DNA are shown below: BC has been identified as the AnTat 1.3 BC (Laurent et al., 1963); ex-ELC is the AnTat 1.3 ELC, which has been conserved in a silent form in all variants derived from AnTat 1.3 (Laurent et al., submitted) and used as template for the AnTat 1.38 ELC; ELC is the additional sequence present in AnTat 1.38 DNA, also identified as the AnTat 1.38 ELC thanks to DNAase I digestion kinetics on different restriction fragments (data not shown). The extent of the AnTat 1.38 ELC is underlined under the ex-ELC map, with uncertainties at both ends: the open box represents the extent of the cDNA. The cDNA fragment used as a probe is shown below the ELC map.
It should first be recalled that this sequence is found in a telomere, as are the actively transcribed ELCs. We observed that the converted “9 kb” member in clone AnTat 1.1 C acquired a much increased sensitivity to DNAase I and, secondly, that no other sequence shares the restriction map and nucleotide sequence of the AnTat 1 .I C cDNA, proving that this sequence became the active AnTat 1 .l C gene. Moreover, in the switch leading to the next variant AnTat 1.38, the converted “9 kb” sequence disappears, probably being chased by the new AnTat 1.3B ELC, and indicating that the telomere which harbored the former “9 kb” sequence behaves as an expression site in AnTat 1.1 C, and probably in 1.38 as well. Although sharing the properties of the VSA gene expression site described so far (Pays et al., 198313) the “9 kb”containing telomere does not exhibit the same 5’ restriction map, at least within about 20 kb upstream from the gene; on the other hand, the restriction maps of the “9 kb” telomere in the expressor clone AnTat 1 .l C and in the
nonexpressor clones-for example, AnTat 1 .I, 1.3, 1.6, and 1.6-are identical, except for the converted region of the gene. Two explanations can be given. First, there are multiple sites where VSA genes can be expressed, controlled in such a way that only one is operating at any one time. Second, in generating the AnTat l.lC variant, the “9 kb” telomere has been exchanged, by reciprocal recombination 3’ to the putative promoter, which the chromosome end which harbored the previous (AnTat 1.16) ELC. As predicted from such rearrangement, the AnTat 1 .16 gene family is amplified by one unit from the clone AnTat 1 .lC onward. Certainly, models derived from the multiple site hypothesis (transposable regulatory sequences f.i.) could also account for gene addition, provided that the activated sequence would act as receptor in the conversion process. We nevertheless favor the simpler crossing-over model. We thus propose that the switch from AnTat 1 .16 to 1 .l C was operated by the combined effects of two different recombination mechanisms: a reciprocal crossing-over
Activation 729
of T. brucei VSA Gene
that involved at least 20 kb of telomeric DNAs, thus positioning the “9 kb” sequence downstream from the putative unique VSA gene promoter, and a partial conversion of this translocated sequence by the “6.4 kb” family member. Since the partial gene conversion is very similar in AnTat 1 .lC and AnTat 1 .lB, at least one of the recombination loci being identical, it seems that these interactions could not be completely random, or that a selection against other interactions has operated, if, for instance, recombinations between two protein domains were favored. In this respect, it is noteworthy that the junction point between the AnTat 1 .l and AnTat 1. IO domains in both AnTat 1 .lB and 1 .lC sequences is 160 codons from the C terminus of the mature protein (see legend of Figure 2) whereas it is known (Cross, 1978) that in their native state, the variantspecific antigens can be selectively attacked by trypsin at a point about 150 residues from their C-terminal extremity. These observations strongly suggest that the region around 1.50 amino acids from the protein end hinges two protein domains, and that homologous domains from different antigens can be exchanged. Similarly, it has been reported that intron-exon junctions correspond to boundaries between protein domains (Sakano et al., 1979; Stein et al., 1980; Eiferman et al., 1981) and that variations are facilitated in these regions, because they provide additional protein surface loops while maintaining the protein function (Craik et al., 1983). The AnTat 1 .l B and 1 .l C recombined sequences encode for the synthesis of proteins with serologically indistinguishable surface-exposed determinants, though another partial conversion involving the same AnTat 1 .l gene family members could lead to antigenic variation (Pays et al., 1983c). Since in the AnTat 1 .l C clone the “9 kb” sequence has been partially converted, it has irreversibly lost a fraction of its genetic information. The lost sequence corresponds to the AnTat 1 .lO coding message (Pays et al., 1983c), and it is thus likely that the ability to express an AnTat 1 ,lO-like serotype has been lost in AnTat 1 .l C. Furthermore, the “9 kb” sequence has been completely removed from AnTat 1.38. These examples show how an antigen repertoire can evolve by loss of genes; on the other hand, the AnTat 1.16 gene family being amplified from the AnTat 1 .lC clone onward, the incoming sequence could further diversify into another antigen coding sequence by genetic alterations such as point mutations or gene conversion, as suggested by Young et al. (1983). Thus, in addition to operating the antigenic switch (Pays et al., 1983c), gene conversion, in combination with the reciprocal rearrangement mechanism, allows the trypanosomes to modify their antigenic repertoire. Since the same gene (the AnTat 1 .I specific sequence in the “9 kb” telomere, for instance) can be activated by either of the two mechanisms, a given gene could be amplified in some repertoires and lost in others, depending on the way these mechanisms alternate. Gene conversion has also been observed as a diversifying mechanism in other multigene families (Slightom et al.,
1980; Schreier et al., 1981; Dildrop et al., 1982; Roeder and Fink, 1982; 0110 and Rougeon, 1983; Bentley and Rabbit&, 1983; Pease et al., 1983; Weiss et al., 1983) though also allowing for some homogenization of gene family sequences (Baltimore, 1981). In the AnTat 1.38 variant, the gene conversion leading to gene activation seems to involve at least 40 kb of telomeric sequence. This is considerably more than the extent of gene conversions described so far in trypanosome antigenic variation: from 1 kb in AnTat 1.1 B (Pays et al., 1983c) to about 3 kb in numerous cases where the limits of the converted sequences appear to be in repeated elements (Van der Ploeg et al., 1982a, 1982c; Pays et al., 1982, 1983a). Obviously, the length of the duplicated sequence is not as rigorously defined as in the “cassette” system controlling the mating type interconversion in yeast (Hicks et al., 1977) but seems to depend on the nature of the target sequence. In the AnTat 1.1 B case, for instance, the homologies between the incoming (AnTat 1.1 B) and outgoing (AnTat 1 .lO) genes are extensive (Pays et al., 1983~) so that recombination could take place all along the molecule. In most cases, however, sequence homologies between ELCs are restricted to blocks present both at the 3’ extremity of different genes (Matthyssens et al., 1981; Majumder et al., 1981; Rice-Ficht et al., 1981; Bernards et al., 1981) and in front of the gene, about 3 kb upstream from the 3’ block (Van der Ploeg et al., 1982a, 1982c; Pays et al., 1982, 1983a, 1983b). In the AnTat 1.3B clone, the target sequence (AnTat 1 .lC gene) is in a telomere whose restriction map completely differs, for at least 20 kb, from either the map characteristic for most expression sites described so far in the AnTAR 1 repertoire (Pays et al., 198313) or the very similar restriction maps of different telomeric BCs, as AnTat 1.3 (Laurent et al., 1983) or AnTat 1.16 (this study). It is possible that the sequence homologies between the AnTat 1 .lC and AnTat 1.3B telomeres were restricted to a region far upstream from the two concerned genes, so that the duplicated sequence would have to be very long to reach the recombinational locus of homology. According to the model presented in Figure 6, this locus has to be downstream from the transcription promoter, which should therefore be located very far upstream from the gene. This conclusion is in accordance with other observations (Van der Ploeg et al., 1982b; Michiels et al., 1983) including the fact that the putative crossing-over point between the AnTat 1 .I6 and AnTat 1 .iC gene-containing telomeres must be far from the genes, since we did not manage to map it within at least 20 kb (Figure 3). On the other hand, occasional duplications of large telomeric stretches, as in AnTat 1.3B, could account for the observation that different chromosome ends, as those harboring the AnTat 1.3 or 1.16 BCs, have very similar restriction maps. In conclusion this paper emphasizes the importance of telomere interactions in the control of trypanosome antigen gene expression. It shows that the expression of the same antigen type (AnTat 1 ,l ) can be obtained from very differ-
Cell
730
ent ways: it can result from the activation, by different types of DNA rearrangements, of sequence combinations affecting different DNA lengths. Finally, it indicates that the same “9 kb” telomeric sequence can either give rise to an ELC (in variant AnTat 1 .lO) or on the contrary become a target for gene conversion (in variant AnTat 1 .lC), probably after reciprocal recombination with another telomere. This flexibility is obviously an additional key to the parasite’s adaptability.
Note Added The three gene conversion endpoints identified so far in the AnTat 1 .l , 1 .l B, and 1 .I C gene rearrangement are all located next to a CACA sequence (see Rogers, 1983, for a review-on the meaning of CACA sequences). Experimental
Procedures
Trypenosomes The different antigenic variants were cloned at the Institute of Tropical Medicine (Antwerp) from the same T. b. brucei stock (EATRO 1125). The AnTat 1 .I, 1 .l B, 1.3, and 1 .I0 bloodstream trypanosome clones have been described previously (Pays et al., 1981a). AnTat 1.lD is an independent 1 .l-like variant, cloned from the same stock, and AnTat l.lE was cloned after cyclical transmission of AnTat 1.I D by the tsetse fly, Glossina morsitans morsitans (Le Ray et al., 1977). An abridged pedigree of the different clones is given in Figure 1. Their nomenclature follows the rules recommended by Lumsden (1982). All trypanosome populations were grown in mice. Clones of successive VATS were set up from heterotypes arising in the diversifying clone of the previous VAT. Reference clones used for DNA analysis were more than 99% serologically homogeneous.
DNA Isolation DNA isolation was carried out as described (Pays et al., 1980). Briefly, the cells were lysed in 10 mM NaCI, 250 mM EDTA, 1% SDS, IO mM Tris-HCI (pH 8) then incubated for 1 hr at 37°C with 100 pg/ml RNAase A, followed by a 4 hr incubation at 37’C with 500 pg/ml proteinase K. After dialysis against 10 mM NaCI, 0.2 mM EDTA, IO mM Tris-HCI (pH 8) the DNA was purified by CsCl gradient centrifugation and further dialysis.
Molecular
Cloning
AnTat 1.10, 1.16, 1 .lC, and 1.38 cDNAs were cloned in the Pst I site of pBR322 by the G-C tailing procedure (Humphries et al.. 1977); screening was done as previously described (Pays et al., 1980). The AnTat l.lC gene has been cloned in Xgt Wes-Xb phage as follows: 9 kb Pst I fragments were isolated from AnTat 1 .I C genomic DNA by preparative electrophoresis on low melting point agarose gels, then digested by Eco RI and inserted by ligation between the Eco RI arms of hgt Wee.-Xb (Tiemeier et at.. 1976). In vitro packaging of recombinant DNA molecules was performed as described by Hohn and Murray (1977) and screening according to Benton and Davis (1977) using two ‘P-labeled probes: either the 500 bp 5’ Eco RI fragment or the 750 bp 3’ Eco RI-Pst I fragment from AnTat 1 .I0 cDNA (see Figure 1 for a map). We obtained several clones harboring the corresponding 2 kb 5’ Eco RI-Eco RI and 750 bp 3’ Eco RI-Pst I fragments of the AnTat 1 .lC gene and neighborhood (Figure l), cloned in tandem together with other Eco RI-Pst I fragments.
PrObeS Specific parts of the cloned sequences were isolated by several rounds of preparative electrophoresis on low melting point agarose. Fragments were 32P-labeled by nick translation (Rigby et al.. 1977) and the specificity of each probe was checked by back-hybridization with a mixture of different isolated fragments.
Hybridization Hybndrzation of the probes with Southern blots of digested genomic DNAs was performed as in previous work (Pays et al., 1981a). As size markers
we used Hind Ill digestion products of X and SV40 DNAs, T-labeled by filling-in of the cohesive ends with the Klenow fragment of DNA polymerase
DNA fhquencing We followed the method of Sanger et al. (1980) the appropriate DNA restriction fragments being subcloned in Ml 3mp9 or mp8 vectors (Messing et at.. 1981) after isolation by electrophcresis in low melting point agarose gels.
Acknowledgmenfs We thank D. Franckx for careful preparation of the figures. This investigation received support from the Fonds de la Recherche Scientifique Medicale (FRSM, Brussels), from the ILRAD/Belgian Research Centres Agreement for Cdlaborative Research (Nairobi), and from the Trypanosomiases component of the UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (Geneva). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received
July 13, 1983; revised September
19, 1983
References Baltimore, D. (1981). Gene conversion: ulin genes. Cell 24, 592-594.
some implications
for immunoglob-
Bentley, D. L., and Rabbit&, T. H. (1983). Evolution of immunoglobulin V genes: evidence indicating that recently duplicated human V, sequences have diverged by gene conversion. Cell 32, 181-189. Benton, W. D., and Davis, R. W. (1977). Screening lambda gt recombinant clones by hybridization to single plaques in situ. Science 796, 180-182. Bernards, A., Van der Ploeg, L. H. T., Frasch, A. C. C., Borst, P., Boothroyd, J. C., Coleman, S., and Cross, G. A. M. (1981). Activation of trypanosome surface glycoprotein genes involves a duplication-transposition leading to an altered 3’ end. Cell 27, 497-505. Bernards, A., Michels, P. A. M., Lincke, C. R.. and Borst, P. (1983). Growth of chromosome ends in multiplying trypanosomes. Nature 303. 592-597. Borst, P., and Cross, G. A. M. (1982). antigenic variation. Cell 29, 291-303. Cross, G. A. M. (1978). Antigenic Sot. Lond. B. 202,55-72.
Molecular
variation
basis for Trypanosome
in trypanosomes.
Proc. Royal
De Lange, T.. and Borst, P. (1982). Genomic environment of the expressionlinked extra copies of genes for surface antigens of Trypanosoma hcei resembles the end of a chromosome. Nature 299, 451-453. Dildrop, R., Brijggemann, M.. Radbruck, A., Radjewsky. K., and Bayreuther, K. (1982). lmmunoglobulin V region variants in hybridoma cells. II. Recombination between V genes. EMBO J. 7, 635-640. Eiferman, F. A., Young, P. R., Scott, R. W., and Tilghman, S. M. (1981). lntragenic amplification and divergence in the mouse a-fetoprotein gene. Nature 294, 713-718. Englund, P. T., Hadjuk, biology of trypanosomes.
S. L., and Marini, J. C. (1982). Ann. Rev. Biochem. 51,695-726
The molecular
Hicks, J. B., Strathern. J. N., and Herskowitz, I. (1977). The cassette model of mating-type interconversion. In DNA Insertion Elements, Pfasmids, and Episomes, A. I. Bukhari et al., eds. (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory), p. 457. Hoeijmakers, J. H. J., Frasch, A. C. C., Bernards, A.. Borst, P.. and Cross, G. A. M. (1980). Novel expression-linked copies of the genes for variant surface antigens in trypanosomes. Nature 284. 78-80. Hohn, B., and Murray, K. (1977). Packaging recombinant DNA molecules into bacteriophage particles. Proc. Nat. Acad. Sci. USA 74, 3259-3263. Humphries, P., Cachet, M., Krust, A., Gerlinger, P., Kourilsky, P., and Chambon, P. (1977). Molecular cloning of extensive sequences of the in vitro synthesized chicken ovalbumin structural gene. Nucl. Acids Res. 4. 2389-2406.
Activation 731
of T. brucei VSA Gene
Laurent, M., Pays, E., Magnus, E., Van Meirvenne, N., Matthyssens, G., Williams, R. O., and Steinen, M. (1983). DNA rearrangements lknked to the expression of a predominant surface antigen gene of trypanosomes. Nature 302, 263-266.
Roeder, G. S., and Fink, G. R. (1982). Movement of yeast transposable elements by gene conversion. Proc. Nat. Acad. Sci. USA 79, 5621-5625.
Le Ray, D., Barry, J. D., Easton, C., and Vickerman. K. (1977). First tsetse fly transmission of the “AnTat” serodeme of Trypanosoma brucei. Ann. Sot. Beige M&d. Trop. 57, 369-381.
Sakano, H., Rogers, J. H., Huppi, K.. Erack, C., Traunecker, A., Maki, R., Wall, R., and Tonegawa, S. (1979). Domains and the hinge region of an immunoglobulin heavy chain are encoded in separate DNA segments. Nature 277, 627-633.
Lumsden, W. H. R. (1982). Characterization and nomenclature of trypanosome serodemes and zymodemes: report of the meeting held in Edinburgh, Sept. 1978. Sys. Parasitol. 4, 373-376. Majiwa. P. A. O., Young, J. Ft., Englund, P. T.. Shapiro, S. Z., and Williams, R. 0. (1982). Two distinct forms of surface antigen gene rearrangement in Trypanosoma brucei. Nature 297, 514516. Majumder, H., Boothroyd, J. C., and Weber. H. (1981). Homologous 3’. terminal regions of mRNAs for surface antigens of different antigenic variants of Trypanosoma brucei. Nucl. Acids Res. 9, 47454753. Matthyssens, G., Michiels, F., Hamers, R., Pays, E., and Steinert, M. (1981). Two variant surface glycoproteins of Trypanosoma brucei have a conserved C-terminus. Nature 293, 230-233. Messing, J., Crea, R., and Seeburg. P. H. (1981). DNA sequencing. Nucl. Acids Res., 9, 309-321.
A system
for shotgun
Michiels, F.. Matthyssens, G., Kronenberger, P., Pays, E., Dero. B.. Van Assel, S., Darville, M.. Cravador, Steinert, M., and Hamers, R. (1983). Gene activation and reexpression of a Trypanosoma brucei variant surface glycoprotein. EMBO J. 2, 1185-I 192. Ollo, R., and Rougeon, F. (1983). Gene conversion and polymorphism: generation of mouse immunoglobulin y2a chain alleles by differential gene conversion by y2b chain gene. Cell 32, 515-523. Pays, E., Delronche, M., Lheureux, M., Vervoort, T., Bloch, J., Gannon. F., and Steinert, M. (1980). Cloning and characterization of DNA sequences complementary to messenger ribonucleic acids coding for the syntheses of two surface antigens of Trypanosoma brucei. Nucl. Acids Res. 8, 59655981. Pays, E., Van Meirvenne, N., Le Ray, D., and Steinert, M. (1981a). Gene duplication and transposition linked to antigenic variation in Jrypanosoma brute!. Proc. Nat. Acad. Sci. USA 78, 2673-2677. Pays, E., Lheureux, M., and Steiner?, M. (1981 b). The expression-linked copy of the surface antigen gene in Trypanosoma is probably the one transcribed. Nature 292, 265-267. Pays, E., Lheureux, M., and Sternert. M. (1981~). Analysis of the DNA and RNA changes associated with the expression of isotyprc variant-specific antigens of trypanosomes. Nucl. Acids Res. 9, 4225-4238.
Rogers, J. (1983). 305.101-102.
CACA sequences:
the ends and the means?
Nature
Sanger, F., Coulson, A. R., Barell, B. G., Smith, A. J. H., and Roe, 6. A. (1980). Cloning in srngle stranded bacteriophage as an aid to rapid sequencing. J. Mol. Biol. 743, 161-178. Schreier, P. H., Bothwell, A. L., Mueller-Hill, B.. and Baltimore, D. (1981). Multiple differences between the nucleic acrd sequences of the IgG2aa and IgGAa” alleles of the mouse. Proc. Nat. Acad. Sci. USA 78, 4495-4499. Slightom, J. L., Blechl. A. E., and Smithies, 0. (1980). Human fetal G~- and ‘7.globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Cell 27, 627-638. Southern, fragments
E. M. (1975). Detection of specific sequences among DNA separated by gel electrophoresis. J. Mol. Biol. 98, 503-517.
Stein, J. P.. Catterall, J. F., Kristo, P., Means, A. R., and O’Malley. B. W. (1980). Ovomucoid intervening sequences specify functional domains and generate protein polymorphism. Cell 27. 681-687. Tiemeier, D., Enquist, L., and Leder, P. (1976). Improved derivative of a phage lambda EK2 vector for cloning recombinant DNA. Nature 263, 526527. Van Der Ploeg, L. H. T., Bernards, A., Rijsewijk, F. A. M., and Borst, P. (1982a). Characterization of the DNA duplication-transposition that controls the expression of two genes for variant surface glycoproteins in Jrypanosoma brucei. Nucl. Acids Res. 70, 593-609. Van Der Ploeg, L. H. T., LIU, A. Y. C., Michels, P. A. M., De Lange, T., Borst, P., Majumder, H. K., Weber, H., Veeneman, G. H., and Van Boom, J. (1982b). RNA splicing is required to make the messenger RNA for a varrant surface antigen in trypanosomes. Nucl. Acids Res. 70, 3591-3604. Van Der Ploeg, L. H. T., Valerio, D., De Lange, T., Bernards, A., Borst, P., and Grosveld. F. G. (1982c). An analysis of cosmrd clones of nuclear DNA from Trypanosoma brucei shows that the genes for variant surface glycoproteins are clustered in the genome. Nucl. Acids Res. 70, 5905-5923. Van Meirvenne. N., Magnus, E., and Vervoort, T. (1977). Comparison of variable antigenic types produced by trypanosome strains of the subgenus Trypanozcon. Ann. Sot. Belge Med. Trop. 57, 409-423. Weintraub, H., and Groudine, M. (1976). Chromosomal subunits genes have an altered conformation, Science 793, 848-856.
in active
Pays, E.. Lheureux, M., and Steinert, M. (1982). Structure and expression of a Jrypanosoma brucei gambiense variant-specific antigen gene, Nucl. Acids Res. 70, 3149-3163.
Williams, R. O., Young, J. R., and Mafiwa, P. A. 0. (1979). Genomic rearrangements correlated with antigenrc variation in Trypanosoma brucei. Nature 282, 847-849.
Pays, E.. Dekerck, P., Van Assel. S., Eldirdiri A., Babiker, Le Ray, D., Van Meirvenne, N., and Steinert, M. (1983a). Comparative analysis of a Trypanosoma brucei gambiense antigen gene family. Its potential use in epidemrology of sleeping sickness. Mol. Blochem. Parasitol. 7, 63-74.
Williams, R. O., Young, J. R., and Majwa, P. A. 0. (1982). Genomic environment of 7. brucei VSG genes: presence of a minichromosome. Nature 299, 417-421.
Pays, E., Van Assel. P., Matthyssens, G., (198313). At least two sion site of a surface 34, 359-369.
S., Laurent, M., Dero, B., Michels, F., Kronenberger. Van Meirvenne, N., Le Ray, D., and Steinert, M. transposed sequences are associated in the expresantrgen gene in different trypanosome clones, Cell
Pays, E., Van Assel, S., Laurent, M., Darvrlle, M., Vervoort, T., Van Merrvenne, N.. and Steinert, M. (1983c). Gene conversion as a mechanism for antigenic variation in trypanosomes. Cell 34, 371-381, Pease, L. R., Schulze, D. H., Pfaffenbach, G. M., and Nathenson, S. G. (1983). Spontaneous H-2 mutants provrde evidence that a copy mechanism analogous to gene conversion generates polymorphism in the major htstocompatibility complex. Proc. Nat. Acad. Sci. USA 80, 242-246. Rice-Flcht, A. C., Chen, K. K., and Donelson, homologies near the C-termint of the variable Trypanosoma brucei. Nature 294, 53-57.
J. E. (1981). Sequence surface glycoprotelns of
Rigby, P. W. J., Dieckmann, M.. Rhodes, C.. and Berg, P. (1977). Labeling deoxyribonucleic acid to high specific actrvity rn vitro by nick translation wrth DNA polymerase I. J. Mol. Blol. 7 13, 237-251.
Young, J. R., Donelson. J. E., Majiwa, P. A. 0.. Shapiro, S. Z., and Williams, R. 0. (1982). Analysis of genomic rearrangements associated wrth two variable antigen genes of Trypanosoma brucel. Nucl. Acrds Res. 70, 803819. Young, J. R., Shah, J. S., Matthyssens, G., and Williams, R. 0. (1983). Relationship between multiple copies of a T. brucei variable surface glycoprotein gene whose expression is not controlled by duplication. Cell 32, 1149-I 159.