Modifications of a trypanosoma b. brucei antigen gene repertoire by different DNA recombinational mechanisms

Modifications of a trypanosoma b. brucei antigen gene repertoire by different DNA recombinational mechanisms

Cell, Vol. 35, 721-731, December 1983 (Part 2). Copyright (0 1983 by MIT 0092-8674/&?3/130721-11 $02.00/O Modifications of a Trypanosoma b. bruc...

3MB Sizes 0 Downloads 32 Views

Cell, Vol. 35, 721-731,

December

1983 (Part 2). Copyright

(0 1983 by MIT

0092-8674/&?3/130721-11

$02.00/O

Modifications of a Trypanosoma b. brucei Antigen Gene Repertoire by Different DNA Recombinational Mechanisms Etienne Pays,* Marie-France Delauw,* Suzanne Van Assel,’ Monique Laurent,* Tony Vervoott,+ Nestor Van Meirvenne,t and Maurice Steinert* Department de Biologic Moleculaire Universite libre de Bruxelles Rhode St Genese, Belgium + Laboratorium voor Serologie lnstituut voor tropische Geneeskunde Antwerp, Belgium l

Summary In the Trypanosoma b. brucei AnTat 1.K clone, the gene coding for the variant-specific surface antigen is telomeric and appears as a hybrid sequence, partially modified by gene conversion. This conversion is very similar to that observed in another AnTat l.l-expressor clone (AnTat l.lB). This sequence is not activated by duplicative transposition, although it could be activated by duplication in another clone (AnTat 1.10). Instead activation of the AnTat l.lC gene seems operated by reciprocal recombination between its own telomere and the telomere carrying the previous (AnTat 1.16) ELC. Indeed, from the switch to AnTat l.lC onward, the AnTat 1.16 ELC becomes a new silent member of its gene family, whereas in the variant directly derived from AnTat l.lC (AnTat 1.38), the AnTat l.lC-containing telomere is lost, probably replaced by a large duplicate, at least 40 kb long, of the AnTat 1.3 gene-containing telomere. Different DNA rearrangement mechanisms used by the trypanosome to change its antigenic type thus contribute, by gain and loss of genes, to the evolution of the repertoire for surface antigens. Introduction The mechanism of antigenic variation in African trypanosomes has been partially elucidated (for reviews see Borst and Cross, 1982; Englund et al., 1982). The extensive antigenic repertoire of these parasites reflects the existence of a large number of genes, each one coding for a different variant-specific surface antigen (VSA). Only one VSA gene seems to be expressed at any one time. In many cases, expression of the VSA gene clearly depends on a genomic rearrangement: a silent “basic copy” (BC) of the gene gives rise to an “expression-linked” copy (ELC), which is found transposed to a probably unique “expression site” (Hoeijmakers et al., 1980; Pays et al., 198la, 1983b), only the ELC being transcribed (Pays et al., 1981 b). In other cases the change of antigenic type occurs in the absence of gene duplication (Williams et al., 1979; Young et al., 1982) yet this has been observed only in the case of telomeric genes. Two different mechanisms of VSA gene activation seem to coexist (Majiwa et al., 1982). It has been hypothesized that VSA genes located in chro-

mosome ends could be exchanged with a former ELC by a reciprocal crossing-over (Borst and Cross, 1982; Laurent et al., 1983). We have undertaken a detailed study of the T. b. brucei AnTat 1.1 VSA gene in different clones expressing the same variable antigen type (VAT), AnTat 1.1 (see the derivation scheme of these clones in Figure 1). The main results (Pays et al., 1981a, 198lc, 1983b, 1983~; Michiels et al., 1983) can be summarized as follows. The AnTat 1.1 gene belongs to a family of five related sequences, designated according to the size of the Pst I fragments in which they are almost entirely comprised: the “9 kb,” “7 kb,” “6.4 kb,” “4 kb,” and “2.15 kb” sequences (see the AnTat 1.l Pst I digestion pattern in the first block of Figure 1). Two of these sequences (“6.4 kb” and “9 kb”) are located near a chromosome end, but are differently oriented toward the corresponding DNA terminus. The same two sequences are the only AnTat 1 .l gene family members reported to encode for the synthesis of VSAs, the AnTat 1 .l- and 1 .lO-specific antigens being specified by the “6.4 kb” and “9 kb” sequences, respectively. Both genes are expressed following duplicative transposition. Segments of the “6.4 kb” sequence are also duplicated and transposed as ELCs in the AnTat 1.1 B, 1.1 D, and 1 .l E clones; however, the length of these ELCs is variable. In the AnTat 1.1 B clone, which is derived from AnTat 1.10, a 1 kb long stretch from the “6.4 kb” ELC has replaced the homologous sequence from the previously expressed “9 kb” ELC, and we could show in this case that gene conversion is the mechanism by which the ELCs are generated (Pays et al., 1983c). Segmental gene conversion clearly adds to antigenic diversity, since, for instance, the AnTat 1.1 B ELC codes for a chimeric protein with both AnTat 1 .l- and 1 .lOspecific domains. The ELC is, however, generally lost from the genome during the antigenic switch, since it is often the target for the next gene conversion. We report here, in the clone AnTat l.lC, the existence of a VSA gene activation mechanism different from the gene conversion, which can thus alternatively apply to the expression of the AnTat 1 .l VAT. The mechanism is speculatively interpreted as a telomeric reciprocal recombination, since it implies two predictions we could clearly verify-namely, the conservation of the ELC from the previously expressed VSA gene (AnTat 1.16 in this case) and the loss of the AnTat 1.1 C gene in the ensuing variant (in this case AnTat 1.38) provided the VSA gene of the latter variant is activated by duplicative transposition. Alternation of the two VSA gene activation mechanisms thus contributes, by both gain and loss of different sequences, to the rapid evolution of the variable antigen gene repertoire.

Results Alteration of an AnTat 1.1 Gene Family Member in the AnTat l.lC Clone The five members of the AnTat 1 .l multigene family have been analyzed in a series of ten trypanosome clones

Cell 722

EATRO

1125 - Al.1 -Al.3 w A 1.6 - A 1.16 - Al.lC b Al.36 I Al.10 - Al.16 c - Al.lD - (fly) - Al.lE

probe

H

P+H

P

: 3’

“9kb’in

AnTat

ll6

“9kb;n

AnTat

MC

ELC

AnTat

11

ELC

AnTat

110

ELC

AnTat

11s

P

E

SP

P I I

E 1~

SP It

t

I

EM I

PY I I

ME

HY

PM

PE

P

MSsM

SP

PE I I

P I

YS,M III

Sp IL

ME

HY

PM

SP

ME I

H I

P

sp I

ME

H

P

e-s

ELC

AnTat

1j.D

ELC

AnTat

11E

SS

SP

SSM

SI

SP

SSM

I

I

l--z--lhb Figure 1. One of the AnTat 1 .l Gene Family Members

(“9 kb”) Is Altered in Clone AnTat 1 ,I C, and Is Lost from AnTat 1.38 DNA

The restriction pattern of the AnTat 1 .I gene family members has been analyzed in ten trypanosome clones, whose derivation scheme is shown (A 1 .I is for AnTat 1 .l, etc.) (see Experimental Procedures for the description of clone derivation). Southern blots of genomic DNA digested by Pst I, Pst I+Hind Ill, or Hind Ill have been hybridized with an AnTat 1 .lO cDNA probe specific for the 3’ half of the gene (see AnTat 1.10 map below); the DNA digests are from AnTat 1.1, 1.3. 1.6. I. 16, 1.1 C, 1.3B. 1 .I 0, 1.1 B, 1.1 D, and 1 .l E clones, respectively. from left to right in each block. The gene family members are defined according to the size of the specific Pst I fragments each one engenders and are labeled with different symbols (left panel); their position in the two other panels is indicated by the same symbols. The ELCs in the AnTat 1.1 homoisotypes (AnTat 1.1, 1.16, IilD, and 1.1E) are indicated by arrowheads; they do not hybridize with the AnTat 1 .lO cDNA probe to the same extent in all cases, because of local DNA rearrangements (Pays et al.. 1963b, iQ63c). The restriction maps of the sequences found to differ in some of the clones are shown below: the “9 kb” sequence as it is in AnTat 1.16 (and in all the other clones except AnTat l.lC and 1.38) the ‘9 kb” sequence in the AnTat 1.1C clone and the AnTat 1.1, 1.10, 1.16. l.lD, and l.lE ELCs. The two arrows above these maps point to an Eco RI site and a Pst I site which are conserved in all sequences, The lefthand Pst I site in the two top maps is located Q kb from the arrowed one. When known, the extent of the corresponding cDNAs is indicated (open boxes). The extent of the converted sequence in each case is indicated by a bar under the maps (see text). Two probes have been derived from AnTat 1 .lO cDNA, as indicated below the AnTat 1 .lO map. Abbreviations, used in each restriction map, are B, Bgl I; Bg, Bgl II; E, Eco RI: H, Hind III; Hi, Hinf I; M. Msp I; P, Pst I; Fv, Fvu II; Sp, Sph I; Ss, Sst I; T, Taq I,

Activation 723

of T. brucei VSA Gene

(AnTat 1.1, 1.3, 1.6, l.lC, 1.38, 1.10, l.lB, l.lD, and 1 .I E) whose pedigree, starting from the EATRO 1125 stock, is outlined in Figure 1. We usually define the AnTat 1 .I family members according to the size of the Pst I fragment to which they give rise: as shown in Figure 1 (left), they are represented in the “9 kb,” “7 kb,” “6.4 kb,” “4 kb,” and “2.15 kb” bands. In addition, the same panel shows the ELC-containing fragments (arrowheads) (2 kb in clones AnTat I .I, 1 .lO, and 1 .lB; 17.5 kb in AnTat 1 .lD; and 21.5 kb in AnTat 1 .lE) identified thanks to detailed analysis presented elsewhere (Pays et al., 1981 a, 1981 b, 1983~). Strikingly, in AnTat 1 .lC DNA, no ELC could be detected in Pst I digests. The restriction pattern of each of these sequences upon different digestions (for instance, Pst I+Hind III and Hind III in Figure 1) has been compared in the different clones. Only one of the AnTat 1 .I gene family members, the “9 kb” sequence, was found not to be conserved in all these clones, as illustrated in Figure 1 by the disappearance of the 9 kb band in AnTat 1.3B after Pst I digestion and in both AnTat 1.1 C and 1.38 upon Pst I+Hind Ill double digestion. The restriction map of the “9 kb” sequence in AnTat 1.1 C was compared with that in the other clones, and with the ELC maps of other homoisotypes (Figure 1). It appears that in a region around the arrowed Eco RI site, the “9 kb” sequence of the AnTat 1.1 C clone has acquired cleavage sites (in particular a Hind III site) characteristic of the AnTat 1 .I, 1 .I B, 1 .I D, and 1 .I E ELCs, which all, in this region, are copied from the “6.4 kb” member (Pays et al., 1983c, and results not shown). This rearrangement is responsible for the duplication of the internal 510 bp Hind Ill-Pst I fragment in AnTat 1 .I C (Figure 1, second panel); no evidence for a duplication of the “9 kb” member itself has been found, whatever restriction endonuclease was used (Figure 1 and results not shown).

Partial Gene Conversion between Two Gene Family Members In order to analyze the alteration of the “9 kb” member more thoroughly, we sequenced the 9 kb Pst I fragment from variant AnTat 1.1 C between codons 45 and 418 of the AnTat I .l -specific sequence, and compared this sequence to those, already known, of the AnTat 1.1 and 1. IO cDNAs (transcribed from the “6.4 kb” and “9 kb” members, respectively; Pays et al., 1983c) (Figure 2). In variant AnTat 1.1 C, down to codon 320, a long stretch of the “9 kb” sequence has clearly been replaced by a copy of the corresponding “6.4 kb” sequence. Since the “6.4 kb” sequence itself is conserved unaltered in AnTat 1.1 C (Figure 1 and results not shown), the observed rearrangement can be attributed to a partial gene conversion, and not to a reciprocal recombination. This situation has also been observed in another clone, AnTat 1 .I B (Pays et al., 1983~; sequence redrawn in Figure 2) although in this case the partial conversion affected an additional copy of the “9 kb” sequence, namely the AnTat 1 .I 0 ELC, and not the “9 kb” sequence itself. The 3’ junction of the “9 kb”

sequence with the incoming block is the same in both cases (asterisk in Figure 2) suggesting that recombinations at this site are favored. In the case of AnTat l.lB, however, a stretch of 133 bp of unknown origin is intercalated between the “6.4 kb” and “9 kb” domains (Pays et al., 1983~); this is not the case in the AnTat 1 .l C sequence (Figure 2). The 5’ junction is not precisely defined, but must be located a short distance upstream from the coding sequence in both cases, as judged from restriction maps (Pays et al., 1983c; Figure 1 and results not shown). Figure 3 presents the restriction maps of the “6.4 kb”- and “9 kb”containing telomeres in AnTat 1 .I B and 1 .lC clones, as well as the AnTat 1.1 B ELC and expression site: the specific “6.4 kb” and “9 kb” rearrangements are schematically represented in the corresponding shades, with in each case some uncertainty at their 5’ edge. The length variations of telomere ends have been observed in several cases (Young et al., 1982; De Lange and Borst, 1982; Williams et al., 1982; Pays et al., 1983b, 1983~; Bernards et al., 1983). Except for these variations, no differences other than the partial gene conversion can be detected in the restriction maps of the “6.4 kb”- and “9 kb”-containing telomeres in the two clones.

The “9 kb” Sequence Is Transcribed in the AnTat l.lC Clone Since no AnTat 1 .I ELC could be detected in the AnTat 1.1 C clone, we speculated that the expressed AnTat 1.1 C gene could be the rearranged “9 kb” sequence. In isolated nuclei, expressed sequences are selectively digested by DNAase I as a result of an altered configuration of the actively transcribing chromatin (Weintraub and Groudine, 1976). We thus exposed AnTat 1 .lC nuclei to a mild DNAase I digestion, in order to see whether the “9 kb” sequence would be preferentially hydrolyzed. This was the case, as illustrated in Figure 4. Either the 2 kb Eco RI fragment of the “9 kb” sequence (A), or the 9 kb Pst I fragment itself (B) is more sensitive to DNAase I than the other AnTat 1 .I -specific fragments of the AnTat 1 .lC genomic DNA. The “9 kb” sequence is thus in an active chromatin configuration in the AnTat 1.1 C clone; this is not true in AnTat 1.lB (Figure 48) or in other clones (results not shown). Proof that the rearranged “9 kb” sequence is the template for AnTat 1 .lC mRNA synthesis came from the analysis of the AnTat 1 .lC cDNA. Indeed, the restriction map of the cloned AnTat 1 .I C cDNA is matched only by the rearranged “9 kb” sequence map (represented by the box in AnTat 1 .l C map, Figure 1). Portions of the AnTat 1 ,l C cDNA have been sequenced at both extremities (codons 22 to 59 and codons 308 to the end). These AnTat 1.1 C sequences were identical with the corresponding ones from the sequenced portion (codons 45 to 418) of the “9 kb” AnTat 1 .I -specific sequence (Figure 2); the AnTat 1 .l c cDNA is thus a perfect copy of the chimeric “6.4 kb”/“9 kb” sequence. We can exclude the possibility that the “6.4 kb” domain of the AnTat 1 .lC cDNA was

Cell 724

CCA CAA C?A CCA

CM CM CAA CM

CAA CAR CM CM

GCT GCT GCT GCT

CTA CTA CTA CTA

GCT GCT GCT GCT

30 CAG CAG CAG CAG

GGT GGT GGT GGT

AGG AGG A'32 AGG

CCC CCC CCC CCC

CTT CW CTT CTT

GCA GCA GCA GCA

GAT GAT GAT GAT

GTG Gc_G GTG GTG

40 GTA GTA GTA GTA

GGC GGC GGC GGC

AAA AM AAA AM

ACT GCA KE ACT

CTA CXCTA CTA

TGT TGc_ TGT T,GT

ACT ACT ACT ACT

CGC CGA '3% 0%

CAG CAG CAG CAG

GCA GCA GCA GCA

GCA GCA GCA GCA

MC MC MC AAC

CTG CTA Cn; Cn;

GCG GCC GC?i GCG

60 CM ACA CTA CAA CM GCT CTA GAT CM ;C; CTA FE CAA ACA CTA CM

CGA CGC CGsi CGA

GCC GGC Gi?C GCC

AGC ATC A?iC AGC

TCA ACA %A TCA

GCA GCA GCA GCA

GCA GCA GCA GCA

70 AAG AAA --MG AAG

CM AAG CM CAA

TCC TCG -TCC TCC

AGA CAA AGA AGA

CAA CAA CM CM

GCG GCG GCG GCG

CAG CAC CAF CAG

CT'2 CT'2 CTG CTG

GCC GCG e GCC

AAA AAA AAA AAA

Cn; CTA cz CTG

CCA GAC CCC CAC CC: EAC CCA GAC

TAC TAC TAC TAC

90 AM AGA A& AAA

GCG ACA CTG GCG AG-----AX GCG ACA c% GCG ACA CTG

1 10 1B 1C

20 GCT TCA GCA CTG ACA CTA CAC GGT GCC GCA CT-2 ACA CTA CAC GCT TCA GCA CT% ACA CTA CAC -------------------$TA CAC

1 10 1B 1C

TAT TAT TAT TAT

TCA TCA TCA TCA

AAA AAG 6 AAA

ACG ACA AC?i ACG

GCC GCC GCC GCC

AM AAA AM AAA

1 10 1B 1C

CAG TTA GAG TTA FAG TTA CAG TTA

GCG GCG GCG GCG

80 GCT Aa; --GCT GCT

TTA ATA TTA TTA

GCA GCA GCA GCA

1 10 III 1C

GCG CM GCA GM &FAA GCG CAA

GCC GCA GCC GCC

110 AGC MC A%2 AGC

ATC GM MC ATC GM MC ATC GAA AAC ATC GM AAC

1 10 16 1C

TTG TTG Tl'G TTG

CTA ATG FTz CTA

CTA CTA CTA CTA

140 GM GM GAA GAA

GGG CAC CGA GAG GGC CAC AGA GAC & CAC CGA GAG GGG CAC CGA GAG

1 10 18 1C

WC TTC TTC TTC

GTC GTC GTC GTC

AAA AAA AAA AAA

170 ACA ACA ACA ACA

GM GAG Gc GM

4

50

1 10 1B lc

1

TGG ACA GGA GAG TGG MA GGC CAG TGG A& GGiizAG TGG ACA GGA GAG

GM GM GAA GM

GCA GCA GCA GCA

GCC GCA GC? GCC

TTA CTC TTA TTA

ATT ATT ATT ATT

TAC TAC TAC TAC

100 GCC GCC Gee GCC

ACG AM -ACG ACG

CAC AAA ATA CAA GAC AAC AM AEA %A GAA_ CAC AAA ATA CAA GAC CAC AAA ATA CM GAC

120 ACT MG CTA ACC AAA CT2 AC? AAE CT: ACT AAG CTA

GTT GTG Ge GTT

GGC CAG GCG ATG TAT GGC GAG GCA ATG TAT GGC FAG GCE ATG TAT GGC CAG GCG ATG TAT

TCC TCC TCC TCC

TCA TCA TCA TCA

130 GGG GGC & GGG

AGA AGA AGA AGA

ATC ATC ATC ATC

GAC GAC GAC GAC

GM GAG GE GAA

CTG CTG CTfi CTG

ATG ATG ATG An;

ACT ACT ACT ACT

160 GCC GGC GGC MT ACA GGA GGC AAC RCA -- MC GCC G4X GGC AAF ACA GCC GGC GGC MT ACA

GTA GTA GTA GTA

AAT GAT i&T AAT

GM GAA GM GAA

AGC CM E AGC

GGC GAC & GGC

180 CAC MC ATC GAG GCA GAC CAC MC ATC AAC GCC GAC CAC AAC ATC ZAzGCiiGAC CAC AAC ATC GAG GCA GAC

TGC TGC TGC TGC

CTA CTA CTA CTA

GGC GGC GGC GGC

GCG GCA GC?i GCG

GCC GCA GCC GCC

MC TCA GM TCA iiFTCA AAC TCA

MC GAC %.C AAC

ATA ATA ATA ATA

190 GGG CM GCG GCA ACG ACT CTA AGC CAA GM CM GCG GCA CGA ACC CTA AGC CAG GG-----------------------------GGG CAA GCG GCA ACG ACT CTA AGC CAA

210 GCC AGC GGA GGC GCA AGC TGC AAA ATA ACA GCA AAC CTT GCC ACT GAC TAC 53 GGC GGc ca --- ~02 tic TGC AX AX ACA GGc uc CTT GCC A= GAC TAC T----------------~---------------------T KC A= GAC TAC

GM

AGT ACA GAC CC;

lo 18 1C

CCG AGT --CCG CCG

CTA T'IG CTA CTA

1 10 1B 1C

260 270 ATC AGC GCA CTC AAA AAT AAG GGC GCC GGT GTC GCA GCT AAA CTG GCA ACT GTA ACG AE A= Gee cm GCG AC~C AAG cm GCT AAc cm AAA GCA CAC ACG GM -- AA~ cat AW m ATC AGC GC; CTC E AAT AAG GGC GCC G?i? FTC GCsi GCT AAii Cn: ?i?A iiC? GTA ACA ATCAGCGCACTCAMAATAAGGGCGCCGGTGTCGCAGCTAAACTGGCAACTGTAA~

1 10 1B 1C

AAA mAAG G

ACA ACA ACA ACA

CTA CTA CTA CTA

1 10 1~ 1C

GAC &AC GAC GAC

27-m TTC TTC TTC

GAC GAC GAT GA5

1 10 1B 1C

350 GTl' CCC ---- CTC GGA GlT CCA CM AAA GTT CCA CAA AAA GTP CCA CAA AAA

1

CTC CTA CT? CTC

1 10 18 1C

GM GM GM

'XC CM KTA CM ATA CM ATA CAA

1 10 1B 1C

GAT GAT GAT GAT

WA GCA GCA GCA

GM

230 GGC GGC GGC GGC

290 CTG CTG CTG CTG 320 GCC GE GCC GCC *

380 ACA ATA AAA AAA

GGC GGA GG? GGC

GM

GCT GCT GCT GCT

150 GAC GGC GCG AAC GGA CAG GAC AAA GGA CM TCA GCA GGG CAP. -- ACC AAA G?i?=cc = GGii CAG GAC AAA GAC GGC GCG MC GGA CAG GAC AAA

TGC GAC ACG GM TGC GAC ACG GM TGC GAC ACG GAA TGC GAC ACG GAA

200 GAA AGT ACA GAC CCA GM GM MT =A GAC ux GM ----:--( not s&encefj

MT AM AAF MT

ACC ACC ACC ACC

GCC AGC GGA GGC GCA AGC TGC AAA ATA RCA GCA MC

'3% CTA ACC ATA CAC AAC Cn; CTA ACA ATA CAC MT CTG CTA AC?! ATA CAC AA? CTG CTA ACC ATA CAC AAC

GCT TCG GCA GCA &TCA GCT TCG

AAA AAA AAA AAA

GGG GAA GGA_Gc_A GCC GAG = GG

CAC ATC ~2 AAA GTG CAC ATC AAA AAA GTG CRC ATC AA?. AAA GTG CRC ATC AAA AAA GTG

GGA CCA CCA CCA

GGA -GGG GGG GCG

CAA ACC ACC ACC

xc AGC AGC AGC

GGC GGC GGC GGC

240 GGC GGC GGC GGC

300 CTC CM CTA -- GCA CT% CAA CC.7 CM

TTC T-TT TTC l"i?C

AAA %AAA AAA

wc TTC TTC TTC

330 G~-GM GAC GGc AAA GCG GAA GAC GGC AAA GCG GM GAC GGC AAA GCG GM GAC GGC AAA

GGc GGC GGC GGC

GCA GCA GCA GCA

AAG MG MG MG

WLT CCT CCT CCr

AAA ACA AAG GCA tijrCA AAA ACA

CGC GCC CGG AGC CdEC & GCC

360 ACA CAA A&& AAA CAA CTC TAT TCC RCA GM AGC AAA CM CTC TAT TCC ACA GM AGC AAA CM CTC TAT TCC ACA GM AGC AAA CM CTC TAT TCC

GCC GCA GCA ACC AAA Gee GCA GC?f AG GCC GCA GCG ACA AAA GCC GCA GCG ACA AAA

410 AA~ GAZL TGC AA~ MC GM TGC MC MC GAA TGC MC AAC GAA TGC AAC

GCA GGG G?i GeA

GCC GCC GCG GCC

MC MC MC AAC

CT-T GCC ACT GAC TAC GAC AGC CAT GCG MT

GGA CM --ACG ACA GGA CAA GGA CM

GAC GAC GM GAC

GAG GAG GAG GAG

AGC ACA ---AGC AGC

TAT TAC TAT TAT

CCA GCG CCA CCA

ACC AAC AAG CTA GCC AAC AM CTA ACC AAC AAC CTA ACC AAC AAG CTA

280 TCG GCA GCA CCT ACA TCG GCA CCA ACE GCG GCA CGA CC? AC: ?CG GCA =A CCT RCA

AGC ACC A% A%

AAG AAA AAE AAG

CAG CM CAC CAG

GM CTC GM CTC GAG TTC GAAcn:

GGC GGA -GGC GGC

MT ACT ET AAT

MC AGC AsiC AAC

GAC AGC GCC TAT Gee GAC AGC GCC TAT GCC GAC AGC GCC TAT GCC GAC AGC GCC TAT GCC

ATT FIT CTT Cl-T

TGG TGG TGG TGG

310 AAA GM ik, AAA

GCC AAA GCC GCC

AAG AAG MG AAG

CCT CCT CCC CC?

GAG GTA = GsiG

GCA GCA GCA GCA

340 CTT GM GGA ATA CTT GAA GG ATA CTT GM GGG ATA Cl-l- GM GGG ATA

TCC TCC TCC TCC

ATT ATT ATT ATT

GAG GA?i GAA GAA

Affi ACF ACC ACC

ATA ATA ATA ATA

GCA GCA GCA GCA

4

370 An: CAG CCA AAA GAC CTA ATG GCA GCT An; CAG CCA AAA GAC CTA ATG GCA GCT ATG CAG CCA AAA GAC CTA ATG GCA GCT ATG CAG CCA AAA GAC CTA ATG GCA GCT

GCA CCA CCA CCA

TGC TTGC TGC

390 CCA GGC CAT AAA CM CCA AAG CAT AM cTG CCA AAG CAT AAA CTG CCA MG CAT AAA CTG

ACA ACA ACA ACA

ACC ACC AGC AW

AU2 GCT GCT GCT

GM

TTc TTC TTC TTC

T@ TGC TGC TGC

AGT AGT AGT AGT

420 BAT AA~ GGc TAT MC iiGc TAT MC AGC TAT AAC AGC

ACT ACT ACT ACT

GAC GAC GAC GAC

TCA SicA ACA ACA

ACC ACC ACC ACC

GM GAA GM GAA

GAG TTA

250 TTG CM ACC GCA GCA ATA AM ACC GCG GO2 ---TTG CM ACC GCii GCA TTG CAA ACC GCA GCA

CCA GA GCA GCA

A

220 GAC AGC CAT GCG AAT GAG TTA GAC ~~37 CAT cc~ MT GAT --- cn: GAC AC ,-AT c,E MT GAG TTA

CCA GCA CCA CCA

TTA TTA TTA TTA

ATA ATA ATA ATA

GGA GGA GGA GGA

GM

AGT AGT AGT AGT

GAC GAC GAC GAC

400 GCC GCC GCC GCC

CTA CTA CTA CTA

TGC EC TGC EC

AGT AM ATA AAG AGT AM ATA MG AGT AAA ATA AAG AGT AAA ATA AAG

GCT GCT GCT GCT

RAA AAA AM AAA

AAG AAG RAG AAG

430 TW TGC TGC TGC

AM FAA CM CAA

TAT T?T TTT TTY'

MT MT AAT AAT

GM GM

Gee G= GM GAA

*CA AC: ACC ACC

AAA AAA AM AAA

Activation 725

1

10 IB 1c

10 1B 1c

10 1B

1c

1 10

GCT GCT GCT GCT

of T. brucei VSA Gene

TCA GCA -GAC AAA GAC AAA GAC AAA

GAC TGC GA?i TGC! GAG ~~32 GAG TGC

GTT GTT GTT G'IT

440 AGT AGT AGT AGT

GAT GGA GGA GGA

GCC CCT ___----GTT TCG Gl-32 TCG Gl-l' TCG

GTA TTG TTG TTG

ACA CCT CCT CCT

CM AAA AAA AAA

GCT --ACC ACC ACC

CAA GGA GGA GGA

470 AAA GAT GGC TCC AAA TGG GAG GCT AAA TCT CCG SiAT TGT AAA TGG GAG GE rim TCT CCG AAT TGT AAA TGG GAG GGC TGT AAA TGG GAG GGC AAA TCT CCG AAT

500 TCT GCT GCG TTA GTG GCA CT'2 --TCT GCT GCA TT? @G TCC TTG TCT GCT GCA TTT GCG TCC TTG TCT GCT GCA TTT GCG TCC TTG

CTG CT? CTT CTT

TTC TT? TTT T'lT

450 ACT ACA AGC CGA AGT -CCT ACC GT ACT ACC GGT ACT CCT CCT ACC GGT ACT

GAA GAA GM GAA

GAA ACA CCA GCA GAA GCA ACA ACr GAA GCA ACA ACT GAA GCA ACA ACT

480 ACT TGC AAA GAT TCC ACT TGC AAA GAT TCC ACT TGC AAA GAT TCC ACT TGC AAA GAT TCC

TCT TCT TCT TCT

ATT ATT ATT ATT

GAA GAr GAT GAT

460 AAJJTW-ACA GGG-AAG RA_A AA5 AAG TGT AAA GAT AAG ACA AAA AAG TGT AAA GAT AAG ACA AAA AAG TGT AAA GAT AAG ACA AAA

CTA CTA ACA CTA ZTA A@ CTA GTA ACC CTA GTA ACC

GAT GAT GAT GAT

490 AAG AAC TTC GCC CTC AK -- GTG AAG AAi TTC GCC CTC AGT CTG RAG AAA TTC GCC CTC AGT CTG RAG AAA TTC GCC CTC AGT CTG

TM ACACCTTI'C TTCCCCCCCCTT TAAAATTTTCCTTGCTACTTGAAAAAC TAA T ~CCTT~~CCCCCCCCmTAAAATTTTCCTTGCAC TAA TTTlGGC CC~-~CCCCCCCCT---Tcc~----------------------------------TAA TTTTGGCCC~RAATTCCCCCCCCT~T~T~TCCTCAC

TTCTGATATA'TT

TTAACAC

EE c

1B

-

lc

GAAGTTCCG~A~) +

Figure 2. DNA Sequence

of the AnTat 1 .lC Gene

Partial DNA sequence of the modified ‘9 kb” family member in variant AnTat 1.lC (IC, between arrowheads) and of the AnTat 1 .iC cDNA (IC, two stretches, respectively between the pairs of thick and thin arrows), compared with the AnTat 1.I (1) and 1 .I0 (10) cDNAs (transcribed from ELCs of the “6.4 kb” and “9 kb” family members, respectively; see Pays et al.. 1963~) and with the AnTat 1.16 cDNA (16, redrawn from Pays et at.. 1963c). In aligning these sequences, homologies are maximized by compensating for small deletions or insertions. The differences with respect to the AnTat 1 ,l C sequence are underlined. The asterisk marks a rearrangement junction point in both AnTat 1 .lB and 1 .l C DNAs. Codons are numbered starting from the known AnTat 1.l ATG initiation codon (Pays et at., 1963c; Michiels et al., 1983). The signal peptide and hydrophobic tail of the AnTat 1 .l antigen putative precursor are made up of 29 and 23 amino acids, respectively (Michiels et al., 1963). so that the junction marked with an asterisk (codon 320 on Figure 2) is located at 160 codons from the C-terminal end of the mature protein (between the 290th and the 291st codons on a total of 451).

AnTat 1.1 C trypanosomes are serologically indistinguishable from AnTat 1 .I is also demonstrated by the immunofluorescence test on fixed trypanosome smears (Table 1).

The AnTat l.lC Telomere Seems to Be Exchanged with the Telomere Containing the Previous (AnTat 1.16) ELC

L.--L-Y

Figure 3. Restriction Maps of the “6.4 kb” and “9 kb” Sequences in AnTat 1 .I B and 1 .lC DNAs. and Comparison with the AnTat 1 .lB ELC Map The sequence corresponding to the AnTat 1.1 mRNA is boxed in each map, with only the conserved Eco RI and Pst I sites indicated within these limits. The specific “6.4 kb” and “9 kb” sequences are represented in black and whrte, respectrvely (see text and Pays et al., 1983c). The wavy arrows represent the transcription of AnTat 1,l B and 1.1 C mature mRNAs.

synthesized on the “6.4 kb” sequence itself, since the latter, which is the AnTat 1 .I BC, never appears to be in an active chromatin configuration, either in AnTat 1 .lC or in other clones (Figure 4B and results not shown).

The AnTat l.lC Trypanosomes are Serologically Indistinguishable from AnTat 1.1 It has been shown, by the immune lysis test on living ceils (Pays et al., 1983c), that the surface-exposed AnTat 1 .I C antigenic determinants share the serological specificity of AnTat 1.1, although the AnTat 1.1 C mRNA is different from the AnTat 1.1 mRNA (boxed maps in Figure 1). That the

Since the AnTat 1.1 C clone was directly derived from AnTat 1.16, we also investigated the structure and expression of the AnTat 1.16 gene. Therefore the AnTat 1.16 cDNA was cloned and used as a probe in Southern blots of genomic digests from six successive variants (AnTat 1.1~1.~1.~1.1~1.1~1.38). A map in Figure 5 summarizes the information collected about the AnTat 1.16 sequence: this gene is telomeric, as indicated by the BAL31 exonuclease sensitivity of its 3’ environment (result not shown); moreover, its activation mechanism involves the transposition of an ELC into another telomere. The AnTat 1 .16 ELC is clearly seen in Hind Ill digestion products of the AnTat 1.16 genome (Figure 5) but, curiously, this additional copy is conserved in the two ensuing variants (AnTat 1 .lC and AnTat 1.38). The AnTat 1.16 ELC is in an active chromatin configuration in the AnTat 1.16 genome only, as revealed by its higher DNAase I sensitivity in AnTat 1.16 DNA only (Figure 5): the remaining AnTat 1.16 ELC in clones AnTat 1.1 C and 1.38 is thus probably not transcribed. We have shown also that the AnTat 1 .I C gene (in the rearranged “9 kb” sequence) is expressed without being duplicated (Figures 1 and 4) then is lost in the next variant, AnTat 1.3B (Figure 1). Together, these observations suggest that the telomere harboring the “9

Cell 726

DNAase I + E

The “Active” AnTat l.lC Telomere Is Probably Replaced by a Large Copy of Another Telomere in the Ensuing AnTat 1.38 Variant

kb

2.5 1% 1.2 -

Al.lC probe: A 1.10 (59

DNAase I + P kb

Al.16

Al.1 C

probe: Al.10 (31

Figure 4. Kinetics of DNAase I Hydrolysis Showing That the “9 kb” Family Member uration in the AnTat I. 1C Clone

of AnTat 1 .I -Specific Sequences, Is in an Active Chromatin Config-

AnTat 1.1C or l.IB nuclei were isolated and digested by DNAase I as described (Pays et al., 198Ib), for 0, 15, 30 set, 1, 2, 4, 6, 8, 10, 12. 14, 16, and 20 min, from left to right in A (only up to I2 min in B). After Eco RI (A) or Pst I (6) digestion, the DNA was hybridized with AnTat 1 .I0 cDNA probes specific for the 5’ (A) or 3’ (B) halves of the gene (see map in Figure I). Arrowheads indicate the sequences that are preferentially hydrolyzed; the 2 kb Eco RI fragment in (A) and the 2 kb Pst I fragment in (B) belong to the “9 kb” sequence and to the AnTat I .I B ELC, respectively. Note the fast disappearance of the 9 kb Pst I band in the AnTat 1.IC DNA, as compared to the same band in the AnTat I .I B DNA.

Table I. Serological 1.I Homoisotypes

Comparison

Trypanosome

of Surface-Exposed

Epitopes in AnTat

The loss of the “9 kb” sequence in the AnTat 1.36 clone, which stems from AnTat 1 .l C (Figure l), is most easily explained as a replacement by the AnTat 1.38 ELC (see scheme in Figure 6). The extent of the “9 kb” sequence replacement cannot be precisley defined, but at least encompasses the AnTat 1 .l C gene as well as a 1.4 kb long stretch in front of it, up to an Eco RI site located 2 kb upstream from the Eco RI site arrowed in Figure 1. We cloned this 2 kb Eco RI sequence, and when used as a probe, it failed to reveal the corresponding sequence in AnTat 1.38 genomic DNA digests (results not shown). To verify that the AnTat 1.38 gene is activated by transposition of an ELC, we cloned the AnTat 1.38 cDNA and used it as a probe in genomic DNA digests. Figure 7 shows that three AnTat 1.9specific sequences are revealed in AnTat 1.38 DNA, one of which is an ELC. The ELC can be revealed as a distinct band only by probing its 3’ environment, up to the DNA end (Figure 7, digestion by Pvu II). When hybridizing the cDNA probe to fragments characteristic for the gene’s 5’ environment only (as Sph I, Figure 7) the ELC fragments are perfectly superimposed on the corresponding ones from another AnTat 1.9specific sequence. Digestion by DNAase I shows that in these cases only one of the two superimposed fragments is in an active chromatin configuration (Figure 7). We found in this way that up to a Barn HI site about 40 kb upstream from the AnTat 1.3 sequence, the restriction map of the 5’ environment of the AnTat 1.38 ELC is indistinguishable from that of this other AnTat 1.3 sequence, therefore considered as the template of a large AnTat 1.38 ELC, at least 40 kb long. This ELC has thus probably replaced the “9 kb” telomere in the AnTat 1.38 clone. Interestingly, the template used is not the AnTat 1.3 BC, but the remaining AnTat 1.3 ELC (ex-ELC) conserved in a silent form in all variants derived from AnTat 1.3, probably as a result of telomeric exchange with the AnTat 1.6 gene-containing telomere (Laurent et al., submitted).

Discussion

clones

Antisera

AnTat 1.1

AnTat 1.10

AnTat l.lB

AnTat l.lC

AnTat I.6

AnTat I.8

Anti-I .l Anti-l.10 Anti-l.lB Anti-i.lC Anti-l.6 Anti-i.8

320 160 640 -

320 -

320 160 640640-

320 160

-

-

640 -

160

The end-titres of reaction of dday antisera from clone-infected rabbits with acetone-fixed tiypanosomes have been detenined by immunofluorescence test {Van Mekvenne et al., 1977). Dashes indicate no reaction at antiserum dilution of l/80.

kb” sequence has been exchanged with the AnTat 1 .16 ELC-containing telomere in the AnTat 1 .lC variant, if we assume that the recombination has taken place downstream from the putative transcription promoter(s) in the “active” telomere (see scheme in Figure 6).

This paper reports that in the AnTat 1 .lC clone, cDNA probes fail to detect an ELC, but that the “9 kb” sequence has been partially converted by another member of the AnTat 1 .I multigene family, the “6.4 kb” sequence. We had shown that this “6.4 kb” sequence acted as the BC for the AnTat 1 .l and 1 .lB ELCs (Pays et al., 1981a), and for the AnTat 1.1D and l.lE ELCs as well (Pays et al., 1983b). On the other hand, the “9 kb” sequence acted as the BC for the AnTat 1.10 ELC (Pays et al., 1983c). In all these previously examined cases, gene conversion took place within the expression site and with one of the BCs as donor. It might thus seem curious that, in the present case, the “9 kb” sequence behaved as receptor in a conversion event; at closer examination, however, it appears that this sequence shares more similarities with ELCs.

Activatron

of T. brucei VSA Gene

727

H

DNAase

I + H

Al.lC

A 1.16 probe

: A 1.16

s

end I

ELC probLwi

Figure 5. The AnTat 1.16 ELC Is Conserved,

in an Inactive Chromatin

Configuration,

in the Ensuing AnTat 1.I C and 1.38 Clones

The Hind Ill digests of AnTat 1 .I, 1.3, 1.6, 1.16, 1 .I C, and 1.38 genomic DNA& respectively, from left to right in the first panel, have been hybridized with an AnTat 1 .I6 cDNA probe, revealing an additional AnTat 1 .I6 gene copy in the last three variants (9.6 kb). This copy is highly susceptible to DNAase I hydrolysis in AnTat 1 .I6 nuclei only: the middle and left panels show kinetics of DNAase I digestion on AnTat 1 .I6 and 1 .iC nuclei, respectively, performed as described in the legend to Figure 4. Digestion times were 0, 7, 15, 30, 60 set, 2, 4, 6, 6 and IO min, from left to right in each block, respectively. The 9.6 kb Hind Ill additional fragment (arrowhead) disappears after 2 min digestion in AnTat 1.16, whereas it is still visible after IO min digestion in AnTat 1 .lC nuclei; the internal control is the 20 kb fragment, which in both cases disappears after a 4 min digestion. The AnTat 1 .I6 6.3 kb and AnTat 1.iC 7.1 kb and 5.5 kb sequences are due to polymorphism: they represent telomerfc fragments containing a divergent form. lacking the 3’ Hind Ill site, of the AnTat 1 ,I 6 BC in AnTat 1.I6 DNA, and of the AnTat 1 .I 6 BC and ex-ELC in AnTat 1 .lC DNA, respectively (data not shown). The restriction maps of the AnTat 1 .I6 BC and ELC sequences and surroundings are presented below. The known extent of the ELC is underlined under the BC map, with uncertainties at both ends; the open box represents the extent of the cDNA. The cDNA fragment used as a probe is shown below the ELC map.

VARIANT

@

EXCHANQE @

AND

VARIANT

BETWEEN

@

BY

ELC

OF

VARIANT

0

--m--B

-m--J@

--m-J@

-m-J@

Frgure 6. Possible Scheme of Gene Activation the AnTat 1.16, 1.I C, and 1.3B Genes

In a Series of Three Successrve

The wavy arrow represents

and the P the transcription

gene transcnptron.

REPLACEMENT

ELC@

Variants, Accounting

promoter

for the Observations

Reported

Here If A, B, and C are

Cell 728

Pv

DNAase

SP

Al.38 pro

be

Figure 7. The AnTat 1.38 ELC Is Probably

I + Sp

Al.lC

: A 1.38

at Least 40 kb Long

The Pvu II and Sph I digests of AnTat 1 ,I 6, 1 ,I C, and 1.38 genomic DNA% respectively, from left to right in the two left panels, have been hybridized with an AnTat 1.3 B cDNA probe. An additional AnTat 1.38 gene copy is revealed as a distinct band only when probing its 3’ environment (Pvu II digestion). When probing the 5’ environment (Sph I digestion for instance) this ELC appears superimposed to another AnTat 1.9specific fragment, but selectively disappears under DNAase I digestion (arrowhead). The restriction maps of the three distinct AnTat 1.3.specific sequences present in AnTat 1.38 DNA are shown below: BC has been identified as the AnTat 1.3 BC (Laurent et al., 1963); ex-ELC is the AnTat 1.3 ELC, which has been conserved in a silent form in all variants derived from AnTat 1.3 (Laurent et al., submitted) and used as template for the AnTat 1.38 ELC; ELC is the additional sequence present in AnTat 1.38 DNA, also identified as the AnTat 1.38 ELC thanks to DNAase I digestion kinetics on different restriction fragments (data not shown). The extent of the AnTat 1.38 ELC is underlined under the ex-ELC map, with uncertainties at both ends: the open box represents the extent of the cDNA. The cDNA fragment used as a probe is shown below the ELC map.

It should first be recalled that this sequence is found in a telomere, as are the actively transcribed ELCs. We observed that the converted “9 kb” member in clone AnTat 1.1 C acquired a much increased sensitivity to DNAase I and, secondly, that no other sequence shares the restriction map and nucleotide sequence of the AnTat 1 .I C cDNA, proving that this sequence became the active AnTat 1 .l C gene. Moreover, in the switch leading to the next variant AnTat 1.38, the converted “9 kb” sequence disappears, probably being chased by the new AnTat 1.3B ELC, and indicating that the telomere which harbored the former “9 kb” sequence behaves as an expression site in AnTat 1.1 C, and probably in 1.38 as well. Although sharing the properties of the VSA gene expression site described so far (Pays et al., 198313) the “9 kb”containing telomere does not exhibit the same 5’ restriction map, at least within about 20 kb upstream from the gene; on the other hand, the restriction maps of the “9 kb” telomere in the expressor clone AnTat 1 .l C and in the

nonexpressor clones-for example, AnTat 1 .I, 1.3, 1.6, and 1.6-are identical, except for the converted region of the gene. Two explanations can be given. First, there are multiple sites where VSA genes can be expressed, controlled in such a way that only one is operating at any one time. Second, in generating the AnTat l.lC variant, the “9 kb” telomere has been exchanged, by reciprocal recombination 3’ to the putative promoter, which the chromosome end which harbored the previous (AnTat 1.16) ELC. As predicted from such rearrangement, the AnTat 1 .16 gene family is amplified by one unit from the clone AnTat 1 .lC onward. Certainly, models derived from the multiple site hypothesis (transposable regulatory sequences f.i.) could also account for gene addition, provided that the activated sequence would act as receptor in the conversion process. We nevertheless favor the simpler crossing-over model. We thus propose that the switch from AnTat 1 .16 to 1 .l C was operated by the combined effects of two different recombination mechanisms: a reciprocal crossing-over

Activation 729

of T. brucei VSA Gene

that involved at least 20 kb of telomeric DNAs, thus positioning the “9 kb” sequence downstream from the putative unique VSA gene promoter, and a partial conversion of this translocated sequence by the “6.4 kb” family member. Since the partial gene conversion is very similar in AnTat 1 .lC and AnTat 1 .lB, at least one of the recombination loci being identical, it seems that these interactions could not be completely random, or that a selection against other interactions has operated, if, for instance, recombinations between two protein domains were favored. In this respect, it is noteworthy that the junction point between the AnTat 1 .l and AnTat 1. IO domains in both AnTat 1 .lB and 1 .lC sequences is 160 codons from the C terminus of the mature protein (see legend of Figure 2) whereas it is known (Cross, 1978) that in their native state, the variantspecific antigens can be selectively attacked by trypsin at a point about 150 residues from their C-terminal extremity. These observations strongly suggest that the region around 1.50 amino acids from the protein end hinges two protein domains, and that homologous domains from different antigens can be exchanged. Similarly, it has been reported that intron-exon junctions correspond to boundaries between protein domains (Sakano et al., 1979; Stein et al., 1980; Eiferman et al., 1981) and that variations are facilitated in these regions, because they provide additional protein surface loops while maintaining the protein function (Craik et al., 1983). The AnTat 1 .l B and 1 .l C recombined sequences encode for the synthesis of proteins with serologically indistinguishable surface-exposed determinants, though another partial conversion involving the same AnTat 1 .l gene family members could lead to antigenic variation (Pays et al., 1983c). Since in the AnTat 1 .l C clone the “9 kb” sequence has been partially converted, it has irreversibly lost a fraction of its genetic information. The lost sequence corresponds to the AnTat 1 .lO coding message (Pays et al., 1983c), and it is thus likely that the ability to express an AnTat 1 ,lO-like serotype has been lost in AnTat 1 .l C. Furthermore, the “9 kb” sequence has been completely removed from AnTat 1.38. These examples show how an antigen repertoire can evolve by loss of genes; on the other hand, the AnTat 1.16 gene family being amplified from the AnTat 1 .lC clone onward, the incoming sequence could further diversify into another antigen coding sequence by genetic alterations such as point mutations or gene conversion, as suggested by Young et al. (1983). Thus, in addition to operating the antigenic switch (Pays et al., 1983c), gene conversion, in combination with the reciprocal rearrangement mechanism, allows the trypanosomes to modify their antigenic repertoire. Since the same gene (the AnTat 1 .I specific sequence in the “9 kb” telomere, for instance) can be activated by either of the two mechanisms, a given gene could be amplified in some repertoires and lost in others, depending on the way these mechanisms alternate. Gene conversion has also been observed as a diversifying mechanism in other multigene families (Slightom et al.,

1980; Schreier et al., 1981; Dildrop et al., 1982; Roeder and Fink, 1982; 0110 and Rougeon, 1983; Bentley and Rabbit&, 1983; Pease et al., 1983; Weiss et al., 1983) though also allowing for some homogenization of gene family sequences (Baltimore, 1981). In the AnTat 1.38 variant, the gene conversion leading to gene activation seems to involve at least 40 kb of telomeric sequence. This is considerably more than the extent of gene conversions described so far in trypanosome antigenic variation: from 1 kb in AnTat 1.1 B (Pays et al., 1983c) to about 3 kb in numerous cases where the limits of the converted sequences appear to be in repeated elements (Van der Ploeg et al., 1982a, 1982c; Pays et al., 1982, 1983a). Obviously, the length of the duplicated sequence is not as rigorously defined as in the “cassette” system controlling the mating type interconversion in yeast (Hicks et al., 1977) but seems to depend on the nature of the target sequence. In the AnTat 1.1 B case, for instance, the homologies between the incoming (AnTat 1.1 B) and outgoing (AnTat 1 .lO) genes are extensive (Pays et al., 1983~) so that recombination could take place all along the molecule. In most cases, however, sequence homologies between ELCs are restricted to blocks present both at the 3’ extremity of different genes (Matthyssens et al., 1981; Majumder et al., 1981; Rice-Ficht et al., 1981; Bernards et al., 1981) and in front of the gene, about 3 kb upstream from the 3’ block (Van der Ploeg et al., 1982a, 1982c; Pays et al., 1982, 1983a, 1983b). In the AnTat 1.3B clone, the target sequence (AnTat 1 .lC gene) is in a telomere whose restriction map completely differs, for at least 20 kb, from either the map characteristic for most expression sites described so far in the AnTAR 1 repertoire (Pays et al., 198313) or the very similar restriction maps of different telomeric BCs, as AnTat 1.3 (Laurent et al., 1983) or AnTat 1.16 (this study). It is possible that the sequence homologies between the AnTat 1 .lC and AnTat 1.3B telomeres were restricted to a region far upstream from the two concerned genes, so that the duplicated sequence would have to be very long to reach the recombinational locus of homology. According to the model presented in Figure 6, this locus has to be downstream from the transcription promoter, which should therefore be located very far upstream from the gene. This conclusion is in accordance with other observations (Van der Ploeg et al., 1982b; Michiels et al., 1983) including the fact that the putative crossing-over point between the AnTat 1 .I6 and AnTat 1 .iC gene-containing telomeres must be far from the genes, since we did not manage to map it within at least 20 kb (Figure 3). On the other hand, occasional duplications of large telomeric stretches, as in AnTat 1.3B, could account for the observation that different chromosome ends, as those harboring the AnTat 1.3 or 1.16 BCs, have very similar restriction maps. In conclusion this paper emphasizes the importance of telomere interactions in the control of trypanosome antigen gene expression. It shows that the expression of the same antigen type (AnTat 1 ,l ) can be obtained from very differ-

Cell

730

ent ways: it can result from the activation, by different types of DNA rearrangements, of sequence combinations affecting different DNA lengths. Finally, it indicates that the same “9 kb” telomeric sequence can either give rise to an ELC (in variant AnTat 1 .lO) or on the contrary become a target for gene conversion (in variant AnTat 1 .lC), probably after reciprocal recombination with another telomere. This flexibility is obviously an additional key to the parasite’s adaptability.

Note Added The three gene conversion endpoints identified so far in the AnTat 1 .l , 1 .l B, and 1 .I C gene rearrangement are all located next to a CACA sequence (see Rogers, 1983, for a review-on the meaning of CACA sequences). Experimental

Procedures

Trypenosomes The different antigenic variants were cloned at the Institute of Tropical Medicine (Antwerp) from the same T. b. brucei stock (EATRO 1125). The AnTat 1 .I, 1 .l B, 1.3, and 1 .I0 bloodstream trypanosome clones have been described previously (Pays et al., 1981a). AnTat 1.lD is an independent 1 .l-like variant, cloned from the same stock, and AnTat l.lE was cloned after cyclical transmission of AnTat 1.I D by the tsetse fly, Glossina morsitans morsitans (Le Ray et al., 1977). An abridged pedigree of the different clones is given in Figure 1. Their nomenclature follows the rules recommended by Lumsden (1982). All trypanosome populations were grown in mice. Clones of successive VATS were set up from heterotypes arising in the diversifying clone of the previous VAT. Reference clones used for DNA analysis were more than 99% serologically homogeneous.

DNA Isolation DNA isolation was carried out as described (Pays et al., 1980). Briefly, the cells were lysed in 10 mM NaCI, 250 mM EDTA, 1% SDS, IO mM Tris-HCI (pH 8) then incubated for 1 hr at 37°C with 100 pg/ml RNAase A, followed by a 4 hr incubation at 37’C with 500 pg/ml proteinase K. After dialysis against 10 mM NaCI, 0.2 mM EDTA, IO mM Tris-HCI (pH 8) the DNA was purified by CsCl gradient centrifugation and further dialysis.

Molecular

Cloning

AnTat 1.10, 1.16, 1 .lC, and 1.38 cDNAs were cloned in the Pst I site of pBR322 by the G-C tailing procedure (Humphries et al.. 1977); screening was done as previously described (Pays et al., 1980). The AnTat l.lC gene has been cloned in Xgt Wes-Xb phage as follows: 9 kb Pst I fragments were isolated from AnTat 1 .I C genomic DNA by preparative electrophoresis on low melting point agarose gels, then digested by Eco RI and inserted by ligation between the Eco RI arms of hgt Wee.-Xb (Tiemeier et at.. 1976). In vitro packaging of recombinant DNA molecules was performed as described by Hohn and Murray (1977) and screening according to Benton and Davis (1977) using two ‘P-labeled probes: either the 500 bp 5’ Eco RI fragment or the 750 bp 3’ Eco RI-Pst I fragment from AnTat 1 .I0 cDNA (see Figure 1 for a map). We obtained several clones harboring the corresponding 2 kb 5’ Eco RI-Eco RI and 750 bp 3’ Eco RI-Pst I fragments of the AnTat 1 .lC gene and neighborhood (Figure l), cloned in tandem together with other Eco RI-Pst I fragments.

PrObeS Specific parts of the cloned sequences were isolated by several rounds of preparative electrophoresis on low melting point agarose. Fragments were 32P-labeled by nick translation (Rigby et al.. 1977) and the specificity of each probe was checked by back-hybridization with a mixture of different isolated fragments.

Hybridization Hybndrzation of the probes with Southern blots of digested genomic DNAs was performed as in previous work (Pays et al., 1981a). As size markers

we used Hind Ill digestion products of X and SV40 DNAs, T-labeled by filling-in of the cohesive ends with the Klenow fragment of DNA polymerase

DNA fhquencing We followed the method of Sanger et al. (1980) the appropriate DNA restriction fragments being subcloned in Ml 3mp9 or mp8 vectors (Messing et at.. 1981) after isolation by electrophcresis in low melting point agarose gels.

Acknowledgmenfs We thank D. Franckx for careful preparation of the figures. This investigation received support from the Fonds de la Recherche Scientifique Medicale (FRSM, Brussels), from the ILRAD/Belgian Research Centres Agreement for Cdlaborative Research (Nairobi), and from the Trypanosomiases component of the UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (Geneva). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received

July 13, 1983; revised September

19, 1983

References Baltimore, D. (1981). Gene conversion: ulin genes. Cell 24, 592-594.

some implications

for immunoglob-

Bentley, D. L., and Rabbit&, T. H. (1983). Evolution of immunoglobulin V genes: evidence indicating that recently duplicated human V, sequences have diverged by gene conversion. Cell 32, 181-189. Benton, W. D., and Davis, R. W. (1977). Screening lambda gt recombinant clones by hybridization to single plaques in situ. Science 796, 180-182. Bernards, A., Van der Ploeg, L. H. T., Frasch, A. C. C., Borst, P., Boothroyd, J. C., Coleman, S., and Cross, G. A. M. (1981). Activation of trypanosome surface glycoprotein genes involves a duplication-transposition leading to an altered 3’ end. Cell 27, 497-505. Bernards, A., Michels, P. A. M., Lincke, C. R.. and Borst, P. (1983). Growth of chromosome ends in multiplying trypanosomes. Nature 303. 592-597. Borst, P., and Cross, G. A. M. (1982). antigenic variation. Cell 29, 291-303. Cross, G. A. M. (1978). Antigenic Sot. Lond. B. 202,55-72.

Molecular

variation

basis for Trypanosome

in trypanosomes.

Proc. Royal

De Lange, T.. and Borst, P. (1982). Genomic environment of the expressionlinked extra copies of genes for surface antigens of Trypanosoma hcei resembles the end of a chromosome. Nature 299, 451-453. Dildrop, R., Brijggemann, M.. Radbruck, A., Radjewsky. K., and Bayreuther, K. (1982). lmmunoglobulin V region variants in hybridoma cells. II. Recombination between V genes. EMBO J. 7, 635-640. Eiferman, F. A., Young, P. R., Scott, R. W., and Tilghman, S. M. (1981). lntragenic amplification and divergence in the mouse a-fetoprotein gene. Nature 294, 713-718. Englund, P. T., Hadjuk, biology of trypanosomes.

S. L., and Marini, J. C. (1982). Ann. Rev. Biochem. 51,695-726

The molecular

Hicks, J. B., Strathern. J. N., and Herskowitz, I. (1977). The cassette model of mating-type interconversion. In DNA Insertion Elements, Pfasmids, and Episomes, A. I. Bukhari et al., eds. (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory), p. 457. Hoeijmakers, J. H. J., Frasch, A. C. C., Bernards, A.. Borst, P.. and Cross, G. A. M. (1980). Novel expression-linked copies of the genes for variant surface antigens in trypanosomes. Nature 284. 78-80. Hohn, B., and Murray, K. (1977). Packaging recombinant DNA molecules into bacteriophage particles. Proc. Nat. Acad. Sci. USA 74, 3259-3263. Humphries, P., Cachet, M., Krust, A., Gerlinger, P., Kourilsky, P., and Chambon, P. (1977). Molecular cloning of extensive sequences of the in vitro synthesized chicken ovalbumin structural gene. Nucl. Acids Res. 4. 2389-2406.

Activation 731

of T. brucei VSA Gene

Laurent, M., Pays, E., Magnus, E., Van Meirvenne, N., Matthyssens, G., Williams, R. O., and Steinen, M. (1983). DNA rearrangements lknked to the expression of a predominant surface antigen gene of trypanosomes. Nature 302, 263-266.

Roeder, G. S., and Fink, G. R. (1982). Movement of yeast transposable elements by gene conversion. Proc. Nat. Acad. Sci. USA 79, 5621-5625.

Le Ray, D., Barry, J. D., Easton, C., and Vickerman. K. (1977). First tsetse fly transmission of the “AnTat” serodeme of Trypanosoma brucei. Ann. Sot. Beige M&d. Trop. 57, 369-381.

Sakano, H., Rogers, J. H., Huppi, K.. Erack, C., Traunecker, A., Maki, R., Wall, R., and Tonegawa, S. (1979). Domains and the hinge region of an immunoglobulin heavy chain are encoded in separate DNA segments. Nature 277, 627-633.

Lumsden, W. H. R. (1982). Characterization and nomenclature of trypanosome serodemes and zymodemes: report of the meeting held in Edinburgh, Sept. 1978. Sys. Parasitol. 4, 373-376. Majiwa. P. A. O., Young, J. Ft., Englund, P. T.. Shapiro, S. Z., and Williams, R. 0. (1982). Two distinct forms of surface antigen gene rearrangement in Trypanosoma brucei. Nature 297, 514516. Majumder, H., Boothroyd, J. C., and Weber. H. (1981). Homologous 3’. terminal regions of mRNAs for surface antigens of different antigenic variants of Trypanosoma brucei. Nucl. Acids Res. 9, 47454753. Matthyssens, G., Michiels, F., Hamers, R., Pays, E., and Steinert, M. (1981). Two variant surface glycoproteins of Trypanosoma brucei have a conserved C-terminus. Nature 293, 230-233. Messing, J., Crea, R., and Seeburg. P. H. (1981). DNA sequencing. Nucl. Acids Res., 9, 309-321.

A system

for shotgun

Michiels, F.. Matthyssens, G., Kronenberger, P., Pays, E., Dero. B.. Van Assel, S., Darville, M.. Cravador, Steinert, M., and Hamers, R. (1983). Gene activation and reexpression of a Trypanosoma brucei variant surface glycoprotein. EMBO J. 2, 1185-I 192. Ollo, R., and Rougeon, F. (1983). Gene conversion and polymorphism: generation of mouse immunoglobulin y2a chain alleles by differential gene conversion by y2b chain gene. Cell 32, 515-523. Pays, E., Delronche, M., Lheureux, M., Vervoort, T., Bloch, J., Gannon. F., and Steinert, M. (1980). Cloning and characterization of DNA sequences complementary to messenger ribonucleic acids coding for the syntheses of two surface antigens of Trypanosoma brucei. Nucl. Acids Res. 8, 59655981. Pays, E., Van Meirvenne, N., Le Ray, D., and Steinert, M. (1981a). Gene duplication and transposition linked to antigenic variation in Jrypanosoma brute!. Proc. Nat. Acad. Sci. USA 78, 2673-2677. Pays, E., Lheureux, M., and Steiner?, M. (1981 b). The expression-linked copy of the surface antigen gene in Trypanosoma is probably the one transcribed. Nature 292, 265-267. Pays, E., Lheureux, M., and Sternert. M. (1981~). Analysis of the DNA and RNA changes associated with the expression of isotyprc variant-specific antigens of trypanosomes. Nucl. Acids Res. 9, 4225-4238.

Rogers, J. (1983). 305.101-102.

CACA sequences:

the ends and the means?

Nature

Sanger, F., Coulson, A. R., Barell, B. G., Smith, A. J. H., and Roe, 6. A. (1980). Cloning in srngle stranded bacteriophage as an aid to rapid sequencing. J. Mol. Biol. 743, 161-178. Schreier, P. H., Bothwell, A. L., Mueller-Hill, B.. and Baltimore, D. (1981). Multiple differences between the nucleic acrd sequences of the IgG2aa and IgGAa” alleles of the mouse. Proc. Nat. Acad. Sci. USA 78, 4495-4499. Slightom, J. L., Blechl. A. E., and Smithies, 0. (1980). Human fetal G~- and ‘7.globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Cell 27, 627-638. Southern, fragments

E. M. (1975). Detection of specific sequences among DNA separated by gel electrophoresis. J. Mol. Biol. 98, 503-517.

Stein, J. P.. Catterall, J. F., Kristo, P., Means, A. R., and O’Malley. B. W. (1980). Ovomucoid intervening sequences specify functional domains and generate protein polymorphism. Cell 27. 681-687. Tiemeier, D., Enquist, L., and Leder, P. (1976). Improved derivative of a phage lambda EK2 vector for cloning recombinant DNA. Nature 263, 526527. Van Der Ploeg, L. H. T., Bernards, A., Rijsewijk, F. A. M., and Borst, P. (1982a). Characterization of the DNA duplication-transposition that controls the expression of two genes for variant surface glycoproteins in Jrypanosoma brucei. Nucl. Acids Res. 70, 593-609. Van Der Ploeg, L. H. T., LIU, A. Y. C., Michels, P. A. M., De Lange, T., Borst, P., Majumder, H. K., Weber, H., Veeneman, G. H., and Van Boom, J. (1982b). RNA splicing is required to make the messenger RNA for a varrant surface antigen in trypanosomes. Nucl. Acids Res. 70, 3591-3604. Van Der Ploeg, L. H. T., Valerio, D., De Lange, T., Bernards, A., Borst, P., and Grosveld. F. G. (1982c). An analysis of cosmrd clones of nuclear DNA from Trypanosoma brucei shows that the genes for variant surface glycoproteins are clustered in the genome. Nucl. Acids Res. 70, 5905-5923. Van Meirvenne. N., Magnus, E., and Vervoort, T. (1977). Comparison of variable antigenic types produced by trypanosome strains of the subgenus Trypanozcon. Ann. Sot. Belge Med. Trop. 57, 409-423. Weintraub, H., and Groudine, M. (1976). Chromosomal subunits genes have an altered conformation, Science 793, 848-856.

in active

Pays, E.. Lheureux, M., and Steinert, M. (1982). Structure and expression of a Jrypanosoma brucei gambiense variant-specific antigen gene, Nucl. Acids Res. 70, 3149-3163.

Williams, R. O., Young, J. R., and Mafiwa, P. A. 0. (1979). Genomic rearrangements correlated with antigenrc variation in Trypanosoma brucei. Nature 282, 847-849.

Pays, E.. Dekerck, P., Van Assel. S., Eldirdiri A., Babiker, Le Ray, D., Van Meirvenne, N., and Steinert, M. (1983a). Comparative analysis of a Trypanosoma brucei gambiense antigen gene family. Its potential use in epidemrology of sleeping sickness. Mol. Blochem. Parasitol. 7, 63-74.

Williams, R. O., Young, J. R., and Majwa, P. A. 0. (1982). Genomic environment of 7. brucei VSG genes: presence of a minichromosome. Nature 299, 417-421.

Pays, E., Van Assel. P., Matthyssens, G., (198313). At least two sion site of a surface 34, 359-369.

S., Laurent, M., Dero, B., Michels, F., Kronenberger. Van Meirvenne, N., Le Ray, D., and Steinert, M. transposed sequences are associated in the expresantrgen gene in different trypanosome clones, Cell

Pays, E., Van Assel, S., Laurent, M., Darvrlle, M., Vervoort, T., Van Merrvenne, N.. and Steinert, M. (1983c). Gene conversion as a mechanism for antigenic variation in trypanosomes. Cell 34, 371-381, Pease, L. R., Schulze, D. H., Pfaffenbach, G. M., and Nathenson, S. G. (1983). Spontaneous H-2 mutants provrde evidence that a copy mechanism analogous to gene conversion generates polymorphism in the major htstocompatibility complex. Proc. Nat. Acad. Sci. USA 80, 242-246. Rice-Flcht, A. C., Chen, K. K., and Donelson, homologies near the C-termint of the variable Trypanosoma brucei. Nature 294, 53-57.

J. E. (1981). Sequence surface glycoprotelns of

Rigby, P. W. J., Dieckmann, M.. Rhodes, C.. and Berg, P. (1977). Labeling deoxyribonucleic acid to high specific actrvity rn vitro by nick translation wrth DNA polymerase I. J. Mol. Blol. 7 13, 237-251.

Young, J. R., Donelson. J. E., Majiwa, P. A. 0.. Shapiro, S. Z., and Williams, R. 0. (1982). Analysis of genomic rearrangements associated wrth two variable antigen genes of Trypanosoma brucel. Nucl. Acrds Res. 70, 803819. Young, J. R., Shah, J. S., Matthyssens, G., and Williams, R. 0. (1983). Relationship between multiple copies of a T. brucei variable surface glycoprotein gene whose expression is not controlled by duplication. Cell 32, 1149-I 159.