Cloning and transcriptional analysis of a variant surface glycoprotein gene expression site in Trypanosoma brucei

Cloning and transcriptional analysis of a variant surface glycoprotein gene expression site in Trypanosoma brucei

Molecular and Biochemical Parasitology, 28 (1988) 197-206 197 Elsevier MBP 00949 Cloning and transcriptional analysis of a variant surface glycopro...

840KB Sizes 1 Downloads 108 Views

Molecular and Biochemical Parasitology, 28 (1988) 197-206

197

Elsevier MBP 00949

Cloning and transcriptional analysis of a variant surface glycoprotein gene expression site in Trypanosoma brucei Carol P. Gibbs* and George A.M. Cross Laboratory of Molecular Parasitology, The Rockefeller University, New York, NY, U.S.A. (Received 2 September 1987; accepted 4 December 1987)

The variant surface glycoprotein (VSG) gene expression site in Trypanosoma brucei variant 117a has been mapped to a point about 40 kb upstream from the VSG gene. Sequences upstream from the previously identified [Cully, D.F. et al. (1985) Cell 42, 173-182] expression site associated gene (ESAG-I) have been cloned and a stable 1.3 kb transcript has been localized immediately 5' to ESAG-I. This transcript is in the same orientation and approximately as abundant as the ESAG-I message. A highly conserved region, of which at least 15 copies are present in the genome, has been identified further upstream. A stable transcript corresponding to this region was not detected in variant l17a, but a 1.7 kb transcript was detected in variant 221a. In isolated nuclei, representative sequences from the l17a expression site were transcribed unidirectionally at similar rates, and transcription was insensitive to a-amanitin. Key words: Trypanosoma brucei; Variant surface glycoprotein expression; Expression site associated gerle

Introduction

Trypanosoma brucei contains an estimated 1000 variant surface glycoprotein (VSG) genes [1], but only one is usually transcribed at any time. The active VSG gene is situated in one of several alternative telomeric expression sites (ESs) [2,3]. Activation of VSG genes is often accompanied by rearrangements, including duplication and transposition, reciprocal telomere exchange and telomere conversion [4]. The essential mechanisms regulating ES activation are obscure. ESs are *Present address: Max Planck Institut fiir Biologie, Spemannstrasse 34, D-7400 TiJbingen, F.R.G. Correspondence address: G.A.M. Cross, Laboratory of Molecular Parasitology, The Rockefeller University, 1230 York Avenue, New York, NY 10021-6399, U.S.A. Abbreviations: bp, base pairs; ES, expression sit~; ESAG, expression site associated gene; kb, kilobase (pairs); Pipes, 1,4-piperazinediethanesulfonic acid; SDS, sodium dodecyl sulfate; SSC, saline sodium citrate; VSG, variant surface glycoprotein.

characterized by distinct structural features. Telomeric CCCTAA repeats are located 3' to the VSG gene [5] and a variable-sized 'barren' region (5-30 kb), composed of imprecise 70 bp repeats, is usually present upstream of the VSG gene [6,7]. Upstream from the barren region, an open reading frame (expression site associated gene; ESAGI) has been characterized in two distinct ESs [8,9]. ESAG-I apparently encodes an amphiphilic glycoprotein, which is coordinately expressed with the downstream VSG. In one ES, a VSG pseudogene is located within the 70 bp array [10]. ES transcription is insensitive to high levels of aamanitin, while other gene transcripts are sensitive to low levels [11]. e~-Amanitin-insensitive transcription extends over 60 kb of the 221a ES, and appears to be polycistronic [12,13]. Sequences extending upstream of the 118 VSG in a different ES are also transcribed by an a-amanitin-insensitive polymerase [14]. However, in this case a 939 bp fragment located approximately 4 kb upstream of the VSG was not transcribed in isolated nuclei, suggesting that, in this ES, the 118 VSG primary transcript is approximately 6.5 kb.

0166-6851/88/$03.50 © 1988 Elsevier Science Publishers B.V. (Biomedical Division)

198

Sequences sharing homology with the T. brucei rRNA promoter have been identified in this region [14]. Our present studies extend these analyses to the ES utilized in variant 117a, where the 117 VSG gene resides in an ES located on one of the megabase chromosomes [15]. Sequences upstream of the previously identified ESAG-I [8] were cloned and used to analyze steady-state transcripts and R N A transcribed in vitro in nuclei isolated from variant clones 117a and 221a. Materials and Methods

Parasite stocks. Bloodstream forms of T. brucei MITat (Molteno Institute Trypanozoon antigen type) 1.4 and 1.2, variant clones 117a and 221a respectively [16] were propagated in vivo in mice and rats. Cultured insect-form analogues (procyclics) derived from MITat 1.4 were maintained in vitro in a supplemented Minimal Essential Medium (ref. 17; M. Duszenko and K. Haldar, personal communication). Nucleic acid isolation and blotting. Parasite DNA and R N A were isolated as described [8]. Genomic D N A (3 lag per lane) was digested with restriction enzymes following manufacturer's specifications and electrophoresed through 0.5% agarose gels in 40 mM Tris, 33 mM sodium acetate, 18 mM NaC1, 2 mM E D T A , pH 8.2. Gels were treated with 0.25 M HC1 for 20 min and then denatured, neutralized and transferred to nitrocellulose following standard procedures [18]. Total R N A (9 ~xg per lane) was electrophoresed through 1.4% agarose formaldehyde gels and transferred directly to nitrocellulose [18]. Filter hybridization. 32P-labeled double-stranded DNA probes were synthesized from gel-purified fragments, using the random priming technique [19]. Strand-specific probes were synthesized from M13 templates using the sequencing primer (New England Biolabs). Southern blots were prehybridized for a minimum of 4 h in 1 M NaCI; 10 mM Tris pH 7.4; 5 x Denhardt's solution (1 × is 0.02% Ficoll, 0.02% polyvinylpyrrolidone, 0.02% bovine serum albumin); 100 lag ml -~ sheared, denatured salmon sperm DNA; 100 lag m l i

wheat germ RNA; 50% formamide at 42°C. Hybridizations were performed under the same conditions. Post-hybridization washes were in 0.5 × SSC (0.15 M NaC1, 15 mM sodium acetate, pH 7.0), 0.1% sodium dodecyl sulfate (SDS) at 65°C (moderate) or 0.05 x SSC, 0.1% SDS, 65°C (high stringency). For probe 5.1, moderate wash conditions were 1.0 × SSC, 0.1% SDS at 65°C, and high stringency washes were in 0.1 x SSC, 0.1% SDS at 65°C. Northern blots were prehybridized overnight in 5 x SSC; 2.5 x Denhardt's solution: 50 mM NaPO4 pH 7.0; 0.1% SDS; 250 lag ml 1 sheared, denatured salmon sperm DNA: 250 lag ml Lwheat germ RNA, 20% formamide at 42°C, and hybridized under the same conditions. Posthybridization washes were in 0.6 × SSC, 0.1% SDS at 65°C. Isolation of genomic clones and subclones. Restriction digestions of genomic DNA (100 lag) were performed and the fragments size-fractionated through 10-40% sucrose gradients [18]. Fractions containing the desired size fragments were precipitated with ethanol and the DNA ligated into appropriate plasmid vectors. For pGE117.21, 1.6 kb HindIII fragments of genomic D N A were ligated into the HindIII site of the pAT153 vector. For pGEll7.31, a pUC12 vector lacking the HindIII site ( p U C 1 2 H ) was constructed by end-repair of HindIII-digested plasmid DNA with the Klenow fragment of DNA polymerase I (Boehringer Mannheim), followed by recircularization, Size-fractionated Pstl fragments, approximately 3.4 kb in length, were ligated into the PstI site of pUC12H . The ligated DNA was digested with EcoRV, HindIII linkers (New England Biolabs) were added, and the DNA was then digested with HindIII, thus removing a 2.7 kb fragment from the insert. The DNA was diluted to 4 lag ml- 1, recircularized and transformed into competent DH5 cells (Bethesda Research Laboratories) using standard protocols [201 . Colonies were screened following established protocols [181, using probe 5.1 for pGEll7.21 and probe 2 for pGEll7.31. Specific fragments from the l17a ES (Fig. 1C, fragments 1-6 and 8-10) were subcloned into the M13 vectors mpl8 and mpl9 [21]. Restriction fragments were gel-purified and ligated into M13 vectors di-

199 gested with the appropriate restriction enzymes. When necessary, fragments were end-repaired with the Klenow fragment of D N A polymerase I (Boehringer Mannheim) and PstI linkers (New England Biolabs) were added prior to ligation. D N A was transformed into competent JM101 cells. All cloned fragments were obtained in both orientations. In the 'a' series, the inserts contain the non-coding strand (relative to the E S A G and VSG messages); in the 'b' series, the inserts are in the opposite orientation. Fragment 7 was not subcloned into the M13 vectors. A subclone of the 221a E S A G containing bases 323 to 655 was constructed in the pAT153 vector. The tubulin clone (constructed by P. Hevezi) contains one copy of the T. brucei e~- and [3-tubulin repeat in pAT153.

Isolation of nuclei. Whole blood was isolated from infected rats and immediately passed through a Stansted cell disruptor [11]. The homogenate was collected in Pipes (1,4-piperazinediethanesulfonic acid) buffer (20 mM Pipes p H 7.5, 15 mM NaCI, 60 mM KC1, 14 mM 2-mercaptoethanol, 0.5 mM E G T A , 4 mM E D T A , 0.15 mM spermine, 0.5 mM spermidine, 0.125 mM phenylmethylsulfonyl fluoride) and the nuclei pelleted by centrifuging for 10 min at 8000 × g. The nuclear pellet was resuspended in Pipes buffer and washed twice. Nuclei were resuspended in storage buffer (50 mM Tris p H 8.0, 75 mM NaCI, 0.5 mM E D T A p H 8.0, 0.2 mM phenylmethylsulfonyl fluoride, 5 mM dithiothreitol, 50% glycerol) at a concentration of 5 × 10 s nuclei ml ~ and immediately frozen at - 7 0 ° C (L.H.T. Van der Ploeg, personal communication). Nascent RNA elongation. Reactions were performed as described by K o o t e r and Borst [11]. Frozen aliquots containing 10s nuclei were thawed and the nuclei pelleted and resuspended in 90 txl reaction buffer (0.1 M Tris pH 7.9, 50 mM NaCl, 4 mM MnCI 2, 2 mM MgC12, 0.25 mM E D T A p H 8.0, 60 ~xM phenylmethylsulfonyl fluoride, 1.2 mM dithiothreitol, 10 mM phosphocreatine, 1 mM GTP, 1 mM CTP, 2 mM A T P , 5 jxM UTP, 25% glycerol, 125 U m1-1 RNasin (Promega)). In reactions containing e~-amanitin, nuclei in storage buffer were first incubated on ice in the presence of ~-amanitin for 15 min, pelleted and resus-

pended in reaction buffer containing ~-amanitin (Serva). N-Lauroylsarcosine was added to 0.5% (w/v) (L.H.T. Van der Ploeg, personal communication) and the reactions initiated by the addition of 90 txCi [~-32p]UTP (Amersham). Reactions were performed at 37°C (bloodstream forms) or 27°C (procyclics) for 10 rain and terminated by incubation at 65°C for 5 min, followed by treatment with RNase-free DNaseI (Promega) for 10 min at 37°C. An equal volume of 10 mM Tris pH 8.0, 10 mM E D T A , 0.1% SDS was added and proteinase K (Boehringer Mannheim) was added to a final concentration of 100 pxg ml ~. After incubation at 37°C for 15 min, R N A was isolated by phenol/chloroform (1 : 1) extraction and passed sequentially over two Sephadex G-50 spin columns, precipitated with ethanol, resuspended in 0.1 mM E D T A , 10 mM Tris, pH 8.0, and heated at 65°C for 10 rain prior to hybridization.

Slot blot hybridization. Single-stranded D N A (5 }xg per sample) was diluted in 6 x SSC, heated at 65°C for 10 min and spotted onto nitrocellulose using a slot blot apparatus (Schleicher & Schuell). Linear double-stranded D N A was denatured in 0.3 M N a O H at 65°C for 60 min. Samples were cooled to room temperature, SSC was added to a final concentration of 6 x and one-tenth the volume of 1 M Tris, pH 6.8 was added. D N A was spotted onto nitrocellulose and rinsed with 6 x SSC. Filters were baked in vacuo at 80°C for 2 h. Slot blots were prehybridized overnight at 42°C in 5 x SSC, 50 mM NaPO 4 pH 7.4, 0.1% SDS, 100 Ixg ml -~ heparin, 50% formamide. Hybridizations were performed under the same conditions. After hybridization, the filters were rinsed several times in 2 x SSC, treated with 200 Ixg ml RNase in 2 x SSC for 30 rain at 37°C, and again rinsed several times in 2 x SSC. The final washes were at 65°C in 0.25 × SSC, 0.1% SDS. Results

Molecular cloning of sequences upstream of the l17a ESAG-1. The 117a ESAG-I cross-hybridized with 14-25 fragments in restriction digests of T. brucei genomic D N A [8]. In order to determine which of these fragments corresponded to the 117a ES, a more specific probe was required.

200

A.

~ :~r. ~.~ ~ ~ > a_

~a x

I

1

¢

~.

,,9 m z

row

E u~

~u, m

ESAG

I I

11

1

I

r--3

-40 L

-20 I

~

VSG ~

end

_10 I \

a. rn

Q

~r- ~

d~

I

I

u_

Q

I.L

/

I

Q

[1 I

cl.

IF_ ,

lkb

B.

pCEI17a.OI pGEl1721 pGEll7.3I

C.

i

5 2

5 4

7

8

9

10

6

5.1

D.

III

II --I~

I lit

vsg ~-

D--

Fig. 1. Transcripts and clones from the 117a ES. (A) Restriction map of the l17a ES. Downstream from the ESAG, only sites defining the fragments used as probes are indicated. Vertical arrows denote the extent of the VSG transposed segment, 'end" indicates the telomere. Abbreviations: A, AccI; B, BarnHI; Bc, BclI; BgI, BglI; BglI, Bg/II; Bs, BstNI; C, ClaI; D, DdeI; E, BstEII; F, HinfI; H, HindlII; K, KpnI; M1, MluI; Ms, MstlI; N, Narl; Ps, PstI; PvI, PvuI; PvlI, PvulI; RI, EcoRI; RV, EcoRV; Sc, SacI; SI, Sail; Sm, Sinai; Sp, SphI; St, StuI; Xb, XbaI; Xh, XhoI; Xm, XmnI. (B) Fragments cloned in plasmid vectors, pCEll7a.01 is the 1.0 kb l17a ESAG cDNA clone [8]. pGEl17.21 and pGEl17.31 are genomic clones described in the text. (C) Fragments used as probes. The sizes of the fragments were: 1,450 bp; 2, 320 bp; 3,480 bp; 4, 430 bp; 5,260 bp; 5.1, 168 bp; 6, 638 bp; 7, 1900 bp; 8, 521 bp; 9, 306 bp; 10,232 bp. (D) Stable transcripts derived from the ES. Arrows indicate the direction of transcription; ESAG-III was found in variant 221a only and the boundaries have not been mapped. Comparison of the 117a and 221a E S A G - I nucleotide sequences [8] revealed two divergent regions located from nucleotides 45 to 117 and from 690 to 781. The 5' end of the 117a E S A G - I c D N A clone [8] is contained within a 178 bp RsaI fragment, which begins in the mini-exon and encompasses the first 166 bp of the coding region (Fig. 1C, probe 5.1). U n d e r conditions of moderate or high stringency, this fragment hybridized strongly to a single band in Southern blots of genomic D N A (Fig. 2). This 117a ES-specific probe facilitated the unambiguous mapping of restriction enzyme sites further upstream in the l17a ES (Fig. 1A). A 1.6 kb HindIII fragment located within the l17a ES was identified by probe 5.1 and was cloned into the pAT153 vector. The resulting clone, pGE117.21, contained 320 nucleotides of the 5' portion of the l17a E S A G - I coding sequence and extended upstream for nearly 1.3 kb (Fig. 1B).

A HindllI-PstI fragment from the 5' end of pGE117.21 (Fig. 1C, probe 2) hybridized quite specifically to a 3.4 kb PstI fragment of genomic D N A (Fig. 2). This upstream PstI fragment was the next target for cloning. However, attempts to clone this fragment or other fragments extending upstream of pGE117.21 were unsuccessful, although a variety of restriction enzymes, plasmid vectors and bacterial hosts were utilized. A clone containing a 450 bp P s t I - E c o R V fragment located approximately 2.7 kb upstream of pGE117.21 was obtained by deleting the intervening 2.7 kb segment in vitro (Fig. 1B, p G E l l 7 . 3 1 ) . Details of the construction of this clone are given in Materials and Methods. Several attempts to delete shorter regions did not yield any stable transformants. Although we cannot conclusively prove that the P s t l - E c o R V fragment of pGEl17.31 is derived from the l17a ES,

20l probe:

1

2

A P

33.5~ 23~

B H

P

3

A H

P

B H

P

4

A H

P

B H

p

5.1

A H

P

B H

P

6

A H

P

B H

P

A H

P

B H

P

H

0 ~ *

15~ 9.4 B 6.6m

J

1

o

m

4.4~

34

.

23- Q





O

D

w

m

2.01

1.6

~

Q

,'-

O

I

O

m

I

O

Q

0.56 --

Fig. 2. Southern blots of 117a genomic DNA hybridized to 117a ES-derived probes. Fragments used as probes are indicated in Fig. 1C. Lanes A were washed under moderate conditions, lanes B are the same blots rewashed under stringent conditions. Digestions were PstI (P) and HindllI (H). Arrows indicate the 1.6 kb HindlII fragment cloned as pGEll7.21 and the 3.4 kb PstI fragment used to generate pGE117.31. Size markers are given in kb.

we feel this fragment is indeed located within the l17a ES for the following reasons. Firstly, out of 5 x 103 transformants screened, two identical clones were obtained and no other recombinants hybridized with p r o b e 2. Secondly, the genomic m a p of the 117a ES, determined using the 117a ES-specific p r o b e 5.1, indicated that a BstNI site should be present within the 450 bp P s t I - E c o R V fragment. This site was present as predicted. Other restriction enzymes that, based on the genomic m a p should not have had sites within this fragment, did not have sites in the clone. Thirdly, the nucleotide sequence of the 320 bp HindlIIPstI fragment c o m m o n to both p G E l l 7 . 2 1 and p G E l 1 7 . 3 1 was identical in the two clones (data not shown). Finally, on pulsed-field-gradient gel electrophoresis, the 450 bp P s t I - E c o R V fragment hybridized to the same megabase c h r o m o s o m e as did probes 2 and 5.1, although p r o b e 1 also hybridized to other c h r o m o s o m e s (data not shown). Two genomic libraries (Sau3A partial digests of T. brucei l17a genomic D N A in the E M B L 3

vector) containing inserts with an average size of 14.5 kb and 19 kb respectively, were screened with probes 1, 2 and 3. N u m e r o u s plaques hybridizing to probes 2 and 3 were obtained, but none of the recombinants analyzed were derived from the 117a ES, as determined by restriction mapping. When probe 1 was used to screen the genomic libraries, only a few weak signals were obtained, and none of these putative recombinants could be carried through the steps of plaque purification. Southern hybridization analyses. Specific fragments derived from pGE117.21 and pGE117.31 were used to probe restriction enzyme digests of T. brucei genomic D N A (Fig. 2). U n d e r moderate conditions (lanes A) each probe detected several bands, suggesting that, like E S A G - I , the upstream region belongs to a family of related sequences. However, with the exception of probe 1, a single band can be seen in each lane under stringent conditions (lanes B). These bands cor-

202

responded to the expected l17a ES-specific fragments. In contrast, all of the bands detected by probe 1 produced strong signals, even under stringent conditions. These sequences are therefore highly conserved. Hybridization analyses using a variety of restriction enzymes that did not have sites within probe l indicated that approximately 15 copies of this sequence are present in the genome, while a PstI-EcoRV digest of genomic D N A yielded a single intense band of 450 bp (data not shown).

Northern hybridization analyses. Single-stranded probes derived from M13 subclones of 117a ESspecific fragments were used in Northern analyses of T. brucei total R N A (Fig. 3). The ESAGI-specific probes 5 and 6, containing the 5' and 3' portions of the E S A G - I cDNA, detected a 1.3 kb transcript, as shown previously [8]. Probe 4 (from pGE117.21), which should encompass the ESAGI 5' untranslated region [8], detected this transcript as well. Probes 2 and 3, containing sequences immediately upstream of the ESAG-I mature message, also hybridized to a transcript that appeared slightly smaller than the l17a ESAG-I message, and in the same orientation. Probes 2 and 3 detected a similar sized transcript in R N A from T. brucei variant l18a, which uses

the same ES [22], in variant 221a, which uses a different ES [23], and in variant 060a, which has not been mapped but probably uses a different ES [8] (data not shown). However, the signal obtained with 221a and 060a RNA was much weaker than with either 117a or 118a RNA. No transcripts were detected in the opposite orientation, nor was hybridization observed, with any of the probes, to R N A isolated from in vitro cultivated procyclic forms (data not shown). In contrast to the results with the other probes, probe 1 did not detect a specific transcript in 117a R N A (Fig. 3), even after longer exposure. However, this probe identified a 1.7 kb band in R N A isolated from variant 221a (Fig. 3B). Transcription was determined to be 5' to 3' from the Pstl site towards the EcoRV site. Since this probe is part of a highly conserved family (Fig. 2), it was not possible to determine which of the many copies was transcribed in variant 221a, nor have the boundaries of the mature transcript been delineated. The locations of the ES-derived transcripts, relative to the 117a ES, are indicated in Fig. 1D.

Analysis of nascent RNA. Transcription from the l17a ES was examined using in vitro labeled nascent RNA to probe filter-bound single-stranded D N A in a slot blot assay (Fig. 4). R N A hybrid-

B.

A. 1

2

3

4

5

6

7

8

9

10

,-

oa

4,4

2.4~

1.4--

Fig. 3. Northern blots of total cellular R N A hybridized to single-stranded ES-derived probes. (A) R N A from variant l17a hybridized to probes 1-10 indicated in Fig. IC. A u t o r a d i o g r a m s were exposed for 5 days (lanes 1-8) or l h (lanes 9,10). (B) Hybridization of probe 1 to R N A isolated from variants l17a and 221a. The autoradiogram was exposed for 5 days. Size markers are in kb.

203

A

B

C

1

~

~

/

3

~

~

~

4

m

~

~

5

6

E

F

G

4 ~ m L ~ O /

........

""="

m ~ ' ~ ' t l l D - -

8 ~

Ill,

9

10

D

--

-.-

~

~

in

IJ

~-~ ~

~

221 ESAG mp18 mp19

~

~

~

tubulin pAT153

Fig. 4. Hybridization of in vitro labeled nascent R N A to filter-bound single-stranded l17a ES-derived clones, tubulin and 221a E S A G clones. The locations of subclones 1-10 are indicated in Fig. 1C. Lanes A - C and E - G contained the 'a' series of non-coding strand 117a ES clones and lane D contained the 'b' series of coding strand inserts. R N A was from variants l17a (lanes A - D ) or 221a (lanes E - G ) and labeling was performed in the absence of a - a m a n i t i n (lanes A , D and E) or in the presence of 50 Ixg ml 1 (lanes B and F) or 1 mg ml-I (lanes C and G) c~-amanitin.

ized only to DNA substrates that would detect transcripts oriented in the same direction as the l17a VSG (Fig. 4A) and this transcription was insensitive to moderate (50 txg m1-1) and high (1 mg m1-1) levels of ~-amanitin (Fig. 4B, C). Indeed, the amount of radioactivity incorporated into ES transcripts increased in the presence of tx-amanitin, perhaps reflecting a larger available pool of UTP when Pol II was inactivated. Single-stranded DNAs cloned in the opposite orientation did not show significant hybridization to nascent RNA

(Fig. 4D). Although three of the clones (2b, 4b and 6b) showed a signal, the hybridization intensity was approximately that of the mp19 vector control (Fig. 4A) and can therefore be considered non-specific. Clone 3b, which is also in the mp19 vector, did not show this background hybridization. A substantial portion of the mp19 polylinker was removed during the construction of clone 3b. This region was present in the other clones and may have contributed to the background hybridization. The mp18 vector (Fig. 4A) and the single-stranded DNA substrates cloned into mp18 (lb, 5b, 8b, 9b and 10b; Fig. 4D) did not hybridize to nascent RNA. Of the clones in Fig. 4A-C, only la is in the rap19 vector, the rest are in rap18. In general, the signal intensities were roughly proportional to the sizes of the cloned inserts and the proportion of U residues (where known) in the hybridizing RNA. However, even after subtracting the background signal due to the mp19 vector, the intensity of the signal with probe la was greater than expected for the size of the insert. This result could indicate that the sequence represented by this probe is reiterated in the ES. The low signal with fragment 9 can be attributed to the low U content (13%) of the corresponding RNA. The sequence of fragment 5 is unknown, but may also contain a low proportion of U. More uniform signal strengths were obtained when labeled GTP was used in place of UTP (data not shown). Nascent RNA synthesized in these experiments did not hybridize to clones containing the ~- and 13-tubulin repeat or 221a ESAG sequences (Fig. 4A). Nascent RNA obtained from procyclic nuclei did not hybridize to any of the 117a ES-specific clones (data not shown). When the single-stranded DNA substrates from the l17a ES were used to examine nascent RNA isolated from T. brucei variant 221a, several of the clones hybridized to the labeled RNA (Fig. 4E-G). Weak signals were obtained with clones 2a, 3a, 4a and 6a, which is consistent with the cross-hybridization due to partially conserved sequences, while probe la produced a very intense signal. Again, transcription was insensitive to moderate (50 ~g m1-1) and high (1 mg ml 1) levels of e~-amanitin (Fig. 4F and G). As with variant l17a nuclei (Fig. 4D), no significant hybrid-

204 ization was obtained with the 'b' series of clones in the opposite orientation (data not shown). Nascent R N A from variant 221a also hybridized strongly to the 221a E S A G - I clone, as expected, but did not hybridize to the 117a ES barren region probe (7a). A VSG 221 probe was not included in the experiment, so it is uncertain whether the lack of barren region transcripts was due to the transcription shut-down effect, in which transcription fades from the 3' (VSG) end of the ES unit, as described elsewhere [11,12]. However, this seems an unlikely explanation when the ESAG-I signal was so strong. Published sequence data for part of the 117a barren region extend about 1.5 kb upstream of the HinfI site on the upstream side of probe 8 [6]. If representative of the barren region covered by probe 7. this sequence is very A + T-rich, and differs sufficiently from the published sequence of a short region of the 221a barren region [10] to explain the lack of cross-hybridization of 221a barren region nascent R N A to the 117a probe. In both 117a and 221a nuclei, the intensity of hybridization, relative to the size of the insert, was much greater for fragment 1 than for the other ES-derived fragments. This was not an artefact due to a high proportion of uridine residues in the R N A as similar results were obtained when nascent R N A was labeled with [32p]GTP. Two explanations are possible: this region may be transcribed at a higher rate and therefore correspond to a distinct transcription unit, or the sequence may be present in multiple copies in the ES. Sequence duplications and triplications occur within the 221a ES [12]. Discussion

ES cloning presents two major problems: repetitive sequences upstream of the VSG gene can cause regions to be unstable in commonly used bacterial host-vector systems, and the existence of many putative ESs with similar structures can lead to problems in mapping and identifying clones representing contiguous regions of a specific ES. This paper extends our initial studies of the 117a ES [8,9]. We have been particularly careful to clone segments from a single ES and have used these 117a ES-specific clones to map

this ES to a point over 40 kb upstream of the VSG gene. Although we only succeeded in cloning sequences up to about 12 kb upstream from the VSG coding region, the characteristics of the newly studied region have several points of interest, and our results can be usefully compared to those obtained in recent studies of other ESs [12,251. Two ES transcripts have been identified in addition to the previously described E S A G [8], suggesting the presence of additional ES-associated coding regions. Since our present work, and that of others [12,14,25], indicates that several stable transcripts may originate from the ES, we propose that the previously identified E S A G family [8] be designated ESAG-I. ESAG-I is recognized by genomic probes 4, 5 and 6 used in this study. According to previous work [8], the 5' end of the ESAG-I m R N A should lie within the region covered by probe 4. Thus, although the next two upstream probes (2 and 3) recognize a stable RNA of essentially indistinguishable size, this must be assumed to represent another gene product, which we tentatively designate ESAG-II. The 5' end of ESAG-1I must extend into the 'unclonable' region. Finally, probe 1 recognizes a clearly distinct 1.7 kb transcript, which we tentatively designate ESAG-III. The most striking feature of the ESAG-III region is that a stable transcript could not be detected by probe 1 in variant 117a, whereas a 1.7 kb RNA was present in variant 221a. A trivial explanation for this result could be that ESAGIII is mainly encoded upstream or in the 'unclonable' region lying just downstream from probe 1, and that a significant portion of 5' or 3' untranslated or coding regions are absent from the 117a mature transcript. However, there are several unique features about probe 1. Firstly, unlike any of the other ES probes, including those for the well-characterized polymorphic ESAG-I family, this probe hybridized intensely to an identical set of genomic D N A bands at both low and high stringency. Secondly, in a genomic double digest using the same enzymes that flank the probe (PstI and EcoRV), a single intense 450 bp band was obtained. The copy number of this band in the genome was not determined, but the restriction patterns with other enzymes suggested a mini-

205 mum of 15 copies. The HindIII digest, for example, showed many bands, indicating that the larger region in which probe 1 is embedded is certainly polymorphic. Thirdly, the size of the largest PstI band (3.4 kb) corresponds to the major PstI band seen in genomic digests of six species and strains of the genus Trypanozoon examined with an A n T a t 1.3a ES probe [25]. Thus it seems possible that the largely 'unclonable' 3.4 kb PstI fragment spanned by pGE117.31 is extremely conserved. In Southern blots at high stringency, probes 2 to 6, encompassing E S A G s I and II, recognized single bands, suggesting that ESAG-II, like ESAG-I, represents a quite polymorphic gene family, in contrast to ESAG-III. Under low stringency, each probe detected several bands, and the correspondence between some of the bands recognized by probes specific for the two E S A G families is consistent with linkage of members of these separate gene families at other sites in the genome besides the l17a ES. In isolated nuclei, transcription of the studied region of the 117a ES was efficient, unidirectional and insensitive to c~-amanitin. These results are consistent with but do not prove the existence of a single large transcription unit. Regulatory sequences could reside further upstream or in the region that proved unclonable in our hands. Evidence from elsewhere [13,14] suggests that the ES promoter may lie anywhere from 4 to 60 kb upstream of the VSG gene in different variants of this T. brucei strain. The low abundance of E S A G m R N A s compared to VSG m R N A , when all are transcribed at a high rate, is a strong argument for a central role of transsplicing in regulating the abundance of individual mRNAs in trypanosomes [27]. Further characterization of the 117a ES, by D N A sequencing and E S A G c D N A cloning, as

was previously described for ESAG-I [8,9], was beyond the scope of the current investigation. However, there are similarities in the organization of the genomic map and R N A transcripts of the 117a ES and the AnTat 1.1b ES, which has been characterized to a point about 7 kb upstream from the probable position of a member of the ESAG-I family [25]. ES probes from variants 1.1b and 1.3a (ref. 25, probes V and V1), lying upstream from the likely position of ESAGI, also recognized stable polyadenylated RNAs of about 1.7 and 1.4 kb. In addition, several stable polyadenylated RNAs apparently containing a mini-exon [27] sequence, and therefore presumed to be mRNAs, were recognized by probes representative of the 221a ES [12]. Thus it seems likely that several genes are present and co-ordinately transcribed in different VSG ESs. The nature of these putative genes and their products has been partly elucidated only for E S A G - I [9], although in this case neither the cellular location nor the function of the encoded protein is known. Conceivably, these E S A G s could be involved in the regulation of VSG expression or glycosylation, which varies dramatically from one variant to another (D. Ashford and M.A.J. Ferguson, personal communication). Overexpression of these genes by D N A transfection may be necessary to elucidate their function.

Acknowledgements We thank L.H.T. Van der Ploeg and C. Shea for advice on the isolated nuclei transcription experiments and P.J. Johnson, J.M. Kooter, P. Borst and L.H.T. Van der Ploeg for sharing their unpublished results. This work was supported by Public Health Service grants AI21729 (G.A.M.C.) and AI07352 (C.P.G.) from the National Institutes of Health.

References 1 Van der Ploeg, L.H.T., Valerio, D., De Lange, T., Bernards, A., Borst, P. and Grosveld, F.G. (1982) An analysis of cosmid clones of nuclear DNA from Trypanosoma brucei shows that the genes for variant surface glycoproteins are clustered in the genome. Nucleic Acids Res. 10, 5905-5923. 2 Myler, P.J., Allison, J., Agabian, N. and Stuart, K. (1984)

Antigenic variation in African trypanosomes by gene replacement or activation of alternate telomeres. Cell 39,. 203-211. 3 Laurent. M., Pays, E., Van der Werf, A., Aerts, D., Magnus, E., Van Meirvenne, N. and Steinert, M. (1984) Translocation alters the activation rate of a trypanosome surface antigen gene. Nucleic Acids Res. 12, 8319-8328.

206 4 Borst, P. (1986) Discontinuous transcription and antigenic variation in trypanosomes. Annu. Rev. Biochem. 55, 701-732. 5 Van der Ploeg, L.H.T., Liu, A.Y.C. and Borst, P. (1984) Structure of the growing telomeres of trypanosomes. Cell 36, 459-468. 6 Campbell, D.A., Van Bree, M.P. and Boothroyd, J.C. (1984) The 5' limit of transposition and upstream barren region of a trypanosome VSG gene. Tandem 76 base-pair repeats flanking (TAA)90. Nucleic Acids Res. 12, 2759-2774. 7 Liu, A.Y.C., Van der Ploeg, L.H.T., Rijsewijk, F.A.M. and Borst, P. (1983) The transposition unit of VSG gene 118 of Trypanosoma brucei: presence of repeated elements at its border and absence of promoter associated sequences. J. Mol. Biol. 167, 57-75. 8 Cully, D.F., Ip, H.S. and Cross, G.A.M. (1985) Coordinate transcription of variant surface glycoprotein genes and an expression site associated gene family in Trypanosoma brucei. Cell 42, 173-182. 9 Cully, D.F., Gibbs, C.P. and Cross, G.A.M. (1986) Identification of proteins encoded by variant surface glycoprotein expression site associated genes in Trypanosoma brucei. Mol. Biochem. Parasitol. 21, 189-197. 10 Bernards, A., Kooter, J.M. and Borst, P. (1985) Structure and transcription of a telomeric surface antigen gene of Trypanosoma brucei. Mol. Cell. Biol. 5,545-553. 11 Kooter, J.M. and Borst, P. (1984) c~-Amanitin-insensitive transcription of variant surface glycoprotein genes provides further evidence for discontinuous transcription in trypanosomes. Nucleic Acids Res. 12, 9457-9472. 12 Kooter, J.M., Van der Spek, H.J., Wagter, R., d'Oliveira, C.E., Van der Hoeven, F., Johnson, P.J. and Borst, P. (1987) The anatomy and transcription of a telomeric expression site for variant-specific surface antigens in Trvpanosoma brucei. Cell 51,261-272. 13 Johnson, P.J., Kooter, J.M. and Borst, P. (1987) Inactivation of transcription by UV irradiation of Trypanosoma brucei provides evidence for a multicistronic transcription unit that includes a variant surface glycoprotein gene. Cell 51,273-281. 14 Shea, C., Lee, M.G.S. and Van der Ploeg, L.H.T. (1987) VSG gene 118 is transcribed from a co-transposed pol Ilike promoter. Cell 50,603-612. 15 Van der Ploeg, L.H.T., Schwartz, D.C., Cantor, C.R. and Borst, P. (1984) Antigenic variation in Trypanosoma brucei analyzed by electrophoretic separation of chromo-

some-sized DNA molecules. Cell 37, 77-84. 16 Cross, G.A.M. (1975) Identification, purification and properties of clone-specific glycoprotein antigens constituting the surface coat of Trypanosoma brucei. Parasitology 71,393-418. 17 Duszenko, M., Ferguson, M.A.J., Lamont, G.S., Rifkin, M.R. and Cross, G.A.M. (1985) Cysteine eliminates the feeder cell requirement for cultivation of Trypanosoma brucei bloodstream forms in vitro. J. Exp. Med. 162, 1256-1263. 18 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. 19 Feinberg, A.P. and Vogelstein, B. (1983) A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132, 6-13. 20 Hanahan, D. (1983) Studies on transformation of Escherichia coli with plasmids. J. Mol. Biol. 166, 557-580. 21 Yanisch-Perron, C., Vieira, J. and Messing, J. (1983) Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13 mpl8 and pUC19 vectors. Gene 33, 103-119. 22 Michels, P.A.M., Liu, A.Y.C., Bernards, A., Sloof, P., Van der Bijl, M.M.W., Schinkel, A.H., Menke, H.M., Borst, P., Venneman, G.H,, Tromp, M.C. and Van Boom, J.H. (1983) Activation of the genes for variant surface glycoproteins 117 and 118 in Trypanosoma brucei. J. Mol. Biol. 166, 537-556. 23 Bernards, A., De Lange, T., Michels, P.A.M., Liu, A.Y.C., Huisman, M.J. and Borst, P. (1984) Two modes of activation of a single surface antigen gene of Trypanosoma brucei. Cell 36, 163-170. 24 Van der Ploeg, L,H.T., Bernards, A., Rijsewijk, F.A.M. and Borst, P. (1982) Characterization of the DNA duplication-transposition that controls the expression of two genes for variant surface glycoproteins in Trvpanosoma brucei. Nucleic Acids Res. 10, 593-609. 25 Murphy, N.B., Guyaux, M., Pays, E. and Steinert, M. (1987) Analysis of VSG expression site sequences in T. brucei. In: Molecular Strategies of Parasitic Invasion (Agabian, N., Goodman, H. and Nogueira, N., eds.) pp. 449-469, Alan R. Liss Inc., New York. 26 Earnshaw, D.L., Beebee, T.J.C. and Gutteridge, W.E. (1987) Demonstration of RNA polymerase multiplicity in Trypanosoma brucei. Biochem. J. 241,649-655. 27 Van der Ploeg, L.H.T. (1986) Discontinuous transcription and splicing in trypanosomes. Cell 47,479-480.