Structure of the gene encoding the murine SCL protein

Structure of the gene encoding the murine SCL protein

Gene, 138 (1994) 93-99 0 1994 Elsevier Science B.V. All rights reserved. SSDI 0378-l 119 (93) E0566-V GENE 93 0378-l 119/94/$07.00 07637 Structur...

917KB Sizes 1 Downloads 91 Views

Gene, 138 (1994) 93-99 0 1994 Elsevier Science B.V. All rights reserved. SSDI 0378-l 119 (93) E0566-V

GENE

93

0378-l 119/94/$07.00

07637

Structure of the gene encoding the murine SCL protein (Helix-loop-helix;

transcription

factor; chromosome

C.G. Begleyapb, L. Robb”, S. Rockmanb, aThe Walter and Eliza Hall Institute of Medical Research, Melbourne 333-6835

Hospitnl,

translocation;

11 May 1993; Accepted:

gene structure)

J. Visvader”, E.O. Bockamp”, Melbourne,

3 August

Y.S. Ghan” and A.R. Greene

Victoria 3050, Australia; b Department of Diagnostic Haematology,

Victoria 3050, Australia; and “Department of Haematology,

Received by P.A. Manning:

hemopoiesis;

MRC Building, Hills Road, Cambridge

1993; Received at publishers:

30 September

P.O. Royal

CB2 2QH. UK. Tel. (44-223 )

1993

SUMMARY

We have determined

the molecular

structure

of the gene encoding

the murine

XL

protein

(helix-loop-helix

transcrip-

tion factor). The gene consists of seven exons spanning approx. 20 kb. The intron/exon structure, coding region sequences and sequences present at the splice junctions were highly conserved between mouse and human. The 5’ flanking sequence contains CCAAT and TATA consensus motifs with several putative binding sites for SP-1, AP-1 and GATA-1. Multiple mRNA transcripts were generated by alternate exon usage. The transcripts differed primarily in the 5’ untranslated region (UTR), but potentially also encode a smaller XL protein. Despite the high degree of conservation between species, the heptamer/nonamer signal sequences in the 5’ region of the human SCL gene (the frequent site of SCL disruption in human leukemia) were poorly represented in the murine sequence. In keeping with this, structural abnormalities of murine SCL were uncommon in murine leukemias that express the SCL transcript.

INTRODUCTION

The SCL protein is a member of the helix-loop-helix (HLH) family of transcription factors. This family includes proteins that play critical roles in development and differentiation in a wide variety of tissues and in species ranging from plants through to mammals (Murre et al., 1989). Several members of the family (c-MYC, SCL, LYL-1, Tal-2 and the E2A gene product) are also implicated in the development of human lymphoid tumors Correspondence to: Dr. C.G. Begley, The Walter and Eliza Hall Institute of Medical Research, Post Office, The Royal Melbourne Hospital, Victoria 3050, Australia. Tel. (61-3) 345-2555; Fax (61-3) 347-0852; e-mail: [email protected] Abbreviations: minute; HLH, tide(s); oligo,

aa, amino acid(s); bp, base pair(s); cpm, counts per helix-loop-helix; kb, kilobase or 1000 bp; nt, nucleooligodeoxyribonucleotide; ORF, open reading frame;

PBGD, porphobilinogen deaminase gene; PCR, polymerase chain reaction; SCL, HLH transcription factor; SCL, stem cell leukemia gene encoding SCL; tsp, transcription start point(s); UTR, untranslated region(s).

because of their involvement in chromosome translocations, and we have recently demonstrated that SCL protein is tumorigenic when constitutively expressed in T-lymphoid cells (Elwood et al., 1993). However, SCL mRNA (Visvader et al., 1991; Green et al., 1992) and SCL protein cannot normally be detected in lymphoid populations. Within hemopoietic cells, SCL mRNA expression is restricted to proliferation and differentiation in megakaryocytes, mast cells, erythroid cells and more primitive progenitor cells (Green et al., 1991; 1992; Mouthon et al., 1993; Aplan et al., 1992a; Tanigawa et al., 1993). The SCL gene was first identified because of its involvement in a translocation in a human stem cell leukemia, and DNA recombinases were strongly implicated in the genesis of this event (Begley et al., 1989a,b). Subsequently, SCL (also known as Tel-5 or T&-l; Finger et al., 1989; Brown et al., 1990) has been recognized to be aberrantly activated in up to 25% of human T-cell acute lymphoblastic leukemias as a result of site-specific rearrangements at sequences that conform well to the

94 heptamer/nonamer recombinases

signal sequences

(Aplan

recognized

regulatory

SCL gene and to determine sequences

recombinases

(b) Nucleotide It is now

The aim of this study was to characterize of the murine

by DNA

et al., 1990a).

and

sequences

were conserved

the structure

whether

targeted

murine

specific

SCL

sequence of full-length murine SCL cDNA possible cDNA

by DNA

clones (Fig. 2). Exons

across species.

stop codon present

Six overlapping a murine

using

mouse

1990b) containing

genomic genomic

cDNA

as

fragments

clones

were isolated

library

(Clontech

human

cDNA

and

fragments

and subcloned

phage

hybridization were

(Aplan

could

et al.,

region,

phage clones into plas-

I II RBHASm

Ub

III

II

0

II

se-

from an in-frame

TAA

initiation

(Kozak,

ATG

sequence

codons

serve as a site of translation (see section

d). Exon

basic domain,

de-

1986). There arc downstream

start

in

potentially

in some SCL

VI contains

the HLH

3’ CJTR and canonical

polyadenylation signal and its variant (ATTAAA nt 6245, Fig. 2. A 297-bp deletion occurs in some transcripts within

mid vectors for sequence analysis and mapping of their relative positions. Fig. 1 shows the restriction map and intron-exon structure of 20 kb of genomic DNA encompassing the murine SCL gene, created by comparison with murine cDNA clones. The cDNA clones were obtained from a normal mouse bone-marrow macrophage cDNA library (six clones with inserts ranging from 1.8-4.3 kb) and five clones from a mouse erythroleukemia cell library (inserts from 0.5-3.0 kb; Begley et al., 1991). Additional cDNA clones were obtained by PCR using sense oligo primers situated in exons Ia or Ib and an antisense oligo in exon III. These cDNA clones displayed multiple alternate splicing patterns in the exons of the 5’ UTR (see section d). The gene is organized into seven exons (with alternate exons Ia or Ib) and an overall structure that is very similar to the structure of the human SCL gene (Aplan et al., 1990b; Bernard et al., 1991).

II0

5’ UTR

in exon III. The first ATG codon and

in-frame

upstream

alternate in cDNA

G + C rich. A long ORF begins

IV and two in exon V, one of which

transcripts

Exon-

by hybridization

la Ib

additional

exon

#ML1030j)

probes.

identified

from the genomic

two

composite

the

nt fit well with the consensus

fined for translation (a) Genomic clones of the murine SCL gene

including

that were identified

IV, 15 bp downstream

surrounding

an entire

I, II, and III contain

quence and are relatively

AND DISCUSSION

from

sequence

exons seen in the 5’ UTR

in exon RESULTS

to present

the 3’ UTR of the human SCL gene as a result of a splicing event (Begley et al., 1989b). Although in the murine 3’ UTR the equivalent splice donor, (nt 3934, Fig. 2) acceptor (nt 4230, Fig. 2) and surrounding homologous sequences were also present, no murine cDNA clones were identified that displayed this splicing event. However, murine transcripts that differ in the 3’ UTR have been demonstrated ( Begley et al., 199 1). The 3’ U TR of human SCL was also the site of a chromosome translocation (Begley et al., 1989a). Two sequences flanking this translocation site showed a five out of seven match with the consensus heptamer sequence CACA/TGTG and likely served as recognition sequences for recombinase enzymes. Similar sequences were also present at the same site in the murine 3’ U TR: at 4408 bp (Fig. 2) the sequence GGCTGTG was identical between species and

I

I

I

I I

Sm

S

s

R

II

I

NP

BS

I I II AtuBA

II

II

II

BS

BSBS

SS

UIIII XHHP Y S

IIIIII APHRB RS

I I SR

1 1 kb

Fig. 1. Organization of the murine SCL gene. Boxes and Roman numerals indicate exons; closed boxes indicate coding region which begins at the second nt of exon IV. A, Apul; B, BronHI; Bg, ⅈ H. HindIll; N. NotI; Nd, N&I: P. Pstl; R, EcoRI; S, SacI; Sm. Sn~l: X, Xhul. Six overlapping genomic bacteriophage clones were isolated from Balb/c DNA (Clontech #ML1030j, DNA source: Mouse Balb/c adult liver Palo Alto, CA. USA) after screening approx. 3 x 10’ clones. Methods: Phage DNA was immobilized on duplicate nitrocellulose filters and hybridized to a series of murine and human cDNA fragments encompassing the entire cDNA. Prehybridization was performed at 65 C for murine probes and 55 C for human probes in 2 x SSC (I x SSC=O.15 M NaCI/O.OlS M Na,.citrate pH 7.6)/0.2% Ficoll/O.?%~ polyvinylpyrollidone,~O.Z% bovine serum albumin,2 mM Na.pyrophosphate/l mM ATP and 0.005% Eschrr-ichitr co/i tRNA. Filters were then hybridized for 16 h in fresh 2 x SSC solution containing O.l’Y” SDS and probe at 2 x lOh cpm/ml as previously described (Begley et al., 1991). Filters were washed in 0.2 x SSC/O.l% SDS at 65 C (for murine probes) or 55 C (for human probes) prior to autoradiography for 12-36 h on Kodak X-AR lilm exposed at - 70 C with an intensifying screen. After purification of positive plaques by screening at lower density, phage clone DNA was extracted using standard procedures (Aplan et al., 1990b). Subfragments of phage DNA were identified, obtained in preparative restriction digests and subcloned into pBluescript (Stratagene, La Jolla. CA, USA) for further restriction digests and sequence analysis. The relative position of the restriction fragments was confirmed by cross-hybridization to Southern blots of genomic and phage DNA.

95

was also present site

at

in the murine

4450 bp

(versus

sequence

CTTTGTG

at the second in the human

element

close to the PBGD

cap site (Beaupain

et al.,

1990).

sequence). (d) Exon-intron junctions and multiple 5’ UTR transcripts The exon-intron (c) Sequence analysis of SCL 5’ flanking region The nt sequence upstream script

of murine

sequence

start

site shown

TCCATTGC

for murine

start site sequence

1990b; Bernard

There

100 bp flanking identity

within

heptamer

was 83%

that

(GGCTGTG

species

(compared

in the human

with 60%

conservation

of the

sequence

compared

with

having

characteristics

in the mouse).

Exon Ia lies within a CpG island

with an increased

G + C content

region) and high frequency

(69%

ment (ATAAT)

from mouse (and human)

upstream

A potential

for the human a result

binding

promoter

site shown

eight

of the high G+ C content, estimated

PCR reactions

oligos

binding

to be functional

the tsp for human to lie approx.

80 bp,

et al., 1991). A series of

were therefore

with an antisense

the five most 3’ oligos successfully

used to prime

initiate

1990 and references

from tsp, the functional

therein).

Moreover,

B-globin-encoding

in the establishment

signi-

Several non-globin nt and still

at a single start site (Beaupain

at - 30 nt in the chicken implicated

additional

et al.,

a GATA-1

from exon IV. A intronic

spliced

exons

sequences IIb and III.

RNA forms that retain

or incompletely

processed

pre-RNA.

The 5’ end of the mouse gene demonstrated pattern

of mRNA

transcripts. curred

splicing

encoding

one murine

a full-length

transcript

involved

exons Ib and V but the relative

alternate

events primarily

in the 5’ UTR and included

potentially

a complex

splicing (Fig. 3) with multiple

These alternate

oc-

exon III, therefore

SCL protein. a splicing

However,

event between

abundance

A similar

SCL transcripts

of this tran-

event has been ob-

site

gene has been

of stable transcriptional

preinitiation complexes (Fong and Emerson, 1992) and GATA-1 may interact with a factor binding to an initiator

involving

exons Ia and

in predicted proteins that retain the HLH and upstream basic domains but lack the Pro-rich N-terminal half of the protein and, by analogy with a number of other transcription factors (Mitchell and Tijan, 1989) may therefore domain.

In addition,

these SCL

proteins would lack the Ser that serves as a substrate for ERK/MAP2 kinases (Cheng et al., 1993). While such splicing events may be responsible for the smaller SCL proteins

observed

in in vitro translated

lysates (Bernard

et al., 1991; Cheng et al., 1993; Goldfarb et al., 1992) and in hemopoietic cell extracts there are probably additional types. Fig. 4 compares

the predicted

genes lack a TATA box at -30 transcription

sequences

levels of translational

ficance of the ATAAT motif is unclear. erythroid

intronic

amplified

ATAAT motif and 30 nt downstream from the GATA-1 binding site. However, since most TATA elements are 25-35 nt upstream

from appropriately

oligo in exon III. Only

product, thus localizing the tsp to the region indicated in Fig. 2. This was approx. 80 nt downstream from the

situated

upstream

cDNA clone contained

lack a transactivation as

from the ATAAT motif

et al., 1990b; Bernard overlapping

upstream

IV, V, VI, but with intronic

immediately

V (Aplan et al., 1990b; Bernard et al., 1991). These splicing events between either exon Ia or Ib and exon V result

exon

et al., 1992). Possibly

135 bp and 200 bp downstream (Aplan

second murine

exons

ap-

in

between

several potential

(Aplan

exon Ia have been variably

spliced

(Fig. 2).

that contained

served in human

TATA ele-

in a region that showed 94% identity

species (in 115 nt) and included sites and a GATA-1

propriately

present

and con-

of

of the dinucleotide

region).

dinucleotides

cDNA clone was isolated

sequence

canonical

splice sequences

One murine

script was not quantitated.

a region

CpG ( 15% in a 200-bp Ia occurred

in

is frequently the site of the upstream from human Ib

events

GGCGCTG

between

exon Ib) but without

sequence

a 200-bp

identity

form well with the consensus

These may represent

position

and 30 nt in the human

this TATA element

recombination

and

a potential

was found at the appropriate

(29 nt in the mouse

sequence).

(YYCRYYYY)

et al., 1991). Furthermore,

element

both

SCL exon Ib (Aplan et al.,

the start site defined for human

upstream

Ib in Fig. 2 is

(A is tsp), which is in good agreement

with the consensus

TATA-like

in Fig. 2. The tranexon

all contained

GT splice donor and AG acceptor

exons Ia, Ib and 550 nt of

are also presented

boundaries

control the mouse

that occur in specific cell and human

sequence

for

exon III and the region immediately upstream. Although an exon approx. 300 bp upstream from human exon III (human exon IIa) was identified in some human transcripts (Aplan et al., 1990b) this region was not seen in murine transcripts. In addition, oligos from within the region likely to encompass the murine counterpart of human exon IIa were unsuccessful in amplifying a PCR product from murine mRNA. It was of interest that the intronic sequence immediately upstream from exon III showed considerable homology between species, raising the possibility that it may play a role in gene regulation. However, despite this homology, neither the human hep-

96 TTTCA?CTCTCTATTGRTAT??G?TCTG?CCCCCTATAGAG?CTTTTbTCT??GAGCAATTCCAGTTTTAGAGCGGTCAGGGGCCTG?GCTGATGAGGGTCAGCGAGGAAC? GGGTG?TGAGATACGGbAAAAGGAGAAACAGACACTC?GGC?T?GGAGAAGACAGGAGAAAGAATGAAGAAGGGAGGGTTGGTTGGGAATGGGTCGGGAGACACGG3GG?CT TA~GAGGCTAGACAACGGCCAAAGCCGCTGAACGAGCATCTGGGA~GCGGAGATCCGAGTCGGGTTGTTTGTAGGGAGCGGGAATCCCGAGGGGCCAGGGGTGGGG~ACGGG . . ..

CGGAACCCGCCGGGTCCCbACGTGAOCGCGC?CAGCCCTCTCC?C?GCGCA~TCC~GGGCCA?CCCbGCGCCGCAACCCG?CCCCGC?CCC?CCGGAGAbA~TGCCbbb?TAbb

. .. .. .

EYON

__-A~TTTGGCCCATAA?GGCCGAGGCGCT?b?CGGGGGCGGGCGGGCCCGAGGCGGCTCC?TA?CTCGGCGGCGCbCACGGCCGGCGC?AA~GAGGC~~

GCGGGATTAGbGCGfGCGGCGAGGCGCGGGCGGGGGCTGCCCAACA?GGCCACGCACACCCCCAAGGCGCAAAA

Ie

GTGAGCGTCCCGGCCTTCTTCCCRGGGCCAGGGCCGGGAGACA

EXON CCGCTGCCCCTCCCGCCGGGGAGACTCTCTTCCTTCCTTCCCCTTTTCCCCTTGCGCAATAGACAGAAAAGCGCAAGGCGCTGGCGGCTTC

Ib

ATTGCTTCTGGGCGGTGTGT

GCGGGCGGCCGGGTTGAGAGTGCTTGCGTGI\GCGGC~GGCTGCGGTGGCTACTCCCCGCGTGGGCTTTTGGCGATAGG?CGTGTGCGGGAGTGTGAAACCC~GTC~G

GG~GATTCCGACCTCTCTGCGTTTT~~~GTATGTbAAT?GCCCCGT?TCTTTTC?CCACAAGG?CGGTCTGG?GTATCGCAGTGTGTGGCCTTGGACA 1TAGGCTGGTTTCGTTATGTCGGGCGGGTGTG~TCCCCACCCTGTGAGGTTGTGTGTCTGTGAGCGTTTGCGTGTG~CTGATGTCGCTGTGTCATG TCTGCC?~CACCGGCTTGTGGGACAGAGCTAGCTCGTGTTCATCTGCTTCCCTCCTCTGGGCC?GGACCATGTGTGGTCCTGTCCACCAAGAG~C TGTCCTTTAGA GTG~GTTTGCAAATCCCTGGGG----------GGACAGAGCCCAG?GGGCGGCAGCTC?GTCCGGAGGCTGTCTGTGCCTTGGTGTTGCG EYON

IIb

CGCCAGCfCTGGATTCGCGCTCCCfCTGGGlTTCGCCl

GGTGTGTGCGGGGGGAATCTTCTAATCTGIGCCGCGCTCGCTGAGGCCAGCGTCTTCCCGCCCGCA

GTCAGTTTCGCGGG?GCGTCCGCGGTGGTTGGGAGAGGCTGGGAAAG----------?TTTTTTTTTGTTGTTGTT~TTTTTG?TTTTGTTT?TGT?TT ATTTCT CCGTTCTCTCCICCCTCGTCAAATAGAGAATTAC~TCTTCCTGCTCTGGCTAAGCCTGGGC~AAGGTTGTGGGGGCAACCCGCbTAGAGACGTCCAT TCGGCCATACTGAATTAGACAGATCCGTTAGAGGGTTCGAACACGTTCTGGGCAGCCTGCCGGAGAGGCTGTTGTTGTTGGACTAAGTbGTTATTC AAGGTTCTTCTTCTTTTTCTTTTTTTAATTCGGG~TGTGTTTGGGGGAGGGTGGGCTCC~AACACCTGCAAGTTGGAGGCGGAA~GCCATTTGCTAT EXON GGCACACGAGGTAATTCCCAGTTAT?GACCCCCC~TTTTCTCTCTCGCCCTCCCTCTTTCTCTCCACCCCCA?CTTTCCTGG~A~CTCGCTTTGGGCGCGGCA

III

ACCGCGCA

GGACCTC~CGGC~AGCTAAGTbACTGGTCT~GTCTCTCAGCGAGAGCCGGGAAbCCCAGCTTCGGG~TCCTACCTCGA~CCC~CTCCAGCGGAGGA~T~~AGGTCCTAbCCbGC

CGAETAGGTCTCICTAAA1ATGCCCC~ GTTAGT?GfGGtCTGGAGTCTGGGG?GGGGTGGAGtAAGGGGGGTAATGTCTG?GGAAACAhCAGTATTCTGGG?TGTGA AClGGGGGGCGGGGGGGGGGGGAGACGdCTTAGA~CTGAGAAGATTGGTGGfiGGTGGGGGTGGGAGAACCAACC?AGACTAAGAACCCGGGATCTGACTGCCACTAAGCTGC C?CAGGtCTCCACCAGGGCTCTT~CCTT?T?CCGGCCTTGGATAAAACAAGC?GGT?G~ATTATG7CCTCCAG~CC~CTCCCATGG?CCCGGTTCCGAGCACATTTCACAAG EXON IV .ei$ Thr G :&x:9

GCCGGTAACTTTATTGTGGCTATA?CCGGATGGCGCAGGAGCTC---------CCCTTACCCTGTTACA

Glu

AZ9

Pro

ACGGAGCGGCCG

PI.0

ser

Gl”

Ala

ai*

CCG

AGC

GAG

GCG

CCA

Gly

Val

Ala

AAC

GGC

GTC

GCC

Gly

Gly

Pro

A1z,

Arq

Scr

AJ~

Pro

Gin

Lcu

Glu

Gly

Gin

Asp

Ala

Ala

Glu

Ala

Rr9

:&#cir:

Ala

Pro

Pro

His

Leu

Yal

Lcu

Leu

bsn

CGC

AGT

GAG

CCG

Cab

CTA

GAG

GGA

CAG

GAC

GCC

GCC

GAG

GCC

CGC

:$r$$

GCC

CCC

CCG

CRC

CTA

GTC

CTG

CTC

Lys

Glu

Thr

SC=

Arg

Ala

Ala

Pro

Ala

Giu

Pro

Pro

“ill

fle

Glu

‘e”

Gl,’

Ala

AK‘,

Ser

Gly

Ala

Gly

Set’

AAG

GAG

ACG

AGC

CGC

GCA

GCC

CCG

GCT

GAG

CCC

CCC

GTC

ATC

GAG

C?A

GGA

GCG

CGC

AGC

GGC

GCG

GGG

GGC

GGC

CC?

GCC

AGT

Gly

Gly

Gly

Ala

Ala

brg

ASP

Lc”

Lys

Gly

brg

ASP

AIA

Yal

Ala

Ala

Gl”

Ala

Arq

Lt”

brg

Vdl

Pro

Thr

Thr

Gl”

Le”

Cys

GGG

GGC

GG?

GCC

GCG

AGG

GbC

??A

bbG

GGC

CGC

CRC

GCA

GTb

GCA

GCC

GAA

GC?

CGC

CPT

CGG

GTG

CCC

ACC

ACC

GIG

CTG

TGC

Arg

Pro

Pro

Gly

Pro

ala

Pro

Ais

Pro

AlA

Pro

Ala

S-r

Al=

Pro

Ala

Gl”

Lau

Pro

Gly

asp

Gly

brg

:$:j.

Val

Gin

LCY

Scr

AGA

CCT

CCC

GGA

CCC

GCC

CCG

GCG

CCC

GCG

CCC

GCC

TCG

GCT

CC?

GCA

GAG

CTG

CC?

GGA

GAC

GGC

CGC

:;&i&

GTG

CAG

CTG

AGC

GCbTCC----------GATCTACCTTCTCTTTCCbTA Lys At-9 Arq Pro 6-r Pro

Tyr

Gl”

;iii&.$

TAT

GAG

T GGG TTC TTT GAA CCG GCC CCC $ii’i& TTC ACCAACAAC G1" 11s A,p EXON “I GlY $+ij GAGATT TCT GAP GTGAGTCTGCACCTGTCCC~TGTCA----------ACGTCCTTCTACCCTCAG GT GGG

GILT

TTC

AAC

CG~

GTG

Ser

AGG

AGG

CCC

TCC

CCA

Pro

“is

Thl:

Lys

“al

Val

Arq

ilrq

110

Phe

Thr

Asn

Scr

Arq

Gl”

brg

Trp

brg

Gl”

Gin

Asn

“al

As”

Cly

Ala

Phe

Ala

Glu

CCT

CaC

ICC

abb

GTA

GTG

CCG

CGC

ATC

TTC

ACC

AAC

AGC

CGG

Gab

CGA

TGG

AGG

CAC

CAC

bA?

GTC

,312

GGG

GCA

ITT

GCT

GAG

Le”

Alrq

Lys

Leu

Ifa

Pro

Thr

His

Pro

Pro

ASP

Lyle

Lys

Leu

$01

Lys

bsn

Clu

Iit

Ls”

bra

L-u

Ala

!&&

Lv,

TYI

1le

Asn

CTC

AGI)

1AG

CTG

ATC

CCC

ACC

C&C

CCA

CCA

GAC

AAG

AAA

CTA

AGC

bAG

RbT

GAG

ATC

CTC

CC;

CTT

GCC

Phe

LI”

*ia

Lya

l,tu

ieu

Gla

Gi”

Cl”

Gly

Thr

Gin

Azq

Ala

lys

Pro

Gly

tys

ASP

Pro

“al

“sl

Gly

Ala

Gly

TTC

CTG

GCC

GAA

GGC

ACC

CAG

CGT

GCC

AAG

CC’!

GTG

GGA

AAG

CTC

Asn ASPGin

GAG

GGC

AAG

GIG

CCC

GIG

Gly

Gly

Ala

Gly

Gly

Gly

Ile

Pro

Pro

Glu

Asp

Lcu

Le”

Cl”

Arip

“al

Le”

Ser

Pro

Asn

Ser

Scr

Cyr

Gly

Ser

Ser

Lc”

GGC

GGT

GGG

GCA

GGG

GGT

GGC

ATC

CCC

CC?

GAA

GAC

CTT

CTA

CAG

GAC

GTG

CTT

TCC

CCC

AAC

TCC

AGC

TGT

GGC

AGC

TCT

CTG

TTA

GAC

GCTGGT

GAG

Gly

AAG

AAT

;$& A& Tic ATCbRT

GIG

Asp

Gly

Ala

Als

Scr

Pro

Alp

Ser

Tyr

Thr

Gl”

Cl”

Pro

Thr

Pro

Lys

HII

Thr

Scr

Arg

SC=

Le”

His

Pro

Ala

Le”

Lcu

Pro

GA1

GGG

GCA

GCC

AGC

CCG

GAC

AGT

TbC

ACb

GRC

GAG

CCb

ACA

CCC

AAG

CAC

ACT

TCC

CGC

AGC

CTC

CAT

CC?

GCC

CTG

CTG

CC?

ala

Ala

brp

Gly

ala

Gly

Pro

Arg

***

GCC

GCT

GA?

GGG

GCT

GGC

CCC

CCG

TGA

TGCGTCTGGGCCTGCCCbGGGCCbGC*GGGCAGGGGCCT~?AGGCCCC?GGG??GC?GG~CT~CbGGGCbGG?GGG

A TGAGAAGCIGGTCAATGGACT2ATGTGTGAACTTCCCTTbCAGTTTGAACTTTGGGAAGTCCCAACTGACCC?AGGCTGGCAT?TCTG?TTCCTGCATGGAAACAGAAGAGGC A AACAGAGTGAAGTbGTAGGTACTTTTTCTGAAG~TGGCACGGTCTTCTCCCTTTCCCAAGCCCAAAGAT?TCCCCAATGATGbGGCTCAA~?GTCTAGTTTTGGTCTAGAG T

G c A A

T c c G

G G G

c T

c T

T T T

c c A A C A

Fig. 2. Partial nt sequence and iniron-exon structure of the murine XL gene. Exons are boxed and aa are shown using the three letter code. Numbers on the right indicate the respective nt. The TATA motifs upstream from exons la and Ib are overlined, the CCAAT box is shown dotted overlined. Spl-recognition sequences (GGGCG) are bold underlined, a GATA-1 binding site is indicated broken overlined and a potential Ap-1 binding site is bold overlined. An alternative 3’ end for exon ib at nt 1045 is shown. The six in-frame Met residues are shaded, The polyadenylation signal is indicated in lower case. Methods: The nt sequence of exons 111, IV, V, VI was determined by sequencing plasmtd cDNA clones in both directions using the dideoxy chain-termination method and a T7 sequencing kit (Pharmacia) as previously described (Begley et al., 1991 ). An anchored PCR technique was used to obtain cDNA clones that encompassed the 5’ end of the SCL gene. Poly(A)--selected mRNA from murine F4N cells was annealed to an SCL-specific oligo primer situated in either exon IV (nt 285222871) or in exon VI (either oligo primer nt 3301-3317 or oligo primer

97 (WEHI

3BD+,

NFS60,

416B, FDCPl)

Ml, PU-5, F4N, WEHI were, therefore,

265,5774, deliberately

32D, se-

lected to determine whether the readily detectable levels of SCL expression were a consequence of gene disruption analogous to the mechanism responsible for SCL expression in human T-cell leukemia. No gross structural abnormality was detected although more subtle abnormalities (e.g., point mutations) could not be excluded. This may of course reflect the lack of DNA recombinases within myeloid cells. However, there was no abnormality of lymphoid tumors examined by Southern blot (AT2.5, WEHI 417.1, AKR thymomas). Although these data do not establish a causal relationship, the inability to detect SCL gene rearrangements is at least in

Fig. 3. Multiple

mRNA

forms

generated

by alternate

exon

usage:

mRNA splicing occurred as shown. Two alternate 3’ ends for exon Ib were identified (at nt 1045 or nt 1356, Fig. 2). Arrows indicate the position of Met residues. Black boxes indicate protein coding sequences. The basic domain and adjacent HLH region (bHLH) are shown. The splicing of exon Ib to V potentially encodes a smaller protein product. These transcripts were defined using an anchored PCR strategy and PCR analysis in Fig. 2.

using sense ohgos

situated

in exon la and Ib as outlined

tamer nor nonamer signal sequences involved cation events within exon III were conserved

in translo(Fig. 4).

(e) The SCL gene structure in murine leukemias The structure of the SCL gene was examined in murine leukemias. Despite specific examination, murine T-cell lines that express SCL have not been detected (Visvader et al, 1991; Green et al., 1992). Murine myeloid leukemias

keeping

with

quences

detected

the paucity

of recombination

in the 5’ region of the murine

signal

se-

SCL gene.

(f ) Conclusions (I) The murine SCLgene consists of 7 exons spanning approx. 20 kb of genomic DNA. This compares with 8 exons in the human gene. (2) Despite the high degree of conservation of mouse and human SCL genes structure and nt sequence, the heptamer/nonamer signal sequences present in the human gene are poorly represented in the murine sequence. (3) The 5’ flanking sequence contains several potentially important regulatory motifs that are conserved between species. (4) The intronic sequence upstream from exon III is highly conserved between species. (5) Structural abnormalities of the SCLgene are uncom-

nt 341553434) and first-strand cDNA synthesis performed with Moloney murine leukemia virus reverse transcriptase according to the manufacturer’s instructions (Boehringer-Mannheim, Mannheim, Germany). First-strand cDNA purification, tailing with dATP and PCR amplification was performed using either the 5’ oligo: 5’-GACTCGACTCGACATCGA(T),, or 5’-GGGGATCCGTCGAC(T),, and nested SCL 3’ oligos in either exon 111 (nt 2209-2228) or in exon V (nt 3202-3220) or spanning the exon V/VI splice junction (nt 3220-3230 and 3284-3300). Thermal cycling was carried out for 35 cycles (95’C, 2 min; 5O’C, 2 min; 72-C, 2 min) using a Perkin-Elmer-Cetus thermal cycler and Taq polymerase (Perkin Elmer Cetus, Norwalk, CT, USA). The PCR products were then extracted with phenol-chloroform and subcloned into pBluescript (Stratagene) for sequence analysis. Subsequently, sense oligos within exon Ia (nt 600-619) and exon Ib (nt 8777893) were used with an antisense oligo in exon 111 (nt 220992228) to generate additional PCR products (35 cycles; 95°C 2 min; 50°C 2 min; 72.C 2 min) from reverse transcribed murine F4N poly(A)+-selected mRNA. The products of these reactions were confirmed to hybridize specifically, purified, extracted with phenol-chloroform and subcloned into pBluescript for sequence analysis. Using these and other cDNA fragments as probes, genomic clones were identified and characterized as described in Fig. 1. The nt sequence shown was determined in both directions from plasmid subclones. Because PCR generated cDNA clones involving exon Ia showed heterogeneity at the 5’ end, a series of oligos were constructed and used to prime PCR reactions using cDNA from F4N cells and an antisense oligo in exon 111 (nt 2209-2228). These included nt 511-530; nt 521-540; nt 528-548; nt 541-560; nt 551-570; 561-580; nt 581-600; nt 600-617. Only the five most 3’ ohgos successfully amplified a hybridizing product of the predicted size from cDNA although all oligos successfully amplified the predicted (hybridizing) product from DNA under the same PCR conditions (35 cycles; 94‘C. 2 min; 55’C/6O”C, 2 min; 72°C 2 min). A similar approach was used for exon Ib. Six oligos flanking exon Ib (nt 8377856; nt 847-866; nt 857-875; nt 867-886; nt 876-893) were used with an antisense oligo in exon III (nt 2209-2228). The most 3’ ohgo (nt 8677893) generated a PCR product while the overlapping oligo (nt 867-886) did only occasionally (35 cycles; 94’C. 2 min; 57’C, 2 min; 72°C 2 min). The other oligos were negative using cDNA from F4N cells. To confirm the difference between species in the nt sequence of the TATA element upstream from exon Ib (TAGA in the mouse versus TATA in human), PCR was performed (35 cycles; 94°C 2 mitt; 55 C. 2 mitt; 72°C 2 min) using a 5’ oligonucleotide (nt 600-617) and 3’ antisense oligo (nt 8766893) with two additional mouse genomic libraries (Clontech ML 100gd; DNA source: mouse DBA2J adult liver; and Stratagene #946303, DNA source: mouse C57Blck 6 x CBA 6-8 weeks spleen) from different mouse strains. The PCR products were confirmed to hybridize specifically, purified, subcloned into plasmid and the nt sequence determined. The GenBank accession No. is UO5130.

98 ___-_-___-____

2276

Fig. 4. Comparison of murine (M) and human (H) exon 111 and upstream sequence. ldentities are indicated with asterisks. The sequence for mouse exon III is boxed, Note the similarity between species immediately upstream from exon 111. The heptamer and nonamer signal sequences involved in translocation events in human SCL are bold underlined. Human exon Ha (Aplan et al., 1990) is boxed. Qhgos (sense, nt 1699 1717 and antisense, nt 1807-1826, Fig. 2) from within the comparable murine sequence (broken overline) failed to successfully generate a PCR product from F4N cDNA when used with oligos within exon IV (antisense nt 285222871) and exon Ia (sense, nt 600%6191, respectively under a variety of conditions 7 and sequence H from Aplan et al. ( 1990b). (35 cycles; 9&C, 2 min; 45Y/5O’C/.55XZ&O’C, 2 min; 72-C, 2 min). Sequence M is from Fig. _

and human

mon in murine transcript.

myeloid

leukemias

that express

the SCL

ACKNOWLEDGEMENTS

We wish to thank Paula Nathan and Bette Papaevangeliou for expert technical assistance. This work was supported in part by grants from the Anti-Cancer Council of Victoria, the Victorian Health Promotion Foundation, the National Health and Medical Research Council, Canberra, the Cancer Research Campaign and the Wellcome Trust. A.R.G. is a Wellcome Senior Fellow in Clinical Science.

REFERENCES Aplan, P.D., Lombardi, D.P., Ginsberg, V.L. and Kirsch, I.R.: Disruption “illegitimate” V-(D)-J recombinase 14261429.

A.M., Cossman, of the human activity. Science

J., Bertness, XL locus 250 (1990a)

Aplan, P.D., Begley, C.G., Bertness, V., Nussmeier, M., Azquerra, A., Coligan, J. and Kirsh, I.R.: The SCL gene is formed from a transcriptionally complex locus. Mol. Cell. Biol. 10 (1990b) 64266435. Aplan, P.D., Nakahara, K., Orkin, S.H. and Kirsh, I.R.: The SCL gene product: a positive regulator of erythroid differentiation. EMBO .I. 11 (1992)4073-4081. Beaupain, D., Eleouet, J.F. and Romeo, P.H.: Initiation of transcription of the erythroid promoter of the porphobilonogen deaminase gene is regulated by a &s-acting sequence around the cap site. Nucleic Acids Res. 18 (1990) 6509-6515. Begley, C.G., Aplan, P.D., Davey, M.P., Nakahara, K., Tchorz, K., Kurtzberg, J., Herschfield, MS., Haynes, B.F., Cohen, D.I., Waldmann, T.A. and Kirsch, I.R.: Chromosomal translocation in a human stem-cell line disrupts the T-cell antigen receptor &chain diversity region and results in a previously unreported fusion transcript. Proc. Natl. Acad. Sci. USA 86 ( 1989a) 2031--2035.

Begley, C.G., Aplan. P.D., Haynes, B., Waldmann. T.A. and Kirsch. I.R.: The gene SCL is expressed during early hematopoiesis and encodes a differentiation-related ~NA-binding motif. Proc. Natl. Acad. Sci. USA 86 (1989b) 10128~-10132. Begley, C.G.. Visvader, J., Green, A.R., Aplan P.D.. Metcalf, D., Kirsch, I.R. and Gough. N.M.: Molecular cloning and chromosomal localization of the murine homolog of the human helix-loop-helix gene SCL. Proc. Natl. Acad. Sci. USA 88 (1991) 8699873. Bernard, O., Lecointe, N., Jonveaux, P., Souyri, M., Mauchauffe, M., Berger, R., Larsen, C.J. and ~~athie~-Mahul, D: Two site-specific deletions and t( 1:14) transiocation restricted to human T-cell acute leukemias disrupt the 5’ part of the rtzl-J gene. Oncogene 6 (1991) 1477-1488. Brown, L., Cheng, J.-T., Chen, Q.. Siciliano, M.J., Crist, W., Buchanan, G. and Baer, R.: Site-specific recombination of the rul-I gene is a common occurrence in human T cell leukemia. EMBO J. 9 ( 1990) 334333351. Cheng. J.-T.. Cobb, M.H. and Baer, R.: Phosphoryiation of the Tal-1 oncoprotein by the extracellutar-signal-regulated protein kinase ERKl. Mol. Cell. Biol. 13 (1993) 801-808. Elwood, N.J., Cook, W.D., Metcalf, D. and Begley, C.G.: The HLH gene XL enhances tumorigenicity of a pre-leukemic T-lymphocyte cell line. Oncogene 8 (1993) 3093--3101. Finger, L.B., Kagan, J., Christopher, G.. Kurtzberg, J., Herschfieid, MS., Nowell, P.C. and Croce, C.M: Involvement of the TCL-5 gene on chromosome 1 in T cell leukemia and Melanoma. Proc. Natl. Acad. Sci. USA 86 ( 1989) 503995043. Fong, T.C. and Emerson, B.M.: The erythroid-specific protein GATA-1 mediates distal enhancer activity through a specialized B-globin TATA box. Genes Dev. 6 (1992) 521--523. Goldfarb. A.N.. Gouel. S., Mickelson, D. and Greenberg, J.M.: T-cell acute lymphoblastic leukemia the associated gene SCL/tat codes for a 42 kd nuclear phosphoprotein. Blood 11 (1992) 285882866. Green. A.R., DeLuca, E. and Begley, C.G.: Antisense SCL suppresses self-renewal and enhances spontaneous erythroid differentiation of thehuman~uekaemiccell~ineK542.EMBOJ. 10(199i)4153~-4~58. Kozak, M.: Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44 (1986) 283 -293. Mitchell. P.J. and Tijan, R.: Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245 (1989) 371-378.

99 Mouthon,

M.A.,

Bernard,

O.,

Mitjavila,

M.T.,

Romeo,

P.H..

Vainchenker. W. and Mathieu-Mahul, D.: Expression of cd-1 and GATA-binding proteins during human hematopoiesis. Blood 8 (1993) 6477655. Murre, C., McCaw, P.S. and Baltimore, D.i A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless MY0 D and Myc proteins. Cell 56 (1989) 7777783.

Tanigawa, T., Elwood, N., Metcalf, D., Cary, D., De Luca, N.A. and Begley, CC.: The SCL gene product is regulated by and differentially regulates differentiation.

cytokine responses during myeloid leukemic cell Proc. Nat]. Acad. Sci. USA 90 (1993) 7864-7868.

Visvader, J., Begley, CC. and Adams, J.M.: Differential expression of the LYL, SCL and E2A helix-loop-helix genes within the hemopoietic system. Oncogene

6 (1991) 1955204.