Gene, 138 (1994) 93-99 0 1994 Elsevier Science B.V. All rights reserved. SSDI 0378-l 119 (93) E0566-V
GENE
93
0378-l 119/94/$07.00
07637
Structure of the gene encoding the murine SCL protein (Helix-loop-helix;
transcription
factor; chromosome
C.G. Begleyapb, L. Robb”, S. Rockmanb, aThe Walter and Eliza Hall Institute of Medical Research, Melbourne 333-6835
Hospitnl,
translocation;
11 May 1993; Accepted:
gene structure)
J. Visvader”, E.O. Bockamp”, Melbourne,
3 August
Y.S. Ghan” and A.R. Greene
Victoria 3050, Australia; b Department of Diagnostic Haematology,
Victoria 3050, Australia; and “Department of Haematology,
Received by P.A. Manning:
hemopoiesis;
MRC Building, Hills Road, Cambridge
1993; Received at publishers:
30 September
P.O. Royal
CB2 2QH. UK. Tel. (44-223 )
1993
SUMMARY
We have determined
the molecular
structure
of the gene encoding
the murine
XL
protein
(helix-loop-helix
transcrip-
tion factor). The gene consists of seven exons spanning approx. 20 kb. The intron/exon structure, coding region sequences and sequences present at the splice junctions were highly conserved between mouse and human. The 5’ flanking sequence contains CCAAT and TATA consensus motifs with several putative binding sites for SP-1, AP-1 and GATA-1. Multiple mRNA transcripts were generated by alternate exon usage. The transcripts differed primarily in the 5’ untranslated region (UTR), but potentially also encode a smaller XL protein. Despite the high degree of conservation between species, the heptamer/nonamer signal sequences in the 5’ region of the human SCL gene (the frequent site of SCL disruption in human leukemia) were poorly represented in the murine sequence. In keeping with this, structural abnormalities of murine SCL were uncommon in murine leukemias that express the SCL transcript.
INTRODUCTION
The SCL protein is a member of the helix-loop-helix (HLH) family of transcription factors. This family includes proteins that play critical roles in development and differentiation in a wide variety of tissues and in species ranging from plants through to mammals (Murre et al., 1989). Several members of the family (c-MYC, SCL, LYL-1, Tal-2 and the E2A gene product) are also implicated in the development of human lymphoid tumors Correspondence to: Dr. C.G. Begley, The Walter and Eliza Hall Institute of Medical Research, Post Office, The Royal Melbourne Hospital, Victoria 3050, Australia. Tel. (61-3) 345-2555; Fax (61-3) 347-0852; e-mail:
[email protected] Abbreviations: minute; HLH, tide(s); oligo,
aa, amino acid(s); bp, base pair(s); cpm, counts per helix-loop-helix; kb, kilobase or 1000 bp; nt, nucleooligodeoxyribonucleotide; ORF, open reading frame;
PBGD, porphobilinogen deaminase gene; PCR, polymerase chain reaction; SCL, HLH transcription factor; SCL, stem cell leukemia gene encoding SCL; tsp, transcription start point(s); UTR, untranslated region(s).
because of their involvement in chromosome translocations, and we have recently demonstrated that SCL protein is tumorigenic when constitutively expressed in T-lymphoid cells (Elwood et al., 1993). However, SCL mRNA (Visvader et al., 1991; Green et al., 1992) and SCL protein cannot normally be detected in lymphoid populations. Within hemopoietic cells, SCL mRNA expression is restricted to proliferation and differentiation in megakaryocytes, mast cells, erythroid cells and more primitive progenitor cells (Green et al., 1991; 1992; Mouthon et al., 1993; Aplan et al., 1992a; Tanigawa et al., 1993). The SCL gene was first identified because of its involvement in a translocation in a human stem cell leukemia, and DNA recombinases were strongly implicated in the genesis of this event (Begley et al., 1989a,b). Subsequently, SCL (also known as Tel-5 or T&-l; Finger et al., 1989; Brown et al., 1990) has been recognized to be aberrantly activated in up to 25% of human T-cell acute lymphoblastic leukemias as a result of site-specific rearrangements at sequences that conform well to the
94 heptamer/nonamer recombinases
signal sequences
(Aplan
recognized
regulatory
SCL gene and to determine sequences
recombinases
(b) Nucleotide It is now
The aim of this study was to characterize of the murine
by DNA
et al., 1990a).
and
sequences
were conserved
the structure
whether
targeted
murine
specific
SCL
sequence of full-length murine SCL cDNA possible cDNA
by DNA
clones (Fig. 2). Exons
across species.
stop codon present
Six overlapping a murine
using
mouse
1990b) containing
genomic genomic
cDNA
as
fragments
clones
were isolated
library
(Clontech
human
cDNA
and
fragments
and subcloned
phage
hybridization were
(Aplan
could
et al.,
region,
phage clones into plas-
I II RBHASm
Ub
III
II
0
II
se-
from an in-frame
TAA
initiation
(Kozak,
ATG
sequence
codons
serve as a site of translation (see section
d). Exon
basic domain,
de-
1986). There arc downstream
start
in
potentially
in some SCL
VI contains
the HLH
3’ CJTR and canonical
polyadenylation signal and its variant (ATTAAA nt 6245, Fig. 2. A 297-bp deletion occurs in some transcripts within
mid vectors for sequence analysis and mapping of their relative positions. Fig. 1 shows the restriction map and intron-exon structure of 20 kb of genomic DNA encompassing the murine SCL gene, created by comparison with murine cDNA clones. The cDNA clones were obtained from a normal mouse bone-marrow macrophage cDNA library (six clones with inserts ranging from 1.8-4.3 kb) and five clones from a mouse erythroleukemia cell library (inserts from 0.5-3.0 kb; Begley et al., 1991). Additional cDNA clones were obtained by PCR using sense oligo primers situated in exons Ia or Ib and an antisense oligo in exon III. These cDNA clones displayed multiple alternate splicing patterns in the exons of the 5’ UTR (see section d). The gene is organized into seven exons (with alternate exons Ia or Ib) and an overall structure that is very similar to the structure of the human SCL gene (Aplan et al., 1990b; Bernard et al., 1991).
II0
5’ UTR
in exon III. The first ATG codon and
in-frame
upstream
alternate in cDNA
G + C rich. A long ORF begins
IV and two in exon V, one of which
transcripts
Exon-
by hybridization
la Ib
additional
exon
#ML1030j)
probes.
identified
from the genomic
two
composite
the
nt fit well with the consensus
fined for translation (a) Genomic clones of the murine SCL gene
including
that were identified
IV, 15 bp downstream
surrounding
an entire
I, II, and III contain
quence and are relatively
AND DISCUSSION
from
sequence
exons seen in the 5’ UTR
in exon RESULTS
to present
the 3’ UTR of the human SCL gene as a result of a splicing event (Begley et al., 1989b). Although in the murine 3’ UTR the equivalent splice donor, (nt 3934, Fig. 2) acceptor (nt 4230, Fig. 2) and surrounding homologous sequences were also present, no murine cDNA clones were identified that displayed this splicing event. However, murine transcripts that differ in the 3’ UTR have been demonstrated ( Begley et al., 199 1). The 3’ U TR of human SCL was also the site of a chromosome translocation (Begley et al., 1989a). Two sequences flanking this translocation site showed a five out of seven match with the consensus heptamer sequence CACA/TGTG and likely served as recognition sequences for recombinase enzymes. Similar sequences were also present at the same site in the murine 3’ U TR: at 4408 bp (Fig. 2) the sequence GGCTGTG was identical between species and
I
I
I
I I
Sm
S
s
R
II
I
NP
BS
I I II AtuBA
II
II
II
BS
BSBS
SS
UIIII XHHP Y S
IIIIII APHRB RS
I I SR
1 1 kb
Fig. 1. Organization of the murine SCL gene. Boxes and Roman numerals indicate exons; closed boxes indicate coding region which begins at the second nt of exon IV. A, Apul; B, BronHI; Bg, ⅈ H. HindIll; N. NotI; Nd, N&I: P. Pstl; R, EcoRI; S, SacI; Sm. Sn~l: X, Xhul. Six overlapping genomic bacteriophage clones were isolated from Balb/c DNA (Clontech #ML1030j, DNA source: Mouse Balb/c adult liver Palo Alto, CA. USA) after screening approx. 3 x 10’ clones. Methods: Phage DNA was immobilized on duplicate nitrocellulose filters and hybridized to a series of murine and human cDNA fragments encompassing the entire cDNA. Prehybridization was performed at 65 C for murine probes and 55 C for human probes in 2 x SSC (I x SSC=O.15 M NaCI/O.OlS M Na,.citrate pH 7.6)/0.2% Ficoll/O.?%~ polyvinylpyrollidone,~O.Z% bovine serum albumin,2 mM Na.pyrophosphate/l mM ATP and 0.005% Eschrr-ichitr co/i tRNA. Filters were then hybridized for 16 h in fresh 2 x SSC solution containing O.l’Y” SDS and probe at 2 x lOh cpm/ml as previously described (Begley et al., 1991). Filters were washed in 0.2 x SSC/O.l% SDS at 65 C (for murine probes) or 55 C (for human probes) prior to autoradiography for 12-36 h on Kodak X-AR lilm exposed at - 70 C with an intensifying screen. After purification of positive plaques by screening at lower density, phage clone DNA was extracted using standard procedures (Aplan et al., 1990b). Subfragments of phage DNA were identified, obtained in preparative restriction digests and subcloned into pBluescript (Stratagene, La Jolla. CA, USA) for further restriction digests and sequence analysis. The relative position of the restriction fragments was confirmed by cross-hybridization to Southern blots of genomic and phage DNA.
95
was also present site
at
in the murine
4450 bp
(versus
sequence
CTTTGTG
at the second in the human
element
close to the PBGD
cap site (Beaupain
et al.,
1990).
sequence). (d) Exon-intron junctions and multiple 5’ UTR transcripts The exon-intron (c) Sequence analysis of SCL 5’ flanking region The nt sequence upstream script
of murine
sequence
start
site shown
TCCATTGC
for murine
start site sequence
1990b; Bernard
There
100 bp flanking identity
within
heptamer
was 83%
that
(GGCTGTG
species
(compared
in the human
with 60%
conservation
of the
sequence
compared
with
having
characteristics
in the mouse).
Exon Ia lies within a CpG island
with an increased
G + C content
region) and high frequency
(69%
ment (ATAAT)
from mouse (and human)
upstream
A potential
for the human a result
binding
promoter
site shown
eight
of the high G+ C content, estimated
PCR reactions
oligos
binding
to be functional
the tsp for human to lie approx.
80 bp,
et al., 1991). A series of
were therefore
with an antisense
the five most 3’ oligos successfully
used to prime
initiate
1990 and references
from tsp, the functional
therein).
Moreover,
B-globin-encoding
in the establishment
signi-
Several non-globin nt and still
at a single start site (Beaupain
at - 30 nt in the chicken implicated
additional
et al.,
a GATA-1
from exon IV. A intronic
spliced
exons
sequences IIb and III.
RNA forms that retain
or incompletely
processed
pre-RNA.
The 5’ end of the mouse gene demonstrated pattern
of mRNA
transcripts. curred
splicing
encoding
one murine
a full-length
transcript
involved
exons Ib and V but the relative
alternate
events primarily
in the 5’ UTR and included
potentially
a complex
splicing (Fig. 3) with multiple
These alternate
oc-
exon III, therefore
SCL protein. a splicing
However,
event between
abundance
A similar
SCL transcripts
of this tran-
event has been ob-
site
gene has been
of stable transcriptional
preinitiation complexes (Fong and Emerson, 1992) and GATA-1 may interact with a factor binding to an initiator
involving
exons Ia and
in predicted proteins that retain the HLH and upstream basic domains but lack the Pro-rich N-terminal half of the protein and, by analogy with a number of other transcription factors (Mitchell and Tijan, 1989) may therefore domain.
In addition,
these SCL
proteins would lack the Ser that serves as a substrate for ERK/MAP2 kinases (Cheng et al., 1993). While such splicing events may be responsible for the smaller SCL proteins
observed
in in vitro translated
lysates (Bernard
et al., 1991; Cheng et al., 1993; Goldfarb et al., 1992) and in hemopoietic cell extracts there are probably additional types. Fig. 4 compares
the predicted
genes lack a TATA box at -30 transcription
sequences
levels of translational
ficance of the ATAAT motif is unclear. erythroid
intronic
amplified
ATAAT motif and 30 nt downstream from the GATA-1 binding site. However, since most TATA elements are 25-35 nt upstream
from appropriately
oligo in exon III. Only
product, thus localizing the tsp to the region indicated in Fig. 2. This was approx. 80 nt downstream from the
situated
upstream
cDNA clone contained
lack a transactivation as
from the ATAAT motif
et al., 1990b; Bernard overlapping
upstream
IV, V, VI, but with intronic
immediately
V (Aplan et al., 1990b; Bernard et al., 1991). These splicing events between either exon Ia or Ib and exon V result
exon
et al., 1992). Possibly
135 bp and 200 bp downstream (Aplan
second murine
exons
ap-
in
between
several potential
(Aplan
exon Ia have been variably
spliced
(Fig. 2).
that contained
served in human
TATA ele-
in a region that showed 94% identity
species (in 115 nt) and included sites and a GATA-1
propriately
present
and con-
of
of the dinucleotide
region).
dinucleotides
cDNA clone was isolated
sequence
canonical
splice sequences
One murine
script was not quantitated.
a region
CpG ( 15% in a 200-bp Ia occurred
in
is frequently the site of the upstream from human Ib
events
GGCGCTG
between
exon Ib) but without
sequence
a 200-bp
identity
form well with the consensus
These may represent
position
and 30 nt in the human
this TATA element
recombination
and
a potential
was found at the appropriate
(29 nt in the mouse
sequence).
(YYCRYYYY)
et al., 1991). Furthermore,
element
both
SCL exon Ib (Aplan et al.,
the start site defined for human
upstream
Ib in Fig. 2 is
(A is tsp), which is in good agreement
with the consensus
TATA-like
in Fig. 2. The tranexon
all contained
GT splice donor and AG acceptor
exons Ia, Ib and 550 nt of
are also presented
boundaries
control the mouse
that occur in specific cell and human
sequence
for
exon III and the region immediately upstream. Although an exon approx. 300 bp upstream from human exon III (human exon IIa) was identified in some human transcripts (Aplan et al., 1990b) this region was not seen in murine transcripts. In addition, oligos from within the region likely to encompass the murine counterpart of human exon IIa were unsuccessful in amplifying a PCR product from murine mRNA. It was of interest that the intronic sequence immediately upstream from exon III showed considerable homology between species, raising the possibility that it may play a role in gene regulation. However, despite this homology, neither the human hep-
96 TTTCA?CTCTCTATTGRTAT??G?TCTG?CCCCCTATAGAG?CTTTTbTCT??GAGCAATTCCAGTTTTAGAGCGGTCAGGGGCCTG?GCTGATGAGGGTCAGCGAGGAAC? GGGTG?TGAGATACGGbAAAAGGAGAAACAGACACTC?GGC?T?GGAGAAGACAGGAGAAAGAATGAAGAAGGGAGGGTTGGTTGGGAATGGGTCGGGAGACACGG3GG?CT TA~GAGGCTAGACAACGGCCAAAGCCGCTGAACGAGCATCTGGGA~GCGGAGATCCGAGTCGGGTTGTTTGTAGGGAGCGGGAATCCCGAGGGGCCAGGGGTGGGG~ACGGG . . ..
CGGAACCCGCCGGGTCCCbACGTGAOCGCGC?CAGCCCTCTCC?C?GCGCA~TCC~GGGCCA?CCCbGCGCCGCAACCCG?CCCCGC?CCC?CCGGAGAbA~TGCCbbb?TAbb
. .. .. .
EYON
__-A~TTTGGCCCATAA?GGCCGAGGCGCT?b?CGGGGGCGGGCGGGCCCGAGGCGGCTCC?TA?CTCGGCGGCGCbCACGGCCGGCGC?AA~GAGGC~~
GCGGGATTAGbGCGfGCGGCGAGGCGCGGGCGGGGGCTGCCCAACA?GGCCACGCACACCCCCAAGGCGCAAAA
Ie
GTGAGCGTCCCGGCCTTCTTCCCRGGGCCAGGGCCGGGAGACA
EXON CCGCTGCCCCTCCCGCCGGGGAGACTCTCTTCCTTCCTTCCCCTTTTCCCCTTGCGCAATAGACAGAAAAGCGCAAGGCGCTGGCGGCTTC
Ib
ATTGCTTCTGGGCGGTGTGT
GCGGGCGGCCGGGTTGAGAGTGCTTGCGTGI\GCGGC~GGCTGCGGTGGCTACTCCCCGCGTGGGCTTTTGGCGATAGG?CGTGTGCGGGAGTGTGAAACCC~GTC~G
GG~GATTCCGACCTCTCTGCGTTTT~~~GTATGTbAAT?GCCCCGT?TCTTTTC?CCACAAGG?CGGTCTGG?GTATCGCAGTGTGTGGCCTTGGACA 1TAGGCTGGTTTCGTTATGTCGGGCGGGTGTG~TCCCCACCCTGTGAGGTTGTGTGTCTGTGAGCGTTTGCGTGTG~CTGATGTCGCTGTGTCATG TCTGCC?~CACCGGCTTGTGGGACAGAGCTAGCTCGTGTTCATCTGCTTCCCTCCTCTGGGCC?GGACCATGTGTGGTCCTGTCCACCAAGAG~C TGTCCTTTAGA GTG~GTTTGCAAATCCCTGGGG----------GGACAGAGCCCAG?GGGCGGCAGCTC?GTCCGGAGGCTGTCTGTGCCTTGGTGTTGCG EYON
IIb
CGCCAGCfCTGGATTCGCGCTCCCfCTGGGlTTCGCCl
GGTGTGTGCGGGGGGAATCTTCTAATCTGIGCCGCGCTCGCTGAGGCCAGCGTCTTCCCGCCCGCA
GTCAGTTTCGCGGG?GCGTCCGCGGTGGTTGGGAGAGGCTGGGAAAG----------?TTTTTTTTTGTTGTTGTT~TTTTTG?TTTTGTTT?TGT?TT ATTTCT CCGTTCTCTCCICCCTCGTCAAATAGAGAATTAC~TCTTCCTGCTCTGGCTAAGCCTGGGC~AAGGTTGTGGGGGCAACCCGCbTAGAGACGTCCAT TCGGCCATACTGAATTAGACAGATCCGTTAGAGGGTTCGAACACGTTCTGGGCAGCCTGCCGGAGAGGCTGTTGTTGTTGGACTAAGTbGTTATTC AAGGTTCTTCTTCTTTTTCTTTTTTTAATTCGGG~TGTGTTTGGGGGAGGGTGGGCTCC~AACACCTGCAAGTTGGAGGCGGAA~GCCATTTGCTAT EXON GGCACACGAGGTAATTCCCAGTTAT?GACCCCCC~TTTTCTCTCTCGCCCTCCCTCTTTCTCTCCACCCCCA?CTTTCCTGG~A~CTCGCTTTGGGCGCGGCA
III
ACCGCGCA
GGACCTC~CGGC~AGCTAAGTbACTGGTCT~GTCTCTCAGCGAGAGCCGGGAAbCCCAGCTTCGGG~TCCTACCTCGA~CCC~CTCCAGCGGAGGA~T~~AGGTCCTAbCCbGC
CGAETAGGTCTCICTAAA1ATGCCCC~ GTTAGT?GfGGtCTGGAGTCTGGGG?GGGGTGGAGtAAGGGGGGTAATGTCTG?GGAAACAhCAGTATTCTGGG?TGTGA AClGGGGGGCGGGGGGGGGGGGAGACGdCTTAGA~CTGAGAAGATTGGTGGfiGGTGGGGGTGGGAGAACCAACC?AGACTAAGAACCCGGGATCTGACTGCCACTAAGCTGC C?CAGGtCTCCACCAGGGCTCTT~CCTT?T?CCGGCCTTGGATAAAACAAGC?GGT?G~ATTATG7CCTCCAG~CC~CTCCCATGG?CCCGGTTCCGAGCACATTTCACAAG EXON IV .ei$ Thr G :&x:9
GCCGGTAACTTTATTGTGGCTATA?CCGGATGGCGCAGGAGCTC---------CCCTTACCCTGTTACA
Glu
AZ9
Pro
ACGGAGCGGCCG
PI.0
ser
Gl”
Ala
ai*
CCG
AGC
GAG
GCG
CCA
Gly
Val
Ala
AAC
GGC
GTC
GCC
Gly
Gly
Pro
A1z,
Arq
Scr
AJ~
Pro
Gin
Lcu
Glu
Gly
Gin
Asp
Ala
Ala
Glu
Ala
Rr9
:cir:
Ala
Pro
Pro
His
Leu
Yal
Lcu
Leu
bsn
CGC
AGT
GAG
CCG
Cab
CTA
GAG
GGA
CAG
GAC
GCC
GCC
GAG
GCC
CGC
:$r$$
GCC
CCC
CCG
CRC
CTA
GTC
CTG
CTC
Lys
Glu
Thr
SC=
Arg
Ala
Ala
Pro
Ala
Giu
Pro
Pro
“ill
fle
Glu
‘e”
Gl,’
Ala
AK‘,
Ser
Gly
Ala
Gly
Set’
AAG
GAG
ACG
AGC
CGC
GCA
GCC
CCG
GCT
GAG
CCC
CCC
GTC
ATC
GAG
C?A
GGA
GCG
CGC
AGC
GGC
GCG
GGG
GGC
GGC
CC?
GCC
AGT
Gly
Gly
Gly
Ala
Ala
brg
ASP
Lc”
Lys
Gly
brg
ASP
AIA
Yal
Ala
Ala
Gl”
Ala
Arq
Lt”
brg
Vdl
Pro
Thr
Thr
Gl”
Le”
Cys
GGG
GGC
GG?
GCC
GCG
AGG
GbC
??A
bbG
GGC
CGC
CRC
GCA
GTb
GCA
GCC
GAA
GC?
CGC
CPT
CGG
GTG
CCC
ACC
ACC
GIG
CTG
TGC
Arg
Pro
Pro
Gly
Pro
ala
Pro
Ais
Pro
AlA
Pro
Ala
S-r
Al=
Pro
Ala
Gl”
Lau
Pro
Gly
asp
Gly
brg
:$:j.
Val
Gin
LCY
Scr
AGA
CCT
CCC
GGA
CCC
GCC
CCG
GCG
CCC
GCG
CCC
GCC
TCG
GCT
CC?
GCA
GAG
CTG
CC?
GGA
GAC
GGC
CGC
:;&i&
GTG
CAG
CTG
AGC
GCbTCC----------GATCTACCTTCTCTTTCCbTA Lys At-9 Arq Pro 6-r Pro
Tyr
Gl”
;iii&.$
TAT
GAG
T GGG TTC TTT GAA CCG GCC CCC $ii’i& TTC ACCAACAAC G1" 11s A,p EXON “I GlY $+ij GAGATT TCT GAP GTGAGTCTGCACCTGTCCC~TGTCA----------ACGTCCTTCTACCCTCAG GT GGG
GILT
TTC
AAC
CG~
GTG
Ser
AGG
AGG
CCC
TCC
CCA
Pro
“is
Thl:
Lys
“al
Val
Arq
ilrq
110
Phe
Thr
Asn
Scr
Arq
Gl”
brg
Trp
brg
Gl”
Gin
Asn
“al
As”
Cly
Ala
Phe
Ala
Glu
CCT
CaC
ICC
abb
GTA
GTG
CCG
CGC
ATC
TTC
ACC
AAC
AGC
CGG
Gab
CGA
TGG
AGG
CAC
CAC
bA?
GTC
,312
GGG
GCA
ITT
GCT
GAG
Le”
Alrq
Lys
Leu
Ifa
Pro
Thr
His
Pro
Pro
ASP
Lyle
Lys
Leu
$01
Lys
bsn
Clu
Iit
Ls”
bra
L-u
Ala
!&&
Lv,
TYI
1le
Asn
CTC
AGI)
1AG
CTG
ATC
CCC
ACC
C&C
CCA
CCA
GAC
AAG
AAA
CTA
AGC
bAG
RbT
GAG
ATC
CTC
CC;
CTT
GCC
Phe
LI”
*ia
Lya
l,tu
ieu
Gla
Gi”
Cl”
Gly
Thr
Gin
Azq
Ala
lys
Pro
Gly
tys
ASP
Pro
“al
“sl
Gly
Ala
Gly
TTC
CTG
GCC
GAA
GGC
ACC
CAG
CGT
GCC
AAG
CC’!
GTG
GGA
AAG
CTC
Asn ASPGin
GAG
GGC
AAG
GIG
CCC
GIG
Gly
Gly
Ala
Gly
Gly
Gly
Ile
Pro
Pro
Glu
Asp
Lcu
Le”
Cl”
Arip
“al
Le”
Ser
Pro
Asn
Ser
Scr
Cyr
Gly
Ser
Ser
Lc”
GGC
GGT
GGG
GCA
GGG
GGT
GGC
ATC
CCC
CC?
GAA
GAC
CTT
CTA
CAG
GAC
GTG
CTT
TCC
CCC
AAC
TCC
AGC
TGT
GGC
AGC
TCT
CTG
TTA
GAC
GCTGGT
GAG
Gly
AAG
AAT
;$& A& Tic ATCbRT
GIG
Asp
Gly
Ala
Als
Scr
Pro
Alp
Ser
Tyr
Thr
Gl”
Cl”
Pro
Thr
Pro
Lys
HII
Thr
Scr
Arg
SC=
Le”
His
Pro
Ala
Le”
Lcu
Pro
GA1
GGG
GCA
GCC
AGC
CCG
GAC
AGT
TbC
ACb
GRC
GAG
CCb
ACA
CCC
AAG
CAC
ACT
TCC
CGC
AGC
CTC
CAT
CC?
GCC
CTG
CTG
CC?
ala
Ala
brp
Gly
ala
Gly
Pro
Arg
***
GCC
GCT
GA?
GGG
GCT
GGC
CCC
CCG
TGA
TGCGTCTGGGCCTGCCCbGGGCCbGC*GGGCAGGGGCCT~?AGGCCCC?GGG??GC?GG~CT~CbGGGCbGG?GGG
A TGAGAAGCIGGTCAATGGACT2ATGTGTGAACTTCCCTTbCAGTTTGAACTTTGGGAAGTCCCAACTGACCC?AGGCTGGCAT?TCTG?TTCCTGCATGGAAACAGAAGAGGC A AACAGAGTGAAGTbGTAGGTACTTTTTCTGAAG~TGGCACGGTCTTCTCCCTTTCCCAAGCCCAAAGAT?TCCCCAATGATGbGGCTCAA~?GTCTAGTTTTGGTCTAGAG T
G c A A
T c c G
G G G
c T
c T
T T T
c c A A C A
Fig. 2. Partial nt sequence and iniron-exon structure of the murine XL gene. Exons are boxed and aa are shown using the three letter code. Numbers on the right indicate the respective nt. The TATA motifs upstream from exons la and Ib are overlined, the CCAAT box is shown dotted overlined. Spl-recognition sequences (GGGCG) are bold underlined, a GATA-1 binding site is indicated broken overlined and a potential Ap-1 binding site is bold overlined. An alternative 3’ end for exon ib at nt 1045 is shown. The six in-frame Met residues are shaded, The polyadenylation signal is indicated in lower case. Methods: The nt sequence of exons 111, IV, V, VI was determined by sequencing plasmtd cDNA clones in both directions using the dideoxy chain-termination method and a T7 sequencing kit (Pharmacia) as previously described (Begley et al., 1991 ). An anchored PCR technique was used to obtain cDNA clones that encompassed the 5’ end of the SCL gene. Poly(A)--selected mRNA from murine F4N cells was annealed to an SCL-specific oligo primer situated in either exon IV (nt 285222871) or in exon VI (either oligo primer nt 3301-3317 or oligo primer
97 (WEHI
3BD+,
NFS60,
416B, FDCPl)
Ml, PU-5, F4N, WEHI were, therefore,
265,5774, deliberately
32D, se-
lected to determine whether the readily detectable levels of SCL expression were a consequence of gene disruption analogous to the mechanism responsible for SCL expression in human T-cell leukemia. No gross structural abnormality was detected although more subtle abnormalities (e.g., point mutations) could not be excluded. This may of course reflect the lack of DNA recombinases within myeloid cells. However, there was no abnormality of lymphoid tumors examined by Southern blot (AT2.5, WEHI 417.1, AKR thymomas). Although these data do not establish a causal relationship, the inability to detect SCL gene rearrangements is at least in
Fig. 3. Multiple
mRNA
forms
generated
by alternate
exon
usage:
mRNA splicing occurred as shown. Two alternate 3’ ends for exon Ib were identified (at nt 1045 or nt 1356, Fig. 2). Arrows indicate the position of Met residues. Black boxes indicate protein coding sequences. The basic domain and adjacent HLH region (bHLH) are shown. The splicing of exon Ib to V potentially encodes a smaller protein product. These transcripts were defined using an anchored PCR strategy and PCR analysis in Fig. 2.
using sense ohgos
situated
in exon la and Ib as outlined
tamer nor nonamer signal sequences involved cation events within exon III were conserved
in translo(Fig. 4).
(e) The SCL gene structure in murine leukemias The structure of the SCL gene was examined in murine leukemias. Despite specific examination, murine T-cell lines that express SCL have not been detected (Visvader et al, 1991; Green et al., 1992). Murine myeloid leukemias
keeping
with
quences
detected
the paucity
of recombination
in the 5’ region of the murine
signal
se-
SCL gene.
(f ) Conclusions (I) The murine SCLgene consists of 7 exons spanning approx. 20 kb of genomic DNA. This compares with 8 exons in the human gene. (2) Despite the high degree of conservation of mouse and human SCL genes structure and nt sequence, the heptamer/nonamer signal sequences present in the human gene are poorly represented in the murine sequence. (3) The 5’ flanking sequence contains several potentially important regulatory motifs that are conserved between species. (4) The intronic sequence upstream from exon III is highly conserved between species. (5) Structural abnormalities of the SCLgene are uncom-
nt 341553434) and first-strand cDNA synthesis performed with Moloney murine leukemia virus reverse transcriptase according to the manufacturer’s instructions (Boehringer-Mannheim, Mannheim, Germany). First-strand cDNA purification, tailing with dATP and PCR amplification was performed using either the 5’ oligo: 5’-GACTCGACTCGACATCGA(T),, or 5’-GGGGATCCGTCGAC(T),, and nested SCL 3’ oligos in either exon 111 (nt 2209-2228) or in exon V (nt 3202-3220) or spanning the exon V/VI splice junction (nt 3220-3230 and 3284-3300). Thermal cycling was carried out for 35 cycles (95’C, 2 min; 5O’C, 2 min; 72-C, 2 min) using a Perkin-Elmer-Cetus thermal cycler and Taq polymerase (Perkin Elmer Cetus, Norwalk, CT, USA). The PCR products were then extracted with phenol-chloroform and subcloned into pBluescript (Stratagene) for sequence analysis. Subsequently, sense oligos within exon Ia (nt 600-619) and exon Ib (nt 8777893) were used with an antisense oligo in exon 111 (nt 220992228) to generate additional PCR products (35 cycles; 95°C 2 min; 50°C 2 min; 72.C 2 min) from reverse transcribed murine F4N poly(A)+-selected mRNA. The products of these reactions were confirmed to hybridize specifically, purified, extracted with phenol-chloroform and subcloned into pBluescript for sequence analysis. Using these and other cDNA fragments as probes, genomic clones were identified and characterized as described in Fig. 1. The nt sequence shown was determined in both directions from plasmid subclones. Because PCR generated cDNA clones involving exon Ia showed heterogeneity at the 5’ end, a series of oligos were constructed and used to prime PCR reactions using cDNA from F4N cells and an antisense oligo in exon 111 (nt 2209-2228). These included nt 511-530; nt 521-540; nt 528-548; nt 541-560; nt 551-570; 561-580; nt 581-600; nt 600-617. Only the five most 3’ ohgos successfully amplified a hybridizing product of the predicted size from cDNA although all oligos successfully amplified the predicted (hybridizing) product from DNA under the same PCR conditions (35 cycles; 94‘C. 2 min; 55’C/6O”C, 2 min; 72°C 2 min). A similar approach was used for exon Ib. Six oligos flanking exon Ib (nt 8377856; nt 847-866; nt 857-875; nt 867-886; nt 876-893) were used with an antisense oligo in exon III (nt 2209-2228). The most 3’ ohgo (nt 8677893) generated a PCR product while the overlapping oligo (nt 867-886) did only occasionally (35 cycles; 94’C. 2 min; 57’C, 2 min; 72°C 2 min). The other oligos were negative using cDNA from F4N cells. To confirm the difference between species in the nt sequence of the TATA element upstream from exon Ib (TAGA in the mouse versus TATA in human), PCR was performed (35 cycles; 94°C 2 mitt; 55 C. 2 mitt; 72°C 2 min) using a 5’ oligonucleotide (nt 600-617) and 3’ antisense oligo (nt 8766893) with two additional mouse genomic libraries (Clontech ML 100gd; DNA source: mouse DBA2J adult liver; and Stratagene #946303, DNA source: mouse C57Blck 6 x CBA 6-8 weeks spleen) from different mouse strains. The PCR products were confirmed to hybridize specifically, purified, subcloned into plasmid and the nt sequence determined. The GenBank accession No. is UO5130.
98 ___-_-___-____
2276
Fig. 4. Comparison of murine (M) and human (H) exon 111 and upstream sequence. ldentities are indicated with asterisks. The sequence for mouse exon III is boxed, Note the similarity between species immediately upstream from exon 111. The heptamer and nonamer signal sequences involved in translocation events in human SCL are bold underlined. Human exon Ha (Aplan et al., 1990) is boxed. Qhgos (sense, nt 1699 1717 and antisense, nt 1807-1826, Fig. 2) from within the comparable murine sequence (broken overline) failed to successfully generate a PCR product from F4N cDNA when used with oligos within exon IV (antisense nt 285222871) and exon Ia (sense, nt 600%6191, respectively under a variety of conditions 7 and sequence H from Aplan et al. ( 1990b). (35 cycles; 9&C, 2 min; 45Y/5O’C/.55XZ&O’C, 2 min; 72-C, 2 min). Sequence M is from Fig. _
and human
mon in murine transcript.
myeloid
leukemias
that express
the SCL
ACKNOWLEDGEMENTS
We wish to thank Paula Nathan and Bette Papaevangeliou for expert technical assistance. This work was supported in part by grants from the Anti-Cancer Council of Victoria, the Victorian Health Promotion Foundation, the National Health and Medical Research Council, Canberra, the Cancer Research Campaign and the Wellcome Trust. A.R.G. is a Wellcome Senior Fellow in Clinical Science.
REFERENCES Aplan, P.D., Lombardi, D.P., Ginsberg, V.L. and Kirsch, I.R.: Disruption “illegitimate” V-(D)-J recombinase 14261429.
A.M., Cossman, of the human activity. Science
J., Bertness, XL locus 250 (1990a)
Aplan, P.D., Begley, C.G., Bertness, V., Nussmeier, M., Azquerra, A., Coligan, J. and Kirsh, I.R.: The SCL gene is formed from a transcriptionally complex locus. Mol. Cell. Biol. 10 (1990b) 64266435. Aplan, P.D., Nakahara, K., Orkin, S.H. and Kirsh, I.R.: The SCL gene product: a positive regulator of erythroid differentiation. EMBO .I. 11 (1992)4073-4081. Beaupain, D., Eleouet, J.F. and Romeo, P.H.: Initiation of transcription of the erythroid promoter of the porphobilonogen deaminase gene is regulated by a &s-acting sequence around the cap site. Nucleic Acids Res. 18 (1990) 6509-6515. Begley, C.G., Aplan, P.D., Davey, M.P., Nakahara, K., Tchorz, K., Kurtzberg, J., Herschfield, MS., Haynes, B.F., Cohen, D.I., Waldmann, T.A. and Kirsch, I.R.: Chromosomal translocation in a human stem-cell line disrupts the T-cell antigen receptor &chain diversity region and results in a previously unreported fusion transcript. Proc. Natl. Acad. Sci. USA 86 ( 1989a) 2031--2035.
Begley, C.G., Aplan. P.D., Haynes, B., Waldmann. T.A. and Kirsch. I.R.: The gene SCL is expressed during early hematopoiesis and encodes a differentiation-related ~NA-binding motif. Proc. Natl. Acad. Sci. USA 86 (1989b) 10128~-10132. Begley, C.G.. Visvader, J., Green, A.R., Aplan P.D.. Metcalf, D., Kirsch, I.R. and Gough. N.M.: Molecular cloning and chromosomal localization of the murine homolog of the human helix-loop-helix gene SCL. Proc. Natl. Acad. Sci. USA 88 (1991) 8699873. Bernard, O., Lecointe, N., Jonveaux, P., Souyri, M., Mauchauffe, M., Berger, R., Larsen, C.J. and ~~athie~-Mahul, D: Two site-specific deletions and t( 1:14) transiocation restricted to human T-cell acute leukemias disrupt the 5’ part of the rtzl-J gene. Oncogene 6 (1991) 1477-1488. Brown, L., Cheng, J.-T., Chen, Q.. Siciliano, M.J., Crist, W., Buchanan, G. and Baer, R.: Site-specific recombination of the rul-I gene is a common occurrence in human T cell leukemia. EMBO J. 9 ( 1990) 334333351. Cheng. J.-T.. Cobb, M.H. and Baer, R.: Phosphoryiation of the Tal-1 oncoprotein by the extracellutar-signal-regulated protein kinase ERKl. Mol. Cell. Biol. 13 (1993) 801-808. Elwood, N.J., Cook, W.D., Metcalf, D. and Begley, C.G.: The HLH gene XL enhances tumorigenicity of a pre-leukemic T-lymphocyte cell line. Oncogene 8 (1993) 3093--3101. Finger, L.B., Kagan, J., Christopher, G.. Kurtzberg, J., Herschfieid, MS., Nowell, P.C. and Croce, C.M: Involvement of the TCL-5 gene on chromosome 1 in T cell leukemia and Melanoma. Proc. Natl. Acad. Sci. USA 86 ( 1989) 503995043. Fong, T.C. and Emerson, B.M.: The erythroid-specific protein GATA-1 mediates distal enhancer activity through a specialized B-globin TATA box. Genes Dev. 6 (1992) 521--523. Goldfarb. A.N.. Gouel. S., Mickelson, D. and Greenberg, J.M.: T-cell acute lymphoblastic leukemia the associated gene SCL/tat codes for a 42 kd nuclear phosphoprotein. Blood 11 (1992) 285882866. Green. A.R., DeLuca, E. and Begley, C.G.: Antisense SCL suppresses self-renewal and enhances spontaneous erythroid differentiation of thehuman~uekaemiccell~ineK542.EMBOJ. 10(199i)4153~-4~58. Kozak, M.: Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44 (1986) 283 -293. Mitchell. P.J. and Tijan, R.: Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245 (1989) 371-378.
99 Mouthon,
M.A.,
Bernard,
O.,
Mitjavila,
M.T.,
Romeo,
P.H..
Vainchenker. W. and Mathieu-Mahul, D.: Expression of cd-1 and GATA-binding proteins during human hematopoiesis. Blood 8 (1993) 6477655. Murre, C., McCaw, P.S. and Baltimore, D.i A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless MY0 D and Myc proteins. Cell 56 (1989) 7777783.
Tanigawa, T., Elwood, N., Metcalf, D., Cary, D., De Luca, N.A. and Begley, CC.: The SCL gene product is regulated by and differentially regulates differentiation.
cytokine responses during myeloid leukemic cell Proc. Nat]. Acad. Sci. USA 90 (1993) 7864-7868.
Visvader, J., Begley, CC. and Adams, J.M.: Differential expression of the LYL, SCL and E2A helix-loop-helix genes within the hemopoietic system. Oncogene
6 (1991) 1955204.