Characterization of an SRY-like gene, DSox14, from Drosophila

Characterization of an SRY-like gene, DSox14, from Drosophila

Gene 272 (2001) 121±129 www.elsevier.com/locate/gene Characterization of an SRY-like gene, DSox14, from Drosophila Andrew C. Sparkes, Katherine L. M...

1MB Sizes 1 Downloads 29 Views

Gene 272 (2001) 121±129

www.elsevier.com/locate/gene

Characterization of an SRY-like gene, DSox14, from Drosophila Andrew C. Sparkes, Katherine L. Mumford, Umesh A. Patel, Sarah F. Newbury 1, Colyn Crane-Robinson* Biophysics Laboratories, Institute of Biomedical and Biomolecular Sciences, University of Portsmouth, St. Michael's Building, White Swan Road, Portsmouth, PO1 2DT, UK Received 20 February 2001; received in revised form 4 April 2001; accepted 1 June 2001 Received by E. Boncinelli

Abstract We have characterized the DSox14 gene, a new member of the family of transcription factors related to the mammalian sex determining factor, SRY. It contains two exons and the intron is large for Drosophila at 2.8 kb. The encoded protein consists of 691 amino acids (72 kDa) and includes an HMG box domain, which is closely related to the mouse Sox4 DNA binding domain. Expression of the DSox14 HMG box domain in vitro shows that it binds the sequence AACAAT with a Kd of 190 nM, generating a bend angle of 48.68. At higher protein concentrations, a second HMG box binds at the recognition sequence, increasing the bend angle by 58. DSox14 is variably expressed throughout development as three alternative transcripts but not at all during the 1st and 2nd larval instars. The several mRNA transcripts are produced primarily from different transcriptional start sites. Analysis of the expression of DSox14 mRNAs during early development shows that they are maternally contributed at a low level and ubiquitously expressed during embryogenesis. The widespread pattern of expression suggests that DSox14 affects a large number of target genes. q 2001 Published by Elsevier Science B.V. All rights reserved. Keywords: Sox domain; HMG box; Transcription factor; DNA bending

1. Introduction Sox proteins form a large family of transcription factors that possess DNA binding domains closely related to that of SRY, the mammalian sex determining factor (Pevny and Lovell-Badge, 1997; Wegner, 1999). The HMG domain encoded by Sry and Sox proteins binds DNA at AACAAT sites, or related sequences (Denny et al., 1992b; Harley et al., 1994) via contacts in the minor groove and induces bends of 73±908 (Ferrari et al., 1992; Connor et al., 1994). This DNA bending property has led to the suggestion that Sox proteins act as architectural transcription factors by Abbreviations: BSA, bovine serum albumin; cDNA, complementary DNA; CNS, central nervous system; DTT, dithiothreitol; HMG box, the DNA binding domain from high mobility group proteins 1 and 2; IPTG, isopropyl-b-d-thiogalactose; kb, kilobase pairs; PCR, polymerase chain reaction; RNaseA, ribonuclease A; rp49, ribosomal protein 49; RT-PCR, reverse transcriptase PCR; SDS-PAGE, sodium dodecyl sulphate polyacrylamide electrophoresis; SOX, Sry-related HMG bOX; UTR, untranslated region * Corresponding author. Tel.: 144-23-92842055; fax: 144-2392842053. E-mail address: [email protected] (C. Crane-Robinson). 1 Present address: Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK.

modulating chromatin structure around transcriptional regulatory elements (Ferrari et al., 1992; Giese et al., 1992; Prior and Walter, 1996; Pevny and Lovell-Badge, 1997). In addition, a number of Sox proteins can act as classical transcription factors since they contain transactivation domains that act on downstream reporter genes (van de Wetering et al., 1993; Hosking et al., 1995; Wotton et al., 1995; Sudbeck et al., 1996). However, other proteins containing sequencespeci®c HMG box domains appear to modulate transcription by binding other proteins (Yuan et al., 1995; Zappavigna et al., 1996). In Drosophila for example, dTCF (pangolin) has been shown to play a role in the Wingless/Wnt pathway by binding to Armadillo (Drosophila b catenin). Transcription of target genes such as Ultrabithorax is blocked upon interaction of this complex with the transcriptional repressors Groucho and dCBP (Drosophila CREB binding protein) (Cavallo et al., 1997; Nollet et al., 1999). Sox proteins have been shown to be important in a number of developmental processes, including sex determination, limb and eye formation and nervous system organization (Goodfellow and Lovell-Badge, 1993; Kamachi et al., 1995; Uwanogho et al., 1995; Kent et al., 1996). The expression pattern of many Sox family members throughout development also appears to correlate with early cell fate

0378-1119/01/$ - see front matter q 2001 Published by Elsevier Science B.V. All rights reserved. PII: S 0378-111 9(01)00557-1

122

A.C. Sparkes et al. / Gene 272 (2001) 121±129

decisions (Wagner et al., 1994; Pevny and Lovell-Badge, 1997). Many Sox genes in mice are expressed in overlapping patterns suggesting possible redundancy in the Sox family. For example, in the developing CNS of the mouse and chicken, Sox1±4 are co-expressed at high levels but are rapidly down-regulated upon differentiation of neural precursor cells. Mice homozygous for a targeted disruption of Sox4 display severe cardiac malformations and lack B lymphocytes but the CNS is unaffected. This may be because Sox1±3 can compensate for Sox4 in the CNS (van de Wetering et al., 1993; Collignon et al., 1996; Schilham et al., 1996). Drosophila Sox Neuro (SoxN) is closely related to mammalian Sox1±3 and is also expressed in the developing CNS and the similarity between the ¯y and mammalian genes extends beyond the HMG box domain in this case (Cremazy et al., 2000). Drosophila Dichaete is essential for embryonic development and nervous system organization: mutant phenotypes are variable and this may be the result of tissue/cell-speci®c interactions with other transcription factors or Sox proteins (Nambu and Nambu, 1996; Russell et al., 1996; Soriano and Russell, 1998; Mukherjee et al., 2000). Drosophila Sox100B expression is prominent in the developing gut, Malpighian tubes and gonad, tissues in which the closely related vertebrate Sox9 and Sox10 are also expressed. For Sox100B the close relationship to the vertebrate homologues is restricted to the HMG box (Hui Yong Loh and Russell, 2000). Sox genes are typically involved in critical developmental pathways and to better understand their role in cellular processes we have characterized a further Sox gene from Drosophila. We report that DSox14 (DSox60B) has a complex temporal pattern of expression and is variably expressed throughout development as three alternative transcripts. When expressed in vitro, the HMG box domain of DSox14 binds to and bends DNA in a similar manner to other HMG box proteins. The widespread expression of DSox14 suggests that it may be involved in modulating a range of target genes.

2. Materials and methods 2.1. cDNA and genomic cloning A 204 bp fragment ampli®ed from a 4±8 h Drosophila embryonic cDNA phage library (Denny et al., 1992a) was used to probe a 4±8 h Drosophila plasmid library. A total of 160,000 colonies were screened and four positives were checked by sequencing. The single cDNA clone that contained sequences identical to the original DSox14 amplicon was fully sequenced on both strands. The complete cDNA sequence was assembled using information from this cDNA clone, the genomic sequence, and from the sequence of a cDNA kindly provided by Dr Christine Rushlow (CR). Sequence data were analyzed using the MacVector programme and multiple sequence alignments generated

using ClustalW (http://dot.imgen.bcm.tmc.edu:9332/multialign/multi-align.html). For genomic cloning, a Drosophila Oregon R phage library in EMBL3 was screened with a 1 kb fragment from the CR DSox14 cDNA. Two non-overlapping clones of 2.7 and 3.0 kb were sequenced and the gap of ,300 bp was bridged by PCR ampli®cation from genomic DNA. The most downstream part of the gene was obtained from cosmid 51D11 (European Drosophila project), known to be located in the region of 60A on chromosome 2 (our own in situ hybridization experiments using a cDNA probe had indicated 60A8-B8 as the location of the gene). 2.2. Northern blotting analysis Total RNA was prepared from ten different life stages of Oregon R wild-type Drosophila and from 0±20 h embryos using standard techniques (Sambrook et al., 1989). PolyA 1 mRNA was then puri®ed from total RNA using oligo(dT)cellulose spin columns (Pharmacia) according to the manufacturer's instructions. The mRNA was separated on agarose-formaldehyde gels, transferred to nylon membranes (Amersham) and probed with random primed DSox14 cDNA with minor modi®cations to standard techniques (Sambrook et al., 1989). Four different cDNA probes were used with the developmental Northern: a 1.8 kb NotI/AgeI fragment which includes most of the cDNA sequence; a 400 bp HincII/XbaI fragment of sequences 5 0 to the HMG box; a 1.1 kb BamHI fragment; and a 1.2 kb SmaI/AgeI fragment. All gave identical results. The probe used for the 0±20 h embryo Northern was an 870 bp BamHI genomic fragment that included the HMG box and downstream sequences. The control rp49 probe was a 253 bp EcoRI/HindIII fragment coding for ribosomal protein 49 released from p720 as described previously (Myers et al., 1995). Three independent Northern blots were prepared and all gave identical results. 2.3. RT-PCR and primer extension analysis Nested RT-PCR using Superscript II reverse transcriptase (Life Technologies) was performed according to the manufacturer's instructions using 2 mg of total 0±20 h embryonic RNA. The ®rst round of RT-PCR was carried out in the presence of 30 pmol of primers ACS10 (5 0 -TCTTGCGCCGCTCCATCTGGC-3 0 ) and KLM1 (5 0 -GCGTCGTCGCCTTCGCCAGC-3 0 ). The second round used ACS10 and KLM7 (5 0 -GCCTGGTGTTCGGATCTGCACGG-3 0 ). The resulting 150 bp fragment was cloned and checked by sequencing. Primer extension was performed by 5 0 end-labelling of the appropriate primer by standard techniques. The primer (2 pmol) was then added to 20 mg of total Drosophila RNA in 80% formamide, 20 mM Tris±HCl (pH 7.5), 400 mM NaCl, and 1 mM EDTA and incubated at 858C for 10 min, and then at 308C for 12 h. The RNA/primer complex was ethanol-precipitated and then resuspended in 10 mM DTT, 75 mM KCl, 50 mM Tris±HCl (pH 8.3), 3 mM MgCl2, 250 mM dNTPs, 2 mg/ml actinomycin D, and 30

A.C. Sparkes et al. / Gene 272 (2001) 121±129

units RNasin ribonuclease inhibitor (Pharmacia), together with 400 units of Superscript II (Life Technologies). The reaction was terminated by the addition of 10 ml 20 mM EDTA and 10 mg sonicated salmon sperm DNA. RNA was digested by the addition of 1 mg/ml RNaseA and incubated at 378C for 15 min. The products were phenol/chloroformextracted, ethanol-precipitated and visualized on 8% polyacrylamide, 7 M urea sequencing gels with sizing markers alongside. The primers used in the extension experiments were: KLM20, 5 0 -TAGCCGGACCAGTGGCAGT-3 0 ; and KLM21, 5 0 -TTGC-TTTAAGTGTGTTGAT-3 0 ; 0 KLM23, 5 -CGAGCGA-ATAAACTACGCAA-3 0 . 2.4. In situ hybridization to whole-mount embryos These were performed on wild-type (Oregon R) Drosophila embryos using digoxygenin-labelled (DIG) antisense RNA probes as described previously (Myers et al., 1995). Antisense DSox14 probes were generated by linearizing plasmids containing various DSox14 cDNA plasmids and transcribing antisense RNAs with T7 polymerase. Embryos were stained and mounted in JB-4 methacrylate (Polysciences). The antisense RNA probes were transcribed from the following DSox14 cDNA sequences: probe 5, the 1.4 kb AgeI/PstI fragment; probe 6, the 1.2 kb SmaI/AgeI fragment; and probe 7, the 0.4 kb BamHI/HincII fragment. These three different antisense probes all gave identical results. The sense probe was transcribed using SP6 polymerase from the 1.4 kb AgeI/PstI cDNA fragment. 2.5. Construction of DSox14 HMG box expression plasmids and protein puri®cation The Dsox14 HMG box, encoding residues 178±265 (88 amino acids) of the DSox14 protein, was ampli®ed by PCR from the cDNA clone with the primers BOX1 (5 0 GCGGGCGGATCCACCAAGAAACATTCGCCCGGCC3 0 ) and BOX2 (5 0 -GCCGGCGAATTCGGAGCGCGTCTGCTTCTTTTGCG-3 0 ). After ampli®cation, the product was restricted with BamHI and EcoRI and inserted into pGEX2T, previously linearized with BamHI and EcoRI. The ligation mix was used to transform Escherichia coli HB101 and the resulting plasmids were checked by sequencing. Escherichia coli HB101 containing this plasmid was grown in LB broth, expression was induced with IPTG and the fusion protein was prepared and puri®ed as described previously for other HMG boxes (Read et al., 1994). 2.6. Band shift assays Protein concentrations were determined from their UV absorbance at 280 nm, using a molar extinction coef®cient calculated on the basis of four tyrosines and two tryptophans (1 ˆ 16; 500 mol 21 cm 21). A 27 bp duplex DNA containing a Sox binding site (underlined) was prepared by annealing the oligonucleotides ACS-25 (5 0 -CTAGCACTATAACAATACAAGCCGGCC-3 0 ) and ACS-26 (5 0 -GGCCGGCTT-

123

GTATTGTTATAGTGCTAG-3 0 ). This duplex was then 5 0 end-labelled using T4 polynucleotide kinase and [g- 32P]ATP. Labeled duplex DNA was puri®ed by gel electrophoresis and DNA duplex concentrations were determined from their UV absorbance at 260 nm. Labeled duplex DNA (50 nM) was mixed with varying concentrations (50 nM to 1 mM) of DSox14 HMG box protein in a buffer containing 5 mM HEPES (pH 7.5), 30 mM KCl, 4% Ficoll, 0.05 mM PMSF, 1 mM MgCl2, 0.5 mg/ml BSA and 1 mM DTT (10 ml ®nal volume). Binding reactions were incubated for 30 min on ice and then electrophoresed on non-denaturing 8% polyacrylamide gels (19:1 acrylamide/ bis) in 0.25 £ TBE buffer at 150 V for 3 h at 48C. Gels were then ®xed, dried and autoradiographed at 2808C with an intensifying screen. 2.7. Circular permutation assay for DNA bending The duplex obtained by annealing oligonucleotides ACS25 and ACS26 was cloned into the HpaI site of pBend4 (Zweib and Adhya, 1994; Read et al., 1994; Lnenicek-Allen et al., 1996). Circularly permutated DNA fragments were isolated by restriction of the resulting plasmid pB4S52, puri®ed by gel electrophoresis and end-labeled. DNA bending assays were performed as described above for band shift assays, using 500 pM DNA and 100 and 150 nM HMG box protein, except that poly(dI.dC) competitor at 1 ng/ml was added to the reaction mixture. 3. Results and discussion 3.1. Cloning of Drosophila Sox14 The HMG domain is well conserved among Sox proteins, although surrounding sequences are frequently highly diverged (Soullier et al., 1999). To isolate a full-length cDNA encoding DSox14, a 204 bp PCR amplicon (kindly provided by Alan Ashworth) derived from a 4±8 h embryonic DNA library using degenerate primers homologous to the conserved ends of vertebrate SRY-like HMG boxes (Denny et al., 1992a) was used to screen a similar 4±8 h embryonic Drosophila cDNA library. Four different cDNA clones were isolated and one of these was found to contain sequences identical to the probe. Sequencing this 3 kb clone (ACS cDNA) showed it to contain the complete HMG box and 2 kb of 3 0 in-frame sequence up to a polyA tail. However, just 5 0 of the box the frame was lost. A second 1.3 kb cDNA clone (CR cDNA) kindly provided by Dr Christine Rushlow was sequenced and found to be identical to the ACS cDNA from the HMG box to the polyA tail but differed 5 0 of the box; moreover, this 5 0 sequence was in-frame with the remainder of the clone. We concluded that the 5 0 part of the ACS cDNA clone was probably intronic, i.e. it was a partially processed product. Furthermore, the breakpoint between the clones corresponded to an acceptance splice site. Genomic sequencing later con®rmed that this was indeed the case (see below).

124

A.C. Sparkes et al. / Gene 272 (2001) 121±129

Fig. 1. Position of the DSox14 gene within the sequenced segment of 6790 bp. The 5 0 UTR is shown from position 667, corresponding to the 3.3 kb mRNA. The translational start ATG is at position 1668 and the 2.8 kb intron starts at position 2202 and ®nishes at position 5000. The HMG box domain starts six amino acids into the second exon and continues to position 5269. The TGA stop codon is at position 6474 and the polyA addition site is at position 6634. The 3 0 UTR is thus 158 bp in length. A sequence comparison is shown of the Sox14 HMG box with mouse Sox4 (X70298), human Sox22 (U35612) and human SOX11 (AB028641). Identical amino acids are marked with an asterisk. The protein segment used in the DNA bending and binding experiments is underlined.

Since the CR cDNA had an open frame right to its 5 0 terminus it was possibly incomplete. We therefore cloned the Dsox14 gene from an Oregon R Drosophila genomic library made in l EMBL3 by Kim Kaiser and Steve Russell. A total of 6790 bp of sequence was obtained showing the presence of a 2.8 kb intron and continuation of the open reading frame upstream of the 5 0 end of the CR cDNA up to an ATG in a context conforming well to a Drosophila consensus Kozak sequence. Assuming this to be the translational start, the DSox14 protein contains 669 amino acids. MWt 72.7 kDa. Our genomic DNA sequence is in excellent accord with that in FlyBase at FBgn0005612, however the translation given in FlyBase prematurely stops at S530 rather than continuing to the correct C-terminal M669. The correct nucleotide sequence and its translation have been submitted to FlyBase. Comparison of the Drosophila Sox14 HMG box with that of other Sox proteins shows that it is most similar to mouse Sox4 and to human Sox11 and 22 (Fig. 1) (Soullier et al., 1999) and more similar to them than to any of the other nine Sox proteins in the Drosophila database. Within the HMG box, DSox14 shows 76% identity with mouse Sox4 but outside the DNA binding domain there is no signi®cant similarity between these proteins. 3.2. Expression of DSox14 during the Drosophila life cycle To determine the expression of DSox14 through the Drosophila life cycle, polyA 1 mRNA was isolated at different developmental stages and analyzed for the expression of DSox14 transcripts by Northern blotting. Four probes from different regions of the DSox14 gene were used and all gave identical results (Fig. 2B and data not shown). DSox14 mRNA is expressed as three transcripts of 3.6, 3.3 and 2.5 kb. In embryos, the 3.6 kb transcript is barely visible and

only the 3.3 and 2.5 kb transcripts are clearly seen. This was veri®ed in a separate experiment using 0±20 h embryos (Fig. 2A). The 3.6 kb transcript becomes clearly visible only in late pupae and adult ¯ies. The 2.5 kb transcript is expressed in embryonic stages, in early pupae and variably thereafter. The expression levels of DSox14 transcripts were low compared to the control transcript rp49. There is a striking lack of expression in 1st and 2nd instar larvae and to verify this, use was made of commercial 96-well plates containing ®rst-strand cDNA from various stages of Drosophila development (Origene, Rockville, MD) that were probed by PCR using a forward primer close to the ATG of the DSox14 gene and a reverse primer within the HMG box (amplicon size 613 bp). This also showed no evidence of transcripts in 1st and 2nd instar larvae, maximal expression in pupae and signi®cant expression between 8 and 24 h in embryos (data not shown). This approach does not of course distinguish between the different sized transcripts. The three DSox14 transcripts could arise from alternative splicing events, or from different start and/or termination sites. In order to determine whether the putative 2.8 kb intron in fact included additional exons, RT-nested PCR was used with primers in the coding sequences close to the ends of the 2.8 kb intron, using total RNA extracted from 0±20 h embryos as a template. Fig. 3A shows that a single product of 150 bp was produced, the expected length if there are no exons within the 2.8 kb region: sequencing con®rmed its correct identity. Although we cannot rule out the possibility that there are alternative polyA addition sites, this seems unlikely since the sequences of the 3 0 UTRs of the two cDNA clones were identical and an AATAAA polyA addition signal is located 16 bp upstream of the polyA tail (Colgan and Manley, 1997). To establish

A.C. Sparkes et al. / Gene 272 (2001) 121±129

Fig. 2. Expression of DSox14 mRNA. (A) Northern blot of mRNA from 0± 20 h embryos probed with a genomic fragment consisting of the HMG box plus 450 bp of downstream sequence. (B) Expression of DSox14 mRNA through the Drosophila life cycle. Northern blot of mRNAs probed with a 1.8 kb cDNA fragment of DSox14 that encompasses most of the coding sequence and with a 300 bp fragment from the ribosomal protein (rp49) gene. rp49 is known to be expressed at constant levels throughout development and is used as a loading control. Developmental stages are: (1) 0±4 h embryos; (2) 4±8 h embryos; (3) 8±24 h embryos; (4) 1st instar larvae; (5) 2nd instar larvae; (6) third instar larvae; (7) early pupae; (8) late pupae; (9) adult males; (10) adult females.

125

whether the mRNA transcripts observed resulted from alternative start sites, a number of primer extension experiments were performed using three different primers (see Section 2) with RNA from 0±20 h embryos. The primer furthest upstream of the open reading frame, KLM23, gave no products, indicating that transcription starts within 1190 nt of the translation start. The KLM21 primer, located 674 nt upstream of the translation start, detected one major product corresponding to a transcriptional start point 3167 nt from the polyA addition site (Fig. 3). Assuming a polyA tail of 200 nt (Graber et al., 1999), this matches the observed size of 3.3 kb detected on the Northern blots. The primer KLM20, located 101 nt upstream of the translation start, detected a product corresponding to a transcriptional start point 2483 nt upstream of the polyA addition site (Fig. 3). This matches (within 10 nt) the start site of the EST clone LD30105. Again, if we assume a polyA tail of 200 nt, this matches the observed 2.5 kb band observed on the Northern blots. The 3.6 kb transcript cannot be accounted for by the primer extension data but is anyway very weak in embryos. Analysis of the genomic sequence in FlyBase showed that

Fig. 3. (A) Nested RT-PCR to screen for possible exon sequences within the 2.8 kb intron. Total RNA from Drosophila embryos was reverse-transcribed and then ampli®ed using primer ACS10 (located 82 bp downstream of the 3 0 border of the intron) and primer KLM1 (located 350 bp upstream of the 5 0 border of the intron). A second round of ampli®cation used ACS10 with primer KLM7 (located 68 bp upstream of the 5 0 border of the intron). Only a 150 bp product was observed after the second round. (B) Primer extensions to determine transcriptional start points. Autoradiographs of extension reactions from primer KLM20 (extension product of 218 nt) and from primer KLM21 (extension product of 327 nt). (C) Positions of the primers KLM20 and KLM21 relative to the translational ATG start codon.

126

A.C. Sparkes et al. / Gene 272 (2001) 121±129

the polyA addition site of our DSox14 cDNA was only 44 bp away from the 3 0 end of the 3 0 UTR of the PHM(U7743) gene which is orientated in the opposite direction to DSox14. The protein product of the PHM gene (peptidylgycine-a-aminidating mono-oxygenase complex) is involved in the production of neuropeptides. The 3.6 kb transcript might therefore produce mRNA that is antisense to the PHM mRNA. Similar convergent transcripts have been found at other sites in the Drosophila genome (e.g. Spencer et al., 1986) and it is possible that these convergent transcripts are co-ordinately regulated. 3.3. Spatial and temporal expression of DSox14 To examine the spatial and temporal pattern of DSox14 expression through embryogenesis, the distribution of transcripts was analyzed by in situ hybridization with four different antisense RNA probes and one negative control sense RNA probe, all ®ve labelled with digoxygenin. The four different antisense probes all gave identical results. DSox14 is expressed widely at low levels throughout embryogenesis (Fig. 4A±D) and the sense probe (Fig. 4E) showed that this widespread low level staining with the antisense probes was

genuine and not a general background staining. The ubiquitous early low level expression of DSox14 indicates that there is a maternal contribution. DSox14 is expressed at low levels throughout the germ band and ubiquitously throughout the rest of embryonic development. In order to follow developmental changes in the DSox14 protein itself, a 41 kDa peptide from the C-terminal region was expressed in E. coli and used to generate antisera in rabbits. These recognized the expected band of about 72 kDa in Western blots and it was found that the DSox14 protein was present in the embryo and 3rd instar larval stages but not in 1st instar larvae, in agreement with the observations made by Northern analysis in Fig. 2 (data not shown). 3.4. Analysis of the DNA binding and bending activities of the DSox14 HMG box HMG boxes have been shown to bind AT-rich sequences and bend the DNA to angles of between 30 and 1308 (Ferrari et al., 1992; Giese et al., 1992; Connor et al., 1994; Read et al., 1994). To compare the binding and bending properties of the HMG box from DSox14 with other HMG boxes, we expressed and puri®ed it for analysis of its interactions with

Fig. 4. Spatial and temporal distribution of DSox14 mRNA in wild-type embryos. Whole-mount in situ hybridization was carried out using digoxygeninlabelled antisense probes. (A) Embryo at stage 3. (B) Embryo at syncytial blastoderm, showing expression throughout the embryo. (C) Embryo at germ band extension, stage 11, showing abundant DSox14 transcripts. (D) Embryo at germ band retraction (stage 13), showing low and ubiquitous expression of DSox14 throughout the embryo. (E) Embryo at syncytial/cellular blastoderm, i.e. of an age close to (B), probed with a DSox14 sense probe.

A.C. Sparkes et al. / Gene 272 (2001) 121±129

DNA. Comparison with other Sox proteins, especially with mouse Sox4, led to the selection of a region of 87 amino acids that included all the residues conserved between the HMG boxes of DSox14 and other Sox proteins (Fig. 1). The DNA encoding the selected region was ampli®ed by PCR and cloned into pGEX2T. The GST fusion protein was expressed, the GST was removed with thrombin and the HMG box peptide was puri®ed using previously described methods (Read et al., 1994). The puri®ed HMG box domain migrated as a single band with the expected mobility in SDS-PAGE and acetic acid/ urea gels (data not shown). Band shift assays were used to monitor the binding of the DSox14 HMG box to DNA using a 27 bp duplex containing the recognition site AACAAT: this is the recognition sequence determined by site selection experiments with the HMG box from the mSox-5 and human SRY proteins (Denny et al., 1992b; Harley et al., 1994). Fig. 5 shows that the DSox14 HMG box binds to this DNA fragment in a concentration-dependent manner, although at high concentrations (where no free DNA remains) a `supershifted' complex is formed. The relative proportions of free and complexed DNA were measured for the ®rst eight protein concentrations using a Phosphorimager system and ®tted to a 1:1 binding equation, yielding a dissociation constant of 190 nM. This value is somewhat larger than that measured for the HMG box of mouse Sox5 at the same temperature (,35 nM; Privalov et al., 1999). This somewhat reduced af®nity may be because the target site was not optimal or that additional residues N- or Cterminal to the minimum HMG box domain need to be included to achieve a higher af®nity. To determine whether the DSox14 HMG box is able to bend DNA, a circular permutation assay was performed. The plasmid pB4552, including the Sox recognition sequence AACAAT, was restricted to give seven DNA fragments of 149 bp having this recognition site at different positions along the fragment and each was then end-labeled (Read et al., 1994; Privalov et al., 1999). The products of seven binding reactions, each containing 100 nM protein, were electrophoresed on an 8% polyacrylamide gel (Fig.

Fig. 5. Band shift assays of the binding of the DSox14 Sox HMG box to a 27 bp duplex containing the Sox binding site AACAAT. All reactions contained 50 nM DNA and protein concentrations of (1) 0 nM, (2) 50 nM, (3) 100 nM, (4) 150 nM, (5) 200 nM, (6) 250 nM, (7) 300 nM, (8) 350 nM, (9) 400 nM, (10) 450 nM, (11) 500 nM, and (12) 1 mM. Products were visualized on an 8% polyacrylamide, 0.25 £ TBE gel electrophoresed at 150 V for 4 h. The initial complex forms with a Ka of 5.26 £ 10 6 M 21 (K d ˆ 190 nM).

127

6A). The relative mobilities of the shifted bands were plotted against their ¯exure displacement (position of the binding site relative to the end of the fragment) and a parabola centred at the recognition sequence (a ¯exure displacement of 0.5) was observed (Fig. 6B). The bend angle derived using the algorithm of Ferrari et al. (1992) was 48.68. Circular permutation assays carried out at a higher protein concentration (150 nM) gave rise to additional supershifted bands (Fig. 6C) due to the binding of additional molecules of protein. Since the 149 bp target duplex is much longer than a typical HMG box footprint (14±16 bp), this could be due to the HMG box binding to other sites of lower af®nity and inducing additional bends. The relative mobilities of the upper supershifted bands were therefore plotted against ¯exure displacement (Fig. 6D): the minimum of the parabola was at the same position as for the lower shifted bands and the bend angle was calculated to be 548. This indicates that the second protein molecule binds at the same position in the 149 bp duplex as the ®rst protein, rather than at a second site elsewhere, and makes only a small difference to the bend angle generated. We conclude that the second protein molecule binds directly to the ®rst by protein/protein interactions. Piggy-backing of a second HMG box onto one already bound to DNA explains the supershifted band observed in the band shift experiment of Fig. 5 that used a DNA duplex of only 27 bp, a length insuf®cient to accommodate two HMG boxes side by side. These data demonstrate that the DSox14 HMG box can bind and bend DNA. It is important to note, however, that the exact bend angle in vivo may differ from that observed here due to additional protein contacts made to ¯anking parts of the protein, for example as shown for LEF-1 (Lnenicek-Allen et al., 1996), and/or the presence of other protein factors. Furthermore, since the cellular targets for DSox14 are not yet known, the precise in vivo DNA recognition sequence may not have been used in the bending assay and this could also in¯uence the bend angle generated. 4. Conclusions 1. We have identi®ed and sequenced the Drosophila Sox14 gene which encodes a protein containing an Sry-like HMG box domain. 2. Analyses of cDNA clones indicate that the gene contains two exons spaced by a 2.8 kb intron. The resulting protein consists of 691 amino acids with a molecular weight of 72 kDa. 3. DSox14 mRNA is expressed in a complex pattern throughout the Drosophila life cycle and is ubiquitously expressed during embryonic development. It is absent during 1st and 2nd instar larval development. This widespread pattern of expression suggests that DSox14 may affect a large number of target genes. 4. The DSox14 mRNA is expressed as three different transcripts of 3.6, 3.3 and 2.5 kb, due primarily to variations

128

A.C. Sparkes et al. / Gene 272 (2001) 121±129

Fig. 6. Circular permutation analysis of the DNA bending induced by the DSox14 HMG box domain. Plasmid pB4S52 containing the binding site used in the DNA band shift experiment (Fig. 5) was restricted to give seven 149 bp fragments, each with the binding site in a different position relative to one end (the ¯exure displacement). (A) The result of a circular permutation assay obtained using 100 nM protein, ,500 pM DNA, 1£ binding buffer, 500 mg/ml of BSA and poly(dI.dC).poly(dI.dC) duplex competitor DNA at 0.1 ng/ml. The gel is an 8% polyacrylamide 0.25£ TBE native gel electrophoresed in 0.25£ TBE running buffer at 150 V at 48C. (B) The graph plotted with the data from the gel in (A). The parabola equation which ®ts these results is y ˆ 0:6112x2 2 0:6496x 1 0:9302, R2 ˆ 0:9813. (C) The result of a circular permutation assay obtained using identical reaction conditions to (A) except that 150 nM protein was used. The two complexes obtained are shown. (D) The graph plotted with the data from the gel in (C). The parabola equation which ®ts these results is y ˆ 0:69192x2 2 0:6496x 1 0:9302, R2 ˆ 0:9813.

in the transcriptional start site, rather than alternative splicing or differences in termination sites. 5. The polyA addition site of the DSox14 cDNA is only 44 bp from the 3 0 UTR of the convergent PHM1 gene. 6. The HMG box of the DSox14 protein binds the sequence AACAAT with a Kd of 190 nM, generating a bend angle of 48.68. Both the binding and bending analyses indicate that two HMG boxes can piggy-back at the DNA recognition sequence.

Acknowledgements We are grateful to Dr N. Brown (Wellcome CRC Institute, Cambridge, UK) for the Drosophila cDNA libraries, Dr S. Russell (University of Cambridge, UK) for the Drosophila genomic libraries, Dr C. Rushlow (University of Koln, Germany) for the CR cDNA clone and Dr A. Ashworth (Institute of Cancer Research, London, UK) for the DSox14 PCR fragment. We would also like to thank Dr C. Read for tech-

A.C. Sparkes et al. / Gene 272 (2001) 121±129

nical advice. A.C.S. acknowledges the award of a Wellcome Trust Prize Studentship and S.F.N. acknowledges the support of a Royal Society University Research Fellowship.

References Cavallo, R., Rubenstein, D., Pfeifer, M., 1997. Armadillo and dTCF: a marriage made in the nucleus. Curr. Opin. Genet. Dev. 7, 459±466. Colgan, D.F., Manley, J.L., 1997. Mechanism and regulation of mRNA polyadenylation. Genes Dev. 11, 2755±2766. Collignon, J., Sockanathan, S., Hacker, A., Cohen-Tannoudji, M., Norris, D., Rastan, S., Stevanovic, M., Goodfellow, P.N., Lovell-Badge, R., 1996. A comparison of the properties of Sox-3 with Sry and two related genes, Sox-1 and Sox-2. Development 122, 509±520. Connor, F., Cary, P.D., Read, C.M., Preston, N.S., Driscoll, P.C., Denny, P., Crane-Robinson, C., Ashworth, A., 1994. DNA binding and bending properties of the post-meiotically expressed Sry-related protein Sox-5 0 . Nucleic Acids Res. 22, 3339±3346. Cremazy, F., Berta, P., Girard, F., 2000. Sox neuro, a new drosophila sox gene expressed in the developing central nervous system. Mech. Dev. 93, 215±219. Denny, P., Swift, S., Brand, N., Dabhade, N., Barton, P., Ashworth, A., 1992a. A conserved family of genes related to the testis determining gene, SRY. Nucleic Acids Res. 20, 2887. Denny, P., Swift, S., Connor, F., Ashworth, A., 1992b. An SRY-related gene expressed during spermatogenesis in the mouse encodes a sequence-speci®c DNA-binding protein. EMBO J. 11, 3705±3712. Ferrari, S., Harley, V.R., Pontiggia, A., Goodfellow, P.N., Lovell-Badge, R., Bianchi, M.E., 1992. Sry, like HMG1, recognizes sharp angles in DNA. EMBO J. 11, 4497±4506. Giese, K., Cox, J., Grosschedl, R., 1992. The HMG domain of lymphoid enhancer factor-1 bends DNA and facilitates assembly of functional nucleoprotein structures. Cell 69, 185±195. Goodfellow, P.N., Lovell-Badge, R., 1993. SRY and sex determination in mammals. Annu. Rev. Genet. 27, 71±92. Graber, J.H., Cantor, C.R., Mohr, S.C., Smith, T.F., 1999. In silico detection of control signals: mRNA 3 0 -end-processing sequences in diverse species. Proc. Natl. Acad. Sci. USA 96, 14055±14060. Harley, V.R., Lovell-Badge, R., Goodfellow, P.N., 1994. De®nition of a DNA consensus binding site for SRY. Nucleic Acids Res. 22, 453±456. Hosking, B.M., Muscat, G.E.O., Koopman, P.A., Dowhan, D.H., Dunn, T.L., 1995. Trans-activation and DNA-binding properties of the transcription factor, Sox-18. Nucleic Acids Res. 23, 2626±2628. Hui Yong Loh, S., Russell, S., 2000. A Drosophila group E sox gene is dynamically expressed in the embryonic alimentary canal. Mech. Dev. 93, 185±188. Kamachi, Y., Sockanathan, S., Liu, Q., Brieman, M., Lovell-Badge, R., Kondoh, H., 1995. Involvement of SOX proteins in lens-speci®c activation of crystallin genes. EMBO J. 14, 3510±3519. Kent, J., Theatley, S.C., Andrews, J.E., Sinclair, A.H., Koopman, P.A., 1996. A male speci®c role for SOX9 in vertebrate sex determination. Development 122, 2813±2822. Lnenicek-Allen, M., Read, C.M., Crane-Robinson, C., 1996. The DNA bend angle and binding af®nity of an HMG box increased by the presence of short terminal arms. Nucleic Acids Res. 24, 1047±1051. Mukherjee, A., Shan, X., Mutsuddi, M., Ma, Y., Nambu, J.R., 2000. The Drosophila Sox gene, ®sh-hook, is required for postembryonic development. Dev. Biol. 217, 91±106. Myers, F.A., Francis-Lang, H., Newbury, S.F., 1995. Degradation of maternal string mRNA is controlled by proteins encoded on maternally contributed transcripts. Mech. Dev. 51, 217±226. Nambu, P.A., Nambu, J.R., 1996. The Drosophila ®sh-hook gene encodes a HMG domain protein essential for segmentation and CNS development. Development 122, 3467±3475.

129

Nollet, F., Berx, G., van Roy, F., 1999. The role of the E-Cadherin/Catenin adhesion complex in the development and progression of cancer. Mol. Cell Biol. Res. Commun. 2, 77±85. Pevny, L.H., Lovell-Badge, R., 1997. Sox genes ®nd their feet. Curr. Opin. Genet. Dev. 7, 338±344. Prior, H.M., Walter, M.A., 1996. SOX genes: architects of development. Mol. Med. 2, 405±412. Privalov, P.L., Jelesarov, I., Read, C.M., Dragan, A., Crane-Robinson, C., 1999. The energetics of HMG box interactions with DNA: thermodynamics of the DNA binding of the HMG box from mouse sox-5. J. Mol. Biol. 294, 997±1013. Read, C.M., Cary, P.D., Preston, N.S., Lnenicek-Allen, M., Crane-Robinson, C., 1994. The DNA sequence speci®city of HMG boxes lies in the minor wing of the structure. EMBO J. 13, 5639±5646. Russell, S.R.H., Sanchez-Soriano, N., Wright, C.R., Ashburner, M., 1996. The dichaete gene of Drosophila melanogaster encodes a SOX-domain protein required for embryonic segmentation. Development 122, 3669± 3676. Sambrook, J., Fritsch, E.F., Maniatis, T., 1989. Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Schilham, M.W., Oosterwegel, M.A., Moerer, P., Ya, J., de Boer, P.A.J., van de Wetering, M., Verbeek, S., Lamers, W.H., Kruisbeej, A.M., Cumano, A., Clevers, H., 1996. Defects in cardiac out¯ow tract formation and pro-B-lymphocyte expansion in mice lacking Sox-4. Nature 380, 711±714. Soriano, N.S., Russell, S., 1998. The Drosophila SOX-domain protein dichaete is required for the development of the central nervous system midline. Development 125, 3989±3996. Soullier, S., Jay, P., Poulat, F., Vanacker, J.-M., Berta, P., Laudet, V., 1999. Diversi®cation pattern of the HMG and SOX family members during evolution. J. Mol. Evol. 48, 517±527. Spencer, C.A., Gietz, R.D., Hodgetts, R.B., 1986. Overlapping transcription units in the dopa decarboxylase region of Drosophila. Nature 322, 279± 281. Sudbeck, P., Lienhard-Schmitz, M., Baeuerle, P.A., Scherer, G., 1996. Sexreversal by loss of the C-terminal transactivation domain of human SOX9. Nat. Genet. 13, 230±232. Uwanogho, D., Ree, M., Cartwright, E.J., Pearl, G., Healy, C., Scotting, P.J., Sharpe, P.T., 1995. Embryonic expression of the chicken Sox2, Sox3 and Sox11 genes suggests an interactive role in neuronal development. Mech. Dev. 49, 23±36. van de Wetering, M., Oosterwegel, M., Van Norren, K., Clevers, H., 1993. Sox-4, an Sry like HMG box protein, is a transcriptional activator in lymphocytes. EMBO J. 12, 3847±3854. Wagner, T., Wirth, J., Meyer, J., Zabel, B., Held, M., Zimmer, J., Pasantes, J., Dagna Bricarelli, F., Keutel, J., Hustert, E., Wolf, U., Tommerup, N., Schempp, W., Scherer, G., 1994. Autosomal sex reversal and campomelic dysplasia are caused by mutations in and around the SRY related gene SOX9. Cell 79, 1111±1120. Wegner, M., 1999. From head to toes: the multiple facets of Sox proteins. Nucleic Acids Res. 27, 1409±1420. Wotton, D., Lake, R.A., Farr, C.J., Owen, M.J., 1995. The high-mobility group transcription factor, SOX4, transactivates the human CD2 enhancer. J. Biol. Chem. 270, 7515±7522. Yuan, H., Corbi, N., Basilico, C., Dailey, L., 1995. Developmental speci®c activity of the FGF-4 enhancer requires the synergistic action of Sox2 and Oct-3. Genes Dev. 9, 2635±2645. Zappavigna, V., Faliola, L., Citterich, M.H., Mavilio, F., Bianchi, M.E., 1996. HMG1 interacts with HOX proteins and enhances their DNAbinding and transcriptional activation. EMBO J. 15, 4981±4991. Zweib, C., Adhya, S., 1994. Improved plasmid vectors for the analysis of protein-induced DNA bending. In: Kneale, G.G. (Ed.). Methods in Molecular Biology. Humana Press. Totowa, NJ, pp. 281±294.