Genomics 59, 77– 84 (1999) Article ID geno.1999.5844, available online at http://www.idealibrary.com on
Characterization of a Novel Chromo Domain Gene in Xp22.3 with Homology to Drosophila msl-3 Siddharth K. Prakash,* Ignatia B. Van den Veyver,* ,† Brunella Franco,‡ Manuela Volta,‡ Andrea Ballabio,‡ and Huda Y. Zoghbi* ,§ ,¶,1 *Department of Molecular and Human Genetics, †Department of Obstetrics and Gynecology, §Department of Pediatrics, and ¶Howard Hughes Medical Institute, Baylor College of Medicine, Houston, Texas 77030; and ‡TIGEM, Telethon Institute of Genetics and Medicine, Via Olgettina 58, 20132 Milan, Italy Received January 20, 1999; accepted April 7, 1999
male-specific lethal (MSL) group of proteins in male embryos. MSL proteins are autosomally encoded transacting factors that bind to chromatin in a multiprotein complex at hundreds of sites along the X chromosome and enhance transcription of single-copy genes in males (Copps et al., 1998). Hypertranscription is critical for survival, because msl mutant males die as thirdinstar larvae or early pupae. Sex-lethal (Sxl) inhibits the formation of MSL complexes in females, and in the absence of functional interactions the individual subunits are rapidly degraded (Kelley et al., 1997). The limiting component of the MSL complex is MSL-2, a RING finger protein that nucleates the assembly of the complex and recruits additional proteins, including MOF (males absent on the first), MLE (maleless), and MSL-3 (Copps et al., 1998; Lucchesi, 1998). MOF, a putative histone acetyltransferase, is implicated in the specific hyperacetylation of histone H4 at lysine 16 on the male X chromosome (Hilfiker et al., 1997). The MLE protein is homologous to a family of ATP-dependent DNA helicases, such as the Werner syndrome protein, that displace histones and remodel chromatin (Lee et al., 1997). MLE’s role in altering chromatin structure may explain why the male X chromosome adopts a characteristic expanded morphology (Gorman et al., 1995). In addition to the core components of the complex, two noncoding RNA molecules, named roX1 and roX2 (RNA on X), paint the male X chromosome and colocalize with the MSL proteins (Meller et al., 1997). MSL-3 is not required for the formation of the complex but is essential for both hyperacetylation and increased transcription. Msl-3 encodes a 512-amino-acid protein of unknown function with two putative chromo domains at the N terminus and C terminus (Koonin et al., 1995). Chromo domains, which were originally identified as a recurrent motif in the Drosophila Polycomb-related proteins, are evolutionarily conserved proteins with roles in chromatin organization and transcriptional regulation (Cavalli and Paro, 1998). Using sequence analysis tools, we have identified a human homolog of Drosoph-
The Drosophila male-specific lethal (MSL) genes regulate transcription from the male X chromosome in a dosage compensation pathway that equalizes X-linked gene expression in males and females. The members of this gene family, including msl-1, msl-2, msl-3, mle, and mof, encode proteins with no sequence homology. However, mutations in each of these genes produce a similar phenotype: sex-specific lethality of male embryos caused by the failure of mutants to increase transcription from the single male X chromosome. The MSL gene products assemble into a multiprotein transcriptional activation complex at hundreds of sites along the chromatin of the X chromosome. Here we report the isolation and characterization of a human gene, named MSL3L1, that encodes a protein with significant homology to Drosophila MSL-3 in three distinct regions, including two putative chromo domains. MSL3L1 was identified by database queries with genomic sequence from BAC GS-590J6 (GenBank AC0004554) in Xp22.3 and was evaluated as a candidate gene for several developmental disorders mapping to this region, including OFD1 and SED tarda, as well as Aicardi syndrome and Goltz syndrome. © 1999 Academic Press
INTRODUCTION
In Drosophila, the dosage of X-linked gene products in males and females is equalized by transcription of the single male X chromosome at twice the rate of each of the two female X chromosomes (Lucchesi, 1998). This is achieved by the selective stabilization of the Sequence data from this article have been deposited with the GenBank Data Library under Accession Nos. AF117065 (MSL3L1) and AF117066 (Msl3l1). 1 To whom correspondence and reprint requests should be addressed at the Department of Molecular and Human Genetics, Howard Hughes Medical Institute, Baylor College of Medicine, One Baylor Plaza, Mail Stop BCM 225, Room T-807, Houston, TX 77030. Telephone: (713) 798-6558. Fax: (713) 798-8728. E-mail:
[email protected]. 77
0888-7543/99 $30.00 Copyright © 1999 by Academic Press All rights of reproduction in any form reserved.
78
PRAKASH ET AL.
ila msl-3 in Xp22.3, named MSL3L1, determined its genomic structure, and analyzed its pattern of expression. MSL3L1 is located 300 kb centromeric to the critical region for microphthalmia with linear skin defects (MLS) syndrome, a dominantly inherited, malelethal disorder, which is caused by terminal deletions of Xp including Xp22.3 (Wapenaar et al., 1993). Because of its proximity to the MLS region, we evaluated MSL3L1 as a candidate gene for two other X-linked dominant disorders that share many features with MLS syndrome, Aicardi syndrome and Goltz syndrome (Lindsay et al., 1994), and two other developmental disorders that were recently linked to this region: oral– facial– digital syndrome 1 (OFD1) and spondyloepiphyseal dysplasia tarda (SED tarda) (Feather et al., 1997; Heuertz et al., 1995). MATERIALS AND METHODS Sequence analysis. Sequencing reactions were performed using a PRISM BigDye Terminator Cycle Sequencing kit and an ABI 377 automated DNA sequencer (Perkin–Elmer Applied Biosystems, Foster City, CA). cDNA contigs were assembled using the Sequencing Project Manager in the Lasergene Navigator suite (DNASTAR, Inc., Madison, WI). Assembled sequences were analyzed with the FASTA, LINEUP, and PILEUP programs in the Wisconsin software package (GCG, Inc., Madison, WI). Using a shotgun strategy, the 196-kb insert of BAC clone 590J6 (Genome Systems, Inc., St. Louis, MO), which includes the entire MSL3L1 genomic locus, was completely sequenced in the Human Genome Sequencing Center at Baylor College of Medicine. The assembled sequence was analyzed with the following programs: RepeatMasker (A. F. A. Smit P. and Green, unpublished results), BLAST (NCBI) with the dbEST and nonredundant databases (Altschul et al., 1990), GRAIL (Uberbacher and Mural, 1991), and the Web Promoter Scan Service (Prestridge, 1995). Southern analysis and cDNA library screening. Probes were radiolabeled using random oligohexamer priming with [a- 32P]dCTP (Sambrook et al., 1989) and purified through a Sephadex G-50 column. cDNA library screens were performed according to our published protocol (Banfi et al., 1994). For PCR screening, primary agar cores were eluted in 1 mL SM and 1 mL was used in each 25-mL reaction. Clones were amplified with cDNA primers RT7A (TTTGTAAGGAGATGGTGGATGG), RT7B (AACGGGAGAGTGTAATCAAAG), RT12B (ATAAGCCGATCTGGGAAGAAG), and lgt11 vector primers. Expression analysis. For Northern analysis, a human multipletissue Northern blot and mouse total embryo Northern blot (Clontech, Palo Alto, CA) were hybridized with a radiolabeled 388-bp human fetal brain RT-PCR product from the 39UTR of MSL3L1 or with the 2.5-kb insert of EST 1163086 according to the manufacturer’s specifications. The filter was washed to 0.13 SSC/0.1% SDS at 50°C and exposed to film (XOMAT AR, Eastman Kodak Company) for 4 days at 280°C. For RT-PCR, 1.0 mg of total RNA from human lymphoblasts was reverse-transcribed using the Superscript II firststrand cDNA synthesis kit (Life Technologies, Gaithersburg, MD) with 150 mM random hexamer primers (Amersham Pharmacia Biotech, Uppsala, Sweden). For the negative control, identical reactions were performed without reverse transcriptase. X-inactivation assay. Fifty nanograms of cDNA prepared from a human– hamster hybrid cell line retaining a single active or inactive X chromosome was amplified with primers to the MSL3L1 cDNA, RT7A and RT12B (as above), in 25 mL with an initial denaturation step of 95°C for 3 min; 33 cycles of 94°C for 1 min, 58°C for 1 min, 72°C for 1 min; and then a final 5-min extension step at 72°C.
Mutation analysis. To detect large rearrangements within MSL3L1, 5 mg of the patient genomic DNA was digested with EcoRI, HindIII, TaqI, and MspI and evaluated by Southern analysis with the radiolabeled MSL3L1 cDNA. DNAs from unaffected males and females and from a human– hamster hybrid cell line retaining a single human X chromosome were used as controls. To screen for point mutations or small deletions in the MSL3L1 coding region, we used a combination of heteroduplex and singlestrand conformation polymorphism (SSCP) analysis and analyzed the results by conformation-sensitive polyacrylamide gel electrophoresis (CSGE). The 13 coding exons of MSL3L1 were amplified from genomic DNA with flanking oligonucleotide primers (Table 2). Approximately 50 ng of patient lymphoblast DNA was amplified in a Peltier Thermal Cycler PTC-225 DNA engine tetrad (MJ Research, Watertown, MA) in 25-mL reactions containing 250 mM dNTPs and 1.25 mM MgCl 2 with an initial denaturation step of 95°C for 5 min; 32 cycles of 95°C for 1 min, primer-dependent annealing temperatures for 1 min, 72°C for 1 min; and a final 5-min extension step at 72°C. For heteroduplex analysis, the denatured PCR products were allowed to reanneal gradually at room temperature. Heteroduplexes or SSCPs were resolved by electrophoresis at 500 V for 15–20 h on CSGE according to the manufacturer’s specifications (Bio-Rad Laboratories, Hercules, CA). PCR products representing the complete MSL3L1 coding region were also directly sequenced in a subset of patients.
RESULTS
Identification and cDNA sequence of a novel gene in Xp22.3. BLAST searches with the DNA sequence of BAC GS-590J6 (GenBank AC0004554) identified 38 human ESTs, 3 mouse ESTs, and 3 rat ESTs in a 20-kb region of genomic DNA. Because many of the ESTs were nearly identical, we searched the UniGene database at NCBI and the Sequence Tag Alignment and Consensus Knowledgebase (STACK) server at the South African National Bioinformatics Institute for prealigned consensus sequences (Miller et al., 1997; Skupski et al., 1999). Using three representative ESTs (IMAGE clones 46405 and 296557 and GeneExpress cDNA c-16f05) as the query, we obtained four UniGene contigs (Hs. 58521, 88764, 105514, 122080) and four STACK clusters (brain3669, hemat8203, repro5830, hemat1481). Comparison of the consensus sequences revealed that all four UniGene contigs and two of the four STACK clusters overlap and belong to the same cDNA sequence. The other STACK clusters are derived from intronic repetitive elements in contiguous genomic DNA and were not analyzed further. From an alignment of the six consensus sequences, we obtained a 2139-bp cDNA contig with an open reading frame at the 59 end. With a pair of primers positioned at the 39 end of the cDNA, we successfully amplified a 388-bp RT-PCR product from lymphoblast total RNA (data not shown). This PCR-generated probe was used to screen a human fetal kidney cDNA library (Clontech). Fifty-seven positive clones were identified and screened for full-length inserts by PCR. Two clones with the longest 59 sequence (27 and 43) were subcloned into pBluescript KS( 1) (Stratagene, La Jolla, CA) and sequenced. The cDNA encodes a nuclear protein with homology to Drosophila msl-3. The cDNA sequence contains a 1669-bp open reading frame and encodes a 521-amino-
msl-3 HOMOLOG ISOLATED FROM Xp22.3
79
FIG. 1. Genomic structure, alternative splice variants, and cDNA sequence of MSL3L1. (a) Schematic depiction of the genomic structure and alternative splicing of MSL3L1. Numbered rectangles indicate the 13 coding exons; white areas demarcate the two chromo domains. Black lines represent transcripts isolated from a human fetal kidney cDNA library, with thick lines indicating coding sequence and thinner lines the untranslated regions. Isoform 1 (top) contains the entire coding sequence of MSL3L1 translated from ATG1, including exon 2. In isoform 2 (bottom), exon 2 is spliced out and the open reading frame begins at ATG2. (b) cDNA sequence, predicted peptide sequence, and intron– exon boundaries of MSL3L1. Amino acid residues in the two putative chromo domains are boxed. The polyadenylation signal at bp 2237 is double-underlined and exon boundaries are numbered. Arrows indicate the splice junctions of the alternatively spliced exon 2 between exons 1 and 3. The three boldface, italic amino acids (MPS) represent an alternative N terminus of the protein, which is in frame with exon 3 when exon 2 is spliced out. (c) Conceptual translations of MSL3L1 isoform 2 from ATG1 and ATG2. Translation from ATG1 creates a 36-amino-acid peptide truncated by a stop codon in exon 3. Translation from an alternative Kozak site (ATG2) in exon 1 preserves the open reading frame into exon 3 and creates a protein that is identical to the C-terminal 462 amino acids of MSL3L1 but lacks the first chromo domain of the full-length protein.
acid protein with a calculated molecular mass of 58.5 kDa (Fig. 1). BLAST searches detected significant homology (P 5 2.4 3 10 232) between three segments of the predicted peptide sequence (amino acids 11–96, 205– 264, and 444 –500) and the Drosophila male-specific lethal-3 protein. In an amino acid alignment with MSL-3 (Fig. 2), the homologous segments are 52, 41, and 38% identical to the Drosophila protein. Two of the segments overlap with putative chromo domain motifs in MSL-3 (amino acids 36 –72 and 452– 488). Because the two proteins are highly conserved with 30% identity over the entire length of the alignment, the novel gene was named MSL3L1 (msl-3 homolog 1). We also identified a mouse EST from the IMAGE Consortium (1163086) that is 96% identical to the first 100 bp of the human cDNA and contains additional 59 sequence. The mouse homolog of msl-3 (Msl3l1), represented by EST
1163086, encodes a protein with 86% identity to MSL3L1 and contains the complete open reading frame of the mouse gene. Several other proteins gave significant alignment scores with the same segments of MSL3L1: predicted peptides of unknown function in Schizosaccharomyces pombe (GenBank Z98977) and Saccharomyces cerevisiae (GenBank Z71255), as well as a Cu 21-transporting ATPase from Arabidopsis thaliana (GenBank Z99707). An uncharacterized human cDNA (KIAA0026) that is homologous to the C-terminal chromo domain may represent another human homolog of msl-3. Searches of the PROSITE database identified a basic nuclear localization signal (RKKKR), two potential tyrosine kinase phosphorylation sites, and a putative leucine zipper motif within the second chromo domain. Leucine zippers facilitate dimerization or cooperative DNA bind-
80
PRAKASH ET AL.
FIG. 2. Amino acid alignment of MSL3L1, Msl3l1, and Drosophila MSL-3. Black residues are identical and gray residues are similar. The mouse cDNA lacks exon 2 and is translated from exon 3. The two putative chromo domains are boxed. Human MSL3L1 encodes a 521-amino-acid protein; mouse Msl3l1 and Drosophila MSL-3 are 460 and 512 amino acids, respectively.
ing of transcription factors and may play a similar role in chromatin-associated proteins such as MSL-3. MSL3L1 is alternatively spliced at the 59 end. Sequence analysis of PCR products from five human fetal kidney cDNA clones indicated that at least two transcripts can be generated from the MSL3L1 gene by alternative splicing of 59 coding sequence (Fig. 1a): full-length transcripts including exons 1, 2, and 3 (isoform 1) and transcripts lacking exon 2 (isoform 2). Isoform 1 is homologous to the complete open reading frame of the Drosophila msl-3 gene and encodes a protein that initiates at a conserved methionine (ATG1) at bp 106. In isoform 2, which lacks exon 2, translation from ATG1 leads to a frame shift and premature termination of the protein in exon 3, after 36 amino acids (Fig. 1c). However, translation from a downstream start site in exon 1 at bp 200 (ATG2) can preserve the open reading frame and creates a protein with three novel amino acids (MPS) fused to exons 3–13 of MSL3L1. The protein encoded by isoform 2 is identical to MSL3L1 from amino acid 62 to the C terminus but does not contain the first 26 amino acids in the N-terminal chromo domain. Mouse EST 1163086 resembles isoform 2, in which exon 1 is spliced directly to exon 3. Intriguingly, the Drosophila msl-3 gene un-
dergoes a similar pattern of alternative splicing at the 59 end, whereby the first 34 amino acids of the protein are spliced out and replaced by three novel amino acids in frame with the 39 coding sequence (Gorman et al., 1995). The function of the truncated Drosophila isoform is currently unknown. Using a 388-bp RT-PCR probe from human lymphoblast RNA, Northern analysis of MSL3L1 (Fig. 3, left) confirmed that the transcript is alternatively spliced. The probe detects a major band of 2.4 kb in poly(A) 1 RNA from all tissues examined (pancreas, kidney, skeletal muscle, liver, lung, placenta, brain, and heart), with highest levels of expression in skeletal muscle and heart. This size corresponds well with the 2342-bp cDNA sequence of MSL3L1. In addition, a 2.6-kb band is unique to skeletal muscle and a faint 4.2-kb band is visible in most lanes. The 2.6-kb transcript may be derived by alternative splicing (as above). On the right, a mouse total embryo Northern blot was probed with the insert of EST clone 1163086. The probe detects a 2.4-kb transcript of equal abundance in poly(A) 1 RNA from e7, e11, e15, and e17 embryos, as well as a faint 5.0-kb band. The upper bands on human and mouse Northern blots may arise from additional alternative splicing or from cross-reacting sequences.
msl-3 HOMOLOG ISOLATED FROM Xp22.3
FIG. 3. Northern analysis of MSL3L1 and Msl3l1. Each lane contains 2 mg of poly(A) 1 mRNA from various tissues. Sizes in kilobases are indicated along the right side of the blot. (Top left) A human multiple-tissue Northern blot (Clontech) was hybridized with a 388-bp RT-PCR product amplified from the 39UTR of MSL3L1. (Top right) A mouse total embryo Northern blot (Clontech) was hybridized with the 2.5-kb insert of mouse EST 1163086, which contains the entire open reading frame of Msl3l1. (Bottom) Control hybridization to a human b-actin cDNA probe.
We also evaluated whether MSL3L1 is subject to X inactivation. First-strand cDNA was prepared by reverse transcription of poly(A) 1 mRNA from hybrid cell lines which retain a single active or inactive human X chromosome. PCR was performed with primers in exons 8 and 13 of the MSL3L1 cDNA. An amplification product was detected for the active X hybrid but not for the inactive X hybrid, demonstrating that this gene undergoes X inactivation (data not shown). Our controls included a gene that is known to escape inactiva-
81
tion (MIC2) and a gene that is inactivated (ARHGAP6) (Schaefer et al., 1997). Genomic structure and mutation analysis of MSL3L1. The intron– exon boundaries of MSL3L1 were rapidly determined by alignment of the consensus cDNA sequence with genomic sequence of BAC 590J6, which is located 200 –300 kb centromeric to the MLS deletion region (Fig. 4). MSL3L1 consists of 13 exons transcribed from telomere to centromere in 17 kb of genomic DNA. The sizes of the exons and introns, as well as the boundary sequences, are provided in Table 1. Consensus splice sites for each exon were identified. The start site is located in exon 1, 20 bp downstream of a putative CpG island (bp 1–90). All cDNA clones share an identical 670-bp 39UTR with a polyadenylation signal at bp 2236. The chromo domains are encoded by exons 1, 2, 12, and 13. The MSL3L1 gene was screened for mutations in a panel of 29 Aicardi and 4 Goltz patients who have one or more features described in MLS syndrome but do not have cytogenetic abnormalities. Several of these patients were previously analyzed for mutations in two other genes located in the MLS critical region (Schaefer et al., 1997). Our mutation analysis also included a single patient with MLS syndrome who has no detectable chromosomal rearrangements (Cox et al., 1998), as well as 2 patients with oral–facial– digital syndrome I and 3 patients with spondyloepiphyseal dysplasia tarda, two developmental disorders previously mapped to Xp22.3 (Feather et al., 1997; Heuertz et al., 1993, 1995).
FIG. 4. Physical map of human chromosome Xp22.3 with the position of MSL3L1. Black squares indicate STS markers; BA325 is a chromosomal breakpoint at the centromeric boundary (left) of the MLS region. ARHGAP6 is a rho-type GTPase-activating protein gene which we previously cloned from the MLS region (Schaefer et al., 1997) and AMELX is the X-linked amelogenin gene. Arrows indicate the transcriptional orientation of each gene and each exon is numbered and represented by a thin black bar. The large intron between exons 1 and 2 of ARHGAP6 includes the oppositely oriented AMELX gene (unpublished data). To preserve the scale of the figure, only 2 of the 14 exons in ARHGAP6 are shown.
82
PRAKASH ET AL.
TABLE 1 Intron–Exon Splice Sequences of the MSL3L1 Gene Exon
59 intron (59–39)
Exon (bp)
39 intron (59–39)
Intron (kb)
1 2 3 4 5 6 7 8 9 10 11 12 13
59 UTR tgtgggaaatatttcttcgtttcag gaattttccttttttgttcaattag tcagttgtttgctactctttttcag taccggatgcttttgtttcacctag taaaaatggaattttgatattgcag cagatgtctatgtttttgttaacag aatctcttctttttaatcctcccag gtgtgacatcctccatcttccctag ataatttcatctttttctttggaag agcattcaattcgtgctttttccag ggttttgcctttatttttaatgcag cattggggctattttttttttccag (C/T)AG
105 82 95 101 83 123 161 158 272 110 99 84 97
gtgccgccgcggagggacagggagg gtgagtgaacttgttaaaccagact gtaagaatacaaaaatcaagatata gtgagtagcttcacttcatttcttt gtataaagtttttattgtaaaaact gtatgtagggagaatgtataaaaca gtgagtactagggagaatgtataaa gtaagttatatagcctgcactttca gtaggttcattctcgggtgccccag gtaaagaatgtttgatgtttgttcc gtaagaatcctggttcctgccttct gtatttttattattttgtcaaaact 39UTR GT(A/G)
1.5 0.5 0.3 0.6 0.6 0.6 0.8 1.6 2.9 3.5 0.4 2.3
Note. For each exon, 1–12, the last 25 bp of the 59 intron, the size of each exon, the first 25 bp of the 39 intron, and the size of each intron are shown.
To evaluate the MSL3L1 genomic locus for deletions or structural alterations, we performed Southern analysis on genomic DNA from each patient. Using human fetal kidney cDNA 27 as the probe, no differences between patients and unaffected individuals were evident. To search for point mutations or small deletions in the coding region, exons 1–13 were amplified individually from genomic DNA using the primer sets in Table 2 and subjected to heteroduplex or SSCP analysis on nondenaturing polyacrylamide gels. The PCR products were also sequenced in a subset of Aicardi, Goltz, OFD1, and SED tarda patients using the same primers. In comparison with unaffected controls, no mutations or polymorphisms were detected.
cific lethal-3 gene. MSL3L1 spans a genomic region of 17 kb and includes 13 exons encoding a 521-amino-acid protein. Three contiguous segments of the protein are highly conserved in MSL-3, including two putative chromo domains at the N terminus and C terminus. The Drosophila protein plays a critical role in a dosagecompensation pathway, which equalizes X-linked gene expression in males and females (Gorman et al., 1995). The conservation of the human homolog suggests that MSL3L1 may perform a similar function in chromatin remodeling and transcriptional regulation. As a putative transcription factor, MSL3L1 is an excellent candidate gene for several developmental disorders that have been mapped or otherwise linked to Xp22.3, including OFD1, SED tarda, Aicardi syndrome, and Goltz syndrome (Donnenfeld et al., 1990; Feather et al., 1997; Heuertz et al., 1995; Lindsay et al., 1994). To evaluate its importance in these disorders, we searched for MSL3L1 mutations in a large panel of
DISCUSSION
Through computer sequence analysis we identified a novel chromo domain gene in Xp22.3, named MSL3L1, with significant homology to the Drosophila male-spe-
TABLE 2 Primers Used to Amplify MSL3L1 Exons from Genomic DNA Exon
Anneal (°C)
1 2 3 4 5 6 7 8 9 10 11 12 13
67 56 53 54 51 52 54 56 58 53 54 54 54
Forward primer (59–39) TCG AGG TTT TTT TCT CCT GTG AAA TCT AGT GTT CTT TTT
CCC GTC GAG TGA GTA TGG GTT AAT TCA CTT AGG CCC GCT
TCC TGT GTA TCT GAC ATA TGT CGG GAA TAA TGG TGA TGA
GCC GAA CTT CAA ATA TTG TAC CAA GCA TTC TTG GTG TGG
ACG TTA TTA GAC AGG ACT ATT CTT CTC AGT GAA AAT TAT
ATG GTT GTC TCA GTT CTT AGG CAT CAT TGC GTG CCT TTA
Reverse primer (59–39) AGC TTC GTT CGA GGT CTA ATG TGT GTT TGT TCT ATG GTC
A C G G G C C G C G C C C
CAG TTT ACA AAT TCA CAA AGT CCC ACA CTT GTG AGA CAC
CCC CTA CTA TTC CAA GAT GTC ACA AAG TAA TAC GCT CTT
GCG TGC AGA AGT CTG TCG ACC CCA TAC GAA AGA GAC GTT
CCT AGC GAA GAA GAG CTA TGT TCA ACG CCT AAG TTC AGT
CCT AAA AAC TCC TAT TTG TCT GAT CTG TGA CTA CCC TAT
CCC GAC ACA TTC TTC CTT AAA TGA GCT CTT ACA ATG TCA
Product size (bp) TGT AAT TTG TCT ACT AC TGG ACT CTC CCA AGA TG CCT
C C G C G A C
G G
162 220 220 240 230 275 291 329 383 245 231 247 237
Note. PCR amplification from genomic DNA was performed under the conditions listed under Materials and Methods using the specified annealing temperatures.
83
msl-3 HOMOLOG ISOLATED FROM Xp22.3
patients by PCR amplification of exons, heteroduplex analysis, SSCP analysis, and direct nucleotide sequencing. No mutations were found, suggesting that MSL3L1 is not involved in the pathogenesis of these disorders. Chromatin organization modifier or chromo proteins comprise a superfamily of nuclear factors conserved across the animal and plant kingdoms (Cavalli and Paro, 1998). The chromo domain is a loosely conserved 40-amino-acid motif that forms a characteristic tertiary structure (Koonin et al., 1995) and interacts with components of chromatin to nucleate the assembly of transcriptional regulatory complexes (Cowell and Austin, 1997; Franke et al., 1995; Messmer et al., 1992). Drosophila Polycomb, the most well characterized chromo protein, aggregates in beadlike complexes at approximately 100 chromosomal positions and silences the transcription of homeodomain genes (Messmer et al., 1992). The yeast SW16 chromo domain protein binds to heterochromatin and is required for positiondependent gene silencing (Aasland and Stewart, 1995). Chromo-related motifs in human heterochromatin-associated proteins can interact with lamin B-type receptors on the nuclear envelope, forming a potential scaffold for silenced domains in the nucleus (Ye et al., 1997). Together these observations suggest that chromo domains play a critical role in the epigenetic regulation and maintenance of gene expression. With the isolation of MSL3L1, highly conserved human homologs have been identified for three components of the Drosophila dosage compensation pathway. Homology between human RNA helicase A and MLE, and the human HIV-1 Tat-interacting protein Tip60 and MOF, was recently described (Hilfiker et al., 1997; Lee and Hurwitz, 1993). The implications are unclear, because dosage compensation in mammals, in contrast to the transactivation of the male X chromosome in Drosophila, culminates with the inactivation of one female X chromosome. However, both pathways utilize many of the same components, including untranslated RNAs of uncertain function, such as XIST in humans and roX1 in Drosophila, and acetylated isoforms of histones, which are unique to the dosage-compensated X chromosome in both species (Costanzi and Pehrson, 1998; Lee et al., 1997; Turner, 1998). The function of vertebrate MSL proteins may be adapted for X inactivation or, in the case of the mouse Clc4 gene, restricted to fewer vertebrate genes. For Clc4, which is autosomal in one species and X-linked in another, transcription from the single X-linked copy in males is equal to the combined output of both autosomal alleles (Adler et al., 1997). Clc4 and a small number of other vertebrate genes are regulated with a strategy that resembles dosage compensation in Drosophila (Graves et al., 1998; Jegalian and Page, 1998). Whether the vertebrate MSL proteins have conserved the biochemical characteristics of their Drosophila counterparts, in complex formation and chromatin remodeling, is currently unknown.
ACKNOWLEDGMENTS The authors thank Chinh Tran and Nicole Chao for their assistance with the mutation analysis of MSL3L1 and screening of the human fetal kidney cDNA library. The inactive X chromosome hybrid cell line was obtained from Dr. A. C. Chinault (Baylor College of Medicine) and the active X chromosome hybrid was obtained from Dr. David Ledbetter (University of Chicago). This work was also supported by the Howard Hughes Medical Institute (H.Y.Z.), Italian Telethon Foundation Grants BMH4-CT96-1134 and BMH4-CT960889 (B.F.), National Institutes of Health Grant K08-HD01171 and the Evelyne and Lucile Hansen Fund (I.B.V), the Medical Scientist Training Program (S.P.), and the Aicardi Syndrome Foundation and by resources from the Mental Retardation Research Center and the Child Health Research Center at Baylor College of Medicine.
REFERENCES Aasland, R., and Stewart, A. F. (1995). The chromo shadow domain, a second chromo domain in heterochromatin-binding protein 1, HP1. Nucleic Acids Res. 23: 3163–3173. Adler, D. A., Rugarli, E. I., Lingenfelter, P. A., Tsuchiya, K., Poslinski, D., Liggitt, H. D., Chapman, V. M., Elliott, R. W., Ballabio, A., and Disteche, C. M. (1997). Evidence of evolutionary up-regulation of the single active X chromosome in mammals based on Clc4 expression levels in Mus spretus and Mus musculus. Proc. Natl. Acad. Sci. USA 94: 9244 –9248. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). A basic local alignment search tool. J. Mol. Biol. 215: 403– 410. Banfi, S., Servadio, A., Chung, M.-Y., Kwiatkowski, T. J., Jr., McCall, A. E., Duvick, L. A., Shen, Y., Roth, E. J., Orr, H. T., and Zoghbi, H. Y. (1994). Identification and characterization of the gene causing type 1 spinocerebellar ataxia. Nat. Genet. 7: 513–519. Cavalli, G., and Paro, R. (1998). Chromo-domain proteins: Linking chromatin structure to epigenetic regulation. Curr. Opin. Cell Biol. 10: 354 –360. Copps, K., Richman, R., Lyman, L. M., Chang, K. A., RampersadAmmons, J., and Kuroda, M. I. (1998). Complex formation by the Drosophila MSL proteins: Role of the MSL2 RING finger in protein complex assembly. EMBO J. 17: 5409 –5417. Costanzi, C., and Pehrson, J. R. (1998). Histone macroH2A1 is concentrated in the inactive X chromosome of female mammals. Nature 393: 599 – 601. Cowell, I. G., and Austin, C. A. (1997). Self-association of chromo domain peptides. Biochim. Biophys. Acta 1337: 198 –206. Cox, T. C., Cox, L. L., and Ballabio, A. (1998). A very high density microsatellite map (1 STR/41 kb) of 1.7 Mb on Xp22 spanning the microphthalmia with linear skin defects (MLS) syndrome critical region. Eur. J. Hum. Genet. 6: 406 – 412. Donnenfeld, A. E., Graham, J. M., Jr., Packer, R. J., Aquino, R., Berg, S. Z., and Emanuel, B. S. (1990). Microphthalmia and chorioretinal lesions in a girl with an Xp22.2–pter deletion and partial 3p trisomy: Clinical observations relevant to Aicardi syndrome gene localization. Am. J. Med. Genet. 37: 182–186. Feather, S. A., Woolf, A. S., Donnai, D., Malcolm, S., and Winter, R. M. (1997). The oral–facial– digital syndrome type 1 (OFD1), a cause of polycystic kidney disease and associated malformations, maps to Xp22.2–Xp22.3. Hum. Mol. Genet. 6: 1163–1167. Franke, A., Messmer, S., and Paro, R. (1995). Mapping functional domains of the polycomb protein of Drosophila melanogaster. Chromosome Res. 3: 351–360. Gorman, M., Franke, A., and Baker, B. S. (1995). Molecular characterization of the male-specific lethal-3 gene and investigations of the regulation of dosage compensation in Drosophila. Development 121: 463– 475.
84
PRAKASH ET AL.
Graves, J. A., Disteche, C. M., and Toder, R. (1998). Gene dosage in the evolution and function of mammalian sex chromosomes. Cytogenet. Cell Genet. 80: 94 –103. Heuertz, S., Nelen, M., Wilkie, A. O., Le Merrer, M., Delrieu, O., Larget-Piet, L., Tranebjaerg, L., Bick, D., Hamel, B., and Van Oost, B. A. (1993). The gene for spondyloepiphyseal dysplasia (SEDL) maps to Xp22 between DXS16 and DXS92. Genomics 18: 100 –104. Heuertz, S., Smahi, A., Wilkie, A. O., Le Merrer, M., Maroteaux, P., and Hors-Cayla, M. C. (1995). Genetic mapping of Xp22.12– p22.31, with a refined localization for spondyloepiphyseal dysplasia (SEDL). Hum. Genet. 96: 407– 410. Hilfiker, A., Hilfiker-Kleiner, D., Pannuti, A., and Lucchesi, J. C. (1997). mof, a putative acetyl transferase gene related to the Tip60 and MOZ human genes and to the SAS genes of yeast, is required for dosage compensation in Drosophila. EMBO J. 16: 2054 –2060. Jegalian, K., and Page, D. C. (1998). A proposed path by which genes common to mammalian X and Y chromosomes evolve to become X inactivated. Nature 394: 776 –780. Kelley, R. L., Wang, J., Bell, L., and Kuroda, M. I. (1997). Sex lethal controls dosage compensation in Drosophila by a non-splicing mechanism. Nature 387: 195–199. Koonin, E. V., Zhou, S., and Lucchesi, J. C. (1995). The chromo superfamily: New members, duplication of the chromo domain and possible role in delivering transcription regulators to chromatin. Nucleic Acids Res. 23: 4229 – 4233. Lee, C. G., Chang, K. A., Kuroda, M. I., and Hurwitz, J. (1997). The NTPase/helicase activities of Drosophila maleless, an essential factor in dosage compensation. EMBO J. 16: 2671–2681. Lee, C. G., and Hurwitz, J. (1993). Human RNA helicase A is homologous to the maleless protein of Drosophila. J. Biol. Chem. 268: 16822–16830. Lindsay, E. A., Grillo, A., Ferrero, G. B., Roth, E. J., Magenis, E., Grompe, M., Hulte´n, M., Gould, C., Baldini, A., and Zoghbi, H. Y. (1994). Microphthalmia with linear skin defects (MLS) syndrome: Clinical, cytogenetic, and molecular characterization. Am. J. Med. Genet. 49: 229 –234. Lucchesi, J. C. (1998). Dosage compensation in flies and worms: The ups and downs of X-chromosome regulation. Curr. Opin. Genet. Dev. 8: 179 –184.
Meller, V. H., Wu, K. H., Roman, G., Kuroda, M. I., and Davis, R. L. (1997). roX1 RNA paints the X chromosome of male Drosophila and is regulated by the dosage compensation system. Cell 88: 445– 457. Messmer, S., Franke, A., and Paro, R. (1992). Analysis of the functional role of the Polycomb chromo domain in Drosophila melanogaster. Genes Dev. 6: 1241–1254. Miller, G., Fuchs, R., and Lai, E. (1997). IMAGE cDNA clones, UniGene clustering, and ACeDB: An integrated resource for expressed sequence information. Genome Res. 7: 1027–1032. Prestridge, D. S. (1995). Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 249: 923–932. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). “Molecular Cloning: A Laboratory Manual,” 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Schaefer, L., Prakash, S., and Zoghbi, H. Y. (1997). Cloning and characterization of a novel rho-type GTPase-activating protein gene (ARHGAP6) from the critical region for microphthalmia with linear skin defects. Genomics 46: 268 –277. Skupski, M. P., Booker, M., Farmer, A., Harpold, M., Huang, W., Inman, J., Kiphart, D., Kodira, C., Root, S., and Schilkey, F. (1999). The Genome Sequence DataBase: Towards an integrated functional genomics resource. Nucleic Acids Res. 27: 35–38. Turner, B. M. (1998). Histone acetylation as an epigenetic determinant of long-term transcriptional competence. Cell Mol. Life Sci. 54: 21–31. Uberbacher, E. C., and Mural, R. J. (1991). Locating protein-coding regions in human DNA sequences by a multiple-sensor network approach. Proc. Natl. Acad. Sci. USA 88: 11261–11265. Wapenaar, M. C., Bassi, M. T., Schaefer, L., Grillo, A., Ferrero, G. B., Chinault, A. C., Ballabio, A., and Zoghbi, H. Y. (1993). The genes for X-linked ocular albinism (OA1) and microphthalmia with linear skin defects (MLS): Cloning and characterization of the critical regions. Hum. Mol. Genet. 2: 947–952. Ye, Q., Callebaut, I., Pezhman, A., Courvalin, J. C., and Worman, H. J. (1997). Domain-specific interactions of human HP1-type chromodomain proteins and inner nuclear membrane protein LBR. J. Biol. Chem. 272: 14983–14989.