GENOMICS
31, 223–229 (1996) 0035
ARTICLE NO.
Complete Genomic Organization of the Human Erythroid p55 Gene (MPP1), a Membrane-Associated Guanylate Kinase Homologue ANTHONY C. KIM, AIDA B. METZENBERG,1 KENNETH E. SAHR, SHIRIN M. MARFATIA, AND ATHAR H. CHISHTI2 Laboratory of Tumor Cell Biology, Department of Biomedical Research and Division of Hematology/Oncology, St. Elizabeth’s Medical Center, Tufts University School of Medicine, Boston, Massachusetts 02135 Received August 25, 1995; accepted November 2, 1995
Human p55 is an abundantly palmitoylated phosphoprotein of the erythroid membrane. It is the prototype of a newly discovered family of membrane-associated proteins termed MAGUKs (membrane-associated guanylate kinase homologues). The MAGUKs interact with the cytoskeleton and regulate cell proliferation, signaling pathways, and intercellular junctions. Here, we report the complete intron–exon map of the human erythroid p55 gene (HGMW-approved symbol MPP1). The structure of the p55 gene was determined from cosmid clones isolated from a cosmid library specific for the human X chromosome. There is a single copy of the p55 gene, composed of 12 exons and spanning approximately 28 kb in the q28 region of the human X chromosome. The exon sizes range from 69 (exon 5) to 203 (exon 10) bp, whereas the intron sizes vary from 280 bp (intron 2) to Ç14 kb (intron 1). The intron–exon boundaries conform to the donor/acceptor consensus sequence, GT-AG, for splice junctions. Several of the exon boundaries correspond to the boundaries of functional domains in the p55 protein. These domains include a SH3 motif and a region that binds to cytoskeletal protein 4.1. In addition, a comparison of the genomic and the primary structures of p55 reveals a highly conserved phosphotyrosine domain located between the protein 4.1 binding domain and the guanylate kinase domain. Finally, promoter activity measurements of the region immediately upstream of the p55 gene, which contains several cis-elements commonly found in housekeeping genes, suggest that a CpG island may be associated with the p55 gene expression in vivo. q 1996 Academic Press, Inc. INTRODUCTION
Human erythroid p55 is a member of a newly discovered family of signaling proteins termed MAGUKs The nucleotide sequence data reported in this paper have been deposited with the EMBL/GenBank Data Libraries under Accession No. U39611. 1 Current address: Howard Hughes Medical Institute and Department of Medicine, University of California, San Francisco, CA 941430724. 2 To whom correspondence should be addressed at Laboratory of Tumor Cell Biology, St. Elizabeth’s Medical Center, ACH4, 736 Cam-
(membrane-associated guanylate kinase homologues). MAGUKs include the Drosophila discs-large tumor suppressor protein, Dlg (Woods and Bryant, 1991); a rat synaptic junction-associated protein, PSD-95 (Cho et al., 1992)/SAP90 (Kistner et al., 1993); the tight junction-associated proteins, ZO-1 (Willott et al., 1993; Itoh et al., 1993) and Z0-2 (Jesaitis and Goodenough, 1994); a human B-lymphocyte homologue of the Drosophila Dlg tumor suppressor protein, hDlg (Lue et al., 1994); a rat presynaptic protein, SAP97 (Muller et al., 1995); and a recently discovered dlg2 protein of unknown function (Mazoyer et al., 1995). The cDNA for the dlg2 protein maps distal to the BRCA1 tumor suppressor on the long arm of chromosome 17 (Mazoyer et al., 1995). The heightened interest in understanding the biology of MAGUKs is largely due to the genetic analysis of the Drosophila dlg tumor suppressor gene (Woods and Bryant, 1991). The Drosophila Dlg protein is localized at the septate junctions, and recessive lethal mutations in the dlg gene result in loss of epithelial cell polarity followed by neoplastic overgrowth of the imaginal discs (Woods and Bryant, 1991). All MAGUKs share a number of well-defined protein domains, which are believed to mediate their signaling function at the interface of the membrane–cytoskeleton (Marfatia et al., 1994; Woods and Bryant, 1993). These domains include one or three copies of the PDZ domain (PSD-95/Discs large/Z0-1), a single copy of the SH3 motif, and a modified guanylate kinase domain (Woods and Bryant, 1993). The structure of the human p55 gene3 represents the first detailed characterization of the genomic locus of a member of the mammalian MAGUK family and may form the basis for future comparisons of the evolutionary origin of protein domains among MAGUKs. Human erythroid p55 is a palmitoylated peripheral membrane protein of the erythroid plasma membrane (Ruff et al., 1991). A clue to its membrane interactions came from the analysis of abnormal red blood cells of subjects with hemolytic anemia, termed hereditary elbridge Street, Boston, MA 02135. Telephone: (617) 789-3118. Fax: (617) 789-3111. 3 The HGMW-approved symbol for the gene described in this paper is MPP1.
223
/ m4752f3845
12-20-95 20:03:48
gnmal
0888-7543/96 $12.00 Copyright q 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
AP-Genomics
224
KIM ET AL.
liptocytosis (HE) (Alloisio et al., 1993). The absence of p55 in the red blood cells of subjects with either homozygous protein 4.1(0) HE or glycophorin C (0) HE strongly suggested that p55 may bind to protein 4.1 and glycophorin C (Alloisio et al., 1993). Using in vitro biochemical assays, we have recently shown a direct association of p55 with cytoskeletal protein 4.1 and glycophorin C (Marfatia et al., 1994). The protein 4.1 binding site on p55 was localized to a region between the SH3 motif and the guanylate kinase domain (Marfatia et al., 1995), whereas the binding site of p55 was localized to the N-terminal 30-kDa domain of protein 4.1 (Marfatia et al., 1994). These studies suggest that the transmembrane glycophorin C provides the attachment site for protein 4.1 and p55 at the erythroid plasma membrane (Marfatia et al., 1995). More recent in vitro binding studies utilizing erythroid membrane vesicles provide further evidence for the existence of this ternary complex among p55, protein 4.1, and glycophorin C at the plasma membrane (Hemming et al., 1995). The protein 4.1–glycophorin C linkage plays a vital role in the red blood cells by regulating the stability and the mechanical properties of the red blood cell membrane (Takakuwa et al., 1986). Unlike other glycophorins, which are erythroid-lineage-specific, glycophorin C is found in many nonerythroid cells (Kim et al., 1989). Villeval et al. (1989) compared glycophorin C expression in normal and leukemic cells and found an abnormal processing of glycophorin C in the human leukemic cells. Similarly, protein 4.1-related cytoskeletal proteins such as ezrin and talin are the major targets of oncogene-encoded tyrosine kinases (Bretscher, 1995). Recently, a domain related to the 30-kDa domain of protein 4.1 was found in the brain tumor suppressor NF2 gene product (Trofatter et al., 1993; Rouleau et al., 1993; Arpin et al., 1994). The binding of p55 and hDlg to the 30-kDa domain of protein 4.1 suggests that other MAGUKs may also function by exerting their effects at a site that is important in the regulation of cell shape and cell–cell contacts. Recently, a protein 4.1 homologue has been localized to the septate junctions of epithelial cells in Drosophila (Fehon et al., 1994). This protein 4.1 homologue is encoded by the coracle gene, whose mutant phenotype exhibits a failure of dorsal closure (Fehon et al., 1994). These observations suggest that a protein 4.1-like molecule may serve as an attachment site for the Drosophila Dlg tumor suppressor protein in the insect septate junctions (Woods and Bryant, 1993). Thus, the interactions of p55 with protein 4.1 may be prototypical of similar associations in many nonerythroid cells. The availability of the intron–exon organization of the p55 gene may be helpful for the structural and functional analysis of various protein domains, as well as for amplifying nucleotide sequences corresponding to these domains. The genomic structure will also be required to test the hypothesis that the p55 gene is aberrant in certain human diseases. Here, we present the complete genomic organization of the human p55
/ m4752f3845
12-20-95 20:03:48
gnmal
gene. There is a single copy of the p55 gene composed of 12 exons that spans approximately 28 kb in the q28 region of the human X chromosome. An analysis of the upstream region indicates that a CpG island may be associated with the transcription of the human erythroid p55 gene, a feature akin to other housekeeping genes. MATERIALS AND METHODS Isolation of cosmid clones. Cosmid clones were isolated from cosmid library 104, which was obtained from the Imperial Cancer Research Fund (London, UK). Library 104 is an X-chromosome-specific cosmid library (Nizetic et al., 1991). It was screened by hybridization using a p55 cDNA probe (HUMPEMP cDNA clone 5) as described before (Metzenberg and Gitschier, 1992). The cDNA probe was prepared by PCR amplification of the plasmid insert and was oligolabeled using [32P]dATP (Amersham). Hybridization was carried out in 61 SSC/51 Denhardt’s/1% SDS at 687C and blots were washed with 0.51 SSC/0.1% SDS at 607C. Intron size determination. The sizes of the introns were determined by PCR. Introns 2, 3, 4, 6, 7, 8, and 10 were amplified using the following parameters: denaturing at 957C for 30 s, annealing at 557C for 1.0 min, and extension at 727C for 3.0 min for 30 cycles. Pfu enzyme was added during the amplification of introns 1, 5, 9, and 11 (Barnes, 1994). Because of its large size, intron 1 was extended at 687C for 20 min, while introns 5, 9, and 11 were extended at 687C for 5 min. Hybridization of a Southern blot containing genomic DNA from various animal species was performed using a full-length cDNA probe of human p55 (Ruff et al., 1991). The cDNA probe was radiolabeled with [32P]dCTP using a random primer kit (Amersham). The hybridization conditions were similar to those described in the Clontech protocol (Palo Alto, CA). 5* RACE. The amplification of the 5* end of the p55 transcript was carried out using the Marathon cDNA amplification method (Clontech). Human reticulocyte RNA was isolated as described before (Ruff et al., 1991). First-strand synthesis of 2.0 mg of reticulocyte RNA was performed using a modified lock-docking oligo(dT) primer that contains two degenerate nucleotide positions at the 3* end (Borson et al., 1992). After second-strand synthesis, the cDNA pool was blunt-end ligated to the cDNA adaptors. The first PCR was performed with a 27mer sense primer (AP1) specific for the adaptor and a p55-specific antisense primer (TK47, 5*-ATGGGCTCTTCTGTGACCTTCTCAAAC). The TK47 primer is derived from exon 2 of the human p55 cDNA (Ruff et al., 1991). The second round of PCR was carried out with a nested 23mer sense primer (AP2) and an exon 1specific antisense primer (T26, 5*-TCTGGCCGACTACGCTTCTGC). The conditions for both PCRs were 957C for 30 s and 687C for 5 min. The nucleotide sequences of the colinear primers, AP1 and AP2, are described in the RACE protocol (Clontech). A 250-bp cDNA fragment was detected after the second PCR. This fragment was subcloned into the PCRII vector and sequenced. Luciferase reporter constructs and transient gene expression. Both sense and antisense constructs of the p55 promoter region were made by subcloning an upstream 315-bp SmaI fragment (Fig. 3) into the pGL2-Basic vector (Promega). Plasmid DNAs were purified by banding twice on CsCl gradients using standard methods (Sambrook et al., 1989), and the luciferase reporter enzyme was measured after transient expression in Raucher erythroleukemia cells and HeLa cells. Prior to transfection of the Rauscher cells, 50 mg of the appropriate construct was precipitated with ethanol, pelleted, dried, and resuspended in 180 ml of sterile distilled H2O. Rauscher erythroleukemia cells (de Both et al., 1978) were grown to late log density (0.8–1.0 1 106/ml) in RPMI 1640 medium (Life Technologies, Inc.) containing 10% fetal calf serum (HyClone Laboratories, Inc.). Cells were harvested by centrifugation at 47C, washed once with ice-cold RPMI 1640, and resuspended in ice-cold RPMI 1640 at 1.75 1 107 cells/ml. This cell suspension, 0.8 ml, was combined with 20 ml of
AP-Genomics
GENOMIC ORGANIZATION OF THE p55 GENE MPP1
225
trast, the cosmid E0139 hybridized only with the upstream cDNA probe (data not shown). These results indicate that the entire p55 gene is contained in the D0864 cosmid (Fig. 1A). Therefore, cosmid D0864 was selected to construct the p55 intron–exon map. A restriction map of the cosmid D0864 DNA indicated that the KpnI digestion of the cosmid DNA produces six DNA fragments ranging from 3.0 to 12.5 kb, proving most suitable for subsequent analysis (Fig. 1A). These genomic fragments were then subcloned into the plasmid vector, and Southern blotting revealed that four of the six KpnI fragments hybridized to the various p55 cDNA probes. The cloned cosmid genomic fragments were then used for the mapping of p55 intron–exon boundaries.
FIG. 1. Organization of the human p55 gene. (A) The intron– exon map of the human p55 gene derived from cosmid clone D0864. The boundaries of the KpnI fragments K16 and K30 extend further upstream and downstream of the p55 gene. (B) The correspondence of exons with known protein domains in human erythroid p55. The sizes of the exons are shown below. Exon 8 defines a newly identified domain that contains a highly conserved site for tyrosine phosphorylation. We have designated the PDZ domain of p55 as PDZ* to highlight the fact that it is significantly shorter than the PDZ domains found in other MAGUKs. It should also be noted that based on published sequence alignments, the start of the amino terminus of the PDZ* domain is shown at residue 71. However, our alignment analysis indicates that a significant sequence identity starts from residue 83. The protein domains are PDZ* (PSD-95/discs large/Z0-1), SH3 (Src homology 3), 4.1B (protein 4.1 binding), TP (tyrosine phosphorylation), and GUK (guanylate kinase). 101 HBS (0.2 M Hepes, pH 7.05, 1.37 M NaCl, 50 mM KCl, 7.0 mM Na2HPO4), and 180 ml of plasmid DNA (above) in an electroporation chamber with a 0.4-cm gap (Life Technologies, Inc.) on ice. After 10 min on ice, the cells were shocked while on ice using a BRL CellPorator electroporation system I (Life Technologies, Inc.) at 1180 mF and 750 V. After an additional 5 min on ice, the cells were transferred to a culture flask containing 9 ml of prewarmed medium. Cells were cultured for 36–40 h, then harvested by centrifugation, and lysed in 250 ml of reporter lysis buffer (Life Technologies, Inc.). Luciferase enzyme activity of a constant volume in each cell lysate (20 ml) was measured using the luciferase assay system (Promega Corp., Madison, WI) in a Turner TD-20e luminometer (Turner Designs, Sunnyvale, CA). HeLa cells were maintained in DMEM containing 10% fetal calf serum (HyClone Laboratories, Inc.). Cells, grown to 75% confluency in 10-cm2 dishes, were transfected with 15 mg of the appropriate luciferase reporter construct using a standard calcium phosphate DNA-precipitation procedure (Sambrook et al., 1989). Cell lysates were prepared 30–36 h following transfection, and luciferase enzyme activity was measured as described above.
Elucidation of the Intron–Exon Organization of the p55 Gene The precise positions of the intron–exon boundaries were determined by direct sequencing of the KpnI fragments derived from the D0864 cosmid DNA (Fig. 2). The nucleotide sequence of the KpnI fragments was determined using primers that covered both strands of the p55 cDNA (Ruff et al., 1991). The sizes of the respective introns were measured by direct amplification of the DNA fragments using exon-specific flanking primers. The PCR-based detection of introns 1, 5, 9, and 11 was carried out in the presence of Pfu DNA polymerase, which allows the amplification of longer segments of the genomic DNA (Barnes, 1994). The human p55 gene consists of 12 exons interrupted by 11 introns and spans approximately 28 kb (Figs. 1 and 2). The sizes of the exons are relatively small and range from 69 (exon 5) to 203 (exon 10) bp, whereas intron
RESULTS
Isolation of the Human p55 Gene Two cosmid clones, ICRFc104D0864 and ICRFc104E0139, were isolated from a human X-chromosomespecific cosmid library. Both cosmids, designated here D0864 and E0139, hybridized with the full-length cDNA probe of human p55. The D0864 cosmid hybridized with the cDNA probes derived from both the 5* and the 3*-untranslated ends of the p55 cDNA. In con-
/ m4752f3845
12-20-95 20:03:48
gnmal
FIG. 2. The intron–exon boundaries of the p55 gene. The nucleotide sequence of the intron–exon junctions. The details of cosmid clone D0864 are shown in Fig. 1. Intron sizes are shown to the right side of the 5* donor sequence.
AP-Genomics
226
KIM ET AL.
sizes range from 280 bp (intron 2) to Ç14 kb (intron 1). All exon–intron splice junctions conform to the eukaryotic 5* donor and 3* acceptor consensus splice junction sequence GT-AG (Stephens and Schneider, 1992). Of the 11 splice junctions, 55% occurred between codons (Type 0). Three of the introns interrupted codons between positions 1 and 2 (Type 1), and the remaining two introns interrupted codons between positions 2 and 3 of the reading frame (Type 2) (Sharp, 1981). Mapping of the Transcription Start Site We have previously shown that a 2.0-kb transcript of p55 is present in human reticulocytes as well as in many nonerythroid tissues (Ruff et al., 1991; Metzenberg and Gitschier, 1992). To map the transcription start site, an anchored PCR-based primer extension method was used. This newly developed 5*-RACE (Marathon cDNA amplification) method produces relatively large RACE products by employing the ‘‘long-distance PCR’’ technology (see Methods). The p55 transcript present in human reticulocytes was amplified using an adaptor-specific sense primer and an antisense primer specific for exon 2 of human p55. Subsequent amplification of the PCR products with nested primers produced a 250-bp PCR product (not shown). The nucleotide sequence of the 250-bp PCR fragment suggested that the transcription start site begins at position 0115 from the initiation codon of p55 cDNA (Fig. 3). The 0115 position of the transcription start site is consistent with the length of published p55 cDNA sequence and the 2.0-kb transcripts of p55 detected in erythroid and nonerythroid cells (Ruff et al., 1991; Metzenberg and Gitschier, 1992). The results from the S1 nuclease protection experiments were not conclusive due to experimental problems associated with the preparation and digestion of the radiolabeled probes (not shown). Such problems have been previously encountered with the mapping of the transcription start sites of genes containing GC-rich regions (Noguiez et al., 1992). The location of the transcription start site of p55 is also consistent with the analysis of the p55 promoter region as described in the next section. Characterization of the Upstream Promoter Region The nucleotide sequence of the upstream region of the p55 gene is shown in Fig. 3. A notable feature of the upstream sequence is its unusually high content of G / C, as shown graphically in Fig. 3. The G / C content of the region immediately preceding the first exon of the p55 gene is Ç80%. Further analysis of the nucleotide sequence revealed four GC boxes at positions 0119, 0168, 0189, and 0374 (Fig. 3). The GC boxes are known to bind the Sp1 transcription factor (Kadonaga et al., 1987). No TATA box was found within the 0400-bp GC-rich region of the upstream sequence. To confirm that the region immediately preceding exon 1 of the p55 gene contains putative promoter elements, a 315-bp SmaI fragment (Fig. 3) was subcloned into a luciferase reporter vector. In a transient expression
/ m4752f3845
12-20-95 20:03:48
gnmal
FIG. 3. Characterization of the upstream region of the human p55 gene. The upstream sequence was analyzed by the MacVector computer program (Kodak). Four predicted Sp1 binding sites are boxed. The SmaI sites were used to produce a 315-bp fragment for promoter analysis (Table 1). The codons for the first two amino acids of the p55 protein are shown. The solid circle at position 0103 shows the location of the beginning of the published transcript of human erythroid p55 cDNA (Ruff et al., 1991). The arrow at position 0115 shows the location of the putative transcription start site as determined by 5*RACE (see Methods). The mapping of the transcription start site is consistent with the known size of p55 transcripts present in tissues examined so far (Ruff et al., 1991; Metzenberg and Gitschier, 1992). In the plot showing the G / C content of the upstream region, the arrowheads indicate the predicted restriction enzyme sites for MspI/Hpa2. The nucleotide sequence of the upstream region (01600) has been submitted to the EMBL/GenBank Data Libraries under Accession No. U39611.
assay, the sense promoter construct was found to be active in Rauscher erythroleukemia cells as well as in the nonerythroid HeLa cells (Table 1). An antisense construct also exhibited a significant level of promoter activity, a result that is consistent with the known properties of Sp1-binding promoters (Kadonaga et al., 1987). The activity of the p55 promoter may be relatively higher than the value reported here, since the upstream SmaI subcloning site overlaps with one of the Sp1 binding sites located at position 0374 of the first exon (Fig. 3). These properties are characteristic of promoters found in the housekeeping genes (Tazi and Bird, 1990). Location and Conservation of the p55 Gene The p55 gene has been previously localized downstream of the factor VIII gene on the human X chromosome (Metzenberg and Gitschier, 1992). Since the location of the p55 gene has not been established by direct fluorescence in situ hybridization, the presence of additional copies of the p55 gene, or conserved pseudogenes, on other chromosomes remains a possibility. To resolve this issue, we hybridized Southern blots containing DNA from human and hamster somatic cell hybrids. The hybridization signals segregated with the Xq24–
AP-Genomics
GENOMIC ORGANIZATION OF THE p55 GENE MPP1
227
TABLE 1 Measurement of the Promoter Activities of p55–Luciferase Constructs
HeLa experiment 1 HeLa experiment 2 Rauscher
SV40/Luc
p55/Luc sense
p55/Luc antisense
Promoterless Luc
22257 { 6398 7963 { 376 3731 { 596
5255 { 2081 1944 { 185 1221 { 74
2406 { 115 833 { 23 990 { 68
n.d.a 90 { 17 £0.1 { 0.1
Note. Results are expressed as total Turner light units per lysate. Each value is the average of duplicate samples { range. The upstream 315-bp SmaI fragment of the p55 gene was subcloned into the pGL2-Basic vector (Promega). After transient expression, luciferase activities were measured as described under Materials and Methods. a Not determined.
qter region (data reviewed but not shown), indicating that the p55 gene is located only within this region of the human X chromosome. A notable feature of the p55 gene is its high conservation. Homologous genes were detected in many animal species (Fig. 4). The size of the EcoRI-digested fragments of human genomic DNA (lane 1), which hybridized with the full-length cDNA probe of p55, was calculated to be Ç20 kb. Interestingly, the size of the human p55 gene is comparable to the calculated size (Ç20 kb) of the Drosophila discs-large tumor suppressor gene (Woods and Bryant, 1991). DISCUSSION
A rationale to decipher the intron–exon map of the p55 gene was based on our interest in determining the relationship of putative protein domains with their respective exons. The boundaries of various protein domains in MAGUKs have been previously established only by an alignment of the primary structures, with the exception of our previous identification of the protein 4.1 binding domain in p55 and hDlg (Lue et al., 1994; Marfatia et al., 1995). The hDlg protein also contains a second binding site for protein 4.1 (Lue et al., 1994). As reported here, exons 1 and 2 of the p55 gene correspond to an undefined N-terminal domain whose function is not yet known. A single copy of the PDZ* (previously termed the DHR or GLGF repeat) domain present in p55 is contained in exons 3 through 5. In fact, exon 5 contains only a portion of the predicted PDZ* domain (Fig. 1B). The exon organization of a single PDZ* domain in p55 may be useful in expressing a stably folded PDZ* domain produced by in vitro recombinant methods. The entire SH3 domain of p55 corresponds to exon 6, underscoring the evolutionary conservation and widespread distribution of this motif in diverse proteins (Yu et al., 1994). Exon 7 defines the boundaries of the protein 4.1 binding domain. Interestingly, the exon 7-encoded boundaries of the protein 4.1 binding domain fit precisely with the boundaries of the experimentally determined protein 4.1 binding domain (Marfatia et al., 1995). We have previously shown that a similar protein 4.1 binding domain also exists in hDlg and is highly conserved in other MAGUKs (Marfatia et al., 1995). The protein 4.1 binding domain may therefore represent a distinct protein module in MAGUKs.
/ m4752f3845
12-20-95 20:03:48
gnmal
The question whether the intron–exon boundaries of the protein 4.1 binding domain in p55 conform to other MAGUKs will await elucidation of their genomic structures. The region between the protein 4.1 binding domain and the guanylate kinase domain of p55 is contained in exon 8 (Fig. 1B). A comparison of the 27-amino-acid sequence of the p55 exon 8 with other MAGUKs revealed a highly conserved domain (Fig. 5). A novel feature of this domain is the presence of a conserved tyrosine residue that is a consensus site for tyrosine phosphorylation (Pawson, 1995). The relative position of tyrosine 271 in p55 is conserved in all MAGUKs. An alignment of the p55 exon 8 produced a consensus sequence YEXV in all MAGUKs (Fig. 5). For convenience, we have designated this segment the TP (tyrosine phosphorylation) domain (Fig. 1B). If tyrosine 271 in p55
FIG. 4. Conservation of the human p55 gene. Southern hybridization analysis of genomic DNA from nine eukaryotic species. Each lane contains 8.0 mg of EcoRI-digested genomic DNA and was hybridized with a radiolabeled probe of human p55 cDNA. Rat genomic DNA (lane 3) shows very faint hybridizing signals, and no signal was detected in yeast (lane 9). It should be noted that the hybridizing pattern in human DNA (lane 1) shows two closely spaced bands of approximately 6.0 kb in the original autoradiogram.
AP-Genomics
228
KIM ET AL.
FIG. 5. Sequence comparison of the tyrosine phosphorylation domain of p55 with other MAGUKs. The 27 amino acids of the p55 exon 8 were aligned with the primary structures of MAGUKs using the MACAW computer program. The p55 exon 8 extends from phenylalanine 263 to glycine 289 (Ruff et al., 1991). The tyrosine residue in the signature sequence YEXV is the predicted site for phosphorylation by tyrosine kinases. The hDlg protein contains a second tyrosine phosphorylation site located four residues upstream of the signature sequence. This tyrosine residue is not conserved in other MAGUKs. Note that both p55 and hDlg are tyrosine phosphorylated in vitro (our unpublished data). We are currently making cDNA constructs to determine the tyrosine phosphorylation site in MAGUKs.
turns out to be a site for tyrosine phosphorylation, then the boundaries of exon 8 may define a new protein domain in p55 or perhaps in MAGUKs in general. The guanylate kinase-like domain of p55 is contained in exons 9 through 12 (Fig. 1B). The MAGUKs can be subdivided into two categories, depending upon the presence or absence of a three-amino-acid deficiency in the putative ATP binding site of the guanylate kinase domain (Woods and Bryant, 1993). The p55 protein does not contain the three-amino-acid deficiency in its guanylate kinase domain (Woods and Bryant, 1993). Interestingly, exon 9 of the p55 gene begins exactly at the ATP binding motif (ASGVGR), thus raising the possibility that the three-amino-acid deficiency in other MAGUKs may arise simply by the truncation of a similar exon. Experiments are currently underway to test this hypothesis in hDlg. The last exon, exon 12, includes the remaining C-terminal end of the p55 protein. The functional significance of this domain is not yet known. The last exon also contains the untranslated region of the p55 cDNA (Ruff et al., 1991). It is now well established that the boundaries of the protein-folding domains encoding structural and functional segments of proteins are often defined by the corresponding exons (Go, 1981, 1983; Branden, 1984). The intron–exon junctions precisely define the structural and functional domains of alcohol dehydrogenase (Branden et al., 1984). In fact, five structurally repeating segments within the dinucleotide binding domain of alcohol dehydrogenase are encoded by five respective exons (Branden et al., 1984). These observations suggest that the amino acid boundaries marked by the exons of the p55 gene (Fig. 1) may identify protein modules with a distinct structural and functional identity. Further credence of this hypothesis will await the availability of intron–exon boundaries of other mammalian MAGUKs. The p55 gene appears to be highly conserved through evolution (Fig. 4) (Metzenberg and Gitschier, 1992). After the completion of these studies, Elgar (1995) reported on the genomic structure of the p55 gene of the puffer fish Fugu rubripes. Although the size of the p55 gene in Fugu rubripes is only 5.5 kb, the gene is orga-
/ m4752f3845
12-20-95 20:03:48
gnmal
nized into 12 exons. The predicted amino acid sequence of the Fugu p55 gene is similar to that of the human p55 gene. The amino acid sequence of the murine p55 is virtually identical to that of the human p55 (our unpublished data), and we have not detected any alternatively spliced transcripts of the p55 gene. This unusually high conservation of the human p55 gene may expedite evaluation of the functions of the p55 gene in suitable animal model systems. In addition, the availability of the nucleotide sequence of intron–exon junctions may have its utility for searching for genetic abnormalities of the p55 gene in human diseases. Indeed, Ruff et al. (1994) recently reported a 23-amino-acid deletion of the p55 transcript in a patient with blastic chronic myeloid leukemia. The p55 deletion was identified in the reverse-transcribed RNA corresponding to codons 138–160, which precisely define the boundaries of exon 5 (Figs. 1B and 2). Similarly, the loci of at least 14 diseases have been mapped to the q28 region of the human X chromosome (Metzenberg and Gitschier, 1992; Metzenberg et al., 1994). Since the genes for these abnormalities have not yet been identified, the p55 gene becomes a candidate for any of these diseases. Now that the complete structure of the human p55 gene has been determined, it will be feasible to subject such enquiries to experimental testing. ACKNOWLEDGMENTS We thank Dr. Peter Bryant of the University of California for his helpful advice and for introducing us to the MACAW alignment program. This work was supported by the National Institutes of Health Grants HL51445 and HL37462 to A.H.C. A.H.C. is an established investigator of the American Heart Association.
REFERENCES Alloisio, N., Venezia, N. D., Rana, A., Andrabi, P., Texier, P., Gilsanz, F., Cartron, J.-P., Delaunay, J., and Chishti, A. H. (1993). Evidence that red blood cell protein p55 may participate in the skeleton– membrane linkage that involves protein 4.1 and glycophorin C. Blood 82: 1323–1327. Arpin, M., Algrain, M., and Louvard, D. (1994). Membrane–actin microfilament connections: An increasing diversity of players related to band 4.1. Curr. Opin. Cell Biol. 6: 136–141. Barnes, W. M. (1994). PCR amplification of up to 35-kb DNA with high fidelity and high yield from lambda bacteriophage templates. Proc. Natl. Acad. Sci. USA 91: 2216–2220. Borson, N. D., Salo, W. L., and Drewes, L. R. (1992). A lock-docking oligo(dT) primer for 5* and 3* RACE PCR. PCR Methods Appl. 2: 144–148. Branden, C. I., Eklund, H., Cambillau, C., and Pryor, A. J. (1984). Correlation of exons with structural domains in alcohol dehydrogenase. EMBO J. 3: 1307–1310. Bretscher, A. (1995). Rapid phosphorylation and reorganization of ezrin and spectrin accompany morphological changes induced in A-431 cells by epidermal growth factor. J. Cell Biol. 108: 921–930. Cho, K.-O., Hunt, C. A., and Kennedy, M. B. (1992). The rat brain postsynaptic density fraction contains a homolog of the Drosophila discs-large tumor suppressor protein. Neuron 9: 929–942. de Both, N. J., Vermey, M., Hull, E. V., Klootwijk, E., van Griensven, L. J. L. D., Mol, J. N. M., and Stoof, T. J. (1978). A new erythroid cell line induced by Rauscher murine leukaemia virus. Nature 272: 626–628.
AP-Genomics
GENOMIC ORGANIZATION OF THE p55 GENE MPP1 Elgar, G. (1995). Genomic structure and nucleotide sequence of the p55 gene of the puffer fish Fugu rubripes. Genomics 27: 442–446. Fehon, R. G., Dawson, I. A., and Artavanis-Tsakonas, S. (1994). A Drosophila homologue of membrane–skeleton protein 4.1 is associated with septate junctions and is encoded by the coracle gene. Development 120: 545–557. Go, M. (1981). Correlation of DNA exonic regions with protein structural units in haemoglobin. Nature 291: 90–92. Go, M. (1983). Modular structural units, exons, and function in chicken lysozyme. Proc. Natl. Acad. Sci. USA 80: 1964–1968. Hemming, N. J., Anstee, D. J., Staricoff, M. A., Tanner, M. J. A., and Mohandas, N. (1995). Identification of the membrane attachment sites for protein 4.1 in the human erythrocyte. J. Biol. Chem. 270: 5360–5366. Itoh, M., Nagafuchi, A., Yonemura, S., Kitani-Yasuda, T., and Tsukita, S. (1993). The 220-kD protein colocalizing with cadherins in non-opithelial cells is identical to ZO-1, a tight junction-association protein in epithelial cells: cDNA cloning and immunoelectron microscopy. J. Cell Biol. 121: 491–502. Jesaitis, L. A., and Goodenough, D. A. (1994). Molecular characterization and tissue distribution of ZO-2, a tight junction protein homologous to ZO-1 and the Drosophila discs-large tumor suppressor protein. J. Cell Biol. 124: 949–961. Kadonaga, J. T., Carner, K. R., Masiarz, F. R., and Tjian, R. (1987). Isolation of cDNA encoding transcription factor Sp1 and functional analysis of the DNA binding domain. Cell 51: 1079–1090. Kim, C. L. V., Colin, Y., Mitjavila, M.-T., Clerget, M., Dubart, A., Nakazawa, M., Vainchenker, W., and Cartron, J.-P. (1989). Structure of the promoter region and tissue specificity of the human glycophorin C gene. J. Biol. Chem. 264: 20407–20414. Kistner, U., Wenzel, B. M., Veh, R. W., Cases-Langhoff, C., Garner, A. M., Appeltauer, U., Voss, B., Gundelfinger, E. D., and Garner, C. C. (1993). SAP90, a rat presynaptic protein related to the product of the Drosophila tumor suppressor gene dlg-A. J. Biol. Chem. 268: 4580–4583. Lue, R. A., Marfatia, S. M., Branton, D., and Chishti, A. H. (1994). Cloning and characterization of hdlg: The human homologue of the Drosophila discs large tumor suppressor binds to protein 4.1. Proc. Natl. Acad. Sci. USA 91: 9818–9822. Marfatia, S. M., Lue, R. A., Branton, D., and Chishti, A. H. (1994). In vitro binding studies suggest a membrane-associated complex between erythroid p55, protein 4.1, and glycophorin C. J. Biol. Chem. 269: 8631–8634. Marfatia, S. M., Lue, R. A., Branton, D., and Chishti, A. H. (1995). Identification of the protein 4.1 binding interface on glycophorin C and p55, a homologue of the Drosophila discs-large tumor suppressor protein. J. Biol. Chem. 270: 715–719. Mazoyer, S., Gayther, S. A., Nagai, M. A., Smith, S. A., Dunning, A., Van Rensburg, E. J., Albertsen, H., White, R., and Ponder, B. A. J. (1995). A gene (DLG2) located at 17q12–q21 encodes a new homologue of the Drosophila tumor suppressor dlg-A. Genomics 28: 25–31. Metzenberg, A. B., and Gitschier, J. (1992). The gene encoding the palmitoylated erythrocyte membrane protein, p55, originates at the CpG island 3 * to the factor VIII gene. Hum. Mol. Genet. 1: 97– 101. Metzenberg, A. B., Pan, Y., Das, S., Pai, G. S., and Gitschier, J. (1994). Molecular evidence that the p55 gene is not responsible for either of two Xq28-linked disorders: Emery–Dreifuss muscular dystrophy and dyskeratosis congenita. Am. J. Hum. Genet. 54: 920–922. Muller, B. M., Kistner, U., Veh, R. W., Cases-Langhoff, C., Becker, B., Gundelfinger, E. D., and Garner, C. C. (1995). Molecular characterization and spatial distribution of SAP97, a novel presynaptic
/ m4752f3845
12-20-95 20:03:48
gnmal
229
protein homologous to SAP90 and the Drosophila discs-large tumor suppressor protein. J. Neurosci. 15: 2354–2366. Nizetic, D., Zehetner, G., Monaco, A. P., Gellen, L., Young, B. D., and Lehrach, H. (1991). Construction, arraying, and high-density screening of large insert libraries of human chromosomes X and 21: Their potential use as reference libraries. Proc. Natl. Acad. Sci. USA 88: 3233–3237. Noguiez, P., Barnes, D. E., Mohrenweiser, H. W., and Lindahl, T. (1992). Structure of the human DNA ligase I gene. Nucleic Acids Res. 20: 3845–3850. Pawson, T. (1995). Protein modules and signalling networks. Nature 373: 573–580. Rouleau, G. A., Merel, P., Lutchman, M., Sanson, M., Zucman, J., Marineau, C., Hoang-Xuan, K., Demczuk, S., Desmaze, C., Plougastel, B., Pulst, S. M., Lenoir, G., Bijlsma, E., Fashold, R., Dumanski, J., de Jong, P., Parry, D., Eldrige, R., Aurias, A., Delattre, O., and Thomas, G. (1993). Alteration in a new gene encoding a putative membrane-organizing protein causes neuro-fibromatosis type 2. Nature 363: 515–521. Ruff, P., Speicher, D. W., and Chishti, A. H. (1991). Molecular identification of a major palmitoylated erythrocyte membrane protein containing the src homology 3 motif. Proc. Natl. Acad. Sci. USA 88: 6595–6599. Ruff, P., Bischoff, D., and Chishti, A. H. (1994). Human erythroid p55: A palmitoylated membrane protein with homology to the Drosophila disc-large tumor suppressor protein is ubiquitously expressed and altered in blastic chronic myeloid leukemia (CML). Blood 84: 2438A. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). ‘‘Molecular Cloning: A Laboratory Manual,’’ 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. Sharp, P. A. (1981). Speculations on RNA splicing. Cell 23: 643–646. Stephens, R. M., and Schneider, T. D. (1992). Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. J. Mol. Biol. 228: 1124–1136. Takakuwa, Y., Tchernia, G., Rossi, M., Benabadji, M., and Mohandas, N. (1986). Restoration of normal membrane stability to unstable protein 4.1-deficient erythrocyte membranes by incorporation of purified protein 4.1. J. Clin. Invest. 78: 80–85. Tazi, J., and Bird, A. (1990). Alternative chromatin structure at CpG islands. Cell 60: 909–920. Trofatter, J. A., MacCollin, M. M., Rutter, J. L., Murrell, J. R., Duyao, M. P., Parry, D. M., Eldridge, R., Kley, N., Menon, A. G., Pulaski, K., Haase, V. H., Ambrose, C. M., Munroe, D., Bove, C., Haines, J. L., Martuza, R. L., MacDonald, M. E., Seizinger, B. R., Short, M. P., Buckler, A. J., and Gusella, J. F. (1993). A novel moesin-, ezrin-, radixin-like gene is a candidate for the neurofibromatosis 2 tumor suppressor. Cell 72: 791–800. Villeval, J.-L., Kim, C. L. V., Bettaieb, A., Debili, N., Colin, Y., Maliki, B. E., Blanchard, D., Vainchenker, W., and Cartron, J.-P. (1989). Early expression of glycophorin C during normal leukemic human erythroid differentiation. Cancer Res. 49: 2626–2632. Willott, E., Balda, M. S., Fanning, A. S., Jameson, B., Itallie, C. V., and Anderson, J. M. (1993). The tight junction protein ZO-1 is homologous to the Drosophila discs-large tumor suppressor protein of septate junctions. Proc. Natl. Acad. Sci. USA 90: 7834–7838. Woods, D. F., and Bryant, P. J. (1991). The discs-large tumor suppressor gene of Drosophila encodes a guanylate kinase homolog localized at septate junctions. Cell 66: 451–464. Woods, D. F., and Bryant, P. J. (1993). ZO-1, DlgA and PSD-95/ SAP90: Homologous proteins in tight, septate and synaptic cell junctions. Mech. Dev. 44: 85–89. Yu, H., Chen, J. K., Feng, S., Daigarno, D. C., Brauer, A. W., and Schreiber, S. L. (1994). Structural basis for the binding of prolinerich peptides to SH3 domains. Cell 76: 933–945.
AP-Genomics