Molecular cloning and characterization of the neonatal rat and mouse submandibular gland protein SMGC

Molecular cloning and characterization of the neonatal rat and mouse submandibular gland protein SMGC

Gene 334 (2004) 23 – 33 www.elsevier.com/locate/gene Molecular cloning and characterization of the neonatal rat and mouse submandibular gland protein...

1MB Sizes 0 Downloads 54 Views

Gene 334 (2004) 23 – 33 www.elsevier.com/locate/gene

Molecular cloning and characterization of the neonatal rat and mouse submandibular gland protein SMGC Karen M. Zinzen a, Arthur R. Hand b, Maya Yankova b, William D. Ball c, Lily Mirels a,* a

Department of Molecular and Cell Biology, University of California, 401 Barker Hall #3204, Berkeley, CA 94720-3204, USA b Department of Pediatric Dentistry, University of Connecticut School of Dental Medicine, Farmington, CT, USA c Department of Anatomy, Howard University College of Medicine, Washington, DC, USA Received 17 November 2003; accepted 5 March 2004 Available online 30 April 2004 Received by J.A. Engler

Abstract We report the molecular cloning and characterization of SMGC, a major secretory product and a marker of the type I (terminal tubule) cells of the neonatal rat and mouse submandibular gland. SMGC is expressed in the submandibular gland at high levels through postnatal day 20, but in the adult is present only in some intercalated duct cells. Rat and mouse SMGC have deduced molecular weights of 67.8 and 74.4 kDa, respectively, are 37% Ser + Gly + Thr, and contain tandem repeats of between 8 and 60 amino acids. Secreted SMGC visualized by SDS-PAGE and silver staining is 89 kDa in rat and 105 kDa in mouse, although Western blot analyses with anti-SMGC antisera demonstrate multiple additional lower molecular weight forms. Contributions to the heterogeneity of SMGC include alternate splicing, proteolysis and Nglycosylation. Smgc is localized on rat chromosome 7q34-35 and on mouse chromosome 15E3, both immediately upstream of the high molecular weight salivary mucin, Muc19. Amino acid sequence identity between the signal peptides of SMGC, human MUC19 and pig submaxillary mucin suggest that rat and mouse Smgc and Muc19 arose from a single ancestral mucin gene. D 2004 Elsevier B.V. All rights reserved. Keywords: Development; Type I cell; Terminal tubule cell; Intercalated duct; Tandem repeat; Muc19

1. Introduction The neonatal rat and mouse submandibular gland contains two transient secretory cell types, proacinar (type III) and terminal tubule (type I) cells (Cutler and Chaudhry, 1974; Yamashina and Barka, 1972; Gresik and MacRae, 1975). Morphological and immunocytochemical studies have demonstrated that the proacinar cells are progenitors of the seromucous acinar cells of the mature gland (Yamashina and Barka, 1972; Denny et al., 1988; Moreira et al., 1991). The developmental role of the type I cells is less clear, although they are thought Abbreviations: PBS, phosphate-buffered saline; SDS-PAGE, sodium dodecyl sulfate polyacrylamide gel electrophoresis; RT-PCR, reverse transcriptase polymerase chain reaction; GST, glutathione-S-transferase; CSP1, rat common salivary protein 1; DCPP, mouse demilune cell and parotid protein; SMGB, neonatal rat submandibular gland protein B; GRP, glutamine-glutamic acid-rich salivary protein. * Corresponding author. Tel.: +1-510-642-5007; fax: +1-510-643-5785. E-mail address: [email protected] (L. Mirels). 0378-1119/$ - see front matter D 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2004.03.014

to contribute to the adult submandibular gland intercalated ducts (reviewed in Denny et al., 1997). Type I cells are a major cell type in the developing submandibular gland during the first three postnatal weeks, but are lost from the gland between days 20 and 30. In rats, apoptosis has been shown to contribute substantially to this developmentally programmed cell population change (Hecht et al., 2000; Hayashi et al., 2000). Submandibular gland protein C (SMGC) is the only known cell-specific product of the type I cells, and has been described only in rat. SMGC is present in all type I cells throughout development, and in a subset of granular intercalated duct cells in the adult gland (Ball et al., 1988; Moreira et al., 1990). A major component of rat submandibular gland secretion between neonatal days 0 and 20, SMGC appears as a broad f 89 kDa band when visualized by SDS-PAGE (Ball et al., 1988, 1993). The function of SMGC is not known. Rats traditionally have been used as a model organism for studies of salivary gland physiology and development.

24

K.M. Zinzen et al. / Gene 334 (2004) 23–33

Technological advances such as targeted mutagenesis and the mouse genome database have made mice an increasingly attractive model for these studies. However, rat and mouse salivary protein content differs considerably. Some salivary proteins including rat and mouse parotid secretory protein and prolactin-inducible protein retain a high degree of similarity in their sequences and expression patterns (Shaw et al., 1986; Poulsen et al., 1986; Myal et al., 1994; Mirels et al., 1998a,b). Others, such as the submandibular mucin MUC10, have conserved tissue specificity, but diverge considerably in their sequences (Denny et al., 1996; Albone et al., 1996). The orthologous salivary proteins, rat CSP1 and mouse DCPP have diverged in both sequence and tissue specificity; CSP1 is expressed in neonatal submandibular gland proacinar cells, DCPP is not (Girard et al., 1993; Bekhor et al., 1994). Examples of the most extreme divergence between rat and mouse salivary secretion are SMGB and GRP, highly abundant products of neonatal and adult rat submandibular gland, respectively, which have no mouse homologue (Mirels et al., 1987, 1998a; Rosinski-Chupin and Rougeon, 1990). The purpose of this study is to characterize rat and mouse Smgc, and to determine whether the cell-specificity of expression of these proteins is conserved in the developing rat and mouse submandibular gland.

2. Materials and methods 2.1. Animals and harvesting of samples Spargue Dawley rats were obtained from Harlan (Indianapolis, IN) and C57BL/6J mice were obtained from The Jackson Laboratory (Bar Harbor, ME). Procedures involving animals were approved by the U.C. Berkeley Office of Laboratory Animal Care. For harvesting of organs, animals were anesthetized with isoflurane, then euthanized by cervical dislocation. Glands to be homogenized or used for preparation of RNA were frozen in liquid nitrogen and stored at  70 jC. Secretion was collected as described (Mirels et al., 1998b). 2.2. RNA isolation, northern blot analysis, and hybridization selection RNA was isolated using Trizol (Invitrogen, Carlsbad, CA), and poly(A)+ RNA was isolated on oligo d(T) cellulose using a Poly(A) Pure kit (Ambion, Austin, TX) as directed by the manufacturers. For Northern blot analysis, RNA was denatured, size fractionated in 1.5% agarose-formaldehyde gels and transferred to nitrocellulose. Probes were labeled with 32P by random priming (Ready-to-Go kit, Amersham, Piscataway, NJ), and hybridized to blots as in Mirels et al. (1998b). Hybridization selection was performed as described (Mirels et al., 1987).

2.3. Identification of rat Smgc cDNA clones A cDNA library was prepared from 5 day old Sprague – Dawley rat submandibular gland poly(A)+ RNA in Lambda Zap II according to the manufacturer’s directions (Stratagene, La Jolla, CA; Girard et al., 1993). First strand cDNA probe was prepared using 1 Ag poly(A)+ RNA from 1 day old rat submandibular gland. The RNA was reverse transcribed in 1  M-MuLV buffer containing 800 AM dATP, dGTP, dTTP, 4.8 AM dCTP, 100 AM a[32P]dCTP (6000 Ci/mmol), 20 u RNAsin, 1.2 Ag oligo dT, and 100 u MMuLV reverse transcriptase at 37 jC for 1 h. Template RNA was hydrolyzed by adding NaOH to 0.3 N, and incubating at 68 jC for 30 min. The library was plated, and filters were screened by incubation in 6  SSC, 5  Denhardt’s, 0.2% SDS, 250 Ag/ml salmon sperm DNA containing 2  106 cpm/ml labeled first strand cDNA at 65 jC overnight. The library was re-screened for longer clones using a Pst I fragment corresponding to rat Smgc bp 1070 – 1448 as above. Clones were excised by phagemid rescue as described by the manufacturer (Stratagene), and analyzed by restriction mapping and DNA sequence analysis. 2.4. Primer extension and 5VRACE Primers were synthesized by Qiagen/Operon (Alameda, CA). For primer extension, a 30-mer primer 5VGAGCAGAGCACCACGGCCAGGTACAGAAGTATCAG (reverse and complement of bp 47– 76 of rat Smgc, accession no. AY459347), was end-labeled with 32P, annealed with 1.4 Ag 1 day old rat submandibular gland total RNA. The primer extension reaction was completed, and reaction products visualized as described (Girard et al., 1993). 5V RACE was completed using a Marathon 5V RACE kit (Clontech, Palo Alto, CA) as directed by the manufacturer. The cDNA synthesis reaction contained 1 Ag poly(A)+ RNA from 5 day old rat submandibular gland. The PCR reaction was performed using the 5V AP1 adaptor primer provided by Clontech, and the 3V primer: 5VTCACCCCGGGATCCTCCAGCAGACACGCTGTCGTTCACCCCGGGATCCTCCAGCAGACACGCTGTCGTC (complementary to rat Smgc bp 203– 226 and incorporating restriction enzyme sites) according to the following program: 94 jC 1 min followed by 25 cycles of 94 jC for 30 s and 68 jC for 3 min. 2.5. Isolation and characterization of mouse Smgc cDNAs Mouse Smgc cDNAs were generated by RT-PCR, using 0.5 Ag poly(A)+ RNA from 1 to 2 day old C57BL/6J mouse submandibular glands. RNA and 350 ng random hexamers (Promega, Madison, WI) were denatured at 70 jC for 5 min., then cooled on ice. The RNA and primers were brought to 1  M-MuLV reverse transcriptase buffer, 170 AM each dNTP, 30 u RNAsin (Promega) and 200 u M-

K.M. Zinzen et al. / Gene 334 (2004) 23–33

MuLV reverse transcriptase (Promega) and incubated at 37 jC for 1 h. PCR reactions included the 5V primer 5VACAGTCTCTACACTTAGGTCCCA and one of the following primers: 3V.2, TTCTCATTCCCTCTTCATAAGGCT; 3V.3, CAGAAGGAGATCCAGATTGGGTCT; or 3V.4, GGATGACCAGTCACAAACACTATC; in 1  Epicentre FailSafe ‘‘Buffer E’’ with 1.25 u FailSafe PCR enzyme mix (Epicentre, Madison, WI). The PCR reaction conditions were: 98 jC for 2 min., followed by 25 cycles of: 95 jC for 45 s, 60 jC for 1 min, 72 jC for 3 min, and a final extension at 72 jC for 7 min. PCR products were excised from 1.5% agarose gels and cloned into the vector pGEM-T Easy (Promega). 2.6. DNA and protein sequence analysis DNA sequence analysis was performed at the U.C. Berkeley DNA Sequencing Facility on an Applied Biosystems 3100 Genetic Analyzer, using ABI Big Dye version 3.1 chemistry. For N-terminal sequencing, 5-day-old rat submandibular gland secretion was separated by SDSPAGE and transferred to PVDF. The band containing SMGC was excised and analyzed by Matthew Williamson at the University of California at San Diego Biology Department protein sequencing facility as described (Mirels et al., 1998b). 2.7. Preparation of rabbit anti-rat SMGC and anti-mouse SMGC antisera A GST-rat SMGC fusion protein was prepared using the vector pGEX-KG as described (Girard et al., 1993). The fusion protein contained the entire rat SMGC protein coding sequence, beginning at residue 3 of the secreted protein. SMGC cDNA was isolated by digestion with Kpn I (Smgc bp 127) and Xho I (in 3V vector sequence). The complementary oliognucleotides, 5V GATCCAGAGGTAGAGCTGGCCTCGGTAC and 5VCGAGGCCAGCTCTACCTCTG, which when annealed encode a 5V BamH I site followed by eight codons of SMGC sequence ending at the Kpn I site, were used to ligate the SMGC Kpn I/Xho I fragment into the BamH I and Xho I sites of pGEX-KG. The amino acid sequence of the recombinant fusion protein across the GST/SMGC junction was: VPRGS#RGRAGLGT, with VPRGS derived from the pGEX-KG polylinker, and the remainder corresponding to amino acids 22 – 29 of rat SMGC. A His6-mouse SMGC fusion protein was prepared using the vector pRSETB (Invitrogen). Two constructs were prepared; (1) the Xho I/Hind III fragment from clone 3.3E (mouse Smgc bp 237– 2058) into the Xho I and Hind III sites of pRSETB, and (2) the Xho I/EcoR I fragment of clone 3.4G (mouse Smgc bp 237 –2605, with 1452 –1550 deleted) into the Xho I/EcoR I sites of the same vector. The N-terminal sequence of both fusion proteins across the Xho I site is: His6.(16 amino acids).DDDDKDPS#SRKN, with the arrow denoting the pRSETB/SMGC junction, and

25

SMGC sequences beginning at SRKN. Fusion protein plasmid (2) includes the SMGC translation termination codon, and approximately 360 bp of 3V untranslated sequence, whereas plasmid (1) contains only protein coding sequences from SMGC and uses a termination codon present in pRSETB. Both constructs were expressed equally well, and a mixture of the two fusion proteins was used for immunization. Fusion proteins were expressed in E. coli strain BL21(DE3) and purified on Ni-NTA agarose according to the manufacturer’s instructions (Qiagen,Valencia, CA). Protein was renatured by dialysis against a stepwise gradient of 6, 5, 4, 3, 2, 1, 0.5 and 0 M urea in 20 mM HEPES, 300 mM KCl, 5 mM MgCl2, 0.1% NP-40, 10% glycerol, 1 mM dithiothreitol, 0.5 mM PMSF, and finally against PBS at 4 jC. Rabbit antisera were raised against the GST-rat SMGC and His6-mouse SMGC fusion proteins at Pocono Rabbit Farm and Laboratory (Canadensis, PA), according to their standard fusion protein protocol. 2.8. Western blot analysis and PNGase F digestion Mouse secretion was digested with PNGase F (New England Biolabs, Beverly, MA), and rat secretion was digested with N-glycanase (Prozyme, San Leandro, CA) according to the manufacturers’ directions, at a ratio of 0.1 u enzyme to 3 Al secretion. Digested and undigested secretion products were fractionated by SDS-PAGE, transferred to nitrocellulose, and incubated with antisera as described (Mirels et al., 1998b). Primary antibody was used at a 1:2000 dilution for 1 h at room temperature. Bound antibody was detected by chemiluminescence (Western Lightning, Perkin Elmer, Boston, MA). 2.9. SMGC immunocytochemistry Submandibular glands were fixed in 4% paraformaldehyde in PBS or 0.1 M sodium cacodylate buffer, pH 7.4, for 4 –16 h. Glands from newborn and adult mice, and 1 day postnatal rats were fixed by immersion in cold fixative solution. Glands from adult rats were fixed by vascular perfusion of anesthetized animals with room temperature fixative solution, followed by immersion in cold fixative. Following fixation, the glands were stored at 4 jC in 1% buffered paraformaldehyde. The tissues were embedded and 1 Am sections were prepared, incubated with antibodies, silver enhanced, and counterstained with methylene blueAzure II as described (Mirels et al., 1998a). Primary antisera were used at a 1:30,000– 1:60,000 dilution.

3. Results 3.1. Isolation of rat Smgc cDNA clones A five day old rat submandibular gland cDNA library was initially screened for clones encoding highly transcribed

26

K.M. Zinzen et al. / Gene 334 (2004) 23–33

products by hybridization to first strand cDNA synthesized from 1 day old rat submandibular gland poly(A)+ RNA. Twenty strongly reactive clones were isolated and grouped by cross-hybridization. Approximately half of the clones isolated were homologous to one another, and were considered likely candidates to encode Smgc based on their abundance and on the results of Northern blot analysis (Fig. 1A). The putative Smgc cDNA clones hybridized to f 3 kb transcripts present in neonatal rat submandibular gland RNA from days 0 to 20, but greatly reduced in amount at day 30. No Smgc transcripts were detected in RNA from adult male rat submandibular gland, sublingual gland, parotid gland, or liver. These clones were confirmed

to encode Smgc by hybridization selection. Transcripts that hybridized to a representative clone were eluted and translated in vitro. Two translation products, a major high molecular weight product and a smaller, less abundant product, were immunoprecipitated by a rabbit antibody previously raised against SMGC (‘‘anti-SMGC’’) isolated from rat submandibular gland secretion (Fig. 1B; Ball et al., 1988). The Smgc cDNA clones originally identified ranged from 1 to 2 kb in length. The two longest clones encoded approximately 1.3 kb of open reading frame and a 718 bp 3V untranslated region. The library was re-screened, and thirty additional clones between 2 and 2.7 kb in length were

Fig. 1. Identification of cDNA clones encoding rat SMGC. (A) Northern blot analysis of total RNA from submandibular glands of neonatal rats ages 0 – 45 days, and from adult male rat liver (L), submandibular gland (SM), sublingual gland (SL), and parotid gland (P). SMGC cDNA clones hybridized to 3 kb transcripts present at highest levels on days 0 – 20. On longer exposure, low levels of SMGC transcripts were also detected in RNA from 30 days submandibular gland, but not in the adult samples. (B) In vitro translated and immunoprecipitated SMGC. Products of in vitro translation reactions visualized by SDS-PAGE on an 8.5% gel directly (lanes 1, 3, 5) or after immunoprecipitation with anti-SMGC (lanes 2, 4, 6). Lanes 1 and 2, no added RNA; lanes 3 and 4, translation of poly (A)+ RNA from 1 day submandibular gland; lanes 5 and 6, translation of poly (A)+ RNA from 1 day submandibular gland selected by hybridization to a SMGC cDNA clone. (C) Primer extension analysis of Smgc. Products of reverse transcriptase reactions containing labeled primer alone (primer), or with RNA from day 0 rat submandibular gland ( + SMG RNA), visualized on a 6% acrylamide, 8 M urea sequencing gel. A single band, 60 bp beyond the length of the primer was detected.

K.M. Zinzen et al. / Gene 334 (2004) 23–33

isolated and analyzed. The protein coding regions of the 10 longest clones were sequenced, adding approximately 500 bp to the sequence already determined. Smgc sequence analysis was completed using primer extension and 5VRACE. 5VRACE resulted in a single pool of reaction products, f 200 – 225 bp in length, which encoded an open reading frame contiguous with the previously determined Smgc cDNA and included a single Met residue followed by a candidate signal peptide. Primer extension was performed to confirm that the 5V RACE products corresponded to the true 5Vend of Smgc transcripts (Fig. 1C). The 60 bp primer extension product indicated that the Smgc transcription start site is 14 bp 5’ to the Smgc sequence available from 5V RACE. Rat Smgc genomic sequence (NW_044044) includes a TATAA box 23 bp upstream of the experimentally predicted Smgc nucleotide + 1, consistent with the primer extension result. The full-length (2780 bp) rat Smgc sequence is available in GenBank under accession no. AY459347. Nucleotides 1 – 14 were taken from NW_044044, and the remainder determined as part of this work. 3.2. Rat SMGC protein The full length Smgc sequence encodes a 674 amino acid protein with a predicted molecular mass of 67.8 kDa. SMGC contains five repeated motifs, between 8 and 60 amino acids in length, repeated two or three times. The repeat sequences are shown in Fig. 2 by underlining homologous sequences in the same color. Several cDNA clones contained small deletions of one or more of the

27

repeat motifs, resulting in a decrease of the encoded protein by 8– 60 amino acids. The sequence shown in Fig. 2 is a compilation of sequences present in any of the SMGC clones analyzed. The amino acid composition of SMGC includes 37% Ser + Gly + Thr (18% Ser, 12% Gly, 7% Thr), and contains many potential sites for post-translational modification (Sections 3.8 and 4). N-terminal sequencing was performed on SMGC isolated from neonatal rat submandibular gland secretion product. The sequence, TSVRGRAGLGTYSRFGLGVLL, was identical to amino acids 19– 39 of Fig. 2, indicating that that the SMGC signal peptide cleavage site is C-terminal to amino acid 18. 3.3. Genomic organization of rat Smgc BLAST analysis of Smgc against the rat genome database resulted in significant alignments only with sequences derived from chromosome 7. The entire Smgc cDNA sequence was present in working draft sequence AC108595. Exon boundaries were assigned based on homology between the genomic and cDNA sequences, and the presence of flanking splice donor and acceptor consensus motifs. Rat Smgc is encoded by 17 exons (Table 1). All spliced exon boundaries create the codon GXX, with G from the preceding and XX from the succeeding exon; in Table 1 and Fig. 2, amino acids derived from these codons are assigned to the second exon. Exon boundaries exactly match the location of deletions observed in the multiple characterized cDNA clones, including individual clones lacking exons 6, 7, 11 and 12, or 13 but encoding a protein otherwise identical to Fig. 2.

Fig. 2. Deduced amino acid sequence of rat SMGC. Homologous repeats are underlined in the same color, and vertical lines denote exon boundaries in the Smgc gene. The vertical arrow C-terminal to amino acid 18 indicates the signal peptide cleavage site. Potentially N-glycosylated NXS motifs are highlighted in red.

28

K.M. Zinzen et al. / Gene 334 (2004) 23–33

Table 1 Summary of the rat and mouse Smgc genes Rat Smgc

Mouse Smgc

exon

nucleotide

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1 – 92 93 – 224 225 – 398 399 – 569 570 – 662 663 – 686 687 – 860 861 – 923 924 – 1094 1095 – 1187 1188 – 1277 1278 – 1367 1368 – 1541 1542 – 1604 1605 – 1727

del

rpt

amino acid 1 – 17 18 – 61 62 – 119 120 – 176 177 – 207 208 – 215 216 – 273 274 – 294 295 – 351 352 – 382 383 – 412 413 – 442 443 – 500 501 – 521 522 – 562

amino acid > >

rpt

del

nucleotide

exon

1 – 94 1 x 95 – 226 2 (a) 227 – 409 3 (b) * 410 – 472 4 (c) * (f) 473 – 643 5 x (d) (g) 644 – 733 6 x (a) (h) 734 – 832 7 (e) * (f) 833 – 1003 8 (b) * (g) 1004 – 1093 9 x (c) (h) x 1094 – 1192 10 x * (f) x 1193 – 1363 11 x (d) (g) x 1364 – 1453 12 (a) x 1454 – 1552 13 (e) 1553 – 1639 14 (h) 1640 – 1738 15 * (f) 1739 – 1909 16 16 1728 – 1907 563 – 622 > < 1910 – 2089 17 17 1908 – 2780 623 – 674 > < 2090 – 2978 18 stop = 2063 stop = 2242 Exons were determined by comparison of cDNA sequences generated in this work with genomic sequences AC108595 (rat) and NT_039621, (mouse) from GenBank (see Sections 3.3 and 3.5). Exons marked with x in the ‘‘del’’ column were deleted in at least one cDNA clone analyzed. Exons denoted by ( ) in the ‘‘rpt’’ column contain repeated sequences; those with the same letter enclosed [e.g. (a)] are homologous. The central columns indicate mouse/rat Smgc homology. Exons marked > < are homologous only to one another (e.g. rat exon 17 with mouse exon 18), those marked * contain a repeat motif present in both rat and mouse.

In th e m ost rec en t r at c hr omo so me 7 b uild (NW_044044), rat Smgc is localized to 7q34-7q35. The predicted gene at this locus (XM_235592) also comprises 17 exons, but contains several differences from the Smgc

< <

1 – 17 18 – 61 62 – 122 123 – 143 144 – 200 201 – 230 231 – 263 264 – 320 321 – 350 351 – 383 384 – 440 441 – 470 471 – 503 504 – 532 533 – 565 566 – 622 623 – 682 683 – 733

sequence as deduced from cDNA clones. Thirteen exons are shared by the predicted gene and the rat Smgc cDNA clones. Four predicted exons were not found in any cDNA clones in this study. Conversely, exons 11 and 15 were identified in

Fig. 3. Deduced amino acid sequence of mouse SMGC. Homologous repeats within mouse SMGC are underlined in the same color, however, colors do not indicate homology with rat SMGC. Exon boundaries are noted with vertical lines. The vertical arrow C-terminal to amino acid 20 indicates the predicted signal peptide cleavage site. Potentially N-glycosylated NXS motifs are highlighted in red.

K.M. Zinzen et al. / Gene 334 (2004) 23–33

cDNA clones, but not predicted from the genomic sequence, and Smgc exons 2 and 3 are not present in NW_044044. 3.4. Immunocytochemical localization of rat SMGC An antiserum was raised against a bacterially expressed GST-SMGC fusion protein. Visualized by SDS-PAGE, SMGC is a single broad band of f 89 kDa (Ball et al., 1988; Fig. 4B). On Western blot analysis, both anti-SMGC and anti-GST-SMGC recognize f 89 kDa SMGC plus multiple additional species in neonatal rat submandibular gland secretion (Ball et al., 1993; Fig. 4C). The smaller SMGC species are in similar proportion in secretion and in homogenate from glands frozen immediately after harvesting (data not shown), suggesting that they represent heterogeneity in secreted SMGC rather than an artifact of in vitro collection of secretion. The reactivity of anti-GST-SMGC and anti-SMGC is also similar when used for immunocytochemistry (Ball et al., 1988; Moreira et al., 1990; Fig. 5). Type I (terminal tubule) cells in 1 day old glands are strongly reactive with anti-GST-SMGC (Fig. 5C), whereas type III (proacinar) cells are unreactive. In the adult female gland, clusters of

29

intensely reactive cells are present in intercalated ducts, typically near the acinar-intercalated duct junction (Fig. 5E). SMGC reactive cells also are present in the intercalated ducts of adult male glands (Fig. 5G), but they are smaller in size and fewer in number than in female glands. Neither anti-SMGC nor anti-GST-SMGC is reactive with neonatal mouse submandibular gland or secretion (Fig. 4C). 3.5. The mouse Smgc gene To identify a potential mouse homologue, the rat Smgc nucleotide sequence was compared with GenBank using BLAST. Two EST clones related to Smgc were identified, AK017450, from 10 day old mouse head, and BC022763, from adult female mouse salivary gland. Base pairs 1 – 230 of AK017450 (1.5 kb total), are 79% similar to rat Smgc exons 1 and 2; the clone also contains three repeat sequences related to rat Smgc exons 4 and 9. BC022763 is similar throughout to exons 16 and 17 of rat Smgc. Portions of both ESTs are present in a predicted 2198 bp gene, XM_128070, encoded on 12 exons at mouse chromosome 15E3 (NT_039621).

Fig. 4. Visualization of mouse Smgc RNA, and rat and mouse SMGC protein. (A) Smgc transcript levels in female and male mouse submandibular gland. Four Ag total RNA from glands of mixed sex days 0 and 5 neonates, and from female and male mice of the indicated ages, was hybridized to a labeled mouse Smgc cDNA probe. The detected transcripts are f 3 kb in length. (B) Secretion collected in vitro from 5-day-old mouse and rat submandibular gland, visualized by SDS-PAGE and silver staining. Mouse SMGC ( f 105 kDa) and rat SMGC ( f 89 kDa) are indicated by the arrowheads. (C) Detection of rat and mouse SMGC by Western blot analysis. Secretion products as in (B) were transferred to nitrocellulose and incubated with rabbit anti-rat GST-SMGC (left panel), or rabbit anti-mouse SMGC (right panel).

30

K.M. Zinzen et al. / Gene 334 (2004) 23–33

Mouse Smgc cDNA clones were generated by RT-PCR using a 5Vprimer corresponding to bp 10– 32 of AK017450 and one of three different 3Vprimers from BC022763 (3V.2, bp 10 –33; 3V.3, bp 567– 590; and 3V.4, bp 979 –1002). The resulting PCR products were f 1660, 2200 and 2630 bp, respectively. Six representative clones, one each from primer 3V.2 and 3V.3, and four from primer 3V.4 were isolated and sequenced. The complete mouse Smgc cDNA sequence is available in GenBank under accession number AY459348. Nucleotides 10 –2648 were determined from cDNA clones generated in this study. Mouse Smgc nucleotides 1 – 9 were taken from genomic sequence NT_039621, and nucleotides 2649 – 2978, encoding the final 330 bp of the Smgc 3V

untranslated region, were compiled from NT_039621 and multiple EST sequences present in GenBank (e.g. BC022763, BC049243, BC038603). Mouse Smgc is likely to be encoded by 18 exons. Fifteen exons were defined by comparison of the cDNA and genomic sequences. Twelve exons are identified in predicted gene XM_128070, and three additional exons, 2, 11, and 12 are present in NT_039621 but not predicted to be transcribed. Genomic sequences encoding mouse Smgc bp 734– 1093 (identical to mouse Smgc bp 1094– 1453) were apparently deleted in the assembly of NT_039621. Because repeat domains in rat Smgc and elsewhere in mouse Smgc exactly correspond to exon boundaries, mouse Smgc bp

Fig. 5. Localization of SMGC protein in rat and mouse submandibular glands during development. Light micrographs of immunogold silver stained 1 Am LR Gold sections of mouse (A, B, D, F) and rat (C, E, G) submandibular glands labeled with anti-mouse SMGC or anti-rat GST-SMGC antisera. (A and B) Newborn mouse. Secretory granules of terminal tubule (type I) cells are intensely labeled. Clusters of type III (proacinar) cells (arrows) and ducts (D) are unreactive. Scale bars: A = 20 Am; B = 10 Am. (C) One-day-old rat. The secretory granules and cytoplasm of type I cells show strong reactivity. No labeling is seen in type III cells (arrows) or ducts (D). Scale bar = 20 Am. (D) Four-month-old female mouse. Occasional SMGC reactive cells (arrows) are present near the intercalated duct-acinar junction. Acinar cells and granular duct (GD) cells are unlabeled. Scale bar = 20 Am. (E) Two-month-old female rat. Clusters of intensely reactive cells are present in intercalated ducts, especially near their junction with the acini. Acinar and granular duct (GD) cells are unlabeled. Scale bar = 20 Am. (F) Four-month-old male mouse. No labeled cells are present. Granular ducts (GD). Scale bar = 20 Am. (G) Two-month-old male rat. Occasional SMGC reactive cells (arrows) are present in the intercalated ducts. Acinar, granular duct (GD) and striated duct (SD) cells are unlabeled. Scale bar = 20 Am.

K.M. Zinzen et al. / Gene 334 (2004) 23–33

734 –1093 were assigned to three exons identical to those encoding bp 1093 –1453 (Table 1). 3.6. Similarity between rat and mouse Smgc The full-length mouse Smgc sequence encodes a 733 amino acid protein with a molecular mass of 74.4 kDa (Fig. 3). The mouse protein is similar to rat SMGC in amino acid composition (19% Ser, 12% Gly, 6% Thr), and overall architecture. Like rat SMGC, approximately 75% of mouse SMGC consists of repeat motifs; three motifs of 57, 30 or 33 amino acids, repeated 3 or 4 times. Alternate splicing also appears to be a mechanism for generating a diverse pool of mouse Smgc transcripts. Deletions of exon 13, exons 2 and 13 and exons 10 – 13 were observed among the six cDNA clones sequenced. The first and last two exons of rat and mouse Smgc, encoding the 5V and 3V untranslated regions, signal peptides, and the secreted proteins outside of the repeat domains, are closely conserved. Within the repeat domains, homology has been maintained in only one repeat motif, present in rat exons 4 and 9, and in mouse exons 5, 8, 11, and 16 (Table 1). The sequences also are divergent at the 5Vend of exon 2, which encodes the signal peptide cleavage site and Ntermini of the secreted proteins. The predicted signal peptide cleavage site of mouse SMGC is between amino acids 20 and 21 (SignalP v1.2, http://www.cbs.dtu; Nielsen et al., 1997).

31

3.7. Expression pattern of mouse Smgc Smgc transcript levels in male and female mouse submandibular gland were assessed by developmental Northern blot (Fig. 4A). Mouse Smgc transcripts of f 3 kb are present at high levels in neonates at days 0 and 5, and decrease markedly after day 20 in both sexes. In the female gland Smgc transcript levels are similar from day 26 through 6 months, whereas in the male gland, expression decreases further by day 32, and is undetectable by this method at 6 months. A rabbit antiserum was prepared against a His6-mouse SMGC fusion protein. By SDS-PAGE, mouse SMGC appears as a broad band at f 105 kDa (Fig. 4B), although Western blot analysis also reveals a series of immunoreactive bands from f 105 to 35 kDa (Fig. 4C). The antiserum is minimally reactive with rat SMGC on Western blot analysis (Fig. 4C) and immunocytochemistry (not shown). The cell specificity of mouse Smgc expression was determined by immunocytochemistry. As in rat, terminal tubule (type I) cells of the newborn mouse submandibular gland are intensely reactive with anti-SMGC antiserum, while the type III (proacinar) cells are unreactive (Fig. 5A,B). SMGC reactive cells were observed in intercalated ducts of adult female glands, usually at their junction with the acini (Fig. 5D). The number of intercalated duct cells expressing Smgc was substantially less than

Fig. 6. PNGase F treatment of rat and mouse submandibular gland secretion. Secretion from 5-day-old rat and mouse submandibular gland was digested with PNGase F, fractionated by SDS-PAGE on 9% gels and visualized by Western blot. Left panel: Rat secretion untreated (secretion), denatured and incubated in PNGase F buffer without enzyme (  PNGase F), and PNGase F treated, detected with anti-rat GST-SMGC. Right panel: Untreated (secretion), sham digested (  PNGase F) and PNGase F treated ( + PNGase F) mouse secretion, detected with anti-mouse SMGC. Bands showing increased mobility after PNGase F treatment are marked with arrowheads.

32

K.M. Zinzen et al. / Gene 334 (2004) 23–33

in the adult female rat, however. No labeled cells were observed in the glands of adult male mice. 3.8. Rat and mouse SMGC are N-glycosylated One possible source for the discrepancy between the deduced and observed molecular masses of rat and mouse SMGC is N-glycosylation. Both SMGC sequences contain multiple NXS/T motifs. Analysis of these sequences with the NetNGlyc program (http://www.cbs.dtu.dk) indicates that seven motifs in mouse SMGC are likely N-glycosylation sites (Fig. 3). In rat SMGC, two sites (N 164 and 339) were predicted very likely, and two likely to be N-glycosylated (Fig. 2). Secretion product from 5-day-old rat and mouse submandibular gland was digested with PNGase F, and SMGC in treated and untreated samples was visualized by Western blot analysis (Fig. 6). Mouse SMGC bands in the f 100, f 80, f 45 and f 35 kDa range all demonstrate increased mobility after PNGase F treatment. A more subtle change is also visible in the digested f 90 and f 80 kDa rat SMGC bands, but not in the lower molecular mass forms. Multiple SMGC bands remained in PNGase F digested samples from both rat and mouse, indicating that N-glycosylation contributes to, but is not a major cause of, the diverse SMGC forms in secretion.

4. Discussion SMGC is the only known marker specific to type I (terminal tubule) cells of the neonatal rat submandibular gland. This study demonstrates that SMGC is also a marker for mouse terminal tubule cells. Smgc is highly expressed in the type I cells, but is undetectable in the type III (proacinar) cells during submandibular gland development. After the developmentally programmed loss of type I cells, SMGC is restricted to a subset of intercalated duct cells in adult female rats and mice, and in adult male rats. No SMGC was detected in adult male mice, consistent with the previous observation that their submandibular gland intercalated duct cells lack secretory granules (Gresik and MacRae, 1975). Western blot analysis demonstrates multiple SMGC species from f 100 to f 30 kDa. The identification of cDNA clones containing small deletions exactly corresponding to exon boundaries indicates that some SMGC heterogeneity results from alternate splicing. Since splicing between any of rat exons 1 – 17 or mouse exons 1 – 18 maintains the SMGC open reading frame, however, no splice variants are expected to generate truncated proteins from alternate stop codons. Smgc transcripts smaller than the major band(s) at f 3 kb were not observed by Northern blot. Therefore, the smaller SMGC forms must arise largely by proteolysis, common in salivary proteins. Another source of SMGC heterogeneity is likely to be post-translational modification. Fig. 6 demonstrates that both

rat and mouse SMGC are N-glycosylated. The large number of Ser and Gly residues suggests that other post-translational modifications of submandibular gland secretory proteins, including O-glycosylation, phosphorylation, or glycosaminoglycan addition also may be present on SMGC. For example, the NetOGlyc prediction program (http://www. cbs.dtu.dk) identifies 16 Ser and Thr residues in rat SMGC, and in mouse SMGC that are potential sites of mucin-type Oglycosylation. The consensus sequence of the Golgi protein kinase responsible for phosphorylating salivary proteins is Ser-Xaa-Glu/SerP (Lasa-Benito et al., 1996). Full-length rat SMGC contains eight, and mouse SMGC contains six SerXaa-Glu sequences. Finally, mouse and rat SMGC contain multiple Ser-Gly pairs, some with acidic amino acids nearby, suggesting the additional possibility of glycosaminoglycan attachment (Esko and Zhang, 1996). Mucins are polydisperse heavily glycosylated protective molecules secreted by epithelia. Although divergent in primary structure, their shared features include a repeat domain devoid of organized secondary structure, enriched in Ser and Thr, and containing multiple O-glycosylation sites (Dekker et al., 2002). Although SMGC is polydisperse, contains repeat domains enriched in Ser and Thr, and is likely to be O-glycosylated, its relatively small size, predominance of N-linked oligosaccharides and proportion of total mass due to protein indicate that it is not a mucin. The chromosomal localization of rat and mouse Smgc is immediately upstream of the gene encoding the high molecular weight mucin MUC19. Human, rat and mouse MUC19 are homologous to one another and to pig submaxillary mucin. Muc19 is expressed in mouse sublingual and submandibular glands, and MUC19 is expressed in human sublingual, submandibular and tracheal glands (Chen et al., 2004; Fallon et al., 2003). The 5Vends of rat and mouse Muc19, encoding their signal peptides and Ntermini, have not yet been identified. Human MUC19 likely comprises what is now annotated as two genes, XM _ 292125, ‘‘similar to submaxillary mucin,’’ and XM_352899, ‘‘MUC19’’ (Chen et al., 2004). The predicted signal peptide of the combined gene is MKLILWYLVVALWCFFKGGFS. This peptide, when compared against GenBank using BLAST, is most similar ( f 70%) to the signal peptides of pig submaxillary mucin, and rat and mouse SMGC. Mouse Smgc and Muc19 have divergent expression patterns and are clearly distinct genes. The homology between SMGC and MUC19 signal peptides, and protein features common to SMGC and apomucins suggest that in rats and mice, Smgc and Muc19 evolved from a single ancestral mucin gene. Smgc is expressed during submandibular gland development at the time when acinar cells are immature and levels of adult acinar cell products such as the low-molecular weight mucin, MUC10 (Denny et al., 1988; Moreira et al., 1991) and presumably MUC19, are low. SMGC may perform mucin-like functions such as lubrication and/or bacterial binding in the neonatal oral cavity.

K.M. Zinzen et al. / Gene 334 (2004) 23–33

Acknowledgements This work was supported by NIH grant DE09428. Thanks to Howard Lien for technical assistance on the initial part of this project, and to Matthew Williamson for performing the N-terminal amino acid sequencing of rat SMGC. We gratefully acknowledge Hitomi Asahara, Brian McCarthy and Millicent Wilson of the U.C. Berkeley DNA Sequencing Facility for performing the DNA sequencing reported herein.

References Albone, E.F., Hagen, F.K., Szpirer, C., Tabak, L.A., 1996. Molecular cloning and characterization of the gene encoding rat submandibular gland apomucin. Mucsmg. Glycoconj. J. 13, 709 – 716. Ball, W.D., Hand, A.R., Moreira, J.E., Johnson, A.O., 1988. A secretory protein restricted to type I cells in neonatal rat submandibular glands. Dev. Biol. 129, 464 – 475. Ball, W.D., Hand, A.R., Moreira, J.E., Iversen, J.M., Robinovitch, M.R., 1993. The B1-immunoreactive proteins of the perinatal submandibular gland: similarity to the major parotid gland protein, RPSP. Crit. Rev. Oral Biol. Med. 4, 517 – 524. Bekhor, I., Wen, Y., Shi, S., Hsieh, C.H., Denny, P.A., Denny, P.C., 1994. cDNA cloning, sequencing and in situ localization of a transcript specific to both sublingual demilune cells and parotid intercalated duct cells in mouse salivary glands. Arch. Oral Biol. 39, 1011 – 1022. Chen, Y., Zhao, Y.H., Kalaslavadi, T.B., Hamati, E., Nehrke, K., Le, A.D., Ann, D.K., Wu, R., 2004. Genome-wide search and identification of a novel gel-forming mucin MUC19/Muc19 in glandular tissues. Am. J. Respir. Cell Mol. Biol. 30, 155 – 165. Cutler, L.S., Chaudhry, A.P., 1974. Cytodifferentiation of the acinar cells of the rat submandibular gland. Dev. Biol. 41, 31 – 41. Dekker, J., Rossen, J.W., Buller, H.A., Einerhand, A.W., 2002. The MUC family: an obituary. Trends Biochem. Sci. 27, 126 – 131. Denny, P.A., Pimprapaiporn, W., Kim, M.S., Denny, P.C., 1988. Quantitation and localization of acinar cell-specific mucin in submandibular glands of mice during postnatal development. Cell Tissue Res. 251, 381 – 386. Denny, P.C., Mirels, L., Denny, P.A., 1996. Mouse submandibular gland salivary apomucin contains repeated N-glycosylation sites. Glycobiology 6, 43 – 50. Denny, P.C., Ball, W.D., Redman, R.S., 1997. Salivary glands: a paradigm for diversity of gland development. Crit. Rev. Oral Biol. Med. 8, 51 – 75. Esko, J.D., Zhang, L., 1996. Influence of core protein sequence on glycosaminoglycan assembly. Curr. Opin. Struct. Biol. 6, 663 – 670. Fallon, M.A., Latchney, L.R., Hand, A.R., Johar, A., Denny, P.A., Georgel, P.T., Denny, P.C., Culp, D.J., 2003. The sld mutation is specific for sublingual salivary mucous cells and disrupts apomucin gene expression. Physiol. Genomics 14, 95 – 106.

33

Girard, L.R., Castle, A.M., Hand, A.R., Castle, J.D., Mirels, L., 1993. Characterization of common salivary protein 1, a product of rat submandibular, sublingual, and parotid glands. J. Biol. Chem. 268, 26592 – 26601. Gresik, E.W., MacRae, E.K., 1975. The postnatal development of the sexually dimorphic duct system and of amylase activity in the submandibular glands of mice. Cell Tissue Res. 157, 411 – 422. Hayashi, H., Ozono, S., Watanabe, K., Nagatsu, I., Onozuka, M., 2000. Morphological aspects of the postnatal development of submandibular glands in male rats: involvement of apoptosis. J. Histochem. Cytochem. 48, 695 – 698. Hecht, R., Connelly, M., Marchetti, L., Ball, W.D., Hand, A.R., 2000. Cell death during development of intercalated ducts in the rat submandibular gland. Anat. Rec. 258, 349 – 358. Lasa-Benito, M., Marin, O., Meggio, F., Pinna, L.A., 1996. Golgi apparatus mammary gland casein kinase: monitoring by a specific peptide substrate and definition of specificity determinants. FEBS Lett. 382, 149 – 152. Mirels, L., Bedi, G.S., Dickinson, D.P., Gross, K.W., Tabak, L.A., 1987. Molecular characterization of glutamic acid/glutamine-rich secretory proteins from rat submandibular glands. J. Biol. Chem. 262, 7289 – 7297. Mirels, L., Hand, A.R., Branin, H.J., 1998a. Expression of gross cystic disease fluid protein-15/Prolactin-inducible protein in rat salivary glands. J. Histochem. Cytochem. 46, 1061 – 1071. Mirels, L., Miranda, A.J., Ball, W.D., 1998b. Characterization of the rat salivary-gland B1-immunoreactive proteins. Biochem. J. 330, 437 – 444. Moreira, J.E., Hand, A.R., Ball, W.D., 1990. Localization of neonatal secretory proteins in different cell types of the rat submandibular gland from embryogenesis to adulthood. Dev. Biol. 139, 370 – 382. Moreira, J.E., Ball, W.D., Mirels, L., Hand, A.R., 1991. Accumulation and localization of two adult acinar cell secretory proteins during development of the rat submandibular gland. Am. J. Anat. 191, 167 – 184. Myal, Y., Iwasiow, B., Yarmill, A., Harrison, E., Paterson, J.A., Shiu, R.P., 1994. Tissue-specific androgen-inhibited gene expression of a submaxillary gland protein, a rodent homolog of the human prolactin-inducible protein/GCDFP-15 gene. Endocrinology 135, 1605 – 1610. Nielsen, H., Engelbrecht, J., Brunak, S., von Heijne, G., 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1 – 6. Poulsen, K., Jakobsen, B.K., Mikkelsen, B.M., Harmark, K., Nielsen, J.T., Hjorth, J.P., 1986. Coordination of murine parotid secretory protein and salivary amylase expression. EMBO J. 5, 1891 – 1896. Rosinski-Chupin, I., Rougeon, F., 1990. A new member of the glutaminerich protein gene family is characterized by the absence of internal repeats and the androgen control of its expression in the submandibular gland of rats. J. Biol. Chem. 265, 10709 – 10713. Shaw, P., Sordat, B., Schibler, U., 1986. Developmental coordination of alpha-amylase and psp gene expression during mouse parotid gland differentiation is controlled posttranscriptionally. Cell 47, 107 – 112. Yamashina, S., Barka, T., 1972. Localization of peroxidase activity in the developing submandibular gland of normal and isoproterenol-treated rats. J. Histochem. Cytochem. 20, 855 – 872.