hUNC93B1: a novel human gene representing a new gene family and encoding an unc-93-like protein

hUNC93B1: a novel human gene representing a new gene family and encoding an unc-93-like protein

Gene 283 (2002) 209–217 www.elsevier.com/locate/gene hUNC93B1: a novel human gene representing a new gene family and encoding an unc-93-like protein ...

714KB Sizes 0 Downloads 30 Views

Gene 283 (2002) 209–217 www.elsevier.com/locate/gene

hUNC93B1: a novel human gene representing a new gene family and encoding an unc-93-like protein Vladimir I. Kashuba a,b,c,1,*, Alexei I. Protopopov a,b,d,1, Sergei M. Kvasha a,c, Rinat Z. Gizatullin a, Claes Wahlestedt a, Lev L. Kisselev e, George Klein b, Eugene R. Zabarovsky a,b,e a Center for Genomics Research, Karolinska Institute, Stockholm 171 77, Sweden Microbiology and Tumor Biology Center, Karolinska Institute, Box 280, Stockholm 171 77, Sweden c Institute of Molecular Biology and Genetics, Ukrainian Academy of Sciences, Kiev 252627, Ukraine d Institute of Cytology and Genetics, Russian Academy of Sciences, Novosibirsk 630090, Russia e Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 117984, Russia b

Received 20 June 2001; received in revised form 5 November 2001; accepted 22 November 2001 Received by M. D’Urso

Abstract We have identified a novel human gene UNC93B1 encoding a protein related to unc-93 of Caenorhabditis elegans. The combined sequence derived from several cDNA clones is 2282 bp and comparison with genomic sequence shows that the gene contains 11 exons. The longest open reading frame encodes a deduced sequence of 597 amino acids. Homology analysis shows that the hUNC93B1 gene is highly conserved and related to sequences in Arabidopsis thaliana, C. elegans, Drosophila melanogaster, chicken and mouse. Structural analysis of the deduced amino acid sequence of hUNC93B1 points to possible existence of multiple membrane-spanning domains. hUNC93B1 protein also displays some similarities to the bacterial ABC-2 type transporter signature and to ion transporters of Deinococcus radiodurans and Helicobacter pylori. As revealed by Northern analysis, the level of expression varies significantly between tissues, with the highest level detected in the heart. The gene was mapped to chromosomal band 11q13 by fluorescence in situ hybridization. We suggest that this gene is a member of a novel hUNC93B-related gene family. q 2002 Elsevier Science B.V. All rights reserved. Keywords: NotI-linking clone; Molecular cloning; Gene mapping; Conserved gene probes; Multigene family; Transporter protein

1. Introduction An approach combining physical and gene mapping strategies to characterize large regions of mammalian chromosomes has been proposed (Allikmets et al., 1994). In this strategy NotI linking/jumping clones have been used as framework markers. Advanced procedures for construction of jumping and linking libraries construction have been developed and a number of human chromosome 3-specific and total NotI linking libraries have been prepared (Zabarovsky et al., 1994, 2000). We have partially sequenced more than 1000 NotI linking clones isolated from human chromo-

Abbreviations: BAC, bacterial artificial chromosome; EST, expressed sequence tag; FISH, fluorescence in situ hybridization; ORF, open reading frame; PCR, polymerase chain reaction; RACE, rapid amplification of cDNA ends * Corresponding author (address b). Tel.: 146-8-728-6737; fax: 146-8319470. E-mail address: [email protected] (V.I. Kashuba). 1 These authors contributed equally to this work.

some 3-specific libraries, in a search for a tumor suppressor gene(s) located on chromosome 3p. These isolates constitute 152 unique chromosome 3-specific NotI clones (Kashuba et al., 1999). A search of the EMBL nucleotide sequence database with these sequences revealed 90–100% similarities to more than 100 different genes or expressed sequence tags (ESTs). Many of these homologies were used to map novel genes to chromosome 3. Seventeen of the NotI linking clones (11.2%) are still absent in public databases (May 2001). One of the DNAs isolated from the chromosome 3-specific NotI linking library (NL1-304) displayed similarity to the 5 0 sequence of the chicken max gene (GenBank accession no. L12469) (Sollenberger et al., 1994). However, further analysis of this cloned DNA revealed that this similarity stems from the presence of a new gene upstream from the chicken max gene, and resulted in the identification of a novel human gene (UNC93B1), representing a new human gene family.

0378-1119/02/$ - see front matter q 2002 Elsevier Science B.V. All rights reserved. PII: S 0378-111 9(01)00856-3

210

V.I. Kashuba et al. / Gene 283 (2002) 209–217

2. Materials and methods NotI linking clones were isolated from chromosome 3specific NotI linking libraries described earlier (Zabarovsky et al., 1994). Growth of l phages and plasmids, DNA isolation and other general molecular biology and microbiology methods were applied according to the standard procedures (Sambrook et al., 1989). Insert from the NL1-304 clone was used for the screening and isolation of cDNA clones from heart cDNA library (Stratagene, La Jolla, CA, USA). After screening of 1 £ 106 cDNA clones in l ZAP II, six hybridizing DNAs were isolated and sequenced. One of them (clone no. 3) contained the largest insert and open reading frame (597 amino acids). Further search against EST and Unigene databases revealed that 5 0 end of this clone coincided with the other available cDNAs. Marathon-Ready cDNA from skeletal muscle (Clontech, Palo Alto, CA, USA) was used to isolate 3 0 end of the gene. The 3 0 -RACE PCR was done according to the instruction provided by the manufacturer with PCR primers: RACE-1: 5 0 -GACTGGACTCAGCACACTCCTGGGAATC-3 0 ; RACE-2: 5 0 -CGGTTACCGCTACTTGGAGGAGGACAAC-3 0 . Hybridization with MTN Northern filter (Clontech) was done according to the manufacturer’s protocols. Sequencing was performed using an ABI310 sequencer (Perkin Elmer, Foster City, CA, USA) according to the manufacturer’s instructions. Sequence data discussed in this article have been deposited under accession nos. AJ271326 (hUNC93B1) and AJ271977 (cUnc93b) in the GenBank Data Library. DNA homology searches were performed using BLASTX and BLASTN (Altschul et al., 1990; Gish and States, 1993) programs at the NCBI server: http:// www.ncbi.nlm.nih.gov:80/BLAST. Sequence assembling was done using DNASIS (Hitachi–Pharmacia). The BEAUTY Post-Processor was used with the BLASTP protein database searches provided by the Human Genome Sequencing Center (Houston, TX): http://dot.imgen.bcm.tmc.edu: 9331. Scanning the PROSITE and the PfamA protein families and domains was performed at the server of the Swiss Institute for Experimental Cancer Research: http://www.isrec.isb-sib.ch/software/PFSCAN_ form.html. Multiple sequence alignment was done by ClustalW program: http://www.clustalw.genome.ad.jp. The prediction of possible transmembrane regions and their orientation (TMpred prediction) was provided by the ISREC-server: www.ch.embnet.org. The algorithm of TMpred program is based on the statistical analysis of TMbase, a database of naturally occurring transmembrane proteins. The prediction was made using a combination of several weight-matrices for scoring (Hofmann and Stoffel, 1993). The standard procedure of FISH analysis with metaphase chromosomes was performed as described previously (Protopopov et al., 1996). About 60 metaphases were

analyzed for each probe. The genomic probe, 4.8 kb in size, containing hUNC93B1 exons 3–6 (introns 3–5), was obtained by PCR using the following primer set: U1: 5 0 AGACCTACCGCGAGGTGAAGTATG-3 0 ; U2: 5 0 -TACAGCGTGTGGTTCAGGTC-3 0 .

3. Results and discussion 3.1. Isolation, expression analysis and chromosomal mapping of the human UNC93B1 gene The NotI linking clone NL1-304 (D3S4632, GenBank accession nos. AJ272058, AJ272059) showed similarity (70% over 184 base pairs) to upstream genomic sequences containing the chicken max gene (GenBank accession no. L124699). Furthermore, we have found that this linking clone maps to chromosome 3p13 ! p12, and displays 97% identity over 40 base pairs (bp) to a human EST clone (GenBank accession no. AA632247). Using a combination of different methods, (cDNA library screening, RACE–PCR and in silico cloning; see Section 2), we identified 2282 bp of the cDNA sequence. This combined sequence encodes a deduced amino acid sequence of 597 amino acids (aa) (Fig. 1). The calculated molecular mass of this longest predicted polypeptide is 66.6 kDa. After analysis of seven 5 0 ESTs clones available in public databases, we have found that in EST clones AA632247 and AW844512 the structure of the mRNA is changed as a result of alternative or incomplete splicing (Fig. 2D). The intron located between exons 4 and 5 is present in these clones, resulting in the creation of a termination codon (TGA) at amino acid position 186. Additional experiments are necessary to clarify whether this is just coincidental result of incomplete splicing or this alternative splicing plays a role in regulation of the gene activity. BLASTX comparison using this 597 aa sequence of a novel putative protein revealed significant similarities to Caenorhabditis elegans unc-93 protein (21% identity over 487 aa, expected E ¼ 1e–19; GenBank accession nos. Z81449, X64415). Based on this similarity we propose to designate this novel human gene UNC93B1. We undertook expression analysis of the hUNC93B1 transcript using a cDNA clone AA632247 to probe a MTN Northern filter containing mRNAs from different human tissues. One transcript of approximately 2.4 kb was expressed (Fig. 3) in all tissues tested so far, although the expression level considerably varied. The highest expression was observed in the heart and the lowest in placenta. As C. elegans unc-93 protein is either a component of an ion transport system involved in excitation–contraction coupling in muscle, or functions in the coordination of muscle contraction between muscle cells (Levin and Horvitz, 1992), this raises the possibility that hUNC93B1 might be important for heart function, and even implicated in heart diseases. On the other hand, expression of

V.I. Kashuba et al. / Gene 283 (2002) 209–217

hUNC93B1 was also extremely high in brain and kidney. This suggested that hUNC93B1 might possess other functions not connected with the muscle contraction. Using FISH, the EST clone AA632247 was mapped to chromosomal band 11q13 (Fig. 4A). The 11q13 locus is associated with many diseases (Hou et al., 1996; Katsanis et al., 1999; Lebo et al., 1990), and some of them are related to muscle function. An example is spinal muscular atrophy, which is associated with respiratory

211

distress (SMARD1) (Grohmann et al., 1999). It is possible that hUNC93B1 may be involved in one or several of these disorders.

3.2. Sequences homologous to the UNC93B1 in human genome A search with the hUNC93B1 nucleotide sequence in the

Fig. 1. Structure of the hUNC93B1 gene. The mRNA and amino acid sequences of hUNC93B1, translation region 42–1832 bp. Stop codon is designated by star (*). The signal of polyadenylation is underlined. TMpred strongly predicted transmembrane helices are shown by arrows: . inside–outside, , outside–inside. Region displaying similarity to the bacterial ABC-2 type transporter signature is bold. Exon borders are also indicated.

212

V.I. Kashuba et al. / Gene 283 (2002) 209–217

EMBL and EST databases resulted in the identification of three groups of 95-100% identical human sequences: 1. NotI linking clones: (a) NL1-304 isolated from a chromosome 3-specific library (Zabarovsky et al., 1994) showed 95% identity over 376 bp (exon 11, see below); (b) NR5KE20 (GenBank accession nos. AJ272060, AJ272061), 97.5% identity over 466 bp (exon 11). 2. BAC and PAC clones: (a) RP11-138N3 (GenBank acces-

sion no. AC034259) mapped to chromosome 11 (99– 100% identity, exons from 1 to 7 and exons 10, 11) and four other nearly identical cloned DNAs that possessed the same features (GenBank accession no. AP002807, etc.); (b) RP11-413E6 (GenBank accession no. AC012661), mapped to chromosome 18 (96% identity over 275 bp, 99% over 120 bp and 95% over 763 bp, exons 9, 10, 11, respectively); (c) CTD-2026G6 (GenBank accession no. AC067827), mapped to chromosome 3 (96% identity over

Fig. 2. Exon–intron organization of the hUNC93B1 gene. (A) Relationship between hUNC93B1 gene and genomic variants highly similar to 3 0 part of this gene. The exact positions of exon/intron borders shown below are based on AC004923: 1, 26,116–26,252; 2, 26,402–26,543; 3, 27,025–27,178; 4, 30,520– 30,681; 5, 30,895–31,027; 6, 31,748–31,841; 7, 32,401–32,524; 8, 33,413–33,597; 9, 34,315–34,588; 10, 36,405–36,523; 11, 38,342–39,100. The star (*) marks unordered sequence. (B) Localization of the putative ABC-2 type transporter signature (black) in the predicted hUNC93B1 protein. (C) General information about hUNCB1-like genes. (D) Normal and alternative EST AA632247 transcripts of the hUNC93B1 gene. The protein predicted by the aberrant transcript would be truncated at the alternatively spliced site.

V.I. Kashuba et al. / Gene 283 (2002) 209–217

213

Fig. 3. Analysis of expression of the hUNC93B1 in different tissues. Control hybridization with the same filter using b-actin probe is also shown. Hybridizations of the cDNA clone AA632247 and b-actin gene were performed to the same filter (Human Multiple Tissue Northern blot, Clontech, #7765-1).

275 bp, 99% over 120 bp and 95% 764 bp, exons 9, 10, 11, respectively); (d) RP11-747H12 (GenBank accession no. AC073648), mapped to chromosome 7 (93% identity over 275 bp, 100% over 120 bp and 95% over 758 bp, exons 9, 10, 11, respectively) and two other nearly identical clones (GenBank accession nos. AC079804, AC079882); (e) RP5-901A4 (GenBank accession no. AC004923) 99– 100% identity to the whole hUNC93B1 sequence, chromosomal location unknown; (f) RP11-324I10 (GenBank accession no. AC011744), mapped to chromosome 4 (93% identity over 275 bp, 97% over 120 bp and 96% over 181 bp, exons 9, 10, 11, respectively) and another nearly identical cloned DNA (GenBank accession no. AC007310). 3. More than 100 unmapped ESTs (94–95% identity). Clones NR5-KE20 and hUNC93B1 have nearly identical sequences to PAC RP5-901A4, which suggests that NR5KE20 and hUNC93B1 are both located in the corresponding PAC clone. It is important to mention that all these PAC/BAC cloned DNAs (except RP5-901A4 and RP11-138N3) exhibit similarities to the 3 0 part of the hUNC93B1 (exons 9–11) (Fig. 2A–C). Genomic (including introns) sequences of all these PAC and BAC clones are very similar in this region. The most probable explanation is that in all these cases sequences for 5 0 ends of the respective genes are not yet known. Another, indirect indication for the importance of the 5 0 exons, is evident from the fact that the 5 0 end of the hUNC93B1 (exons 1–8) is similar to that of unc93 (Fig.

5A). However, we cannot rule out the possibility that the 3 0 part of the hUNC93B1 can exist as a separate gene. Using FISH, we assigned the NotI linking clone NR5KE20 to four different chromosomal bands: 3p13 ! p12, 4p16, 7p22 and 11q13 (Fig. 4B). The NL1-304 clone hybridized to the same chromosomal positions (data not shown) in contrast to the EST clone AA632247 (exons 1–10) or genomic probe, containing hUNC93B1 exons 3–6 (introns 3–5), that both were mapped to 11q13 only (Fig. 4A). Most probably, the EST hybridized to single site 11q13 because from the region showing identity between different chromosomes it contained only exons 9 and 10 that are too short for the FISH (Figs. 1 and 2C). Considering several chromosomal locations of the NR5KE20 and NL1-304 clones, and the presence in the human genome of very similar but not identical sequences, we proposed that hUNC93B1 gene represented one member of a family of closely related genes. We also found that hUNC93B1 was located in chromosomal band 11q13 and contained within the PAC clone RP5-901A4 and BAC clone RP11-138N3. Careful analysis demonstrated that clones RP5-901A4 and RP11-138N3 indeed contain truly identical sequences (data not shown). Based on these findings we were able to identify 11 exons, shown in Figs. 1 and 2. The RP11-413E6 clone was nearly identical to NL1-304 (1264 of 1274 bp). The NL1-304 was mapped to 3p13 ! p12 and isolated from a chromosome 3-specific library. We were confident that NL1-304 was located on chromosome 3 because the other NotI-linking clone (924021) (Kashuba et al., 1999), which was identical to NL1-

214

V.I. Kashuba et al. / Gene 283 (2002) 209–217

304, was isolated from an independent chromosome 3specific library. As the CTD-2026G6 clone (mapped to chromosome 3) also contained sequences identical to NL1-304 (1264 of 1274 bp), we suggested that either the clone RP11-413E6 was erroneously annotated to chromosome 18 (GenBank accession no. AC012661) or that this BAC clone was chimeric and/or contained a duplication of genomic sequences identical to those from chromosome 3. We designated the gene located on chromosome 3 as hUNC93B3, and another gene located on chromosome 4 (RP11-324I10 BAC clone) as hUNC93B4. Clones NR5-

Fig. 4. Chromosome mapping of hUNC93B1 and NR5-KE20. (A) Assignment of hUNC93B1 by FISH to 11q13. (B) Localization of NR5-KE20 by FISH. Arrows indicate the positions of NR5-KE20 on different chromosomes: 3p13 ! p12, 4p16, 7p22 and 11q13.

KE20 and NL1-304 displayed hybridization to four chromosomes, and the BAC clone RP11-747H12 was certainly different from the other hUNC93B1-related clones. Therefore, we suggested that this clone originated from chromosome 7 and contained the hUNC93B2 gene (Fig. 2A–C). At present, we cannot rule out the possibility that other hUNC93B1 related genes, other than hUNC93B1 itself, are non-functional genes or pseudogenes. 3.3. The hUNC93B1 is a highly conserved gene BLASTN search for the hUNC93B1 sequence found 84% identity in a 355 bp overlap to a mouse cDNA (GenBank accession no. U89424) related to the C. elegans unc-93 gene. We designated the putative mouse gene encoded by the cDNA murine Unc93b (mUnc93b). A BLASTX search with putative protein sequence derived from the hUNC93B1 cDNA found 86% identity over 118 amino acids with the mUnc93b (see Table 1). In fact, in many extended regions mUnc93b and our putative protein hUNCB1 were identical. The hUNC93B1 and the NL1-304 DNAs, displayed significant nucleotide similarity (78–86% identity) to DNA regions upstream from the chicken max gene. We supposed that an ortholog to hUNC93B1 was located in this chicken genomic region. Using the anticipated similarity to hUNC93B1 and existing ORFs, we constructed the putative gene called chicken Unc93b gene (cUnc93, GenBank accession no. AJ271977). This gene displayed significant similarity to hUNC93B1 both at the nucleotide and the amino acid levels (Table 1). However, it is similar to unc93 only at the protein level (27% identity over 157 aa, E ¼ 4e–13). As chicken cDNA clone DKFZ426_9N12R1 (GenBank accession no. AJ399414) was virtually identical to cUnc93 (353 out of 354 nucleotides), and this identity followed the exon/intron structure predicted for cUnc93, we strongly believe that the putative chicken gene is not an artifact. Therefore this chicken sequence was incorrectly annotated as 5 0 chicken max gene sequence. The hUNC93B1 also displayed significant similarities (Table 1) to the translated region of Drosophila melanogaster cDNA BcDNA.GH10120 (GenBank accession no. AF145657). Furthermore, hUNC93B1 bore also strong similarity to two human open reading frames (ORFs) encoding putative genes similar to unc-93 (human PAC clone 366N23, GenBank accession no. AL021331). We proposed to designate this putative gene located in the PAC clone 366N23 (which maps to chromosome 6q27) as hUNC93A. It is noteworthy, that hUNC93A displayed even higher identity with unc93 and D. melanogaster cDNA BcDNA.GH10120 than with hUNC93B1 (Table 1). As D. melanogaster cDNA BcDNA.GH10120 had significant similarity to unc93 (E ¼ 8e–80), we named a putative gene encoded by this cDNA drosophila Unc93 (dUnc93). In all these comparisons, no significant similarity was noticed at the nucleotide level. It is worthwhile to note that hUNC93B1 displayed simi-

V.I. Kashuba et al. / Gene 283 (2002) 209–217

215

Fig. 5. Alignment of the predicted amino acid sequences of the family of unc-93 (C. elegans) related genes. The most conserved 5 0 and 3 0 regions of UNC93B1 are presented (A and B, respectively). Amino acid sequences were predicted using GenBank accession nos. AJ271326 (hUNC93B1), AL021331 (hUNC93A), X64415 (unc93), U89424 (mUnc93b), AJ271977 (cUnc93b). GenBank accession numbers for putative D. melanogaster, A. thaliana and bacterial proteins are also shown.

larity with the hypothetical protein of Arabidopsis thaliana (GenBank accession no. AC016661). This putative protein of A. thaliana also had significant similarity with unc93. Therefore, we designated this gene as atUnc93. It is worthy to mention that the complete gene sequences were identified only for C. elegans (Levin and Horvitz, 1992) and, in this work, for Homo sapiens. We can conclude from Fig. 5 and Table 1 that hUNC93B1 is a very conserved protein, and thus probably plays an important biological role. 3.4. The hUNC93B1 is likely to encode a putative transmembrane transporter It has been shown that C. elegans unc-93 membraneassociated muscle protein is involved in the regulation or coordination of muscle contraction in C. elegans (Levin and Horvitz, 1992). Mutations in C. elegans unc-93 resulted in sluggish movement and a characteristic ‘rubber band’ uncoordinated phenotype (Levin and Horvitz, 1992) inspiring the cognomen unc from ‘uncoordinated’. The putative unc-93 protein has two distinct regions: the NH2 terminal portion is extremely hydrophilic, whereas the rest of the protein has multiple potential membrane-span-

ning domains. Thus, the C. elegans unc-93 gene seems to encode a transmembrane protein with probable ion transporter functions (Levin and Horvitz, 1992). The ‘rubber band’ mutations do not eliminate gene activity, but rather produce abnormal protein products that disrupt muscle function (Greenwald and Horvitz, 1986). Null mutations in unc93 confer no visibly abnormal phenotype, presumably because the unc93 gene is functionally redundant with another gene (or set of genes) that can maintain normal muscle function. A search with the aa sequence deduced from the hUNC93B1 gene using BLASTP-BEAUTY also revealed potential transmembrane PROSITE domains. Modeling the transmembrane topology of hUNC93B1 using TMpred software, we predicted 11 convincing transmembrane helices (marked by arrows in Fig. 1), with a total score of 19,701 (scores above 500 are considered significant). ProfileScan search of the hUNC93B1 sequence against PfamA pattern library showed a weak match (score 6.709) to the bacterial ABC-2 type transporter signature (PDOC00692). The candidate ABC transporter pattern was identified in the C-terminal section of hUNC93B1 protein, from 319 to 523 aa positions (Fig. 2B). This region

216

Table 1 Similarities between hUNC93B1 related genes/proteins a Human homolog B1, hUNC93B1

Mouse homolog B, mUnc93b Chicken homolog B, cUnc93b Human homolog A, hUNC93A Caenorhabditis elegans unc93 Drosophila melanogaster AF145657 Arabidopsis thaliana AC016661 Deinococcus radiodurans AAF12127 Helicobacter pylori O05731 a

Chicken homolog B, cUnc93b

Human homolog A, hUNC93A

Caenorhabditis elegans unc93

Drosophila melanogaster AF145657

Arabidopsis thaliana AC016661

Deinococcus radiodurans AAF12127

Helicobacter pylori O05731

86% 118 aa, 84% 355 aa

62% 275 aa, 83% 207 bp, 77% 322 bp 60% 116 aa

28% 140 aa, 28% 203 aa

21% 487 aa

22% 478 aa

31% 63 aa

27% 224 aa

31% 64 aa

28% 106 aa

26% 97 aa

23% 96 aa

NSS

40% 52 aa

40% 49 aa

26% 198 aa

27% 157 aa

26% 186 aa

NSS

NSS

NSS

37% 174 aa, 36% 176 aa

43% 238 aa, 37% 211 aa

32% 158 aa, 42% 54 aa

NSS

NSS

35% 444 aa

34% 135 aa, 37% 61 aa

NSS

30% 36 aa

25% 314 aa

NSS

NSS

NSS

NSS

86% 118 aa, 84% 355 bp 62% 275 aa, 83% 207 bp, 77% 322 bp 28% 140 aa, 28% 203 aa

60% 116 aa 28% 106 aa

26% 198 aa

21% 487 aa

26% 97 aa

27% 157 aa

22% 478 aa

23% 96 aa

26% 186 aa

31% 63 aa

NSS

NSS

27% 224 aa

40% 52 aa

NSS

37% 174 aa, 36% 176 aa 43% 238 aa, 37% 211 aa 32% 158 aa, 42% 54 aa NSS

31% 64 aa

40% 49 aa

NSS

NSS

aa, amino acids; bp, base pairs; NSS, no significant similarity was found.

35% 444 aa 34% 135 aa, 37% 61 aa NSS

25% 314 aa NSS

NSS

30% 36 aa

NSS

NSS

41% 285 aa 41% 285 aa

V.I. Kashuba et al. / Gene 283 (2002) 209–217

Human homolog B1, hUNC93B1

Mouse homolog B, mUnc93b

V.I. Kashuba et al. / Gene 283 (2002) 209–217

is inside the conserved part of hUNC93B1 (Figs. 2 and 5B). It is known that transport ATPases of the ATP-binding cassette (ABC) family are conserved throughout the evolution and are located either in the plasma membrane (catalyzing substrate efflux) or in internal membranes. The ABC2 genes represent a distinct subfamily of ABC-type transporters (Reizer et al., 1992; Frosch et al., 1991). The hUNC93B1 protein displayed 27% identity (and 43% positives) over 224 aa to an iron ABC transporter from Deinococcus radiodurans (GenBank accession no. AAF12127), and mUnc93 had 40% identity (50% positives) over 49 aa to a putative Helicobacter pylori iron chelatin transporter system permease protein HP0889 (GenBank accession no. O05731). HP0889 protein has ten transmembrane helices and all these similarities allow us to suggest that hUNC93B1 is a transmembrane protein with potential transporter activity. Acknowledgements This work was supported by research grants from the Swedish Cancer Society, Ingabritt och Arne Lundbergs Forskningsstiftelse, the Royal Swedish Academy of Sciences, Pharmacia & Upjohn for the Center for Genomics Research and the Karolinska Institute. L.K. thanks the support from the Russian National Human Genome Program. References Allikmets, R.L., Kashuba, V.I., Pettersson, B., Gizatullin, R., Lebedeva, T., Kholodnyuk, I.D., Bannikov, V.M., Petrov, N., Zakharyev, V.M., Winberg, G., Modi, W., Dean, M., Uhlen, M., Kisselev, L.L., Klein, G., Zabarovsky, E.R., 1994. NotI linking clones as a tool for joint physical and genetic maps of the human genome. Genomics 19, 303– 309. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410. Frosch, M., Edwards, U., Bousset, K., Krausse, B., Weisberger, C., 1991. Evidence for a common molecular origin of the capsule gene loci in Gram-negative bacteria expressing group II capsular polysaccharides. Mol. Microbiol. 5, 1251–1263. Gish, W., States, D.J., 1993. Identification of protein coding regions by database similarity search. Nat. Genet. 3, 266–272. Greenwald, I., Horvitz, H.R., 1986. A visible allele of the muscle gene sup10X of C. elegans. Genetics 113, 63–72.

217

Grohmann, K., Wienker, T.F., Saar, K., Rudnik-Schoneborn, S., Stoltenburg-Didinger, G., Rossi, R., Novelli, G., Nurnberg, G., Pfeufer, A., Wirth, B., Reis, A., Zerres, K., Hubner, C., 1999. Diaphragmatic spinal muscular atrophy with respiratory distress is heterogeneous, and one form is linked to chromosome 11q13–q21. Am. J. Hum. Genet. 65, 1459–1462. Hofmann, K., Stoffel, W., 1993. TMbase – A database of membrane spanning protein segments. Biol. Chem. 347, 166. Hou, Y.C., Richards, J.E., Bingham, E.L., Pawar, H., Scott, K., Segal, M., Lunetta, K.L., Boehnke, M., Sieving, P.A., 1996. Linkage study of Best’s vitelliform macular dystrophy (VMD2) in a large North American family. Hum. Hered. 46, 211–220. Kashuba, V.I., Gizatullin, R.Z., Protopopov, A.I., Li, J., Vorobieva, N.V., Fedorova, L., Zabarovska, V.I., Muravenko, O.V., Kost-Alimova, M., Domninsky, D.A., Kiss, C., Allikmets, R., Zakharyev, V.M., Braga, E.A., Sumegi, J., Lerman, M., Wahlestedt, C., Zelenin, A.V., Sheer, D., Winberg, G., Grafodatsky, A., Kisselev, L.L., Klein, G., Zabarovsky, E.R., 1999. Analysis of NotI linking clones isolated from chromosome 3 specific libraries. Gene 239, 259–271. Katsanis, N., Lewis, R.A., Stockton, D.W., Mai, P.M., Baird, L., Beales, P.L., Leppert, M., Lupski, J.R., 1999. Delineation of the critical interval of Bardet–Biedl syndrome 1 (BBS1) to a small region of 11q13, through linkage and haplotype analysis of 91 pedigrees. Am. J. Hum. Genet. 65, 1672–1679. Lebo, R.V., Anderson, L.A., DiMauro, S., Lynch, E., Hwang, P., Fletterick, R., 1990. Rare McArdle disease locus polymorphic site on 11q13 contains CpG sequence. Hum. Genet. 86, 17–24. Levin, J.Z., Horvitz, H.R., 1992. The Caenorhabditis elegans unc-93 gene encodes a putative transmembrane protein that regulates muscle contraction. J. Cell Biol. 117, 143–155. Protopopov, A.I., Gizatullin, R.Z., Vorobieva, N.V., Protopopova, M.V., Kiss, C., Kashuba, V.I., Klein, G., Kisselev, L.L., Grafodatsky, A.S., Zabarovsky, E.R., 1996. High resolution FISH mapping of 50 NotI linking clones homologous to genes and cDNAs on human chromosome 3. Chromosome Res. 4, 443–447. Reizer, J., Reizer, A., Saier Jr, M.H., 1992. A new subfamily of bacterial ABC-type transport systems catalyzing export of drugs and carbohydrates. Protein Sci. 1, 1326–1332. Sambrook, J., Fritsch, E.F., Maniatis, T., 1989. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Sollenberger, K.G., Kao, T.L., Taparowsky, E.J., 1994. Structural analysis of the chicken max gene. Oncogene 9, 661–664. Zabarovsky, E.R., Allikmets, R., Kholodnyuk, I., Zabarovska, V.I., Paulsson, N., Bannikov, V.M., Kashuba, V.I., Dean, M., Kisselev, L.L., Klein, G., 1994. Construction of representative NotI linking libraries specific for the total human genome and for human chromosome 3. Genomics 20, 312–316. Zabarovsky, E.R., Gizatullin, R., Podowski, R.M., Zabarovska, V.V., Xie, L., Muravenko, O.V., Kozyrev, S., Petrenko, L., Skobeleva, N., Li, J., Protopopov, A., Kashuba, V., Ernberg, I., Winberg, G., Wahlestedt, C., 2000. NotI clones in the analysis of the human genome. Nucleic Acids Res. 28, 1635–1639.