Analysis of expressed sequence tags of porcine skeletal muscle

Analysis of expressed sequence tags of porcine skeletal muscle

Gene 233 (1999) 181–188 www.elsevier.com/locate/gene Analysis of expressed sequence tags of porcine skeletal muscle R. Davoli*, P. Zambonelli, D. Bi...

182KB Sizes 0 Downloads 113 Views

Gene 233 (1999) 181–188

www.elsevier.com/locate/gene

Analysis of expressed sequence tags of porcine skeletal muscle R. Davoli*, P. Zambonelli, D. Bigi, L. Fontanesi, V. Russo DIPROVAL, Sezione di Allevamenti Zootecnici, University of Bologna, Via F.lli Rosselli 107, Coviolo, 42100 Reggio Emilia, Italy Received 5 January 1999; received in revised form 22 March 1999; accepted 8 April 1999; Received by J.A. Engler

Abstract Porcine skeletal muscle genes play a major role in determining muscle growth and meat quality. Therefore, to progress towards a better understanding of the genetic factors influencing these traits, the first step is to characterize the genes expressed in skeletal muscle tissue in pig. To this aim, we constructed a porcine biceps femoris muscle cDNA library and sequenced 111 randomly isolated clones. By FASTA analysis we identified 72 unique clones: 47 showed homology to previously identified genes in human or other mammals, 20 matched uncharacterized expressed sequence tags (ESTs), two showed no significant matches to sequences already present in DNA databases, and three other clones containing only repetitive elements were excluded from further analysis. Mitochondrial genes (16.2%), myosin heavy chain genes (9%) and the actin a skeletal muscle gene (9%) were the most abundant transcripts. Among the 47 identified genes several muscle-specific or predominant sequences expressed in skeletal muscle were found. The sequences of the clones matching uncharacterized human, mouse or porcine ESTs were tested by GRAIL in order to identify putative coding regions. The results of our analysis allowed the establishment of a first list of genes expressed in porcine skeletal muscle. © 1999 Elsevier Science B.V. All rights reserved. Keywords: cDNA library; EST; Skeletal muscle; Swine

1. Introduction The identification of genes expressed in cells of a tissue is a basic step to provide essential information about gene function and tissue physiology. A convenient and efficient approach is to characterize the transcripts of such genes by isolating and partially sequencing clones from cDNA libraries obtaining ESTs (Putney et al., 1983; Adams et al., 1991; Okubo et al., 1992). This approach has been successfully used to generate lists of genes expressed in different tissues and in several species (Adams et al., 1995; Le Provost et al., 1996; Marra et al., 1998). The skeletal muscle profile of gene expression has been extensively studied in human and mouse (Lanfranchi et al., 1996; Lennon et al., 1996). In Abbreviations: 3∞-UTR, 3∞-untranslated region; BAP, bovine alkaline phosphatase; bp, base pairs; cDNA, DNA complementary to RNA; CDS, coding sequence; dNTP, dideoxy nucleotide triphosphate; EMBL, European Molecular Biology Laboratory; EST, expressed sequence tag; kb, kilo base pair; MOPS (3-[N-morpholino] propane sulfonic acid; mRNA, messenger RNA; nt, nucleotide; PCR, polymerase chain reaction; TBE, Tris borate etylenedinitrilo tetraacetic acid. * Corresponding author. Tel.: +39-0522-290-512; fax: +39-0522-290-523. E-mail address: [email protected] (R. Davoli)

the pig, only few ESTs from skeletal muscle have been characterized ( Tuggle and Schmitz, 1994), whereas other porcine tissues (small intestine and granulosa cells) have been more widely studied using the EST approach ( Winterø et al., 1996; Tosser-Klopp et al., 1997). Porcine skeletal muscle genes can be considered of relevant interest because they play a major role in determining muscle growth and meat quality. Therefore, the achievement of a better understanding of the genetic variation in pig meat production is strongly dependent on the identification and study of the genes expressed in skeletal muscle tissue. Identification of ESTs from a porcine skeletal muscle cDNA library will provide candidate genes for genetic improvement of meat production traits. With this aim, we characterized 111 clones randomly isolated from a porcine skeletal muscle cDNA library in order to obtain a first list of pig transcripts expressed in this tissue type. 2. Materials and methods 2.1. cDNA synthesis and cloning A sample of biceps femoris muscle from a Duroc×(Landrace×Large White) adult pig was used

0378-1119/99/$ – see front matter © 1999 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 9 ) 0 0 14 1 - 9

182

R. Davoli et al. / Gene 233 (1999) 181–188

to extract total RNA according to Chattopadhyay et al. (1994). The integrity of RNA was checked by running 10 mg of formaldehyde-denatured RNA through a 1% agarose, 6.5% formaldehyde gel in 1× MOPS buffer, pH 7.0. mRNA was isolated by oligo-(dT ) cellulose chromatography (Sambrook et al., 1989). The library was developed by using a Time Saver@ cDNA synthesis kit (Amersham Pharmacia Biotech, Uppsala, Sweden) with oligo-(dT ) as primer for the reverse transcrip12–18 tion of 3 mg of mRNA. The synthesized cDNAs were ligated in the EcoRI site of the phagemid vector pT7T318UEcoRI/BAP (Amersham Pharmacia Biotech, Uppsala, Sweden) and inserted into Escherichia coli NM522 strain (Stratagene, La Jolla, CA). To test the presence of inserts, the colonies were PCR analyzed using T3 and T7 sequencing primers (Amersham Pharmacia Biotech) according to the protocol of Trower and Elgar (1994) with some modifications as follows. Bacterial colonies were directly amplified in a PCR mixture containing 2 nmol each of the four dNTPs, 4 pmol of each primer, 0.05 ml of RNase DNase free (Boehringer Mannheim, Mannheim, Germany) and 0.25 unit of Taq DNA polymerase (Perkin Elmer, Norwalk, CT ). PCR amplifications were performed on a Perkin Elmer GeneAmp PCR system 9600. The cycling conditions were 95°C for 5 min followed by 35 cycles of: 95°C, 30 s; 50°C, 30 s; 72°C, 2 min; then, a final extension of 5 min at 72°C was performed. The size of the amplified inserts was checked by agarose gel electrophoresis. 2.2. Sequencing Single-pass sequencing with [a35S]-dATP using an Amply Cycle sequencing kit (Perkin Elmer) and T3 and T7 primers (Amersham Pharmacia Biotech) was performed. The reactions were carried out on a Perkin Elmer DNA Thermal Cycler 480, and sequencing products were analyzed on a Genomyx LR apparatus (Beckmann Coulter Inc., Fullerton, CA) using 4.5% polyacrylamide, 8 M urea gels in 1× TBE buffer. 2.3. Sequence data analysis The identification of the clones was obtained by homology search in EMBL database release 55 (June 1998) using FASTA 3.1 (August 1998) at the European Bioinformatics Institute (Cambridge, UK; http:// www2.ebi.ac.uk/fasta3/; Pearson and Lipman, 1988). The criteria for scoring a sequence as having a significant match were similarity ≥70% or E-value ≤1e−10 in an overlapping region of at least 70 bp. The clones showing homology with uncharacterized ESTs in databases and the sequences that had no match were checked for the presence of putative open reading frames using GRAIL release 1.3 ( Uberbacher and

Mural, 1991; http://compbio.ornl.gov/Grail-1.3/). The regions containing repetitive elements were identified with RepeatMasker (Smit AFA and Green P, unpublished; http://ftp.genome.washington.edu/).

3. Results and discussion 3.1. cDNA library In order to establish a first list of genes expressed in pig skeletal muscle, a cDNA library containing approximately 1e5 independent primary transformants (1e6 independent clones per microgram of starting DNA vector) was constructed. To analyze the skeletal muscle cDNA library, we randomly chose the clones in order to isolate genes highly expressed in this tissue. The detection of unique transcripts and the identification of single components of gene families can be obtained by analyzing the 3∞-UTRs of genes (Okubo et al., 1992; Lanfranchi et al., 1996). The analysis of the 3∞-end portion of mRNAs is of special importance for the transcripts expressed in skeletal muscle because in different stages of development and in different fiber types, several isoforms of the same gene family are expressed (De Nardi et al., 1993; Smerdu et al., 1994; Schiaffino and Reggiani, 1996). To obtain the 3∞-end of the clones, oligo-d( T ) was used as primer for the reverse transcription. This approach allowed the identification of different myosin heavy chain isoforms in pig by a comparative analysis of their 3∞-UTRs from different species (Davoli et al., 1998). 3.2. Sequencing One hundred and sixty-five independent clones with an insert size greater than 0.2 kb and an average size of about 0.6 kb were sequenced, and at least one readable end-sequence for 111 clones was obtained. Fifty-four out of 165 sequenced cDNA clones were not studied any further mainly because the poly(A) tail may have interfered with the cycle-sequencing reaction producing a non-readable pattern (Lanfranchi et al., 1996; TosserKlopp et al., 1997). Eighty-one out of 111 considered clones presented both T3 and T7 sequences. The full sequence of 29 clones was obtained when overlapping regions from both ends were available. The length of the sequenced regions ranged from a minimum of 117 nt to a maximum of 460 nt with an average length of 270 nt. 3.3. Clone identification FASTA analysis allowed the identification of 72 unique clones out of 111 and a total of 116 endsequences were obtained. A summary of the non-redundant clones is reported in Table 1. Forty-seven clones

183

R. Davoli et al. / Gene 233 (1999) 181–188 Table 1 Summary of the 72 non-redundant clones obtained from the porcine skeletal muscle cDNA library Category

Number of clones (%)

Homology to known transcripts Human Pig Other species Homology to uncharacterized ESTs Human Pig Other species No database match [poly(A)] Genomic DNA

47 (65.3) 22 (30.6) 8 (11.1) 17 (23.6) 20 (27.8) 18 (25.0) 1 (1.4) 1 (1.4) 2 (2.8) 3 (4.2)

Total

72

out of 72 showed homology with previously identified genes in human or other mammalian species, 20 were similar to uncharacterized ESTs, and two showed no significant matches to sequences already included in DNA databases ( Tables 1 and 2). Three clones considered as genomic sequences were excluded from further analysis. The results of homology search for the identification of the 69 produced clones are reported in Table 2. The matches ranged between 73 and 1041 bp, the latter for a cDNA of tropomyosin a. Nine clones homologous to porcine sequences in database showed percentage matches from 97.1 to 100%. The sequences matching human or other mammalian sequences showed an identity ranging from 63.1 to 98.6%. The most abundant transcripts observed in our library were the mitochondrial genes (16.2%), the actin a skeletal muscle gene (9%) and the complex of the myosin heavy chain genes (9%). These data are in agreement with the results of Lanfranchi et al. (1996), who reported that the most frequent cDNAs expressed in human skeletal muscle were the mitochondrial genes (24.8%) and the actin a skeletal muscle gene (8.5%). Moreover, Peuker and Pette (1993) showed that myosin heavy chain mRNAs correspond to approximately 8% of the total mRNA in rabbit soleus muscle. A comparison of our results with the data reported in human and rabbit indicates that these genes may present a similar level of transcription in skeletal muscle tissues from different mammals. The 43 sequences listed in Table 2 identified as nuclear genes were assigned to different categories according to Adams et al. (1995), as shown in Table 3. Nine unique clones were represented by genes coding for myofibrillar proteins: actin a skeletal muscle, myosin heavy chains, myosin light chains fast skeletal muscle, tropomyosins a fast skeletal muscle and troponin C fast skeletal muscle. Three different myosin heavy chain isoforms were isolated from the cDNA library: the myosin heavy chain b slow and two adult fast skeletal isoforms. The latter two were characterized as 2B and

Number of end-sequences (%) 81 (69.8) 39 (33.6) 15 (12.9) 27 (23.3) 28 (24.1) 25 (21.5) 2 (1.7) 1 (0.9) 3 (2.6) 4 (3.5) 116

2X by comparing their 3∞-UTRs to the corresponding regions of the previously characterized myosin heavy chain isoforms from other mammals (Davoli et al., 1998; Zijlstra et al., 1998). Two myosin light chain fast skeletal clones coding for the isoforms 1F and 3F were isolated. These transcripts are produced by alternative splicing of the same gene in human, mouse and rat (Daubas et al., 1985). The isolation of these two isoforms in pig confirms our previous finding of two transcripts of this gene detected by Northern analysis of skeletal muscle total RNA hybridized with a clone of porcine myosin light chain 3F (Davoli et al., 1997). Two different transcripts of tropomyosin a were found. The two transcripts showed a perfect overlap of 576 bp of the coding region and of 193 bp of the 3∞-UTR. The difference between the two transcripts involves a different length of the 3∞-UTR. The clone indicated in Table 2 as transcript 1 ( Z98773) showed a longer 3∞-UTR than the transcript 2 ( Z98799). Moreover, the shorter 3∞-UTR showed only one polyadenylation signal starting at nucleotide 174 of the 3∞-UTR, whereas the longest 3∞-UTR showed an additional polyadenylation signal 26 bp upstream from its poly(A) tail. The different length of the two transcripts is probably due to the use of the two polyadenylation signals detected. In the EMBL DNA database, several human and mouse a-tropomyosin cDNAs or ESTs with the same different length of 3∞-UTR, probably due to the use of two polyadenylation signals, are present. Both pig transcripts can be classified as tropomyosin a fast-twitch skeletal muscle, as indicated by the FASTA results showing the highest similarity with the vertebrate sequences coding for the tropomyosin a isoform specific of the fast-twitch striated muscles (MacLeod and Gooding, 1988; LeesMiller and Helfman, 1991). On this subject, it is important to note that all the cDNAs coding for myofibrillar proteins isolated from our porcine cDNA library except the myosin heavy chain b slow are the isoforms specific of fast skeletal muscles. This is in agreement with the prevalence of fast fibers in porcine biceps femoris muscle ( Karlsson et al.,

184

R. Davoli et al. / Gene 233 (1999) 181–188

Table 2 Results of the homology search performed with FASTA of the 69 unique clones that have been identified Clones

Matching sequences

Accession Name No.

5∞/3∞ enda Frequency

Speciesb Accession No.

E value

Nuclear genes Z98813 14-3-3 protein c subtype Z98779 Actinin a 2 associated LIM protein Z98823 Actin a, skeletal muscle Z98782 ATPase Na+/K+ transporting, a 2 subunit Z98783 ATPase Na+/K+ transporting, a 2 subunit Z98803 ATP synthase d-subunit Z98805 Calmodulin 1 Z98825 Creatine kinase sarcomeric mitochondrial Z98816 Crystallin a B X94252 Desmin Z98807 DNA binding protein A Z98837 DNA polymerase zeta catalytic subunit Z98774 Doc 1 Z98775 Doc 1 Z98834 Glyceraldehyde 3 phosphate dehydrogenase Z98787 Glycogen phosphorylase, muscle Z98790 H5 Z98791 H5 X94253 Heterogeneous nuclear ribonucleoprotein E2 Z98824 Isocitrate dehydrogenase NAD (H ) specific, a-subunit Y18405 Isocitrate dehydrogenase NAD (H ) specific, a-subunit Z98826 LIM protein, skeletal muscle Z98780 Malate dehydrogenase, cytosolic Z98770 mCBP X91846 Myosin heavy chain b slow X91845 Myosin heavy chain 2B, fast skeletal muscle Z98835 Myosin heavy chain 2X, fast skeletal muscle Z98788 Myosin light chain alkali 1F Y18404 Myosin light chain alkali 1F X94689 Myosin light chain alkali 3F X94254 Phosphofructokinase, muscle type Z98802 Phosphoglycerate mutase, muscle subunit X91847 Proteasome subunit C9 Z98796 Proteasome subunit HN3 Z98822 Pyruvate dehydrogenase kinase isoform 4 Z98821 Pyruvate dehydrogenase kinase isoform 4 Z98818 Replication factor C large subunit Z98819 Replication factor C large subunit Z98798 Ribosomal protein L4 Z98784 Ribosomal protein L14 Z98812 Ribosomal protein S3a Z98811 Ribosomal protein S3a Z98831 Ribosomal protein S24 Z98786 Ribosomal protein S28 Z98820 Sarcolipin Z98841 Ser/Arg-related nuclear matrix protein X91850 Spectrin b, non-erythroid X91849 Spectrin b, non-erythroid Z98769 Translationally controlled tumor protein Z98768 Translationally controlled tumor protein Z98773 Tropomyosin a fast, transcript 1 Z98799 Tropomyosin a fast, transcript 2 Z98777 Troponin C, fast skeletal muscle

3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 3∞ 3∞ 3∞ 3∞ 5∞ 3∞ 5∞ 3∞ 3∞ 3∞ 5∞ 3∞ 3∞ 3∞ 5∞ 5∞ 5∞ 5∞ 3∞ 3∞ 3∞ 5∞ 3∞ 3∞ 3∞ 3∞ 3∞ 3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 5∞ 5∞ 3∞ 3∞ 3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 3∞ 3∞ 3∞ 5∞

1 1 10 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 1 8 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1

RN HS SS HS HS BT HS HS HS HS HS HS HS HS OC OA HS HS MM HS HS HS SS MM SS OC OC BT BT HS CF HS HS HS HS HS HS HS CF HS HS HS SS SS OC HS CF CF HS HS SS SS OC

S55305 AF039018 U16368 J05096 J05096 X06089 U16850 J05401 M28638 M63391 X95325 AF078695 AF006484 AF006484 L23961 AF001899 AF035811 AF035811 X97982 U07681 U07681 U60115 U44846 X75947 U75316 S68376 U32574 U45430 U45430 M20643 U25183 M18172 D00763 S71381 AC002451 AC002451 L23320 L23320 X99909 U16738 M77234 M77234 Z84114 Z72399 U96091 AF048977 L02897 L02897 X16064 X16064 X66274 X66274 Y00760

1.6e−71 87.5 2.0e−22 86.1 4.7e−87 100 1.1e−18 79.3 8.1e−44 78.9 3.2e−66 88.7 3.9e−91 83.3 1.3e−29 84.0 1.2e−46 92.8 1.4e−35 84.6 3.8e−25 85.9 7.4e−56 87.0 4.8e−11 77.2 5.1e−28 80.3 1.9e−88 85.6 1.0e−40 81.1 4.3e−93 91.5 1.1e−65 91.6 3.2e−131 94.9 5.7e−14 63.2 2.3e−10 72.7 9.9e−37 77.7 4.8e−55 100 2.8e−76 98.6 5.1e−174 98.5 1.8e−97 86.7 7.1e−145 92.4 6.9e−60 96.9 1.1e−74 89.0 6.3e−119 88.7 9.5e−39 87.8 1.8e−65 89.8 9.1e−129 87.9 4.8e−91 91.8 2.7e−17 70.6 3.5e−37 82.7 5.7e−15 70.1 4.8e−08 83.6 6.8e−91 92.2 3.4e−38 85.1 3.8e−74 95.8 1.0e−79 93.6 7.8e−94 97.1 7.7e−81 100 1.0e−49 78.7 9.6e−100 91.7 1.5e−74 91.8 3.2e−58 92.6 1.2e−73 94.0 2.6e−55 86.8 2.3e−165 99.6 4.4e−234 99.7 7.5e−40 96.3

530/515 151/152 398/912 135/300 308/300 335/331 562/554 212/222 222/224 246/283 184/359 353/370 114/203 229/245 561/578 271/338 434/433 324/319 570/600 261/289 99/117 319/321 213/213 277/278 927/937 798/798 965/983 254/265 372/372 837/859 311/324 329/359 934/949 428/432 228/229 278/286 204/269 73/276 396/393 261/257 307/308 358/361 414/465 295/295 334/324 648/656 340/341 364/375 315/315 318/331 770/858 1054/1054 188/188

Mitochondrial genes Z98789 ATP synthase a-subunit Z98828 Cytochrome B Z98830 Cytochrome C oxydase polypeptide III Z98832 NADH ubiquinone oxydoreductase subunit 1

5∞ 3∞ (fl ) 3∞ (fl ) 3∞ (fl )

3 3 11 1

EC SS BP BM

X79547 AB015076 X61145 X72204

3.2e−24 4.2e−152 1.0e−120 2.4e−37

241/277 614/622 766/781 255/347

(fl )c (fl )

(fl ) (fl ) (fl ) (fl )

(fl )

(fl )

(fl ) (fl ) (fl )

(fl ) (fl ) (fl ) (fl ) (fl )

(fl ) (fl ) (fl )

(fl ) (fl )

Identity bp overlap/ (%) total bp

74.3 99.7 83.0 79.2

185

R. Davoli et al. / Gene 233 (1999) 181–188 Table 2 (continued). Clones Accession Name No. Uncharacterized ESTs Z98765 1A02 Z98764 1A02 Z98766 1A03 Z98767 1B11 Z98776 1F12 Z98778 2C05 Z98781 2E01 Z98785 2E08 Z98793 2H05 Z98797 3A01 Z98800 3B02 Z98801 3B02 Z98804 3C05 Z98806 3C07 Z98809 3C10 Z98810 3D02 Z98827 3G07 Z98829 3H02 Z98836 06-01 Z98838 12-3 Z98839 13-1 Z98840 13-1 Z98845 C2-5 Z98844 C2-5 No match with ESTs/cDNAs Z98771 1E05 Z98842 C1-2

Matching sequences 5∞/3∞ enda Frequency

Speciesb Accession No.

E value

5∞ 3∞ 3∞ (fl )

HS HS HS HS HS HS HS HS HS HS HS HS HS HS HS HS MM SS HS HS HS HS HS HS

1.3e−15 4.6e−14 3.0e−32 3.5e−13 8.0e−20 2.3e−31 7.2e−66 1.5e−12 3.7e−64 8.4e−27 4.3e−19 4.9e−23 3.3e−16 1.4e−15 2.4e−13 1.1e−29 1.8e−53 3.1e−107 1.4e−52 2.2e−41 4.0e−45 2.5e−64 7.6e−27 2.5e−30

5∞ 5∞ 3∞ (fl ) 3∞ (fl ) 3∞ 5∞ 3∞ 5∞ 3∞

1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

3∞ 3∞ (fl )

1 1

5∞ 5∞ 5∞ 3∞ (fl ) 5∞

5∞ 5∞

AA463276 R45171 Z74617 W67463 AA989233 AA770292 AA195156 AA374262 N24376 AA371719 AA180130 Z25197 R54575 AA442571 F00196 AA017603 AA216840 Z84020 AA948097 AA115631 W68245 AA515188 C04902 F16970

NMd NM

Identity bp overlap/ (%) total bp

75.2 79.0 73.5 70.0 76.2 76.8 90.6 73.5 87.9 86.5 76.5 78.3 85.6 69.4 79.9 79.2 86.9 99.5 87.1 90.6 93.4 93.2 71.4 92.2

169/194 209/210 317/384 160/160 206/194 276/271 329/331 147/311 464/642 178/180 170/434 157/161 146/348 242/353 149/315 245/323 297/322 424/438 522/590 278/276 305/303 338/339 301/311 205/245 0/313 0/690

a For four EST clones, this column is not filled in because not enough information to find the correct 5∞/3∞ orientation was available. b BM, Balaenoptera musculus; BP, Balaenoptera physalus; BT, Bos taurus; CF, Canis familiaris; EC, Equus caballus; HS, Homo sapiens; MM, Mus musculus; OA, Ovis aries; OC, Oryctolagus cuniculus; RN, Rattus norvegicus; SS, Sus scrofa. c fl, full length. d NM, no database match.

1993), which was utilized for the extraction of the mRNA to construct the cDNA library. Among the genes reported in Table 2, except for those encoding for myofibrillar proteins, eight exhibited a skeletal muscle-restricted or skeletal muscle-predominant pattern of expression: ATPase Na+/K+ transporting subunit a2 (Orlowski and Lingrel, 1988), actinin a 2 associated LIM protein ( Xia et al., 1997), creatine kinase sarcomeric mitochondrial (Pie´tu et al., 1996), desmin (Pie´tu et al., 1996), glycogen phosphorylase muscle (Pie´tu et al., 1996), LIM protein skeletal muscle (Arber et al., 1994; Fung et al., 1995), pyruvate dehydrogenase kinase isoenzyme 4 (Rowles et al., 1996), sarcolipin (Odermatt et al., 1997). In our investigation, we also identified 20 clones matching only uncharacterized ESTs. Eighteen out of 20 revealed the highest homology to human ESTs, one matched a mouse EST and the latter a porcine EST ( Tables 1 and 2). Clone 1a03 was included in Table 2 among the 20

uncharacterized EST matches because, even if its first match was with a human genomic sequence ( Z74617), it showed other significant matches with ESTs in the FASTA output but with a lower probability. Moreover, the human genomic sequence representing the first match presented several annotations with human ESTs. These findings could indicate that our clone may be considered a transcribed sequence and not a genomic contamination. The sequencing of our cDNA library in addition to some most abundant ESTs allowed us to isolate also some rare transcripts of muscle tissue. In particular, four clones (3b02, 3c07, 3c10, c2-5) showed in their FASTA output only significant matches to few ESTs isolated mainly from skeletal muscle, heart or rabdomyosarcoma. Since the number of significant matches detected for a sequence can be considered as representative of the relative abundance of the corresponding mRNA, these four clones may correspond to low-level transcripts specific to the muscle tissue.

186

R. Davoli et al. / Gene 233 (1999) 181–188

Table 3 List of the identified genes classified according to Adams et al. (1995) Adams∞ classification

Number

Cell division

2

Cell signaling and cell communication

3

Cell structure and motility

13

Cell/organism defense Gene and protein expression

1 12

Metabolism

9

Unclassified

3

Clones 1e05 and c1-2 presented no significant homologies with the sequences in EMBL DNA databases but, as they contained the poly(A) tail and the polyadenylation signal, it could be presumed that these sequences may be new genes. A further analysis of the sequence of the 20 clones matching an uncharacterized EST and the two clones without any match was performed using the GRAIL program. In three sequences analyzed (2e01t7, 2h05, 3h02), we identified putative open reading frames ranging from 46 to 140 amino acids with a score indicating excellent protein coding potential ( Fig. 1). The short average length of the obtained sequences and the high

Genes DNA polymerase zeta catalytic subunit Replication factor C large subtype 14-3-3 protein c subtype Calmodulin 1 Sarcolipin Actinin a 2 associated LIM protein Actin a, skeletal muscle Crystallin a B Desmin Myosin heavy chain b slow Myosin heavy chain 2B, fast skeletal muscle Myosin heavy chain 2X, fast skeletal muscle Myosin light chain alkali 1F Myosin light chain alkali 3F Spectrin b, non-erythroid Tropomyosin a fast, transcript 1 Tropomyosin a fast, transcript 2 Troponin C, fast skeletal muscle Creatine kinase sarcomeric mitochondrial DNA binding protein A Heterogeneous nuclear ribonucleoprotein E2 LIM protein, skeletal muscle mCBP Proteasome subunit C9 Proteasome subunit HN3 Ribosomal protein L4 Ribosomal protein L14 Ribosomal protein S3a Ribosomal protein S24 Ribosomal protein S28 Ser/Arg-related nuclear matrix protein ATPase Na+/K+ transporting, a2-subunit ATP synthase d-subunit Glyceraldhehyde 3 phosphate dehydrogenase Glycogen phosphorylase, muscle Isocytrate dehydrogenase NAD(H ) specific, a-subunit Malate dehydrogenase cytosolic Phosphofructokinase, muscle type Phosphoglycerate mutase, muscle subunit Pyruvate dehydrogenase kinase isoform 4 Doc 1 H5 Translationally controlled tumor protein

presence of 3∞-UTRs are the most likely reasons for the small number of potential coding regions detected. Three clones (3d08, 3e05, 4c08) indicated at the end of Table 1 were considered a genomic contamination of the cDNA library because, in most of their sequence, they contained repetitive elements identified with Repeat Masker, and no other elements supporting the evidence of transcribed sequences were found. 3.4. Conclusions The isolation of genes expressed in skeletal muscle tissue is a necessary step to progress toward a better

R. Davoli et al. / Gene 233 (1999) 181–188

187

Fig. 1. Partial DNA and deduced amino acid sequence of EST Z98781, Z98792, Z98829 corresponding to the longest open reading frame identified by GRAIL.

understanding of the genetic basis of meat production. In the present work, we identified 43 genes, most of them important for the structure and metabolism of muscle cells. Moreover, for the genes coding some of the main myofibrillar proteins, such as myosin heavy chain, myosin light chain and tropomyosin a, several isoforms or splice variants were found. The sequence of the 69 unique transcripts isolated can be useful for gene mapping by PCR analysis of porcine–rodent somatic cell hybrid panels ( Yerle et al., 1996; Zijlstra et al., 1996) and can represent a source of PCR-based polymorphic markers (Takahashi and Ko, 1993). Therefore, our sequences can help improve the porcine transcription map by adding genes expressed in

skeletal muscle that could be candidates for meat production and meat quality traits. Acknowledgement This work was funded by the EC Inco–Copernicus project IC15-CT96-0902 (DG 12-CDPE) and the Italian MURST 60%. References Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Vide, B., Moreno,

188

R. Davoli et al. / Gene 233 (1999) 181–188

R.F., 1991. Complementary DNA Sequencing: expressed sequence tags and human genome project. Science 252, 1651–1656. Adams, M.D., Kerlavage, A.R., Fleischmann, R.D., Fuldner, R.A., Bult, C.J., Lee, N.H., Kirkness, E.F., Weinstock, K.G., Gocayne, J.D., White, O., et al., 1995. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature (Lond.) 377, 6547 Suppl., 3–174. Arber, S., Halder, G., Caroni, P., 1994. Muscle LIM protein a novel essential regulator of myogenesis promotes myogenic differentiation. Cell 79, 221–231. Chattopadhyay, N., Kher, R., Godbole, M., 1994. Inexpensive SDS/ phenol method for RNA extraction from tissues. BioTechniques 15, 24–26. Daubas, P., Robert, B., Garner, I., Buckingham, M., 1985. A comparison between mammalian and avian fast skeletal muscle alkali myosin light chain genes: regulatory implications. Nucleic Acids Res. 13, 4623–4643. Davoli, R., Fontanesi, L., Costosi, E., Zambonelli, P., Russo, V., 1997. Isolation and sequencing of porcine fast skeletal muscle alkali myosin light chain 3 cDNA. Anim. Biotech. 8, 179–185. Davoli, R., Zambonelli, P., Bigi, D., Fontanesi, L., Russo, V., 1998. Isolation and mapping of two porcine skeletal muscle myosin heavy chain isoforms. Anim. Genet. 29, 91–97. De Nardi, C., Ausoni, S., Moretti, P., Gorza, L., Velleca, M., Buckingham, M., Schiaffino, S., 1993. Type 2X-myosin heavy chain is coded by a muscle fiber type-specific and developmentally regulated gene. J. Cell Biol. 123, 823–835. Fung, Y.W., Wang, R.X., Heng, H.H., Liew, C.C., 1995. Mapping of a human LIM protein (CLP) to human chromosome 11p15.1 by fluorescence in situ hybridization. Genomics 28, 602–603. Karlsson, A., Enfa¨lt, A.-C., Esse´n-Gustavsson, B., Lundstro¨m, K., Rydhmer, L., Stern, S., 1993. Muscle histochemical and biochemical properties in relation to meat quality during selection for increased lean tissue growth rate in pigs. J. Anim. Sci. 71, 930–938. Lanfranchi, G., Muraro, T., Caldara, F., Pacchioni, B., Pallavicini, A., Pandolfo, D., Toppo, S., Trevisan, S., Scarso, S., Valle, G., 1996. Identification of 4370 expressed sequences tags from a 3∞-endspecific cDNA library of human skeletal muscle by DNA sequencing and filter hybridization. Genome Res. 6, 35–42. Lees-Miller, J.P., Helfman, D.M., 1991. The molecular basis for tropomyosin isoform diversity. BioEssays 13, 429–437. Lennon, G., Auffray, C., Polymeropoulos, M., Soares, M.B., 1996. The I.M.A.G.E. Consortium: an integrated molecular analysis of genomes and their expression. Genomics 33, 151–152. Le Provost, F., Le´pingle, A., Martin, P., 1996. A survey of the goat genome transcribed in the lactating mammary gland. Mamm. Genome 7, 657–666. MacLeod, A.R., Gooding, C., 1988. Human hTMa gene: expression in muscle and nonmuscle tissue. Mol. Cell. Biol. 8, 433–440. Marra, M.A., Hillier, L., Waterston, R.H., 1998. Expressed sequence tags — ESTablishing bridges between genomes. Trends Genet. 14, 4–7. Odermatt, A., Taschner, P.E.M., Scherer, S.W., Beatty, B., Khanna, V.K., Cornblath, D.R., Chaudhry, V., Yee, W.-C., Schrank, B., Karpati, G., Breuning, M.H., Knoers, N., MacLennan, D.H., 1997. Characterization of the gene encoding human sarcolipin (SLN ) a proteolipid associated with SERCA1: absence of structural mutations in five patiens with Brody Disease. Genomics 45, 541–553. Okubo, K., Hori, N., Matoba, R., Niiyama, T., Fukushima, A., Kojima, Y., Matsubara, K., 1992. Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression. Nat. Genet. 2, 173–179. Orlowski, J., Lingrel, J.B., 1988. Tissue-specific and developmental

regulation of rat Na, K-ATPase catalytic a isoform and b subunit mRNAs. J. Biol. Chem. 263, 10436–10442. Pearson, W.R., Lipman, D.J., 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448. Peuker, H., Pette, D., 1993. Non-radioactive reverse transcriptase/ polymerase chain reaction for quantification of myosin heavy chain mRNA isoforms in various rabbit muscles. FEBS Lett. 318, 253–258. Pie´tu, G., Alibert, O., Guichard, V., Lamy, B., Bois, F., Leroy, E., Mariage-Sampson, R., Houlgatte, R., Soularue, P., Auffray, C., 1996. Novel gene transcripts preferentially expressed in human muscles revealed by quantitative hybridization of a high density cDNA array. Genome Res. 6, 492–503. Putney, S.D., Herlihy, W.C., Schimmel, P., 1983. A new troponin T and cDNA clones for 13 different muscle proteins found by shotgun sequencing. Nature (Lond.) 302, 718–721. Rowles, J., Scherer, S.W., Xi, T., Majer, M., Nickle, D.C., Rommens, J.M., Popov, K.M., Harris, R.A., Riebow, N.L., Xia, J., Tsui, L.-C., Bogardus, C., Prochazka, M., 1996. Cloning and characterization of PDK4 on 7q21.3 encoding a fourth pyruvate dehydrogenase kinase isoenzyme in human. J. Biol. Chem. 271, 22376–22382. Sambrook, J., Fritsch, E.F., Maniatis, T., 1989. Molecular Cloning, A Laboratory Manual. 2nd edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Schiaffino, S., Reggiani, C., 1996. Molecular diversity of myofibrillar proteins: gene regulation and functional significance. Physiol. Rev. 76, 371–423. Smerdu, V., Karsch-Mizrachi, I., Campione, M., Leinwand, L., Schiaffino, S., 1994. Type IIx myosin heavy chain transcripts are expressed in type IIb fibers of human skeletal muscle. Am. J. Physiol. 267, C1723–C1728. Takahashi, N., Ko, M.S.H., 1993. The 3∞-end region of complementary DNAs as PCR-based polymorphic markers for an expression map of the mouse genome. Genomics 16, 161–168. Tosser-Klopp, G., Benne, F., Bonnet, A., Mulsant, P., Gasser, F., Hatey, F., 1997. A first catalog of gene involved in pig ovarian follicular differentiation. Mamm. Genome 8, 250–254. Trower, M.K., Elgar, G.S., 1994. PCR cloning using T-vectors, Methods in Molecular Biology, Harwood, A.J. ( Ed.), Protocols for Gene Analysis Vol. 31. Humana Press, Totowa, NJ, pp. 19–33. Tuggle, C.K., Schmitz, C.B., 1994. Cloning and characterization of pig muscle cDNAs by an expressed sequence tag approach. Anim. Biotech. 5, 1–13. Uberbacher, E.C., Mural, R.J., 1991. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network aproach. Proc. Natl. Acad. Sci. USA 88, 11261–11265. Winterø, A.K., Fredholm, M., Davies, W., 1996. Evaluation and characterization of a porcine small intestine cDNA library: analysis of 839 clones. Mamm. Genome 7, 509–517. Xia, H., Winokur, S.T., Kuo, W.L., Altherr, M.R., Bredt, D.S., 1997. Actinin-associated LIM protein: identification of a domain interaction between PDZ and spectrin-like repeats motifs. J. Cell Biol. 139, 507–515. Yerle, M., Echard, G., Robic, A., Mairal, A., Dubut-Fontana, C., Riquet, J., Pinton, P., Milan, D., Lahbib-Mansais, Y., Gellin, J., 1996. A somatic cell hybrid panel for pig regional gene mapping characterized by molecular cytogenetics. Cytogenet. Cell Genet. 73, 194–202. Zijlstra, C., Bosma, A.A., de Haan, N.A., Mellink, C., 1996. Construction of a cytogenetically characterized porcine somatic cell hybrid panel and its use as mapping tool. Mamm. Genome 7, 280–284. Zijlstra, C., Davoli, R., Fontanesi, L., Zambonelli, P., Bosma, A.A., Russo, V., 1998. Isolation and localization of the skeletal myosin heavy chain 2X gene on pig Chromosome 12q1.4–q1.5. Mamm. Genome 9, 412–413.