Gene 261 (2000) 373±382
www.elsevier.com/locate/gene
Transcriptome analysis of channel cat®sh (Ictalurus punctatus): genes and expression pro®le from the brain Zhenlin Ju, Attila Karsi, Arif Kocabas, Andrea Patterson, Ping Li, Dongfeng Cao, Rex Dunham, Zhanjiang Liu* The Fish Molecular Genetics and Biotechnology Laboratory, 203 Swingle Hall, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Auburn University, Auburn, AL 36849, USA Received 24 July 2000; accepted 16 October 2000 Received by W.-H. Le
Abstract Expressed sequence tag (EST) analysis was conducted using a complementary DNA (cDNA) library made from the brain mRNA of channel cat®sh (Ictalurus punctatus). As part of our transcriptome analysis in cat®sh to develop molecular reagents for comparative functional genomics, here we report analysis of 1201 brain cDNA clones. Of the 1201 clones, 595 clones (49.5%) were identi®ed as known genes by BLAST searches and 606 clones (50.5%) as unknown genes. The 595 clones of known gene products represent transcripts of 251 genes. These known genes were categorized into 15 groups according to their biological functions. The largest group of known genes was the genes involved in translational machinery (21.4%) followed by mitochondrial genes (6.2%), structural genes (3.1%), genes homologous to sequences of unknown functions (2.3%), enzymes (2.7%), hormone and regulatory proteins (2.5%), genes involved in immune systems (2.1%), genes involved in sorting, transport, and metal metabolism (1.8%), transcriptional factors and DNA repair proteins (1.6%), proto-oncogenes (1.2%), lipid binding proteins (1.2%), stress-induced genes (0.7%), genes homologous to human genes involved in mental diseases (0.6%), and development or differentiation-related genes (0.3%). The number of genes represented by the 606 clones of unknown genes is not known at present, but the high percentage of clones showing no homology to any known genes in the GenBank databases may indicate that a great number of novel genes exist in teleost brain. q 2000 Elsevier Science B.V. All rights reserved. Keywords: Expressed sequence tags; Functional genomics; Oncogene
1. Introduction The identi®cation of genes expressed in cells of a tissue is a basic step to understand gene function and tissue physiology. An ef®cient approach to characterize transcripts of genes is to partially sequence cDNA clones from cDNA libraries obtaining expressed sequence tags or ESTs (Adams et al., 1991). EST analysis not only identi®es genes transcribed in speci®c tissues, but also reveals expression pro®les of the tissue from which the cDNA library was made. With the advancement of sequencing technology, it is now possible to produce large numbers of ESTs representing a large proportion of the transcriptome, the overall transcriptional activity, of an organism. Characterization of large number of ESTs from various Abbreviations: EST, expressed sequence tag; cDNA, complementary DNA; PCR, polymerase chain reaction; NCBI, National Center for Biotechnology Information; TIGR, The Institute of Genome Research * Corresponding author. Tel.: 11-334-844-4054; fax: 11-334-844-9208. E-mail address:
[email protected] (Z. Liu).
organisms makes it possible to assemble EST sequences into tentative consensus sequences or gene indexing databases such as UniGene (Boguski and Schuler, 1995), STACK (Burke et al., 1998), and the TIGR Gene Indices (Quackenbush et al., 2000). Such tentative consensus sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, and to provide links between orthologous and paralogous genes (Quackenbush et al., 2000). The second major importance of ESTs lies in its application as molecular reagents for comparative functional genomics using cDNA microarray technology (Johnston, 1998). Additionally, polymorphism markers can be developed from ESTs (Liu et al., 1999). Upon sequencing analysis, ESTs can be catalogued according to tissue speci®city (Hishiki et al., 2000), biochemical pathways (Mekhedov et al., 2000), or as high ®delity set of non-redundant transcripts (Boguski and Schuler, 1995). These can be used for more extensive functional annotation, and integrated with linkage and physical mapping information. Such EST categories can
0378-1119/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved. PII: S 0378-111 9(00)00491-1
374
Z. Ju et al. / Gene 261 (2000) 373±382
be arrayed to ®lters or chips for expression studies addressing gene regulation and expression in speci®c tissues, in speci®c metabolic pathways, or under speci®c environments. For instance, using cDNA microarrays with 5,766 clones, Wang et al. (1999) successfully identi®ed 15 ovarian carcinoma-speci®c genes. Similarly, new heat shock genes and phorbol ester-regulated genes were identi®ed in human T cells by microarray-based expression monitoring of 1,000 genes (Schena et al., 1996). ESTs have provided a ®rst glimpse of transcription pro®les in a variety of organisms. Careful analyses of the sequence data have provided signi®cant additional functional, structural, and evolutionary information (Quackenbush et al., 2000). Large numbers of ESTs have been produced from a number of species (Adams et al., 1991; Waterston et al.,1992; Franco et al., 1995; Azam et al., 1996). ESTs represent 71% of all GenBank entries and 40% of the individual nucleotides (Quackenbush et al., 2000). ESTs from teleosts account for only about 1% of the almost ®ve million ESTs in the dbEST division of GenBank. The greatest effort so far has been made in zebra®sh (Gong, 1999), ¯ounder (Douglas et al., 1999), Japanese ¯ounder (Inoue et al., 1997), medaka (Hirono and Aoki, 1997), and channel cat®sh. In channel cat®sh, we have initially analyzed several hundred of cDNA clones from pituitary and muscle libraries (Karsi et al., 1998; Kim et al., 2000). Large-scale EST analysis is essential to adopt the cDNA microarray technology for comparative functional genomics, particularly to address the complex nature of gene expression involved in determination of performance traits such as feed conversion and behavioral traits. As part of the transcriptome analysis of cat®sh, we report analysis of 1201 clones from the channel cat®sh brain.
2. Materials and methods 2.1. Tissue preparation and RNA isolation All experimental channel cat®sh were raised in troughs, located inside the hatchery of the Auburn University Fish Genetics Facility, under the same conditions for 4 weeks before the tissues were harvested. Brain tissues were collected from both young (9±18 months old) and mature ®sh (4±5 years old) in January, April, July, and October in order to include all transcripts, particularly those that may be developmentally regulated or seasonally regulated. Brain tissues were kept frozen in liquid nitrogen until RNA extraction. Tissues were ®rst ground with a mortar/pestle and then homogenized in RNA extraction buffer with a hand-held tissue tearor (Model 985-370, Biospec Products, Inc., WI) following the guanidium thiocyanate method (Chomczynski and Sacchi, 1987). mRNA was puri®ed from total cellular RNA using the Poly (A) 1 Pure kit (Ambion, Cat. #1915) according to the manufacturer's instructions, except that two rounds of puri®cation were performed.
2.2. cDNA library construction Directional cDNA libraries were constructed using the pSPORT-1 SuperScript Plasmid Cloning System (GIBCO/ BRL). Three microgram poly (A) 1 RNA was used. The detailed protocols for construction of the cDNA library followed GIBCO/BRL's instructions, except that the library was electroporated into ElectroMax DH12S cells, which are highly adapted to ef®cient electroporation and ef®cient production of single-stranded phagemids (Life technologies), a feature advantageous to development of normalized cDNA libraries. The quality of the brain cDNA library was determined by the number of primary recombinants and the average size of inserts. The primary cDNA library was ampli®ed once before colonies were picked for sequencing analysis. 2.3. Plasmid preparation and sequencing analysis The plasmid cDNA library was plated to appropriate density to pick individual colonies. Random clones were picked and grown in 2-ml liquid culture in LB medium overnight in 12 £ 75-mm culture tubes for plasmid preparation. Plasmid DNA was prepared by alkaline lysis method (Sambrook et al., 1989) using the Qiagen Spin Column Mini-plasmid kits. Mini-preparation plasmid DNA (3 ml, about 600±1000 ng) was used for all sequencing reactions. Sequencing was conducted using the chain termination method. Sequencing reactions were performed in a thermocycler using cycleSeq-farOUTe polymerase (Display Systems Biotech, Vista, CA). The pro®les for cycling were: 958C for 30 s, 558C for 40 s, 728C for 45 s for 30 cycles. An initial 2-min denaturation at 968C and a 5-min extension at 728C were always used. All sequences were analyzed on an automatic LI-COR DNA Sequencer Long ReadIR 4200 or LI-COR DNA Analyzer Gene ReadIR 4200. Vector sequences were removed before searching for homologies using BLAST through Internet (NCBI, Bethesda, MD). Matches were considered to be signi®cant only when the probability (P) was less than 0.001 and scores were greater than 80 for BLASTN with all parameters at default. Some sequences exhibited high levels of con®dence with low P values, but exhibited low scores because of gap penalty at the default parameters. These sequences were included only when more than one stretches of homologous sequences existed in the clone to the orthologous sequences of the GenBank. 3. Results and discussion A total of 1201 random cDNA clones were sequenced from a channel cat®sh brain cDNA library. The library had 2:5 £ 10 6 primary recombinant clones, which was constructed using mRNAs isolated from brains of various developmental stages harvested in all four seasons. There-
Z. Ju et al. / Gene 261 (2000) 373±382
fore, this library should be a resource for further EST development using normalized libraries. 3.1. Known genes and novel genes from the cat®sh brain expressed sequence tags Sequences were analyzed by BLAST searches through Internet using NCBI databases. Of the 1201 clones sequenced, 595 clones were known genes and 606 clones were unknown genes. Overall, 49.5% of the sequenced clones were known genes and 50.5% were unknown genes. This indicated that a great level of novel genes may be quite speci®c to teleost ®sh brains considering that large numbers of genes have been sequenced from the human brain. The 595 clones of known genes represent transcripts of 251 genes. Multiple clones were sequenced for 94 genes ranging from 2±54 clones per gene (Table 1). These 94 genes accounted for 439 clones of the 1201 sequenced clones. The number of unknown genes represented by the 606 clones is not known at present. 3.2. Gene expression pro®le of the channel cat®sh brain The sequenced ESTs were grouped into 15 categories (Table 1): (1) Genes involved in the protein translational
375
machinery such as ribosomal proteins, translational factors, or tRNA genes; (2) cellular structural genes such as actins, tubulins, keratins, and histones; (3) enzymes; (4) transcriptional factors, DNA binding proteins, and DNA repair proteins; (5) genes involved in the immune system; (6) metal binding proteins, ionic channels, and genes involved in protein sorting and transportation; (7) proto-oncogenes, tumor repressors, and tumor-related proteins; (8) hormones, receptors, and regulatory proteins; (9) developmental genes such as clock genes and genes involved in tissue or organ differentiation; (10) stress-induced genes such as heat shock proteins and cold acclimation proteins; (11) genes in lipid metabolism; (12) genes homologous to human mental disease-related genes; (13) genes homologous to known sequences of unknown functions; (14) mitochondrial genes; and (15) other genes. The grouping of the ESTs was arbitrary and many genes may have multiple functions and their grouping was made based on their major functions. For instance, the epsilon-sarcoglycan was grouped under structural genes because it mediates membrane-matrix interactions, but sarcoglycans are also transmembrane components of the dystrophin-glycoprotein complex and, therefore, are involved in the human limb-girdle muscular dystrophy (Ettinger et al., 1997).
Table 1 Brain genes and expression pro®les as revealed by EST analysis Clone No a
Acc No. b
Identity c
1. Genes involved in the protein translation machinery IpBrn01236 BE212689 Acidic ribosomal phosphoprotein P0 IpBrn01743 BE212698 Ribosomal protein L5b IpBrn01604 BE212945 Ribosomal protein L6 IpBrn01307 BE212946 Ribosomal protein L7 IpBrn00367 BE212445 Ribosomal protein L7a IpBrn00552 BE212821 Ribosomal protein L9 IpBrn00076 BE212446 Ribosomal protein L10A IpBrn00847 BE212768 Ribosomal protein L11 IpBrn00591 BE212447 Ribosomal protein L12 IpBrn01483 BE212947 Ribosomal protein L14 IpBrn00067 BE212448 Ribosomal protein L15 IpBrn00519 BE212822 Ribosomal protein L18 IpBrn02139 BE213183 Ribosomal protein L18a IpBrn02040 BE213166 Ribosomal protein L19 IpBrn00053 BE212449 Ribosomal protein L21 IpBrn00003 BE212441 Ribosomal protein L22 IpBrn00891 BE212777 Ribosomal protein L23A IpBrn00804 BE212759 Ribosomal protein L24 IpBrn02138 BE213182 Ribosomal protein L27 IpBrn01321 BE212948 Ribosomal protein L28 IpBrn01531 BE212949 Ribosomal protein L29 IpBrn00606 BE212554 Ribosomal protein L30 IpBrn01445 BE212950 Ribosomal protein L31 IpBrn00631 BE212555 Ribosomal protein L32 IpBrn00337 BE212556 Ribosomal protein L35 IpBrn01597 BE212951 Ribosomal protein L35a IpBrn00505 BE212823 Ribosomal Protein L36 IpBrn01500 BE212952 Ribosomal protein L36a IpBrn01533 BE212953 Ribosomal protein L37 IpBrn00619 BE216904 Ribosomal protein L37a
Pd
Fe
0 1 £ 10 272 9 £ 10 214 1 £ 10 209 6 £ 10 246 3 £ 10 240 1 £ 10 2109 1 £ 10 2101 3 £ 10 238 3 £ 10 215 1 £ 10 264 6 £ 10 257 1 £ 10 2112 9 £ 10 220 6 £ 10 226 4 £ 10 279 4 £ 10 258 2 £ 10 213 1 £ 10 282 2 £ 10 204 2 £ 10 232 5 £ 10 223 4 £ 10 254 1 £ 10 245 5 £ 10 226 1 £ 10 210 5 £ 10 232 3 £ 10 258 2 £ 10 249 6 £ 10 269
1 6 1 1 4 1 3 4 3 1 3 3 1 2 7 7 2 10 1 1 1 3 3 6 8 5 3 1 2 5
(continued overleaf)
376
Z. Ju et al. / Gene 261 (2000) 373±382
Table 1 (continued) Clone No a
Acc No. b
Identity c
Pd
Fe
IpBrn01647 IpBrn00061 IpBrn01029 IpBrn02056 IpBrn01440 IpBrn00620 IpBrn00538 IpBrn02081 IpBrn00041 IpBrn00553 IpBrn00544 IpBrn00087 IpBrn01619 IpBrn01371 IpBrn00220 IpBrn00086 IpBrn01536 IpBrn01608 IpBrn00122 IpBrn01136 IpBrn00115 IpBrn00856 IpBrn00886 IpBrn00027 IpBrn00300 IpBrn01594 IpBrn02110 IpBrn01361 IpBrn01154 IpBrn01599 IpBrn01493 IpBrn01545 IpBrn01626 IpBrn01304 IpBrn00066 2. Cellular structural genes IpBrn00218 IpBrn00308 IpBrn01194 IpBrn00853 IpBrn00605 IpBrn00820 IpBrn01001 IpBrn01382 IpBrn00647 IpBrn02018 IpBrn01615 IpBrn00347 IpBrn00302 IpBrn01627 IpBrn02119 IpBrn00542 IpBrn00038 IpBrn00011 IpBrn00821 IpBrn01639 IpBrn00842 IpBrn00118 IpBrn01311 3. Enzymes IpBrn00136 IpBrn00014 IpBrn01038
BE212954 BE212557 BE212658 BE213170 BE212955 BE212558 BE212824 BE213173 BE212559 BE212825 BE212826 BE212560 BE212956 BE212957 BE212561 BE212562 BE212958 BE212959 BE212563 BE212668 BE212564 BE212770 BE212776 BE212565 BE212566 BE212960 BE213177 BE212961 BE212674 BE212962 BE212963 BE212964 BE212965 BE212966 BE212567
Ribosomal protein L38 Ribosomal protein L39 Ribosomal protein L41 Ribosomal protein S2 Ribosomal protein S3 Ribosomal protein S4 Ribosomal protein S7 Ribosomal protein S8 Ribosomal protein S9 Ribosomal protein S10 Ribosomal protein S11 Ribosomal protein S12 Ribosomal protein S13 Ribosomal protein S14 Ribosomal protein S15 Ribosomal protein S15A Ribosomal protein S16 Ribosomal protein S17 Ribosomal protein S18/Ke3 Ribosomal protein S19 Ribosomal protein S20 Ribosomal protein S21 Ribosomal protein S22 Ribosomal protein S23 Ribosomal protein S24 Ribosomal protein S27 Ribosomal protein S28 Ribosomal protein S29 Translation initiation factor 1A Translation initiation factor 5 Translation elongation factor 1 alpha Translation elongation factor 2 Translation factor sui1 tRNA-Thr tRNA-Val
1 £ 10 238 3 £ 10 214 3 £ 10 230 1 £ 10 2148 1 £ 10 2102 1 £ 10 249 7 £ 10 214 6 £ 10 267 4 £ 10 265 3 £ 10 240 1 £ 10 2105 3 £ 10 270 0.0 2 £ 10 287 3 £ 10 292 2 £ 10 287 2 £ 10 247 1 £ 10 2106 6 £ 10 260 1 £ 10 227 4 £ 10 236 1 £ 10 229 2 £ 10 244 6 £ 10 245 1 £ 10 236 3 £ 10 228 1 £ 10 219 1 £ 10 226 1 £ 10 230 1 £ 10 249 2 £ 10 230 1 £ 10 235 8 £ 10 237 4 £ 10 231 0
5 4 18 3 3 2 1 3 4 2 2 2 2 1 3 1 3 1 3 2 3 3 3 2 4 9 1 1 1 1 1 1 1 3 54
BE212568 BE212569 BE212682 BE212769 BE212570 BE212763 BE212653 BE212967 BE212571 BE213163 BE212968 BE212572 BE212573 BE212969 BE213179 BE212626 BE212574 BE212575 BE212764 BE212970 BE212766 BE212576 BE212971
Beta actin Alpha tubulin, brain-speci®c Beta-4 tubulin Beta-5 tubulin Beta-2 microglobulin precursor l-plastin Vimentin K8 simple type II keratin Synapse protein (SNAP-25) Synaptosomal-associated protein (snapin) Connexin 43 Coronin, actin-binding protein pp66 Vascular adhesion protein 1 Focal adhesion molecule Histone H2A and H3 genes Non-histone chromosome protein 2 Intersectin-EH binding protein Telomerase-associated protein 1 SCG10 protein (Superior Cervical Ganglion) MLL septin-like fusion protein Claudin (tight junction protein) Proteiolipid protein DM beta Epsilon-sarcoglycan
0.0 6 £ 10 270 1 £ 10 2174 0 1 £ 10 2110 4 £ 10 228 6 £ 10 205 1 £ 10 249 0.0 1 £ 10 206 8 £ 10 208 1 £ 10 206 1 £ 10 209 1 £ 10 217 4 £ 10 228 1 £ 10 218 2 £ 10 208 3 £ 10 27 2 £ 10 208 7 £ 10 241 2 £ 10 205 2 £ 10 210 2 £ 10 208
4 1 1 3 3 2 1 3 2 1 1 1 1 1 1 1 2 2 1 2 1 1 1
BE212577 BE212578 BE212661
F1-ATPase gamma subunit Protease Na (1)/K (1) ATPase alpha subunit
9 £ 10 259 8 £ 10 263 1 £ 10 2144
1 1 1
Z. Ju et al. / Gene 261 (2000) 373±382
377
Table 1 (continued) Clone No a
Acc No. b
IpBrn01621 IpBrn00059 IpBrn00095 IpBrn00064
BE212972 BE212579 BE212580 BE212581
Identity c
RNA Helicase Inorganic pyrophosphatase Malate dehydrogenase Nerve regeneration induced 2 0 , 3 0 -cyclic nucleotide 3 0 phosphodiesterase IpBrn01405 BE212973 Non-selenium glutathione phospholipid hydroperoxide peroxidase (phgpx gene) IpBrn01221 BE212686 Peptidylprolyl isomerase D (nuclear gene encoding mitochondirial cyclophilin D), stress responsive IpBrn00148 BE212582 Peptidylprolyl isomerase F (cyclophilin F), stress responsive IpBrn00873 BE212773 Aspartyl-tRNA synthetase IpBrn01469 BE212974 ATP synthase IpBrn00352 BE212583 Glutathione S-transferase IpBrn01137 BE212669 Mitogen-activated protein kinase kinase (c-MKK) IpBrn00293 BE212584 DNA polymerase delta catalytic subunit IpBrn00060 BE212585 Glucose phosphate isomerase IpBrn01149 BE212673 6-phosphofructo-2-kinase/fructose 2, 6-bisphosphatase IpBrn01320 BE212975 Adenine nucleotide translocase (Ant1) IpBrn00508 BE212627 Aldolase C IpBrn01495 BE212976 Creatine kinase IpBrn01368 BE212977 Hypoxanthineguanine phosphoribosyl transferase IpBrn01401 BE212978 Sulfotransferase-related (Sultx3) protein IpBrn00365 BE212586 Calpain (calcium-dependent protease) II regulatory subunit IpBrn01377 BE212979 Calpain small subunit IpBrn01245 BE212690 Phosphatase 2A catalytic subunit, isotype alpha IpBrn01232 BE212687 Protein-tyrosine-phosphatase IF1 IpBrn01272 BE212694 Protein tyrosine phosphatase, receptor-type, N IpBrn01719 BE212697 l-isoaspartate (d-aspartate) O-methyltransferase (PCMT) IpBrn00146 BE212587 Type-1 protein phosphatase catalytic subunit alpha isoenzyme 4. Transcriptional factors, DNA repair and DNA-binding proteins IpBrn00141 BE212588 Arginine/serine-rich splicing factor IpBrn01474 BE212980 c-fos transcription factor IpBrn01333 BE212981 Splicing factor similar to S. cerevisiae Prp18 IpBrn00194 BE212589 Zinc ®nger protein 231 IpBrn00625 BE212590 Transcription elongation factor B IpBrn01080 BE212666 HER-4 DNA binding protein IpBrn01143 BE212672 Transcription elongation factor B (SIII) IpBrn01156 BE212675 Transcription factor IIA IpBrn01638 BE212982 Poly(A)-binding protein, cytoplasmic 4 (inducible form) IpBrn01186 BE212680 U6 snRNA-associated Sm-like protein (LSM5) IpBrn00636 BE212591 Mouse 38 kDa Mov34 homolog IpBrn00554 BE212628 Interleukin enhancer binding factor 1 IpBrn00189 BE212592 Y-box protein IpBrn01341 BE212983 Human dead box protein, X isoform (DBX) IpBrn01330 BE212984 RAD23b homolog 5. Genes involved in immune systems IpBrn00018 BE212593 Immunoglobulin gamma heavy-chain IpBrn01328 BE212985 Cat®sh immunoglobulin heavy chain joining region IpBrn00109 BE212594 MHC class I alpha chain IpBrn00629 BE212595 MHC class II antigen IpBrn02043 BE213168 Danio rerio invariant chain-like protein 2 IpBrn01057 BE212663 14-3-3 (immunodetecting) protein beta-1 IpBrn02068 BE213171 Brain-speci®c 14-3-3 protein beta-2 IpBrn00600 BE212596 Transplantation antigen IpBrn00865 BE212771 Nuclear autoantigen GS2NA IpBrn02106 BE213175 Zebra®sh Dare-DAXX DAXX protein 6. Ionic channels, metal metabolism, sorting proteins, and transporters IpBrn00610 BE212597 Ferritin heavy subunit IpBrn00598 BE212598 Ferritin mid subunit IpBrn00845 BE212767 Metallothionein IpBrn01163 BE212676 ATPase, Na 1 /K 1 transporting, alpha 3
Pd
Fe
7 £ 10 216 2 £ 10 241 3 £ 10 281 7 £ 10 226
1 1 1 1
1 £ 10 207
1
2 £ 10 239
1
5 £ 10 221 6 £ 10 205 2 £ 10 254 8 £ 10 205 1 £ 10 2144 7 £ 10 204 1 £ 10 217 9 £ 10 214 8 £ 10 216 1 £ 10 298 4 £ 10 233 3 £ 10 204 0.004 3 £ 10 207 8 £ 10 213 1 £ 10 2105 9 £ 10 214 4 £ 10 281 1 £ 10 283 1 £ 10 274
1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1
4 £ 10 233 7 £ 10 211 1 £ 10 208 1 £ 10 23 3 £ 10 241 3 £ 10 228 2 £ 10 241 2 £ 10 230 8 £ 10 247 9 £ 10 204 4 £ 10 265 3 £ 10 249 2 £ 10 297 1 £ 10 221 4 £ 10 218
1 1 1 1 2 2 2 1 1 1 1 1 2 1 1
0.0 4 £ 10 209 1 £ 10 2132 0.0 2 £ 10 205 2 £ 10 251 1 £ 10 218 1 £ 10 243 1 £ 10 225 2 £ 10 211
13 1 1 2 1 1 1 3 1 1
1 £ 10 224 5 £ 10 216 1 £ 10 2129 4 £ 10 265
6 1 1 1
(continued overleaf)
378
Z. Ju et al. / Gene 261 (2000) 373±382
Table 1 (continued) Clone No a
Acc No. b
Identity c
IpBrn00121 BE216903 Calmodulin IpBrn00325 BE212599 Voltage-dependent calcium channel, gamma subunit 2 IpBrn00169 BE212600 High af®nity glutamate transporter IpBrn00507 BE212629 Glutamate/aspartate transporter protein IpBrn01139 BE212670 Voltage-gated sodium channel alpha subunit IpBrn01504 BE212986 Cytosolic sorting protein PACS-1a IpBrn01553 BE212987 Iron-sulfur protein subunit IpBrn00319 BE212601 15 kD selenoprotein IpBrn01387 BE212988 S100-like calcium binding protein IpBrn01165 BE212677 Sec61 (protein transport protein) gamma 7. Proto-oncogenes, tumor-related proteins, tumor suppressors IpBrn00048 BE212602 Proto-oncogene BMI-1 IpBrn02125 BE213180 FGFR1 oncogene partner (FOP) IpBrn01636 BE212989 Malignancy-related C140 product IpBrn00817 BE212762 Malignant cell expression-enhanced gene/tumor progressionenhanced gene IpBrn01450 BE212990 Human hepatocellular carcinoma associated ring ®nger protein IpBrn00589 BE212603 Leukemia-associated phosphoprotein p18 IpBrn02069 BE213172 Wilm's tumor-related protein (QM) IpBrn02042 BE213167 RAB7, member RAS oncogene family IpBrn01069 BE212665 Mouse Finkel-Biskis-Reilly murine sarcoma virus (FBRMuSV) ubiquitously expressed (fox derived) protein IpBrn01461 BE212991 Mouse mago-nashi homolog, proliferation-associated protein IpBrn01013 BE212655 von Hippel-Lindau binding protein 1 IpBrn02045 BE213169 Deleted in polyposis IpBrn01197 BE212683 Laminin receptor 1 (67kD, ribosomal protein SA) IpBrn01140 BE212671 Ras like GTPase 8. Hormones, receptors, and regulatory proteins IpBrn02036 BE213165 Isotocin IpBrn01502 BE212992 Estrogen receptor type beta IpBrn01601 BE212993 Thymosin beta a IpBrn02116 BE213178 Thymosin beta-10 IpBrn02012 BE213161 Prostaglandin E receptor EP3 IpBrn01028 BE212657 Ubiquitin IpBrn00340 BE212604 Leucine-rich repeat-containing F-box protein IpBrn00812 BE212760 Cholecystokinin IpBrn01410 BE212994 Pancreatic somatostatin-14 IpBrn00191 BE212605 High af®nity IgE receptor gamma subunit IpBrn00602 BE212606 Receptor for activated protein kinase C IpBrn00043 BE212607 Thyroid hormone receptor-associated protein IpBrn01425 BE212995 Tyrosine 3 monooxy-genase/tryptophan 5-monooxygenase activation protein, theta polypeptide IpBrn01635 BE212996 Protein inhibitor of neuronal nitric oxide synthase IpBrn01207 BE212684 RaP2 (G-protein) interacting protein 8 IpBran01712 BE212696 Activin B IpBrn01624 BE212997 Ionotropic glutamate receptor subunit 3 alpha precursor 9. Development and differentiation-related proteins IpBrn01407 BE212998 Bithoraxoid-like protein IpBrn00342 BE212608 Human deleted in split-hand/split-foot 1 region IpBrn00835 BE212765 Clock gene BMAL-1 IpBrn01035 BE212659 Uncharacterized hematopoietic stem/progenitor cells protein MDS027 10. Stress induced proteins IpBrn02128 BE213181 Mouse mSTI1 (stress-inducible protein) IpBrn00166 BE212609 Heat shock protein 70 IpBrn02099 BE213174 Heat shock protein 90 IpBrn00221 BE212610 Ependymin (cold acclimation-related) 11. Genes involved in lipid metabolism IpBrn01374 BE212999 Fatty acid binding protein IpBrn01061 BE212664 Fatty acid binding protein 7 IpBrn00349 BE212611 Fatty acid binding protein H-FABP IpBrn01171 BE212678 Apolipoprotein E 12. Genes homologous to human mental disease related genes
Pd
Fe
2 £ 10 293 8 £ 10 211 4 £ 10 215 8 £ 10 27 1 £ 10 233 3 £ 10 208 5 £ 10 207 1 £ 10 215 2 £ 10 207 4 £ 10 224
3 2 1 1 1 1 1 1 1 1
2 £ 10 226 2 £ 10 205 1 £ 10 205 1 £ 10 210
1 1 1 1
4 £ 10 248 4 £ 10 223 3 £ 10 297 5 £ 10 231 5 £ 10 233
1 1 1 2 1
3 £ 10 253 0.004 2 £ 10 223 5 £ 10 283 3 £ 10 216
1 1 1 1 1
1 £ 10 212 5 £ 10 207 1 £ 10 205 6 £ 10 205 2 £ 10 208 1 £ 10 2112 2 £ 10 208 2 £ 10 218 0.0 5 £ 10 205 6 £ 10 245 4 £ 10 215 2 £ 10 243
1 2 1 2 1 6 1 1 2 2 1 1 3
8 £ 10 247 1 £ 10 243 1 £ 10 2114 1 £ 10 2111
2 2 1 1
1 £ 10 206 1 £ 10 211 3 £ 10 244 8 £ 10 211
1 1 1 1
1 £ 10 23 1 £ 10 280 4 £ 10 211 2 £ 10 260
1 1 1 6
1 £ 10 224 5 £ 10 205 9 £ 10 216 3 £ 10 213
7 3 3 2
Z. Ju et al. / Gene 261 (2000) 373±382
379
Table 1 (continued) Clone No a
Acc No. b
Identity c
IpBrn01190 BE212681 Huntingtin IpBrn02108 BE213176 Human chromosome 1 atrophin-1 related protein IpBrn01392 BE213000 Human CGI-108 protein IpBrn01338 BE213001 Neuroendocrine speci®c protein (NSP) IpBrn01644 BE213002 CNS myelin P0-like glycoprotein IpBrn01010 BE212654 Small EDRK-rich factor 2 13. Genes homologous to sequences of unknown functions IpBrn01513 BE213003 Human cDNA FLJ10866 ®s IpBrn00607 BE212612 Human cDNA FLJ20279 ®s, clone HEP03229 IpBrn01574 BE213004 Human cDNA FLJ10640 ®s, clone NT2RP2005723 IpBrn00310 BE212613 Chromosome 19 cosmid R33743 IpBrn01018 BE212656 Clarias batrachus clone Cba06 IpBrn01045 BE212662 Drosophila melanogaster genomic scaffold IpBrn01584 BE213005 C. elegans cosmid ZK652 IpBrn01524 BE213006 Human 2p16 PAC RPCI5-960D23 IpBrn01564 BE213007 Human 30 kDa protein expressed in adrenal gland IpBrn00206 BE212614 Human cDNA DKFZp434B055 IpBrn00015 BE212615 Human DKFZP566B023 protein IpBrn00391 BE216905 Human chromosome 16, BAC clone 26O3 IpBrn00608 BE212616 Human clone 25228 IpBrn00183 BE212617 Human clone 316G12 IpBrn00815 BE212761 Human clone NH0395G17 IpBrn00878 BE212774 Human clone RP4-635O5 IpBrn00133 BE212618 Human CTG4a IpBrn01181 BE212679 Human DNA sequence from PAC 257A7 on chromosome 6p24. IpBrn01350 BE213008 Human HepG2 3 0 region MboI cDNA, clone hmd4h12m3 IpBrn01675 BE213009 Human hypothetical protein (BM-002) IpBrn02001 BE213159 Human KIAA0806 gene IpBrn00640 BE212619 Human KIAA1033 protein IpBrn00884 BE212775 Human KIAA1097 protein IpBrn02015 BE213162 Human KIAA1488 protein IpBrn01260 BE212691 Zebra®sh LINE DNA IpBrn01262 BE212692 Human homologue of yeast-44.2 protein IpBrn00509 BE212630 Human Pac 817k2 chromosome X 14. Mitochondrial genes IpBrn00633 BE212620 ATPase subunit 8 and subunit 6 IpBrn01263 BE212693 Aldehyde dehydrogenase 2 (Aldh2) IpBrn01344 BE213010 18S small subunit ribosomal RNA gene IpBrn02020 BE213164 28S ribosomal RNA IpBrn00301 BE212621 12 S rRNA gene IpBrn02010 BE213160 5.8S ribosomal RNA gene and internal transcribed spacer IpBrn00173 BE212622 Cytochrome P450 aromatase-like IpBrn00174 BE212623 Cytochrome b IpBrn00108 BE212624 Cytochrome c oxidase I IpBrn01355 BE213011 Cytochrome c oxidase II IpBrn01037 BE212660 Cytochrome c oxidase III IpBrn01709 BE212695 NADH dehydrogenase (ubiquinone) 1 beta subcomplex IpBrn01212 BE212685 NADH dehydrogenase subunit 1 IpBrn01234 BE212688 NADH dehydrogenase subunit 2 gene IpBrn01411 BE213012 NADH dehydrogenase subunit 5 and subunit 6 genes IpBrn01135 BE212667 NADH dehydrogenase subunit 3 15. Other genes IpBrn00195 BE212625 Alpha globin IpBrn00389 BE212450 Beta-globin a b c d e
Clone numbers. Accession numbers. Gene identity as determined by BLAST searches. Probability. Frequency of the clones in the sequenced pool.
Pd
Fe
5 £ 10 215 3 £ 10 219 4 £ 10 221 1 £ 10 208 4 £ 10 218 2 £ 10 211
1 2 1 1 1 1
1 £ 10 220 3 £ 10 210 2 £ 10 206 2 £ 10 204 3 £ 10 207 1 £ 10 206 2 £ 10 206 9 £ 10 214 8 £ 10 216 1 £ 10 224 7 £ 10 228 9 £ 10 217 8 £ 10 214 1 £ 10 26 2 £ 10 208 2 £ 10 214 2 £ 10 214 2 £ 10 208
1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1
2 £ 10 208 3 £ 10 228 6 £ 10 205 2 £ 10 220 1 £ 10 215 1 £ 10 206 6 £ 10 221 4 £ 10 216 3 £ 10 228
1 1 1 1 1 1 1 1 1
8 £ 10 246 8 £ 10 230 1 £ 10 2158 1 £ 10 2173 3 £ 10 295 1 £ 10 2179 2 £ 10 211 8 £ 10 294 1 £ 10 298 7 £ 10 223 2 £ 10 234 5 £ 10 218 3 £ 10 244 6 £ 10 239 3 £ 10 241 5 £ 10 229
4 1 1 2 7 1 1 8 22 3 10 1 1 5 4 3
1 £ 10 205 1 £ 10 2120
14 7
380
Z. Ju et al. / Gene 261 (2000) 373±382
The expression pro®le as revealed by the EST analysis is summarized in Table 2. Among the known genes, genes involved in protein translation were the largest group accounting for 21.4% of expression in the brain, followed by mitochondrial genes (6.2%), structural genes (3.1%), enzymes (2.7%), hormones and regulatory proteins (2.5%), and immune-related proteins (2.1%). Eighteen clones were transcriptional factors or genes involved in DNA binding or DNA repair (1.6%). Genes involved in transportation and translocation of small molecules accounted for 1.8% of expression. Many genes of this category were voltage-gated ionic channels, metal binding proteins such as metallothionein and calmodulin, and amino acid transporters. Fourteen genes (1.2%) were proto-oncogenes, tumor suppressors, and tumor or malignancy-related proteins. Genes involved in lipid metabolism were expressed at high levels; three fatty acid binding protein genes accounted for 1.1% of all expression in the brain. Six genes were found to share high levels of similarity to known human genes involved in a number of human mental diseases such as the Huntingtin gene in the Huntington disease, atrophin-1 gene in the dentatorubral and pallidoluylsian atrophy disease (Khan et al., 1996), CGI-108 gene in the clinical global disease, and the small EDRK-rich factor 2 in the spinal muscular atrophy (Scharf et al., 1998). The evolutionary conservation of these orthologous genes in the teleost ®sh is highly interesting and that may make ®sh a model organism for some behavior studies involving disease-related genes in humans. Four stress induced genes were identi®ed including the heat shock proteins hsp70, hsp90, the ependymin gene which has been demonstrated to be important for cold acclimation in ®sh (Tang et al., 1999), and the stress-inducible homologue of mouse (Blatch et al., 1997). The homology of the mSTI1 is short in a stretch of 53 nucleo-
tides (87% similarity). While this short length homology may indicate uncertainty of the gene identity, such evolutionarily conserved regions may imply conserved functional domains of the protein families. These stressrelated proteins may be useful for environmental genomics studies. An interesting group of genes showed a signi®cant level of similarities to known sequences of unknown functions from various organisms including human, zebra®sh, C. elegans, and Drosophila. Twenty-seven genes belong to this group accounting for 2.3% of all sequenced clones. Although the functions of these genes are not yet known, their evolutionary conservation demonstrated existence of many new gene families through evolution. Once functional information is gained from any species, comparative functional genomics will allow assignment of functionality to these orthologous genes. 3.3. The most abundantly expressed genes in the cat®sh brain Mitochondrial genes were among the most highly expressed genes. Mitochondrial tRNA-Val gene was the most highly expressed gene in the channel cat®sh brain. It was encountered 54 times accounting for 4.5% of the expression. Several other mitochondrial genes were also expressed at high levels including cytochrome c oxidase I (1.8%), cytochrome oxidase III (0.8%), and cytochrome b (0.6%). Several nuclear genes were expressed at high levels such as ribosomal protein genes L41 (1.5%), L24 (0.8%), S27 (0.8%), L35 (0.7%), immunoglobulin heavy chain (1.1%), and fatty acid binding protein (0.6%). The high levels of expression of these genes indicated that either high copy numbers of these genes existed in the cat®sh genome, or their promoters were highly active.
Table 2 Summary of the EST analysis with 1201 channel cat®sh (Ictalurus punctatus) brain cDNA clones Category
Number of clones sequenced
Number of genes
Redundancy factor
Expression (%)
Translational machinery genes Structural genes Enzyme genes Transcriptional factors, DNA repair and DNA-binding protein genes Genes involved in immune systems Ionic channel, metal metabolism, sorting protein, and transporter genes Proto-oncogene, tumor-related protein, tumor suppressor genes Hormones, receptors, and regulatory genes Development and differentiation-related genes Stress induced protein genes Genes involved in lipid metabolism Genes homologous to human mental diseases-related genes Genes homologous to sequences of unknown functions Mitochondrial genes Other genes Subtotal Unknown clones Total
257 37 32 19 25 22 15 30 4 9 15 7 28 74 21 595 606 1201
66 23 29 15 10 14 14 17 4 4 4 6 27 16 2 251 ±
3.89 1.61 1.10 1.27 2.50 1.57 1.07 1.76 1.00 2.25 3.75 1.17 1.04 4.63 10.50 2.37 ±
21.4 3.1 2.7 1.6 2.1 1.8 1.2 2.5 0.3 0.7 1.2 0.6 2.3 6.2 1.7 49.5 50.5
Z. Ju et al. / Gene 261 (2000) 373±382
Highly expressed genes were re¯ected in the EST sequencing as repeated sequencing of the same clones. In this term, the redundancy factor was used to measure the frequency of repeated sequencing for all categories of genes (Table 2). Alpha- and beta-globin genes had a redundancy factor of 10.5, but these globin genes were presumably from blood presence in the brain. Efforts were made to reduce blood contamination in the brain samples so that the expression pro®le was minimally affected by genes expressed in the blood. Overall, we believe that the expression pro®le in the cat®sh brain was not greatly biased by the blood genes since no other blood-speci®c genes were encountered. Globin genes are normally expressed at extremely high levels in the blood; even minimal blood contamination may result in the representation of globin genes. Mitochondrial genes had a redundancy factor of 4.63 meaning that for all 16 mitochondrial genes they were sequenced 4.63 times on average in this project. Similarly, translational proteins such as ribosomal proteins were highly expressed with a redundancy factor of 3.89. Other categories with a redundancy factor of more than 2.0 include lipid binding proteins (3.75), genes involved in immune systems (2.50), and stress induced proteins (2.25). Redundancy factor was lowest for development and differentiation-related genes (1.00), followed by genes homologous to known sequences of unknown functions (1.04), proto-oncogenes (1.07), enzyme genes (1.10), brain genes homologous to human diseaserelated genes (1.17), and transcriptional factors (1.27). The overall redundancy factor for the 251 known genes was 2.37. 3.4. Cat®sh ribosomal protein genes were transcribed at highly differential rates ESTs of 59 ribosomal protein genes have been sequenced, 34 for large and 25 for small ribosome subunits. Although ribosomal proteins are proportionally required for the assembly of ribosomes, large differences were observed in abundance of the ESTs for the ribosomal proteins. As each of the ribosomal proteins is required in the formation of ribosomes, correct expression of ribosomal protein genes poses an interesting regulatory problem for the cell. Each ribosome contains some 50 distinct proteins that must be made at the exactly the same rate (Nomura et al., 1984). It is known that the primary control of ribosomal protein synthesis is on translation of the mRNA, not on its synthesis (Nomura et al., 1984). Thus the level of translational regulation is quite dramatic. The most abundant ribosomal protein gene products were L41 (18 clones), followed by L24 (ten clones), S27 (nine clones), L35 (eight clones), L21 and L22 (both seven clones), L5b and L32 (both six clones), ribosomal protein large P2, L35a, L37a, and L38 (all ®ve clones), L7a, L11, L39, S9, and S24 (all four clones). Three clones were sequenced for L10a, L12, L15, L18, L30, L31, L36, S2, S3, S8, S15, S16, S18, S20, S21, and S22. Two clones were sequenced for S4, S10, S11, S12, S13, S19, and S23. Only
381
one clone was sequenced for each of the remaining 16 ribosomal genes (Table 1). The expression pro®le of ribosomal proteins indicated a range of 18 times in their RNA abundance. Such large differences indicated that several ribosomal protein genes such as L41, S24, and S27 have strong promoters and that translational control has to account for over 10±20 times of the adjustment in RNA abundance to make the same level of ribosomal proteins. One of the advantages of EST analysis using non-normalized libraries is its ability to produce expression pro®les. The frequency of cDNAs in a cDNA library is a re¯ection of mRNA abundance in the mRNA pool. Expression pro®le in the cat®sh brain is similar to those in human and mice (Adams et al., 1992; Lee et al., 2000). A higher percentage of genes was identi®ed as known genes (orthologues of known genes from other species) in the cat®sh brain than in the human brain. This was expected because more genes have been identi®ed in the last eight years from various species since the report of human ESTs (Adams et al., 1992). The expression pro®le in the cat®sh brain was much less polarized than those of the cat®sh pituitary (Karsi et al., 1998) and muscle (Kim et al., 2000). This could be caused by tissue types since our results with gene expression pro®les in the cat®sh muscle were similar to those found in the porcine skeletal muscle (Davoli et al., 1999). Additionally, most abundant RNAs could have been further over-represented in the process of `in vivo excision'. In our previous EST analysis, the cat®sh pituitary and muscle libraries were made in lambda phage cloning vector UNIZAP (Stratagene). A procedure called `in vivo excision' was used to convert the phage libraries into plasmid libraries. For the brain library, the cDNAs were cloned into plasmid vector pSport-1 (Life Technologies) and, therefore, clones were picked and sequenced directly without the procedure of in vivo excision. For the purpose of EST cataloging for the development of bio-reagents, repeated sequencing of highly expressed genes is not desirable. Normalized cDNA libraries are therefore highly needed for characterization of large numbers of unique ESTs. We have constructed several normalized cDNA libraries of cat®sh, and the transcriptome analysis using the normalized libraries is underway in our laboratory. 4. Conclusions Transcriptome analysis is an ef®cient alternative to genomic sequencing analysis. Such analysis of overall transcripts of tissues and organs not only produce large numbers of ESTs, but also generate expression pro®les by using nonnormalized cDNA libraries. EST cataloging and pro®ling will provide the basis for functional genomics research. In the present work, we identi®ed 251 channel cat®sh brain genes and produced sequence tags for additional 606 unknown gene clones. This demonstrated the rapid discovery of large numbers of novel genes. These ESTs will be
382
Z. Ju et al. / Gene 261 (2000) 373±382
useful for comparative genomics by determination of their orthologous counterparts through evolution, for mapping by PCR analysis using radiation hybrid panels, and for identi®cation of polymorphic markers in genes of known functions (type I markers). The ESTs will also be valuable molecular reagents for production of microarrays for functional genomics. Therefore, our sequences can help improve the cat®sh transcription map by adding genes expressed in the brain, particularly the candidate genes controlling feed conversion ef®ciency and other behavior traits. Acknowledgements This project was supported by a grant from US Department of Agriculture National Research Initiative Competitive Grants Program (USDA-NRICGP) to Z.L. and R.D (9835205-6738), by the Auburn University Competitive BioGrant (Biogrant J. Liu 99). We appreciate the support of Auburn University Department of Fisheries and Allied Aquacultures, College of Agriculture, and the Vice President for Research for their matched funds to USDA National Research Initiative Equipment Grants to Z.L. (98-352086540, 99-35208-8512). References Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F., Kerlavage, A.R., McCombie, W.R., Venter, J.C., 1991. Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252, 1651±1656. Adams, M.D., Dubnick, M., Kerlavage, A.R., Moreno, R.F., Kelley, J.M., Utterback, T.R., Nagle, J.W., Fields, C., Venter, J.C., 1992. Sequence identi®cation of 2,375 human brain genes. Nature 355, 632±634. Azam, A., Paul, J., Sehgal, D., Prasad, J., Bhattacharya, S., Bhattacharya, A., 1996. Identi®cation of novel genes from Entamoeba histolytica by expressed sequence tag analysis. Gene 181, 113±116. Blatch, G.L., Lassle, M., Zetter, B.R., Kundra, V., 1997. Isolation of a mouse cDNA encoding mSTI1, a stress-induced protein containing the TPR motif. Gene 194, 277±282. Boguski, M.S., Schuler, G.D., 1995. ESTablishing a human transcript map. Nat. Genet. 10, 369±371. Burke, J., Wang, H., Hide, W., Davison, D.B., 1998. Alternative gene form discovery and candidate gene selection from gene indexing projects. Genome Res. 8, 276±290. Chomczynski, P., Sacchi, N., 1987. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Analyt. Biochem. 162, 156±159. Davoli, R., Zambonelli, P., Bigi, D., Fontanesi, L., Russo, V., 1999. Analysis of expressed sequence tags of porcine skeletal muscle. Gene 233, 181±188. Douglas, S.E., Gallant, J.W., Bullerwell, C.E., Wolff, C., Munholland, J., Reith, M.E., 1999. Winter ¯ounder expressed sequence tags: establishment of an EST database and identi®cation of novel ®sh genes. Mar. Biotechnol. 1, 458±464. Ettinger, A.J., Feng, G., Sanes, J.R., 1997. Epsilon-Sarcoglycan, a broadly expressed homologue of the genes mutated in limb-girdle muscular dystrophy 2D. J. Biol. Chem. 272, 32534±32538.
Franco, G.R., Adams, M.D., Bento, S.M., Simpson, A.J.G., Venter, J.C., Pena, S.D.J., 1995. Identi®cation of new Schistosoma mansoni genes by the EST strategy using a directional cDNA library. Gene 152, 141±147. Gong, Z., 1999. Zebra®sh expressed sequence tags and their applications. Methods Cell. Biol. 60, 213±233. Hirono, I., Aoki, T., 1997. Expressed sequence tags of medaka (Oryzias latipes) liver mRNA. Mol. Mar. Biol. Biotechnol. 6, 345±350. Hishiki, T., Kawamoto, S., Morishita, S., Okubo, K., 2000. BodyMap: a human and mouse gene expression database. Nucleic Acids Res. 28, 136±138. Inoue, S., Nam, B., Hirono, I., Aoki, T., 1997. A survey of expressed sequence tags in Japanese ¯ounder (Paralichthys olivaceus) liver and spleen. Mol. Mar. Biol. Biotechnol. 6, 376±380. Johnston, M., 1998. Gene chips: array of hope for understanding gene regulation. Curr. Biol. 8, R171±R174. Karsi, A., Li, P., Dunham, R., Liu, Z.J., 1998. Transcriptional activities in the pituitaries of channel cat®sh before and after induced ovulation by injection of carp pituitary extract as revealed by expressed sequence tag analysis. J. Mol. Endocrinol. 21, 121±129. Khan, F.A., Margolis, R.L., Love, S.L., Sharp, A.H., Li, S.H., Ross, C.A., 1996. cDNA cloning and characterization of an atrophin-1 (DRPLA disease gene)-related protein. Neurobiol. Dis. 3, 121±128. Kim, S., Li, P., Zheng, X., Dunham, R.A., Liu, Z.J., 2000. Gene expression in the muscles of young and mature channel cat®sh (Ictalurus punctatus) as analyzed by expressed sequence tags and gene ®lters. Fish Physiol. Biochem. (in press). Lee, C-K., Weindruch, R., Prolla, T.A., 2000. Gene-expression pro®le of the aging brain in mice. Nat. Genet. 25, 294±297. Liu, Z.J., Karsi, A., Dunham, R., 1999. Development of polymorphic EST markers suitable for genetic linkage mapping of cat®sh. Mar. Biotechnol. 1, 437±447. Mekhedov, S., de Ilarduya, O.M., Ohlrogge, J., 2000. Toward a functional catalog of the plant genome A survey of genes for lipid biosynthesis. Plant Physiol. 122, 389±401. Nomura, M., Gourse, R., Baughman, G., 1984. Regulation of the synthesis of ribosomes and ribosomal components. Ann. Rev. Biochem. 53, 75± 117. Quackenbush, J., Liang, F., Holt, I., Pertea, G., Upton, J., 2000. The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res. 28, 141±145. Sambrook, J., Frisch, E.F., Maniatis, T., 1989. Molecular Cloning, A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Scharf, J.M., Endrizzi, M.G., Wetter, A., Huang, S., Thompson, T.G., Zerres, K., Dietrich, W.F., Wirth, B., Kunkel, L.M., 1998. Identi®cation of a candidate modifying gene for spinal muscular atrophy by comparative genomics. Nat. Genet. 20, 83±86. Schena, M., Shalon, D., Heller, R., Chai, A., Brown, P.O., Davis, R.W., 1996. Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Natl. Proc. Acad. Sci. USA 93, 10614± 10619. Tang, S.J., Sun, K.H., Sun, G.H., Lin, G., Lin, W.W., Chuang, M.J., 1999. Cold-induced ependymin expression in zebra®sh and carp and implications for cold acclimation. FEBS Lett. 459, 95±99. Wang, K., Gan, L., Jeffery, E., Gayle, M., Gown, A.M., Skelly, M., Nelson, P.S., Ng, W.V., Schummer, M., Hood, L., Mulligan, J., 1999. Monitoring gene expression pro®le changes in ovarian carcinomas using cDNA microarray. Gene 229, 101±108. Waterston, R., Martin, C., Craxton, M., Huynh, C., Coulson, A., Hillier, L., Durbin, R., Green, P., Shownkeen, R., Halloran, N., Metzstein, M., Hawkins, T., Wilson, R., Berks, M., Du, Z., Thomas, K., ThierryMieg, J., Sulston, J., 1992. A survey of expressed genes in Caenorhabditis elegans. Nat. Genet. 1, 114±123.