JOURNAL OF
GENETICS AND GENOMICS J. Genet. Genomics 36 (2009) 695702
www.jgenetgenomics.org
Identification, bioinformatic analysis and expression profiling of candidate mRNA-like non-coding RNAs in Sus scrofa Bang Xiao a, b, 1, Xingju Zhang b, 1, Yong Li b, Zhonglin Tang b, Shulin Yang b, Yulian Mu b, Wentao Cui b, Hong Ao b, Kui Li b, * a
Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education of China, Huazhong Agricultural University, Wuhan 430070, China b Key Laboratory of Farm Animal Genetic Resources and Utilization of Ministry of Agriculture of China, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China Received for publication 13 July 2009; revised 6 November 2009; accepted 7 November 2009
Abstract Messenger RNA-like non-coding RNAs (mlncRNAs) are a newly identified group of non-coding RNAs (ncRNAs) that may be involved in a number of critical cellular events. In this study, 93 candidate porcine mlncRNAs were obtained by computational prediction and screening, among which 72 were mapped to the porcine genome. Further analysis of 8 representative candidates revealed that these mlncRNA candidates are not highly conserved among species. Remarkably, one of the candidates, sTF35495, was found to be precursor of a putative porcine microRNA. By RACE PCR, we determined that the full length of sTF35495 was 3 kb. The protein-coding potential of this RNA was tested in silico with no significant finding. Semi-quantitative RT-PCR analysis of the subgroup of 8 candidates revealed two distinct expression profiles and two molecules were further validated by real-time PCR. The predicted pre-microRNA sequence in this study provides a potentially interesting insight into the in vivo function of porcine mlncRNAs and our findings suggest that they play key biological roles in Sus scrofa. Keywords: putative porcine mlncRNA; chromosome localization; interspecies conservation; pre-microRNA; RACE; expression profiling; qPCR
Introduction The non-coding RNAs (ncRNAs) are a group of nucleotide molecules that function directly as structural, catalytic or regulatory RNAs, rather than as templates for encoding proteins. These molecules have gained much attention as they participate in diverse processes that regulate gene expression in vivo. For example, microRNAs can regulate the expression of target genes by triggering trans* Corresponding author. Tel & Fax: +86-10-6281 3822. E-mail address:
[email protected] 1 These authors contributed equally to this work. DOI: 10.1016/S1673-8527(08)60162-9
lational repression or transcript degradation; piRNAs (Piwi-interacting RNAs) are involved in interactions with piwi-domain proteins in mammalian germlines; and snoRNAs (small nucleolar RNAs) function in splicing and post-transcriptional modification of rRNA precursors. One of the most prominent characteristics of the ncRNAs is the great diversity of their lengths. Distinct from the relatively small sizes of the microRNAs and snoRNAs (typically tens to hundreds of bases), messenger RNA-like non-coding RNAs (mlncRNAs) are much larger molecules. They are likely transcribed by RNA polymerase II, and may also be capped, spliced, and polyadenylated in a manner similar to conventional mRNAs (Griffiths-Jones,
696
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
2007). The roles of the mlncRNAs in biochemical and physiological events in vivo have also been the subject of an increasing number of reports. For instance, some mlncRNAs have been found to be precursors of microRNAs. H19 RNA, an important ncRNA involved in maternal imprinting, was identified as a precursor of microRNA (Cai and Cullen, 2007), and 97 mouse full-length mlncRNAs have proven to be miRNA-encoding candidates via robust experimental evidence (He et al., 2008). Some mlncRNAs are known to function in epigenetic pathways. Kcnq1ot1 organizes a lineage-specific nuclear domain for epigenetic gene silencing (Redrup et al., 2009), and Air silences transcription by targeting G9a to chromatin in mouse (Nagano et al., 2008). Other mlncRNAs participate in infrastructural developmental processes. The knockdown of a 6.7 kb spliced and polyadenylated murine ncRNA (Tug1), which is expressed in the retina and brain, results in malformed or non-existent outer segments of transfected photoreceptors (Young et al., 2005), and Dnm3os was indispensable for normal skeletal development and body growth in mouse (Watanabe et al., 2008). Because the experimental methods for distinguishing mlncRNAs from mRNA are time-consuming and functional studies of these molecules are relatively recent, the roles of the majority of the mlncRNAs remain unknown despite their emerging biological importance in vivo. Pig (Sus scrofa) is an important commercial farm animal and has been the subject of many biomedical and animal breeding studies. Huang et al. (2008) identified hundreds of skeletal muscle development related microRNAs in pig through in silico analysis, and studied their expression using microarrays. McDaneld et al. (2009) reported a complete set of microRNA transcriptome profiles at specific stages of skeletal muscle growth in pig, and revealed a group of microRNAs that may function in muscle development and growth. However, in addition to Xist and H19, only one mlncRNA (porcine trophoblast-derived non-coding RNA, TncRNA) has thus far been systematically investigated in pig (Ren et al., 2009). It is thus imperative that further studies are conducted on the mlncRNAs, which are a vital component of the transcriptome. In this study, we predicted 93 porcine mlncRNAs. Bioinformatic analysis described their chromosome distribution and lower conservation among species. The example of sTF35495 gave evidence of protein-coding potential deficit for mlncRNA candidates. Distinct expression patterns of 8 representative candidates were detected. Noticeably, the prediction of pre-microRNA gave a clue how porcine mlncRNAs function in vivo.
Materials and methods In silico screening of candidate porcine mlncRNAs We obtained information on 512 pairs of full-length ncRNA sequences that are conserved among human and mouse from supplemental data provided by the study of Willingham et al. (2005). For convenience, we referred to each pair of conserved sequences as one “seed”, and all human transcripts had a homologous murine counterpart. Each seed was aligned with the NCBI porcine EST database using BLASTN and the max score of 100 was set as the threshold value for homologous ESTs. For porcine counterparts of each seed, repeat ESTs were filtered and all homologs were assembled into contigs using DNASTAR software. Subsequently, each contig was mapped back into both the human and mouse genome with Map Viewer (http://www.ncbi.nlm.nih.gov/mapview/). The homologous genomic sequence with the highest score was retrieved for comparison with the mapping result of the corresponding seed. We adopted the approach of Inagaki et al. (2005) with some modifications, whereby a contig was collected if it was aligned with a score higher than 100 in either species. Next, we tested whether the retained contigs had any similarity with known protein-coding sequences by BLASTX search against the NCBI non-redundant protein sequence (nr) database. We used the max score of 100 as the threshold value for rejection. Finally, the seed of each retained contig was aligned against the NCBI human or mouse genomic plus transcript (Human or Mouse G + T) database. If the human or mouse counterpart of any contig was identified as a known mRNA, it was excluded. In addition to the identical transcriptional direction with the homologous mRNA, a threshold of at least 80% coverage and 80% max identity was used in such screening. The retained contigs were regarded as putative porcine mlncRNAs.
Chromosome localization of putative mlncRNAs Based on the unfinished pig genome sequencing project (http://www.ncbi.nlm.nih.gov/projects/mapview/map_sear ch.cgi?taxid=9823), the putative mlncRNAs were localized on porcine chromosomes by BLASTN searches of both the NCBI ref-genomics and HTGS (High Throughput Genomic Sequences) databases. Only the retrieved genomic sequences with > 100 score, an E value < 1010, and those covered no less than 80% of the full length of contig were accepted as the chromosomal localizations of the
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
mlncRNAs.
Bioinformatic analysis
Table 1 Conservation and expression analysis of eight representative porcine mlncRNA candidates Contig
To determine the level of interspecies conservation of the putative porcine mlncRNAs, 8 candidates, which represented both typical coverage and an abundance of homologous ESTs (Table 1), were chosen for alignment against the genomic sequences of different species. An E value of 1–10 was set as the threshold for accepting homologous genomic sequences. Bos taurus, Monodelphis domestica, Ornithorhynchus anatinus, and Gallus gallus were the species analyzed. The secondary structures of the candidate porcine mlncRNAs were predicated using MFOLD, a program that predicts which RNA secondary structures are most likely to form based on free energy minimization (Zuker, 2003). Stem-loop structures of an appropriate size (about 70 nt) predicted by MFOLD were evaluated as possible microRNA precursors by SSEARCH analysis of the miRBase database (http://microrna.sanger.ac.uk/sequences/search. shtml). High scores (100) and low E values (10–10) were set as the thresholds for validation as a precursor. Finally, mature microRNA sequences were computationally identified from all putative precursors deposited in miRBase.
5ƍ-RACE PCR To determine the upstream nucleotide sequence of a novel mlncRNA candidate, sTF35495, 5ƍ-RACE PCR was carried out using the putative microRNA precursor. The specific primer used was complementary to the sTF35495 contig (GSP, 5ƍ-GGATGTGAGTGCCAGCGTCCAG TAGG-3ƍ). 5ƍ-RACE PCR was performed using the SMARTerTM RACE cDNA Amplification Kit (Clontech, Palo Alto, CA, USA). Briefly, total RNA was extracted from the PK-15 (porcine kidney epithelial) cell line and then reverse transcribed into 5ƍ-RACE-Ready cDNA. PCR was then performed for 5 cycles of 94°C for 30 s, 72°C for 2 min; 5 cycles of 94°C for 30 s, 70°C for 30 s, and 72°C for 2 min; and 30 cycles of 94°C for 30 s, 68°C for 30 s, and 72°C for 2 min. The products were analyzed on 1.5% agarose/EtBr gels and subcloned. The sequencing results were aligned with the contig using DNASTAR software and mapped in the NCBI HTGS database. The peptides predicted by NCBI ORF-Finder (http://www.ncbi.nlm.nih. gov/projects/gorf/) for sTF35495 were analyzed by BLASTP against the nr database.
697
sTF30376
EST abundance 7
Coverage (%) Mouse
Human
56.6
32.91
sTF30418
7
22.78
69.29
sTF30650
21
32.12
90.06
sTF33561
6
49.82
66.93
sTF35121-2
1
42.73
14.35
sTF35495
56
93.46
90.98
sTF36235
14
52.21
73.83
sTE71410
4
74.67
86.68
“EST abundance” denotes the number of homologous ESTs found by BLASTN for each candidate. These numbers of homologous ESTs for the different contigs varied from 1 to 56 and the coverage ranged from 14.35% to 93.46%. Hence, these 8 candidates were representative of typical EST abundance and coverage variation.
Preparation of RNA and cDNA for the expression profiling of candidate mlncRNAs An indigenous Chinese pig, the Wuzhishan pig, was chosen as the animal subject. Total RNA extracts of 17 fresh tissue samples were prepared using Trizol reagent (Invitrogen, Carlsbad, CA, USA) and reverse-transcribed with M-MLV reverse transcriptase (Promega, Madison, WI, USA). In addition to whole embryos at 33 days post coitus (dpc) and longissimus muscle at 90 dpc, the samples included longissimus muscle, heart, liver, spleen, kidney, stomach, small and large intestine, brain, thymus gland, pituitary gland, thyroid, ovary, testicle, and uterus from adult pigs.
Semi-quantitative analysis and validation by real-time PCR All primers used in the semi-quantitative and real-time PCR experiments were designed using Primer premier 5.0 and are listed in Supplemental data 1. The internal standard control primers for porcine ȕ-actin gene were as described by Li et al. (2007). The PCR was performed for 5 min at 94°C, followed by 24 cycles of 30 s at 94°C, 30 s at the appropriate annealing temperature, 50 s at 72°C and a final extension step of 5 min at 72°C. All amplified products were cloned and sequenced. Nine cDNA templates in total were analyzed by real-time PCR, each of which represented a principle physiological system in the pig. They included the diges-
698
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
tive (liver, stomach, large, and small intestine), circulatory (heart), urinary (kidney), reproductive (ovary), endocrine (thyroid), and immune (spleen) systems. Real-time PCR was performed using a Power SYBR Green PCR Master Mix according to the manufacturer’s instructions (ABI, Foster City, CA, USA). The cycling conditions consisted of an initial 5 min at 95°C followed by 40 cycles of two-temperature cycling including 15 s at 95°C and 1 min at 60°C.
Results and discussion In silico screening for candidate mlncRNAs in pig Porcine mlncRNA candidates were obtained using a computational analysis series. For example, TF10184 and hTF10184 are homologous sequences from mouse and human, respectively, and consisted of one seed. For homologous porcine EST searches, 7 and 4 ESTs showed a positive alignment with TF10184 and hTF10184, respectively. Four repeat sequences were filtered and then 7 ESTs were assembled into 2 different sequences of sTF10184 (“s” denotes Sus scrofa). Among them, a similar genomic localization was found between sTF10184-fra1 and hTF10184. Likewise, similar mapping results were found between sTF10184-fra2 and TF10184. Finally, no significant similarity was found between sTF10184 and any known protein by BLASTX analysis and no known mRNA was identified from alignments with its seed. Hence, both sTF10184-fra1 and sTF10184-fra2 were designated as candidate porcine mlncRNAs. We obtained a provisional cohort of 119 contigs after assembly and re-mapping. To minimize possible false positive results, we tested whether they had any similarity with known protein-coding sequences. By BLASTX analysis, 9 contigs were eliminated from this pool (Supplemental data 2). Additionally, the corresponding seeds of the retained 110 contigs were re-analyzed, as some mlncRNA candidates reported in 2005 were subsequently shown to be mRNAs. We identified 17 contigs whose seeds had been identified as mRNA in the NCBI human or mouse G + T database (Supplemental data 3). Hence, these porcine mlncRNA candidates were excluded from further analysis and a total of 93 sequences were retained as putative porcine mlncRNAs for additional test (Supplemental data 4). Among them, 8 candidates were cloned, sequenced, and deposited to GenBank under the accession numbers of GQ231315 to GQ231315 for sTF30650, sTF33561,
sTF35495, sTE71410, sTF30376, sTF30418, sTF36235 and sTF35121-2. Among these porcine mlncRNA candidates, 2 different contigs were sometimes assembled from one seed due to a limitation in EST information. If 2 contigs were localized in the same porcine clone without overlap, they were denoted with the postscript “fra”. For example, sTF10184fra1 and -fra2 were localized within a distance of 63 bp in the same clone and in the same transcription orientation. Similarly, sTF28373-alt1 and -alt2 were identified in the same porcine clone with some overlap and in the same orientation, indicating that they are alternatively spliced RNAs. Additionally, the relationship between sTF32747-1 and -2 remains unclear because the chromosome localization could not be determined. Willingham et al. (2005) firstly remapped 4,280 mouse ncRNA candidates on the human genome, and then selected the top 512 sequences based on conservation. It must be noted that our derivative porcine counterparts do not represent all authentic mlncRNAs in S. scrofa but are a fraction of the entire mlncRNA population extracted at high stringency. To test the protein-coding potential of the candidate mlncRNAs, Inagaki et al. (2005) previously used online software to predict the lengths of possible open reading frame (ORF) sequences for full-length molecules. However, our putative mlncRNA cohort consisted of contigs assembled by homologous porcine ESTs and it had not been experimentally determined whether they were full-length. We thus used BLASTX rather than ORF analysis in this study. There was a possibility, however, that some of our candidates were 5ƍ or 3ƍ UTRs of unknown porcine mRNAs, in which no similarity could be found in BLASTX. Thus, the results of BLASTX need to be validated in the future based on the full-length data of our candidates.
Chromosome mlncRNAs
localization
of
candidate
porcine
Using the available data from the pig genome sequencing project, 72 of the 93 newly identified mlncRNA candidates could be mapped to porcine chromosomes. Among these candidates, 17 came from the NCBI ref-genomics database and others from the HTGS database. Eight candidates were transcribed from porcine chromosome 1, which contained most candidates identified in our screen. Chromosomes 9, 13, and 15 contained 7 putative mlncRNAs each. Only one of the candidate mlncRNAs, sTF30376, was transcribed from chromosome X. Likewise,
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
only one candidate was mapped to chromosome 11. The chromosome localizations of the 72 candidate mlncRNAs are shown in Supplemental data 5. All adjacent genes (within 5 kb of either the 5ƍ end or the 3ƍ end of a putative mlncRNA locus) are listed in Table 2. For 2 candidates, the distance from their 3ƍ ends to the corresponding downstream genes was less than 1.2 kb. Close adjacent genes were also found at the 5ƍ ends of our mlncRNA candidates (1,207 to 4,136 bp distances). In addition, 4 candidates that are transcribed within their corresponding host genes showed similar distribution patterns on chromosomes and are localized in introns. An example is sTF36447, which is transcribed from a > 10 kb intron of a porcine gene similar to Gyltl1b.
699
The candidate porcine mlncRNAs not highly conserved among different species It is commonly believed that due to the evolutionary stress, conservation often implies a functional necessity. Consequently, the interspecies conservation of the nucleotide sequences of the mlncRNA candidates was investigated. Because it had not been determined whether they were full-length, 8 candidates, which were representative of different coverage and abundance of homologous ESTs, were selected for conservation analysis (Fig. 1). Using BLASTN, an obvious trend of decreasing of similarity with increasing of genetic distance between pig and other organisms was observed. Seven of eight candi-
Table 2 Positional information for eight representative porcine mlncRNA candidates Distance to adjacent genes (bp)
Contig
SSC
sTF25363
7
521
sTF24007
1
1,167
sTF17281 sTF36065 sTF12535
14
3ƍ
5ƍ
Adjacent genes Similar to Atxn1
2,590
GTPase Rab14/ similar to Cep110
5
4,136
Cyp27a1
1
1,207
TUBA3D
Intron
Kcnma1
sTF31326
1
Intron
Similar to ZNF516
sTF36447
15
Intron
Similar to Gyltl1b
sTF37831
1
Intron
Similar to Slc12a1
The functions of the TUBA3D and ZNF516 genes were referred to human homologs as no murine data are available.
Fig. 1. Comparison of the 8 candidate porcine mlncRNAs among four species. The sequences of the 8 putative mlncRNAs were aligned with the genomic sequences of four different species by BLASTN. The histograms were generated representing the max scores from each alignment, whilst the smooth curves demonstrated the corresponding E values. It was obvious from this analysis that the number of homologous counterparts was decreased with the increase of genetic distance between each species and pig. Furthermore, the corresponding max scores were declined and E values were increased in parallel. Also notably, with the exception of sTF30376, six E values for homologous counterparts to the porcine mlncRNAs in cattle were 0, which could not be shown on a logarithmic scale. Additionally, the number of chicken counterparts to the putative porcine mlncRNAs, which satisfied the threshold max scores and E values, was unexpectedly comparable with those of platypus. This was possibly due to the small number of candidates investigated.
700
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
dates had significant homology with the genomic sequences of cattle, which diverged from pig with the smallest genetic distance than any other animal. Moreover, five homologous counterparts were found in opossum; and two and three homologous transcripts were identified in a representative primordial mammal, the platypus, and a lower amniote, the chicken, respectively. Remarkably, these 8 porcine putative mlncRNAs were also found by computational screening to be conserved in human and mouse. Hence, the results from this study demonstrated that the putative mlncRNAs are only conserved among the eutheria. A low level of conservation was evident between lower mammals and euteleostomi. Only 1 homologous counterpart was identified in zebrafish (data not shown). Our findings provide evidence that mlncRNAs are in fact poorly conserved among different species, which is consistent with the conclusions from the report of Inagaki et al. (2005). Unlike conventional genes, low levels of conservation among species may be a distinct characteristic of the mlncRNAs. This hypothesis is supported by an increasing number of studies. For example, Xist, which plays key roles in gene dosage effects in specific mammals, has no counterpart in some vertebrates such as the pufferfish (Inagaki et al., 2005). Additionally, a number of studies have reported that in a great number of cases, regulatory mlncRNAs function by participating in epigenetic pathways. For instance, Schoenfelder et al. (2007) found that ncRNAs in the H19 imprinting control region mediate gene silencing in transgenic Drosophila. In addition, murine Kcnq1ot1 organized a specific nuclear domain for epigenetic gene silencing (Redrup et al., 2009). It is noteworthy that these findings are consistent with a low level of conservation of mlncRNA sequences and suggest that these RNA molecules may function by regulating conserved advanced structures rather than by base-pairing with complementary sequences.
sTF35495 and the stem-loop of pre-microRNA 568 are shown in Fig. 2. Interestingly, sTF35495 was also highly conserved with the corresponding counterpart in G. gallus, a lower amniote indicating that the putative precursor of mir-568 may be conserved among species other than mammals and has a key biological function.
5ƍ-RACE PCR and analysis of the protein-coding potential of sTF35495 A ~0.5 kb fragment was amplified by 5ƍ-RACE PCR and an overlap of 40 bp was found between the amplicon and contig of sTF35495. Genome mapping showed that the amplicon was located in SSC 13 clone CH242-125K14,
sTF35495, the precursor of putative microRNA sscmir-568 To test whether the 8 candidate porcine mlncRNAs encode microRNAs, their secondary structures were predicted using MFOLD software. Consequently, a putative porcine microRNA, ssc-mir-568, and corresponding precursor, sTF35495, was identified. Although this particular microRNA has never previously been experimentally identified in pig, it was highly conserved with its human and mouse counterparts. The predicted secondary structure of
Fig. 2. Predicted secondary structure of sTF35495 and stem-loop of the putative pre-microRNA 568. A: predicted secondary structures of sTF35495 using MFOLD. The stem-loop structure contained in the red box is a putative precursor of ssc-mir-568. B: predicted stem-loop structure of the putative precursor of ssc-mir-568. This stem-loop structure was referred to Homo sapiens miR-568 in miRBase. The bold capitals indicate the sequence of mature microRNA, which is identical to hsa-mir-568. The bold italic capitals indicate the difference between human and pig; “uGuauccaCa” (human) versus “uAuauccaUa” (pig). Only minor differences exist between the two predicted stem-loop structures.
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
covering the region from 81,034 to 81,564. STF35495 was assigned to the same chromosome (from 78,531 to 81,073). This mlncRNA candidate is 3,000 nt in length and contains a typical polyadenylation signal sequence (AATAAA) located at 179 nt upstream from the 3ƍ poly (A) tail. Thus, the full-length sequence of sTF35495 was identified and its protein-coding potential was predicted using ORFFinder. The longest putative peptide of 112 amino acids and 15 other predictions were then analyzed by BLASTP but no significant similarity with any known proteins was found. The predicted ORFs and the results of BLASTP analysis are shown in Supplemental data 6. We aligned the full-length sTF35495 cDNA against the NCBI nr database using BLASTX as discussed above but found only a human hypothetical protein with the E value of 0.37. These data validated the use of BLASTX instead of ORF analysis in the screening of the mlncRNA candidates.
Evaluation of the expression profiles of the candidate porcine mlncRNAs The expression profiles of the 8 selected candidate porcine mlncRNAs are shown in Fig. 3. Among them,
Fig. 3. Semi-quantitative PCR analysis of four representative porcine mlncRNA candidates. Four porcine mlncRNA transcripts that show typical distribution patterns for the contigs were selected. “A” to “I” denotes sTF30376, sTF30418, sTF33561, sTF36235, Gapdh, sTF35121-2, sTE71410, sTF35495, and sTF30650, respectively. The upper numbers 1–17 indicate Wuzhishan pig embryo at 33 dpc, longissimus muscle at 90 dpc, and adult longissimus muscle, heart, liver, spleen, kidney, stomach, small and large intestine, brain, thymus gland, pituitary gland, thyroid, ovary, testicle, and uterus, respectively. M indicates the 100 bp DNA ladder, with the lowest bands corresponding to a molecular weight of 200 bp.
701
sTF30418 is primarily restricted to heart, stomach, and large and small intestine. This finding indicated a possible role for this RNA in adult porcine circulatory and digestive systems. Similarly, sTE71410 was also identified in six tissues only. In contrast, three candidates were expressed ubiquitously in almost all of the pig tissues tested; sTF33561 and sTF35121 were identified in 12 and 13 tissues, respectively; and sTF30376 was identified in all adult tissues except for brain and stomach. Our results suggest that these three putative mlncRNAs may participate in a variety of biological events. Interestingly, another three mlncRNA candidates were identified in 8 tissues despite their differing specificities. Two porcine mlncRNA candidates, sTF33561 and sTF30376, were selected to determine their accurate expression levels by real-time PCR (Fig. 4). The results were generally in accordance with, and thus validated, the semi-quantitative PCR results shown in Fig. 3, such as the higher expression of sTF30376 in heart compared with kidney. Together, our data revealed two distinct expression patterns for the porcine mlncRNA candidates. Some of these molecules are expressed ubiquitously in vivo. In contrast to the prominent tissue specificity of small ncRNAs, mlncRNAs may participate in some common cellular processes and function physiologically in infrastructural events in vivo, somewhat like house-keeping genes. An example is the well-known ncRNA, 7SL RNA (Wild et al., 2004; Halic and Beckmann, 2005), which is a core component of the signal recognition particle and is critical for targeting or transportation of nascent proteins. However, other mlncRNAs may only function in specific pathways in a few tissues. Our current evidence does further indicate that mlncRNAs are very diverse in terms of their in-
Fig. 4. Real-time PCR analysis of two putative porcine mlncRNAs. More accurate expression analysis of the porcine mlncRNAs sTF33561 and sTF30376 was conducted by real-time PCR. The results were consistent with the semi-quantitative PCR data for these transcripts.
702
Bang Xiao et al. / Journal of Genetics and Genomics 36 (2009) 695702
volvement in certain biological processes, and suggests that they have a diverse functional repertoire and play key biological roles in pig and other animals.
Acknowledgments This work was supported by the National Natural Science Foundation of China (No. 30830080 and 30800779), the Natural Science Foundation of Beijing (No. 5072035), and the Ministry of Science and Technology of China (No. 2006AA10Z135, 2008AA10Z143, 2006CB102105 and 2009CB941604).
Supplemental data Supplemental data 1–6 associated with this article can be found in the online version at www.jgenetgenomics.org.
References Cai, X., and Cullen, B.R. (2007). The imprinted H19 noncoding RNA is a primary microRNA precursor. RNA 13: 313–316. Cummins, J.M., He, Y., Leary, R.J., Pagliarini, R., Diaz, L.A., Jr. Sjoblom, T., Barad, O., Bentwich, Z., Szafranska, A.E., Labourier, E., Raymond, C.K., Roberts, B.S., Juhl, H., Kinzler, K.W., Vogelstein, B., and Velculescu, V.E. (2006). The colorectal microRNAome. Proc. Natl. Acad. Sci. USA 103: 3687–3692. Griffiths-Jones, S. (2007). Annotating noncoding RNA genes. Annu. Rev. Genomics Hum. Genet. 8: 279–298. Halic, M., and Beckmann, R. (2005). The signal recognition particle and its interactions during protein targeting. Curr. Opin. Struct. Biol. 15: 116–125. He, S., Su, H., Liu, C., Skogerbo, G., He, H., He, D., Zhu, X., Liu, T., Zhao, Y., and Chen, R. (2008). MicroRNA-encoding long non-coding RNAs. BMC Genomics 9: 236. Huang, T.H., Zhu, M.J., Li, X.Y., and Zhao, S.H. (2008). Discovery of
porcine microRNAs and profiling from skeletal muscle tissues during development. PLoS One 3: e3225. Inagaki, S., Numata, K., Kondo, T., Tomita, M., Yasuda, K., Kanai, A., and Kageyama, Y. (2005). Identification and expression analysis of putative mRNA-like non-coding RNA in Drosophila. Genes Cells 10: 1163–1173. Li, X., Zhu, Z., Mo, D., Wang, H., Yang, S., Zhao, S., and Li, K. (2007). Comparative molecular characterization of ADSS1 and ADSS2 genes in pig (Sus scrofa). Comp. Biochem. Physiol. B Biochem. Mol. Biol. 147: 271–277. McDaneld, T.G., Smith, T.P., Doumit, M.E., Miles, J.R., Coutinho, L.L., Sonstegard, T.S., Matukumalli, L.K., Nonneman, D.J., and Wiedmann, R.T. (2009). MicroRNA transcriptome profiles during swine skeletal muscle development. BMC Genomics 10: 77. Nagano, T., Mitchell, J.A., Sanz, L.A., Pauler, F.M., Ferguson-Smith, A.C., Feil, R., and Fraser, P. (2008). The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science 322: 1717–1720. Redrup, L., Branco, M.R., Perdeaux, E.R., Krueger, C., Lewis, A., Santos, F., Nagano, T., Cobb, B.S., Fraser, P., and Reik, W. (2009). The long noncoding RNA Kcnq1ot1 organizes a lineage-specific nuclear domain for epigenetic gene silencing. Development 136: 525–530. Ren, H., Li, Y., Tang, Z., Yang, S., Mu, Y., Cui, W., Ao, H., Du, L., Wang, L., and Li, K. (2009). Genomic structure, chromosomal localization and expression profile of a porcine long non-coding RNA isolated from long SAGE libraries. Anim. Genet. 40: 499–508. Schoenfelder, S., Smits, G., Fraser, P., Reik, W., and Paro, R. (2007). Non-coding transcripts in the H19 imprinting control region mediate gene silencing in transgenic Drosophila. EMBO Rep. 8: 1068–1073. Watanabe, T., Sato, T., Amano, T., Kawamura, Y., Kawamura, N., Kawaguchi, H., Yamashita, N., Kurihara, H., and Nakaoka, T. (2008). Dnm3os, a non-coding RNA, is required for normal growth and skeletal development in mice. Dev. Dyn. 237: 3738–3748. Wild, K., Halic, M., Sinning, I., and Beckmann, R. (2004). SRP meets the ribosome. Nat. Struct. Mol. Biol. 11: 1049–1053. Willingham, A.T., Orth, A.P., Batalov, S., Peters, E.C., Wen, B.G., Aza-Blanc, P., Hogenesch, J.B., and Schultz, P.G. (2005). A strategy for probing the function of noncoding RNAs finds a repressor of NFAT. Science 309: 1570–1573. Young, T.L., Matsuda, T., and Cepko, C.L. (2005). The noncoding RNA taurine upregulated gene 1 is required for differentiation of the murine retina. Curr. Biol. 15: 501–512. Zuker, M. (2003). Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31: 3406–3415.