BmSE, a SINE family with 3′ ends of (ATTT) repeats in domesticated silkworm (Bombyx mori)

BmSE, a SINE family with 3′ ends of (ATTT) repeats in domesticated silkworm (Bombyx mori)

JOURNAL OF GENETICS AND GENOMICS J. Genet. Genomics 37 (2010) 125−135 www.jgenetgenomics.org BmSE, a SINE family with 3′ ends of (ATTT) repeats in d...

838KB Sizes 1 Downloads 37 Views

JOURNAL OF

GENETICS AND GENOMICS J. Genet. Genomics 37 (2010) 125−135 www.jgenetgenomics.org

BmSE, a SINE family with 3′ ends of (ATTT) repeats in domesticated silkworm (Bombyx mori) Jinshan Xu a, b, Tie Liu c, Dong Li c, Ze Zhang c, Qinyou Xia c, Zeyang Zhou a, b, c, * b

a Laboratory of Animal Biology, Chongqing Normal University, Chongqing 400047, China Engineering Research Center of Bioactive Substances, Chongqing Normal University, Chongqing 400047, China c Institute of Sericulture and System Biology, Southwest University, Chongqing 400716, China

Received for publication 27 July 2009; revised 18 January 2010; accepted 19 January 2010

Abstract Short interspersed elements (SINEs), which are mainly composed of Bm1, are abundant in the domesticated silkworm. A 294 bp novel SINE family, designated as BmSE, was identified by mining the database of the complete Bombyx mori genome. A representational BmSE element is flanked by an 11 bp target site duplication sequence posterior poly (A) at the 3′ end and has the sequence motifs of an internal promoter of RNA polymerase III, which are similar to that of Bm1. The repetitive elements of BmSE are widely distributed in all 28 chromosomes of the genome and share the common (ATTT) repeats at the ends. GC-content distribution shows that BmSE tends to accumulate preferably in the region of higher AT content than that of Bm1. A high proportion of the BmSEs are mapped to the coding sequence introns, whereas several elements are also present in the UTR of some transcripts, indicating that BmSEs are indeed exonized with UTRs. Of the 615 identified structural variants (SVs) of BmSE among the 40 domesticated and wild silkworms, only 230 SVs were found in the domesticated silkworms, indicating that many recent SV events of BmSE occurred after domestication, which was probably due to its mobilization. Our analysis might assist in developing BmSE as a potential marker and in understanding the evolutionary roles of SINEs in the domesticated silkworm. Keywords: domesticated silkworm; SINE; distribution; structural variant

Introduction Retrotransposons are mobile elements that replicate via an RNA intermediate, and they are the most widespread and enriched class of eukaryotic transposable elements. Retrotransposons can usually be classified into several groups, including LTR retrotransposons, long interspersed elements (LINEs), short interspersed elements (SINEs), etc. SINEs are repetitive sequences of retrotransposons with a * Corresponding author. Tel: +86-23-6530 5293; Fax: +86-23-6825 1128. E-mail address: [email protected] DOI: 10.1016/S1673-8527(09)60031-X

length of 70–500 bp. Most families are derived from tRNA genes, while several are from 7SL RNA, like the primate Alu family (Weiner, 1980; Ullu and Tschudi, 1984; Okada, 1991; Schmid and Maraia, 1992; Ohshima and Okada, 2005). Distinct from other transposable elements, tRNA-derived SINE, commonly have both the conserved RNA polymerase III-specific internal promoter (boxes A and boxes B) and tRNA-like secondary structure. Moreover, these SINEs have an omnibus configuration made up of a tRNA-related region and tRNA-unrelated region, but not simple pseudogenes for tRNAs. Although SINEs appear to be mobilized via retrotransposition, they do not

126

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

encode potential proteins and represent non-autonomous transposable elements. SINEs usually share common 3′ regions or poly (A) tails structure with LINEs, indicating that retrotransposition of SINEs might require the enzymatic machinery derived from LINEs (Hasan et al., 1984; Okada et al., 1997; Nakajima et al., 1999; Ogiwara et al., 1999). After that, several lines of evidence have proved that SINEs can recruit the enzymes encoded by LINEs to mobilize, after transcription via RNA polymerase III (Kajikawa and Okada, 2002; Dewannieux et al., 2003). And recent studies of structure motifs in most SINE RNAs from mammals, fishes and plants suggest that common selective constraints are imposed at the SINE RNA structural level (Sun et al., 2006). SINEs are known in many higher eukaryotes, including plants, vertebrates, and invertebrates (Kramerov and Vassetzky, 2005). One of the most abundant SINEs families, Alu, is widely dispersed in the human genome, which contains no less than one million copies and occupies 10% of the genome (Lander et al., 2001), implying that SINEs are important components in animal genomes. Few SINE families of the domesticated silkworm have been characterized and investigated, and one principle member is Bm1. The 439 bp consensus sequence of Bm1 SINE consists of a 5′ tRNA-related region followed by a non-tRNA region, and the 5′ end of the consensus sequence corresponds to the initiation site for Pol III-directed transcription, which is thought to transcribe the Bm1 RNA intermediates that are required for retrotransposition (Adams et al., 1986). Further studies have shown that the level of the Bm1 transcript increased in response to either heat shock, inhibiting protein synthesis by cycloheximide, or viral infection, indicating that Bm1 played a role in the cell stress response (Kimura et al., 1999). Most SINEs are specific to order, family, or genus, which suggests that SINEs appear in a common ancestor of some lineages during evolution, such as Alu that is only present in primate genomes (Kido et al., 1991; Takasaki et al., 1994; Shedlock and Okada, 2000). Little evidence of horizontal transfer of SINEs was found, except for SINEs in salmonid species (Hamada et al., 1997). The recent release of the completely domesticated silkworm genomic sequence with the haploid content of 432 Mb (Xia et al., 2008) offers a good opportunity to comprehensively study the structures and distribution of more families of SINEs in the genome. In this study, we identified a tRNA-derived SINE family named as BmSE, which might be related to

an R1 clade non-LTR retrotransposition from Bomby mori. We investigated the genomic structure features of BmSEs, such as distribution bias and associations with putative genes, and we also identified structure variations (SVs) of this element among the 40 silkmoth varieties (Xia et al., 2009). Currently, except for Bm1, there is no report about the SINE family of which the sequence structure was intact in Bombyx mori, and it is also the first time that the SVs of SINE were used to analyze the genetic variation of more silkmoths. Therefore, our studies might assist in understanding the roles of retroelements in silkmoth genome evolution, which might also have potentially valuable implications as molecular markers.

Materials and methods Sequence analyses The latest database of the domesticated silkworm genome and cDNA was obtained, along with gene annotations, from the Genbank genome database or silkDB database (http://silkworm.swu.edu.cn/silkdb; http://sgp.dna. affrc.go.jp/KAIKObase/). Several procedures were used to isolate and characterize novel SINE-like elements. Firstly, the most primary repetitive elements in the genome were identified to form the database of B. mori repetitive elements using the RePS program (Wang et al., 2002), and the elements with a length of 70–500 bp were potential candidates. Then, a tRNA cloverleaf secondary structure was searched and constructed with tRNAscan-SE (Lowe and Eddy, 1997) and the DNAsis (Hitachi Software Engineering Co., Yokohama, Japan) program for the candidates. Furthermore, the A and B boxes of the promoter for RNA polymerase III were labeled manually. Lastly, only the elements that contained tRNA-like sequences and the A and B boxes of the promoter were considered as SINE-like elements. The Repeatmasker program (Smit et al., 1996-2004) was run under the sensitive condition to identify all the members covering full-length and partial copies in the genome by using the conserved sequence of aimed elements as a library, and all high-scoring pairs (HSPs) with a length longer than 50 nucleotides and identities higher than 80% were used to define the copies in a family. Then the distribution and density of SINE elements were surveyed based on all the copies identified above. The longest 1,000

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

copies of each SINE family were aligned using ClustalW (Thompson et al., 1994). Sequence variation was estimated from this alignment using Phylip (Felsenstein, 1989) as the pairwise differences between copies. The retrotransposition of SINEs is dependent on autonomous partner LINEs that encode reverse transcriptase (RT) and endonuclease (EN) for their own amplification (Eickbush 1992; Luan et al., 1993). There is ample evidence that RTs encoded by LINE partners recognize the matching 3′ tails of SINEs and thus initiate SINE replication via retrotransposition in trans (Ohshima et al., 1996; Okada et al., 1997; Kajikawa and Okada, 2002; Kajikawa et al., 2005; Piskurek et al., 2006). To discuss the possible retrotranspositional mechanism of BmSE, we explored the B. mori genome to see if BmSE shared an identical 3′ tail sequence with certain LINEs. All LINEs of B. mori were obtained by the two methods described below: 1) BmTELib data downloaded at http://sgp.dna.affrc.go.jp/ KAIKObase/. 2) Identifying potential elements with MGEScan-non-LTR program (Rho and Tang, 2009), and then BLASTN searches of all identified LINEs were accomplished using the BmSE sequence as the query.

GC-bias analysis In the analysis of the GC-content distribution of the elements, a range of consecutive intervals of 2 kb long were chosen because the SINEs elements are relatively short. Subsequently, the GC content of the 2 kb segments, which include SINEs, was computed. However, the number of SINE elements that one segment contained was not considered. In order to survey the status of SINEs that are associated with genes in the B. mori genome, the database sets of putative gene annotation were collected and all the genes were blasted for homology to the SINE family with 10–10 of E-value.

Detection of structural variants (SVs) As indicated by Xia et al. (2009), SVs in silkworms include four main types, such as deletion, duplication, insertion and other complex SVs, and the great mass of SVs are deletion types. To detect the SVs of BmSE, the accurate genomic positions of the 2,604 complete copies in 432 Mb B. mori genomic sequence were obtained by the Repeatmasker program, and then the SVs in each position for 40 recent published silkworm population genomes were de-

127

tected with PE sequencing through a three-step strategy (Xia et al., 2009).

Results Characterizations of BmSE from the domesticated silkworm genome The novel SINE-like element (BmSE), with the length of 294 bp, was characterized in the domesticated silkworm. Representational BmSEs possess the A and B boxes of the promoter for RNA polymerase III, which are highly identical to that of Bm1. Moreover, BmSEs have poly (A) tails at the 3′ ends of the elements and a distinct characteristic of target site duplication (TSD) (Fig. 1A). The DNAsis search tRNA program revealed a degenerated cloverleaf tRNA secondary structure of these two elements, and of a total of 21 nucleotide pairs within a 4-stem region, 9 pairs were matched in BmSE (Fig. 1B). When the same program was applied to the domesticated silkworm Bm1, the nucleotide number of matching pairs in the stem region was 12. Although the secondary structures in this tRNA-like structure are not well conserved, several conserved and semiconserved nucleotides specific to tRNA are present at the secondary positions corresponding to tRNA (gray shade in Fig. 1B). All the above results suggested that BmSE is a typical tRNA-derived family of SINEs, and it might be more degenerated than Bm1. To determine the abundance of BmSE, Repeatmasker program was run to survey the domesticated silkworm genome using the intact element as a canonical library. Its copy numbers were estimated to be 5,182, suggesting that they are widely distributed in the haploid genome, even though the copy numbers are far less than that of Bm1, whose estimated copies number is about 150,000. The comparison of sequence divergence revealed that the sequences of BmSE copies are more highly conserved than those of Bm1 (Fig. 2). However, these elements usually have the various numbers of (ATTT) repeat preceding the A stretch at the 3′ end, and one copy of BmSE even presents more than 30 (ATTT) repeats at its tail (Fig. 3), which indicates that there should be several differentiated types of BmSE. It is notable that most of BmSE copies have the random target sites, like the Bm1 or other LINEs reported previously in B. mori. No findings showed that BmSE and certain LINEs shared the common sequence at

128

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

Fig. 1. RNA secondary structure of two SINE-like elements, BmSE and Bm1. A: schematic representation of BmSE and Bm1. The positions of the A and B boxes are indicated by bold gray strip, and the nucleotide sequences matching to the consensus sequences of the box A, box B among standard tRNAs (Galli et al., 1981) are also shaded. The nucleotides of TSD are indicated with the underline. B: predicted cloverleaf tRNA secondary structures of tRNA-related BmSE. The structures are obtained using DNAsis, and all the conserved and semiconserved nucleotides are present with gray shade.

80% identities to those of B. mori BmSART1 (GenBank accession No. D85594) at the amino acid level, most likely indicating that the retrotranspositional mechanism of BmSE may be related to the SART1 element.

The distribution of BmSE sequences in the domesticated silkworm genome

Fig. 2. Frequency distribution of pairwise distances for the families of BmSE and Bm1.

the 3′ tail sequence. One interesting result is that the 146 bp tail sequences of BmSE were followed closely by the poly(A) of a predicted LINE (named SART-ZY, Fig. 4). The ORF1 and ORF2 of SART-ZY show about 50% and

In an initial effort to gain insight into the chromosomal distribution of the silkworm SINEs, we analyzed the density and GC-insertion of the two SINE families. BmSE and Bm1 are widely distributed in each of B. mori 28 chromosomes, and the densities of them in 28 silkworm chromosomes are listed in Table 1. It is notable that once the density of Bm1 is relatively high in a chromosome, the densities of BmSE elements in that chromosome are usually high too. For instance, in chromosome 2, the density of Bm1 is 472.14 copies/Mb, and density of BmSE is 18.13 copies/Mb, which are both higher than that in most other chromosomes, indicating that different SINE elements might be inclined to cluster in a given region. This clustered and non-random genomic distribution of SINEs in B. mori might reflect the host genome against the deleterious effects of SINEs insertions in certain regions.

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

129

Fig. 3. Multiple alignment of the sequences of BmSE members and their flanking region. Seven members with poly (A) and four members without poly (A) are presented together with consensus. Dots indicate abbreviated sequences, and the rectangle indicates differential (ATTT) repeats.

Fig. 4. Comparison of the 3′ end regions of the BmSEs and SART-ZY LINEs. ORF1 and ORF2 represent the open reading frames that encode the enzymes of endonuclease and reverse transcriptase. Multiple alignment of common sequences at the 3′ end of three partial copy elements of LINE and BmSE were achieved and shaded by Boxshade3.21 (http://www.ch.embnet.org/software/BOX_form.html). Short lines indicate abbreviated sequences of LINEs, and open reading frames and intergenic regions are not shown in scale.

130

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

Table 1 Density of BmSE and Bm1 SINEs in the completely sequenced chromosome 28 of domesticated silkworm Chromosome

Density (copies/Mb) BmSE

Bm1

Chromosome

Density (copies/Mb) BmSE

Bm1

1

6.98

215.81

15

9.07

292.92

2

18.13

472.14

16

14.23

351.80

3

10.15

304.60

17

11.57

323.92

4

8.78

270.52

18

10.16

328.59

5

8.49

278.85

19

10.69

323.97

6

8.99

292.47

20

17.06

365.89

7

12.71

358.68

21

13.07

324.36

8

12.11

309.94

22

9.21

279.38

9

10.75

312.49

23

7.67

340.31

10

8.54

272.81

24

13.69

433.32

11

7.87

300.92

25

12.16

336.38

12

10.93

295.56

26

13.85

408.09

13

11.00

299.74

27

17.72

377.42

14

14.13

358.58

28

15.11

457.72

To determine GC-content distribution of the SINEs inserted in the domesticated silkworm genome, we analyzed relative frequency distributions of GC content in 2 kb segment with SINEs. GC content was calculated for each 2 kb non-overlapping segment of the genome. The GC contents of the segments containing BmSE elements were lower compared with that of Bm1 and the whole genome (Fig. 5). This suggests that BmSE showed the differential distribution characteristics in the genome and accumulated preferably in regions of higher AT content.

The effect of BmSE sequences on the putative genes We examined associations of the SINEs with putative genes in the domesticated silkworm genome with the set of BGI gene annotation. The SINEs were blasted to the complete genes (containing no UTR regions) to evaluate the position of elements. The results showed that a total of 19.1% (29,706/15,5401) of SINEs sequences laid within coding sequence (CDS) introns (Table 2), while the total sequences of introns only sum up to 14.8% of the whole genome, suggesting the tendency of SINEs inserted in introns. There are no copies of BmSE and Bm1 located in or over CDS exons, which means that they are hardly correlated with CDS exons. It had been reported that the exonization level of Alu, a primate-specific SINE, is significantly

Fig. 5. Relative frequency distribution of GC content in non-overlapping 2 kb SINE-containing segments, for two SINEs of BmSE and Bm1, respectively. X axis represents the continuous GC content, and Y axis represents the percentage of the numbers of SINE-containing segments in the total numbers of SINE-containing segments.

higher than that of other transposable elements (TEs) within the human genome (Sela et al., 2007), and our results above indicate that SINEs of B. mori might be difficult to be recruited by CDS exons. Furthermore, the BmSE-related transcripts were analyzed by searching the cDNA database containing 81,635 reads, and 154 hits were detected for BmSE. To determine whether the location of the exonization events of BmSE is

131

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

in UTRs, all 14,623 gene sequences in B. mori were extended 2,000 bp upstream of the ATG initiation codon and downstream of the TAA terminal codon to ensure that they contain the 5′ UTR and 3′ UTR region, and they were subsequently blasted to 154 BmSE-related EST sequences. The results of pairwise alignment showed that all 154 EST sequences were usually matched to the regions adjacent to the ATG initiation codon or TAA terminal codon of the genes (Table 3), confirming that BmSE can be exonized with the UTR and might play role in gene regulation. We

further investigated the structures of these BmSE transcripts and found that their lengths rang from 50 bp to 290 bp. Among them, 29 ESTs covered more than 90% of the complete BmSE sequence and 127 ESTs covered more than 60%. Some elements have a 5′ deletion or 3′ deletion that resulted in the loss of the initiation site for Pol III or (ATTT) repeats, and others frequently display an insertion-deletion polymorphism within the interior sequences (Fig. 6), suggesting remarkable structure variations (SVs) in BmSE transcripts.

Table 2 The effect of SINEs on the predicted domesticated silkworm genes Family BmSE Bm1

Total copies

SINEs in intronic region

SINEs in CDS exonic region

ESTs matching SINEs

5,182

805

0

136

150,219

28,901

0

250

Table 3 The location of representative BmSE-related ESTs in extended region of the domesticated silkworm genes Upstream of ATG

EST code name Start

End

Downstream of TAA Gene code name Start

End

rswla0_012655.y1.abd

−720

−38

N

N

BGIBMGA006079-TA

rswga0_001115.y1.abd

−874

−372

N

N

BGIBMGA003543-TA

rswla0_021238.y1.abd

−1310

−811

N

N

BGIBMGA001213-TA

rswhb0_001169.y1.abd

−1762

−1302

N

N

BGIBMGA014178-TA

rswhb0_001025.y1.abd

−537

−212

N

N

BGIBMGA003257-TA

rswea0_006104.y1.abd

N

N

+54

+334

BGIBMGA009330-TA

rswgb0_004162.y1.abd

N

N

+649

+1264

BGIBMGA014202-TA

rswab0_001081.y1.abd

N

N

+170

+789

BGIBMGA004776-TA

rswpb0_006064.y1.abd

N

N

+209

+795

BGIBMGA004776-TA

rswpb0_003168.y1.abd

N

N

+545

+1067

BGIBMGA012038-TA

rswea0_008948.y1.abd

N

N

+1072

+1729

BGIBMGA001107-TA

rswfa0_000619.y1.abd

N

N

+931

+1268

BGIBMGA013868-TA

rswpb0_007275.y1.abd

N

N

+968

+1583

BGIBMGA014202-TA

rswea0_001025.y1.abd

N

N

+801

+1123

BGIBMGA002256-TA

rswla0_020150.y1.abd

N

N

+724

+1154

BGIBMGA012646 -TA

rswla0_016617.y1.abd

N

N

+724

+1128

BGIBMGA012646-TA

rswjb0_005010.y1.abd

N

N

+1078

+1468

BGIBMGA009102-TA

N: no matches. EST code name and Gene code name are derived from silkDB database (http://silkworm.swu.edu.cn/silkdb).

132

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

Fig. 6. Multiple alignment of representative BmSE transcripts. Each sequence name denotes the original code in the ESTs database of the domesticated silkworm in silkDB database (http://silkworm.swu.edu.cn/silkdb).

The SVs of BmSE elements in silkworms Recent studies based on 40 silkworm genomes have showed that over three-fourths of the SVs overlapped with TEs, suggesting that SV events in the silkworm are probably due to TE content and mobility (Xia et al., 2009). Therefore, it is interesting to see if SVs of BmSE can contribute to a comprehensive genetic variation for the silk-

worms. A total 615 SVs of BmSE were identified among the 40 varieties of silkworms (Fig. 7), which all belong to the deletion type. Only thirty-five SVs found in wild silkworms are not shared with domesticated silkworms. Two hundred and thirty SVs exist independently in the domesticated silkworms, suggesting that almost all the old SV events of BmSE from wild silkworms were retained and many recent SV events occurred after domestication,

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

133

Fig. 7. Cluster of BmSE SVs in three silkworm groups. W(red): 11 Chinese wild varieties; D1(blue): 14 Chinese, 3 European, 3 tropical and 4 mutant domesticated silkworms; D2 (green): 5 Japanese domesticated silkworms. Arabic numerals mean the numbers of SVs in each cluster. Forty silkworm samples and detailed traits can be seen in recent published data (Xia et al., 2009).

because domesticated silkworms are proved to be originated from Chinese wild silkworm through archaeological and genetic evidences. Further exploration of the 230 SVs of BmSE in the differentiated variety of domesticated silkworms shows that three SVs and one SV exist independently in Japanese and European domesticated strains, respectively. It can be conferred that this phenomenon may be due to BmSE’s mobilization under different geographical domestic conditions.

Discussion In this study, a novel SINE family named as BmSE was identified in the domesticated silkworm, and its copy numbers were estimated to be 5,182. Interestingly, most BmSE copies were highly conserved, but differentiated at the 3′ end for the (ATTT) repeats. It is not clear what the role of the common (ATTT) repeat is, but the OsSN elements from rice were reported to share a TTCTC sequence in the 3′ region preceding the A stretch, and this sequence was assumed to be important for retroposition (Tsuchimoto et al., 2008). A detailed examination of 28 chromosomes revealed that the SINEs were distributed

widely in the genome but have slight variation of numbers in individual chromosomes. The analysis of the GC-content insert indicated that SINE elements accumulated preferably in genomic regions of higher AT content, which was distinguished from the human Alu SINE that is more abundant in GC-rich DNA (Wichman et al., 1992). Additional evidence is needed to explain the biological mechanism behind the BmSE distribution of AT bias, however, highly repetitive (ATTT) repeats at the 3′ end may be a clue. SINEs are a major component of genomes of higher organisms, and they have also been recognized to be beneficial for the host genome. For several animal SINEs, polyadenylation has been used for the generation of functional proteins, such as the γ-subunit of muscle phosphorylase kinase and leukemia inhibitory factor receptor (Maichele et al., 1993). In the domesticated silkworm, Bm1 SINE has been proved to be related to the cell stress response (Kimura et al., 1999). In this study, we found that even if BmSE or Bm1 were not correlated with CDS exons of putative genes in the domesticated silkworm, some of them were exonized in region of UTR, suggesting that SINE elements may play a significant role in gene regulation.

134

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

Previous studies usually use the data of SNPs to analyze species phylogeny. However, there are so many copies of BmSE in the B. mori genome, and the 40 silkworm genomes are sequenced with only approximately three-fold coverage, which makes it difficult to distinguish the accurate SNPs in different varieties. Therefore, we first analyzed the SVs of distributed BmSE copies from 40 varieties to comprehend genetic variation. The results of BmSE SVs show that this element is distributed widely in various silkmoths, most deletion variations of BmSEs are commonly shared by different silkworm systems, and some novel variations were derived during the evolutionary process of domestication. Thus, BmSE should have a potential use as a molecular maker in the silkmoth. The analysis of BmSE SVs might also provide us with a novel approach to analyze the genetic variation of the domesticated silkworm.

Acknowledgements This work was supported by the Natural Science Foundation Project of CQ CSTC (No. 2009BB1241), Ministry of Science and Technology of China (No. 2006AA10A117 and 2005CB121003). We thank three anonymous referees for providing valuable comments.

References Adams, D.S., Eickbush, T.H., Herrera, R.J., and Lizardi, P.M. (1986). A highly reiterated family of transcribed oligo(A)-terminated, interspersed DNA elements in the genome of Bombyx mori. J. Mol. Biol. 187: 465−478. Dewannieux, M., Esnault, C., and Heidmann, T. (2003). LINE-mediated retrotransposition of marked Alu sequences. Nat. Genet. 35: 41−48. Eickbush, T.H. (1992). Transposing without ends: the non-LTR retrotransposable elements. New Biol. 4: 430−440 Felsenstein, J. (1989). PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5: 164−166. Galli, G., Hofstette, H., and Bimstiel, M.L. (1981). Two conserved sequence blocks within eukaryotic tRNA genes are major promoter elements. Nature 294: 626−631. Hamada, M., Kido,Y., Himberg,M., Reist, J.D., Ying, C., Hasegawa, M., and Okada, N. (1997). A newly isolated family of short interspersed repetitive elements (SINEs) in coregonid fishes (whitefish) with sequences that are almost identical to those of the SmaI family of repeats: possible evidence for the horizontal transfer of SINEs. Genetics 146: 355−367.

Hasan, G., Turner, M.J., and Cordingley, J.S. (1984). Complete nucleotide sequence of an unusual mobile element from Trypanosoma brucei. Cell 37: 333−341. Kajikawa, M., and Okada, N. (2002). LINEs mobilize SINEs in the eel through a shared 3′ sequence. Cell 111: 433−444. Kajikawa, M., and Okada, N. (2005). Isolation and characterization of active LINE and SINEs from the eel. Mol. Biol. Evol. 22: 673−682. Kido, Y., Aono, M., Yamaki, T., Matsumoto, K., Murata, S., Saneyoshi, M., and Okada, N. (1991). Shaping and reshaping of salmonid genomes by amplification of tRNA-derived retroposons during evolution. Proc. Natl. Acad. Sci. USA 88: 2326−2330. Kimura, R.H., Choudary, P.V., and Schmid, C.W. (1999). Silkworm Bm1 SINE RNA increases following cellular insults. Nucleic Acids Res. 27: 3380−3387. Kramerov, D.A., and Vassetzky, N.S. (2005). Short retroposon in eukaryotic genome. Int. Rev. Cytol. 247: 165−221. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., and FitzHugh, W. (2001). Initial sequencing and analysis of the human genome. Nature 409: 860−921. Luan, D.D., Korman, M.H., Jakubczak, J.L., and Eickbush, T.H. (1993). Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell 72: 595−605. Lowe, T.M., and Eddy, S.R. (1997). tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25: 955−964. Maichele, A.J., Farwell, N.J., and Chamberlain, J.S. (1993). A B2 repeat insertion generates altemate structures of the mouse muscle Y-phosphorylase kinase gene. Genomics 16: 139−149. Nakajima, Y., Hashido, K., Tsuchida, K., Takada, N., Shiino, T., and Maekawa, H. (1999). A novel tripartite structure comprising a mariner-like element and two additional retrotransposons found in the Bombyx mori genome. J. Mol. Evol. 48: 577−585. Ogiwara, I., Miya, M., Ohshima,K., and Okada, N. (1999). Retropositional Parasitism of SINEs on LINEs: identification of SINEs and LINEs in Elasmobranchs. Mol. Biol. Evol. 16: 1238−1250. Ohshima, K., and Okada, N. (2005). SINEs and LINEs: symbionts of eukaryotic genomes with a common tail. Cytogenet. Genome Res. 110: 475−490. Ohshima, K., Hamada, M., Terai, Y., and Okada, N. (1996). The 3′ ends of tRNA-derived short interspersed repetitive elements are derived from the 3′ ends of long interspersed repetitive elements. Mol. Cell Biol. 16: 3756−3764. Okada, N. (1991). SINEs: short interspersed repeated elements of the eukaryotic genome. Trends Ecol .Evol. 6: 358−361. Okada, N., Hamada, M., Ogiwara, I., and Ohshima, K. (1997). SINEs and LINEs share common 3′ sequences: a review. Gene 205: 229−243. Piskurek, O., Austin, C.C., and Okada, N. (2006). Sauria SINEs: novel short interspersed retroposable elements that are widespread in reptile genomes. J. Mol. Evol. 62: 630−644. Rho, M., and Tang, H. (2009). MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes. Nucleic Acids Res. 37: 1−12.

Jinshan Xu et al. / Journal of Genetics and Genomics 37 (2010) 125−135

Schmid, C.W., and Maraia, R. (1992). Transcriptional regulation and transpositional selection of active SINE sequences. Curr. Opin. Genet. Dev. 2: 874−882. Sela, N., Mersch, B., Gal-Mark, N., Lev-Maor, G., Hotz-Wagenblatt, A., and Ast, G. (2007). Comparative analysis of transposed elements’ insertion within human and mouse genomes reveals Alu’s unique role in shaping the human transcriptome. Genome Biol. 8: R1271− R12719. Shedlock, A.M., and Okada, N. (2000). SINE insertions: powerful tools for molecular systematics. Bioessays 22: 148−160. Smit, A. F. A., Hubley, R., and Green, P. (1996−2004). RepeatMasker Open-3.0 (http://www. repeatmsker. org). Sun, F.J., Fleurdepine, S., Cecile, B.A., Gustavo, C.A., and Deragon, J.M. (2006). Common evolutionary trends for SINE RNA structures. Trends Genet. 23: 26−33. Takasaki, N., Murata, S., Saitoh, M., Kobayashi, T., and Okada, N. (1994). Species-specific amplification of tRNAderived short interspersed repetitive elements (SINEs) by retroposition: a process of parasitization of entire genomes during the evolution of salmonids. Proc. Natl. Acad. Sci. USA 91: 10153−10157. Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673. Tsuchimoto, S., Hirao, Y., Ohtsubo, E., and Ohtsubo, H. (2008). New

135

SINE families from rice, OsSN, with poly(A) at the 3′ ends. Genes Genet. Syst. 83: 227−236. Ullu, E., and Tschudi, C. (1984). Alu sequences are processed 7SL RNA genes. Nature 312: 171−172. Wang, J., Wong, G.K., Ni, P.X., Han, Y.J., Huang, X.G., and Zhang, J.G. (2002). RePS: a sequence assembler that masks exact repeats identified from the shotgun data. Genome Res. 2: 824−831. Weiner, A.M. (1980). An abundant cytoplasmic 7S RNA is complementary to the dominant interspersed middle repetitive DNA sequence family in the human genome. Cell 22: 209−218. Wichman, H.A., Van den Bussche, R.A., Hamilton, M.J., and Baker, R.J. (1992). Transposable elements and the evolution of genome organization in mammals. Genetica 86: 287−293. Xia, Q., Wang, J., Zhou, Z., Li, R., Fan, W., Cheng, D., Cheng, T., Qin, J., Wang, J., Xiang, Z., Mita, K., Kasahara, M., Nakatani, Y., Yamamoto, K., Abe, H., Ahsan, B., Daimoni, T., Doi, K., Fujii, T., Shimada, T., and Morishita, S. (2008). The genome of a lepidopteran model insect, the silkworm Bombyx mori. Insect Biochem. Mol. Biol. 38: 1036−1045. Xia, Q.Y., Guo, Y., Zhang, Z., Li, D., Xuan, ZL., Li Z., Dai, F., Li, Y.R., Cheng, D.J., Li, R.Q., Chen, T.C., Jiang, T., Becquet, C., Xu, X., Liu, C., Yang, H.M, Lu, C., Nielsen ,R., Zhou, Z.Y., Wang, J., Xiang, Z.H., and Wang, J. (2009). Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx). Science 326: 433−436.