Short nucleotide sequences in herpesviral genomes identical to the human DNA

Short nucleotide sequences in herpesviral genomes identical to the human DNA

Journal of Theoretical Biology 372 (2015) 12–21 Contents lists available at ScienceDirect Journal of Theoretical Biology journal homepage: www.elsev...

3MB Sizes 7 Downloads 43 Views

Journal of Theoretical Biology 372 (2015) 12–21

Contents lists available at ScienceDirect

Journal of Theoretical Biology journal homepage: www.elsevier.com/locate/yjtbi

Short nucleotide sequences in herpesviral genomes identical to the human DNA Felix Filatov a,n, Alexander Shargunov b a b

Department of Scientific and Clinic Viral Diagnostics, Hematology Research Center, Ministry of Public Health, Moscow, Russian Federation Laboratory of Bioinformatics, Mechnikov Research Institute of Vaccines and Sera, Russian Academy of Medical Sciences, Moscow, Russian Federation

H I G H L I G H T S

   

The human herpesvirus genomes contain short sequences identical to the human DNA. Authors allocated these sequences in the genes of each herpesvirus genome. These sequences may indicate target viral genes for cellular siRNA. One of these putative targets is the RL1 gene of the Herpes simplex virus type 1.

art ic l e i nf o

a b s t r a c t

Article history: Received 17 November 2014 Received in revised form 8 February 2015 Accepted 17 February 2015 Available online 26 February 2015

In 2010, we described many similar DNA sequences in human and viral genomes, including herpesviral ones. The data obtained allowed us to suggest that these motifs may provide the antiviral protection by mating with a complementary potential target and destroying it by the catalytic way like small interfering RNA, siRNA. Since we have analyzed these viruses as a group, two major issues seemed to us curious: (1) the number of such motifs in genomes of various herpesvirus types, and (2) distribution of these motifs in an individual viral genome. Here we searched only the herpesviral genomes for short (420 nt) continuous sequences (hits) that are totally identical to the sequences of human DNA. We found that different viral genes and genomes of different herpesviruses contain different amount of such hits. Assuming like in previous paper that the density of these hits in viral genes is associated with the probability to be targets for cellular siRNA, we consider the genomic allocation of this density as a hypothetical targetome map of the human herpesviruses. We combined all nine types of herpesviruses in the three groups according the hit concentration in their genomes and found that the resulting sequence corresponds to the type of cellular pathology caused by a virus. We do not assert now that this trend also relates to other human viruses or other viruses in general. As the GenBank continues to fill, it would be highly advisable to conduct further relevant research. We also suggested that a high hits concentration we found in the gene RL1 (ICP34.5) of the herpes simplex virus type 1 (HSV1) can make this gene a likely target for putative cellular endogenous siRNA. Artificial blockade of the gene RL1 attaches oncolytic properties to HSV1, and we do not exclude the possibility that part of the HSV1 population in humans with blocked RL1 in vivo, may participate in early anti-cancer protection during the reactivation of the virus from the latent state. & 2015 Elsevier Ltd. All rights reserved.

Keywords: RNA interference Small interfering RNA Human herpesviruses Oncolytic herpesviruses Targetome

1. Background Several years ago, we published a paper (Zabolotneva et al., 2010) on detection of short many (21–30 nt) motifs common to the viral and human DNA. The basis of that work was a rapid increase in studies of

n Correspondence to: Laboratory for Viral Safety of the Blood and Blood Components Transfusion, Department of Scientific and Clinic Viral Diagnostics, Hematology Research Center, Ministry of Public Health, Novy Zykovsky Pr., 4a, Moscow 125167, Russian Federation. Tel.: þ 7 791 612 854 08. E-mail address: [email protected] (F. Filatov).

http://dx.doi.org/10.1016/j.jtbi.2015.02.019 0022-5193/& 2015 Elsevier Ltd. All rights reserved.

RNA interference (RNAi), which showed a new and significant level of complexity of the relationship of biological systems (Hannon and Rossi, 2004; Qi et al., 2006). RNAi is involved in many biological processes such as developmental timing, differentiation, apoptosis, antivirus actions and others (Jovanovic and Hengartner, 2006; Cullen, 2006; Skalsky and Cullen, 2010). Variants of RNAi are very diverse (Meister and Tuschl, 2004), and its many new mechanisms yet to be discovered (Mello and Conte, 2004). Virus brings to a cell its own RNAi machinery and its own microRNAs (Kincaid and Sullivan, 2012; Grundhoff and Sullivan, 2011; Grey et al., 2008), some of them focus on its own genes (Umbach et al., 2008, 2009). It also brings a set of

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

DNA motifs, which may be targets for the host interfering RNA molecules (iRNAs), in other words, targetome. Although this term refers primarily to the experimental molecular biology (Parnas et al., 2014), we use it here to indicate the likely targets for iRNA. In 2010 we found many similar DNA sequences in human and viral genomes, including herpesviral ones. The data obtained allowed us to suggest that these motifs may provide the antiviral protection by mating like small interfering RNA, siRNA, with a complementary target and destroying it by the catalytic way (Grundhoff and Sullivan, 2011; Umbach et al., 2008; Jeang, 2012). Since we have analyzed these viruses as a group, two major issues remain unclear: (1) the number of such motifs in herpesviruses of various types, and (2) their distribution in an individual viral genome. The object of our studies was the human herpesviruses, HHV. This taxonomic group includes nine types of pathogenic HHV which cause a variety of acute and chronic infectious pathology and capable to persist in the host cells as a latent virus (free episomal DNA), Minarovits, 2006. We attempted to find intra-genome and intra-group differences in motifs allocation getting away by the assumptions about the virus-host mutual evolution. Linear HHV genome contains from 77 to 168 genes. These DNA viruses are much less sensitive to the natural mutagenesis than RNA viruses. The data presented here are result of the analysis of a small number of the full-genome sequenced reference strains of wild type herpesviruses from the GenBank, and although this number is probably not enough for statistical processing and these data are preliminary, they are worth attention and discussion. Format of HHV types 1–5 and 7–8 fits to our study, HHV types 6a and 6b have been specially formatted (Tables 1 and 2). Translated genome regions or exons (exome) of herpesvirus comprise the overwhelming majority ( 80%) of its DNA. In human genome, this value is very small, less than 1.5% (Lander et al., 2001). The remaining space of the human chromosomal DNA consists of a variety of transcribed (or not transcribed) genetic elements, some of which are iRNA. Here we analyze only viral exons and ORFs, since the exons are the preferable targets for siRNA (Sontheimer and Carthew, 2005). We chose the HHV1 (herpes simplex virus type 1, HSV1, the most studied of HHV) as an object for analysis of the intra-genome allocation of the human/ herpesvirus identical DNA motifs. Another problem in our study was the choice of approach to solve the issues above. In the 2010 study, we analyzed short host/virus genome motifs with partial coincidence of nucleotide sequence (BLAST-hits). The disadvantage of counting the number of BLAST-hits is too much noise component requiring complex statistical processing. Here we preferred counting the host/virus DNA segments identical to each other; further, we call them human/herpesvirus hits, or simply hits. For search and counting of the hits, we have developed our own simple programs verified for reliability. The computer algorithm can be found in open access repository Bitbucket.org. Hits attributed to the

13

viral exome we conditionally consider as a potential targets for putative cellular siRNA. Hits attributed to the human genome we consider – also conditionally – as a potential source of putative siRNA. The number of hits in viral genomes is much smaller than the number of corresponding hits in cellular (human) genome (Table 2). The same hit can be repeated in different chromosomes, or in the same viral gene. These hits may appear as “isolated” DNA segments, or as overlapped conglomerates (Table 3). To eliminate the inevitable dependence of the number of hits on the length of the viral gene, we propose the more adequate index, a density of hits or their number in the fixed length of the gene segment. However, to compensate for a rather vague definition of the term “hit” and the variable size of a frame search (Z20 nt) it would be more reasonable to assess not just the density of hits, but the percent of coverage of the individual viral gene (E, Eclipse) or of the whole viral genome sequences with hits sequences, (E0). In our paper, we used these particular indexes. Table 3 also illustrates the algorithm for the formation of hits based on combinations of 20 nt DNA sequences that are identical to compared genomes, as opposed to partial homology of BLAST hits. The general number of human/ herpesvirus hits seems to be sufficient for the initial evaluation of their distribution in herpesvirus genomes. Assuming that the density of the hits in viral genes is associated with the probability to be targets for [putative] cellular siRNA, we consider the genomic allocation of this density as a potential targetome map of the human herpesvirus. Two types of small RNA molecules, microRNA (miRNA) and small interfering RNA (siRNA), are central to RNA interference and key gene regulators in eukaryotes. In the context of this study, we are interested only in siRNA. siRNA consists of 21–28 nt and are fully complementary to the target (Sontheimer and Carthew, 2005). Complementary unit (seed matches) of miRNA is too short (6–7 nt) to be useful in identifying the relevant target (Kiyosawa et al., 2003; Bartel, 2004, 2009; Lewis et al., 2005; Birmingham et al., 2006; Leuschner et al., 2006; Ling et al., 2013). miRNAs and siRNAs have distinct requirements for mRNA target recognition. The base pairing between miRNAs and their targets is degenerate, whereas the pairing between siRNAs and their targets requires high fidelity, typically allowing no more than three mismatches (Leuschner et al., 2006; Watanabe et al., 2008). Localization of the miRNAs is theoretically unpredictable today. Sontheimer and Carthew (2005) noted, “Despite the paucity of data on endogenous mammalian siRNA in the literature, there has been no reason to doubt that mammalian cells could autonomously generate endogenous siRNAs for their own purposes”. The history of research siRNA is significantly shorter than the history of research miRNA (Seitz, 2009). Some results suggest that antisense transcripts contribute to an endogenous siRNA in fully differentiated human cells (Werner et al., 2014; Jing Xia et al., 2012). Endogenous siRNAs

Table 1 Human herpesvrirus (HHV) genome sequences analyzed in this study. HHV

Sequence analyzed

1 2 3 4 4.2 5 6a 6b 7 8

ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_1_uid15217/ ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_2_uid15218/ ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_3_uid15198/ ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_4_uid14413/ ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_4_type_2_uid20959/ ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_5_uid14559/ Formatted especially for this study Formatted especially for this study ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_7_uid14625/ ftp://ftp.ncbi.nih.gov/genomes/viruses/human_herpesvirus_8_uid14158/

14

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

Table 2 Amount of the human/herpesvirus hits in the human and herpesvirus genomes. Human herpesvirus, HHV

Strains

NCBI accession number

In virus

In cell

1 2 3 4 4.2 5 6a 6b 7 8

17 HG52 Dumas AG876 B95-8 Merlin U1102 229 RK GK18-p

GenBank: GenBank: GenBank: GenBank: GenBank: GenBank: GenBank: GenBank: GenBank: GenBank:

233 239 270 417 352 296 479 498 910 283

1151 2171 401 858 618 1680 2009 4438 2233 586

Herpes simplex virus type 1, HSV1 Herpes simplex virus type 2, HSV2 Varicella zoster virus, VZV Epstein–Barr virus, EBV Epstein–Barr virus type 2 Cytomegalovirus, CMV HHV6a HHV6b HHV7 Kaposi sarcoma herpesvirus, KSHV

NC_001806 NC_001798 NC_001348 NC_009334 NC_007605 NC_006273 NC_001664 NC_000898 NC_001716 NC_009333

Table 3 Human/HSV1 hits in the RL1 (ICP34.5) gene.

50 -Nucleotide position in the RL1 gene and in human chromosomes are indicated in the columns to the right.

can arise from long dsRNA transcripts derived from repetitive or transposable elements (Okamura and Lai, 2008; Tam et al., 2008; Watanabe et al., 2008), which require Dicer, but not Drosha/Dgcr8 for processing (Babiarz et al., 2008, 2009). Endo-siRNAs have been described in murine embryonic stem cells (mESCs), oocytes (Babiarz et al., 2008; Tam et al., 2008; Watanabe et al., 2008) and in murine embryonic skin cells (Yi et al., 2009). Both miRNAs and siRNAs exert their regulatory functions through association with the RNA-induced silencing complex (RISC), which contains an Argonaute (AGO) protein. Consequently, most miRNAs lower the expression levels of target genes by mRNA destabilization or translational repression (Cernilogar et al., 2011), whereas endosiRNAs elicit direct cleavage of target mRNAs (Watanabe et al., 2008). Much, however, remains unclear. As Seitz (2009) wrote, “After the miRNA era (where so many functions have been ascribed to miRNAs, in so many physiological processes), we may be entering the ‘siRNA era’. How many biological pathways will involve siRNAs? As siRNAs can act catalytically, minute amounts of these novel regulators could have tremendous effects; undetected small RNAs may lie behind unexplained phenomena. Clearly, the exploration of small regulatory

RNAs is not over”. We describe here the data used as a basis for the hypothesis presented below.

2. Results In accordance with the purpose of our study, we present here two sets of data. The first set of these data relates to the HSV1. It helps to understand the more general results related to the whole HHV taxonomy group (the second set of data). 2.1. Allocation of the human/herpesvirus hits in the genome of the herpes simplex virus type 1 Fig. 1 shows the ratio of the H/HV1 hits (marked in yellow) in each viral gene (exon) and the size of the exon. This ratio reflects a certain (though not severe) dependence of the number of hits in a gene on the size of the gene. If the dependence was strong, and would be consistent with the most siRNA targets, the longer genes (such as the UL36) would be highly susceptible to siRNA, which is unlikely biologically.

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

15

Fig. 1. Comparison of the human/HSV1 hits distribution in the virus genes and the size of these genes. The absolute number of hits (left yellow column of each gene) gene and nucleotide sizes of (right black column of each gene) are presented on a comparable scale. For comparison: the number of hits in the gene RL1 is 7, its length is 248 nt; the number of hits in the gene UL36 is 22, its length is 3139 nt. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 2. Percent coverage E of the HSV1 genes by the human/HSV1 hits. (A) Human/HSV1 hits in the sense and (B) in the anti-sense chains of the human DNA. Unique genes of the long (L) and the short (S) viral genome segments are marked with black. Genes RL1, RL2 of the direct, and invert sequences of the L segment are marked with green and blue, respectively. RS1 genes of the direct and invert sequences of the S segment are marked with red. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 2A shows the distribution of the percent coverage E in the HHV1 genome. Genes RL1 and RL2, and a gene RS1 are repeated twice along the HSV1 genome, since they are located in mutually inverted sequences at the ends of the long (L) and short (S) segments HSV1 genome, i.e. on opposite DNA strands. The value of E of the RL1 gene significantly exceeds the E for the other HSV1 genes. Since the source of antiviral siRNA can also be complementary chain of human DNA, we got a map of hits for this chain (Fig. 2B). It turned out that the same chain of the human DNA contains identical short portions of both viral DNA strands, and that the level of E in the exons (plus chain) and

complementary sequences (minus chain) of human DNA roughly equivalent and mostly coincident to each other. The hits of the sense virus DNA chain may coincide with the anti-sense DNA hits. Interestingly, in the human genome, these hits fall on different chromosomes, and in the gene RL1 these hits fall asymmetrically, on a different number of chromosomes. One of the RL1 human/HSV1 hits (798–832 nt of the viral genome) falls on six human chromosomes. The same region in the anti-sense viral DNA chain (corresponding to 125,536–125,575 nt of the viral genome) falls only on one chromosome (Fig. 3). It probably selects the sense viral DNA chain as a siRNA target. Short transcripts

16

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

Fig. 3. Human/HSV1 hits in the sense and complementary chains of the HSV1 RL1 gene. Human/HSV1 hits are marked with green. Chromosome numbers corresponding to sense chain hits are shown above the upper sequence (RL1 gene); chromosome numbers corresponding to the anti-antsense chain hits are shown under the lower sequence. Positions of hits corresponding to transcripts found in cells of certain human tissues (EST database) marked with dark-blue. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

found in human tissues (EST database) coincide with the position of some of the hits of the gene RL1 (Fig. 3), and, in a certain respect, it supports this idea. Total number of hits in this gene – in the sense that we define them here – is seven (Table 3). They vary in size and structure, while the coverage area (E) of this gene is 25.57% (total length of hits is 190 nt, the gene size is 747 nt). The total percent coverage (E0) for HSV1 (strain 17) is 4.33% (total length of hits is 5.335 nt; the exome size is 123.090 nt). Fig. 4 shows the HHV1 strain differences in amount of the hits of the RL1 gene for eight published sequences (http://www.ncbi. nlm.nih.gov/; Watson et al., 2012). These differences include few insertions and deletions of different size, and few single nucleotide substitutions at the preferred loci. Most of these strain depended substitutions falls on the space between the hits. The deletions in human/herpesvirus hit destroy the hit, but at the same time shorten the gene. The insertions in human/ hrpesvirus hit act in opposite direction: they also destroy the hit, but extend the gene. As a result, the value of E changes not too drastically in both cases. Thus, mentioned deletions and insertions mitigate

changes E, and its value for the HSV1 RL1 gene ranges from 15 to 30%; it maintains the predominance of the RL1 over the “neighbors” even at the minimal E. We must say that the result of the counting of identical human/herpesvirus hits, by definition, is extremely sensitive even to the point nucleotide substitutions. However, small deviation from 100% identity human/herpesvirus hits identity (1–2 nucleotides) quite possible for a siRNA target in reality restores the hit. Introns and intergenic regions of the human genome make the largest contribution to the formation of the human/HSV1 hits (not shown). The human mitochondrial DNA does not contain human/ HSV1 hits at all. 2.2. Allocation of the human/herpesvirus hits and the percent hit coverage E0 for the genomes of other human herpesviruses In this section, our task is to compare the genomic distribution of human/HHV hits in other human herpesviruses. It is sufficient to demonstrate here five of the remaining eight types. Fig. 5 shows the genome hits distribution in six herpesviruses: HSV1 and HSV2, EBV and EBV2, and HHV6A and HHV6B. One can see a certain similarity of

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

17

Fig. 4. Partial sequence of the HSV1 RL1 gene containing all 7 its hits (green) of eight HSV1 strains. Deletions (—), insertions and point mutations (red) are shown. Nucleotide numeration corresponds to HSV1 reference strain 17. 1—strain 17 (E¼ 25.57%), 2—isolate RE (E ¼22.39%), 3—strain KOS (E ¼15.69%), 4—strain H129 (E¼ 21.72%), 5—strain F (E¼ 25.57%), 6—strain SLP (E ¼24.07%), 7—strain 777 (E¼ 25.82%), 8—strain 775 (E¼ 30.19%). Strains SLP, 775 and 777 were cultivated in vitro for a long time for selection of certain features (e.g. plaques size, Watson et al., 2012). Н129 is a low-passage clinical isolate (Szpara et al., 2010). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

the HSV1 and HSV2 maps (Fig. 5A). It is not surprising, because both viruses belong to the same genus simplexvirus of the same subfamily alpha-herpesviruses and share from 60 to 80% genome sequences depending on the genome region (Dolan et al., 1998; Aguilar et al., 2006). The maximum value of E for HSV2 (like for HSV1) falls on the genes RL1 and RL2, although somewhat higher for RL2, and somewhat lower for RL1, than for these genes of HSV1 in absolute figures. No one human/HSV1 RL1 hit sequence does not coincide with the sequence of a human/HSV2 RL1 hit. The percent coverage E0 for HSV2, strain HG52, is 4.51%. By the way, sequences of RL1 contain the most diverged regions of the both genomes (McGeoch et al., 1991). The terminal and inverted repeats of the L segment of the HHV3 (Varicella zoster virus, VZV, strain Dumas, not shown here) are too short to contain any genes or open reading frames. E0 for HHV3 is 5.30%. The genome distribution of human/EBV hits for two its subtypes (EBV and EBV type 2) is very similar and emphasizes the high values of E for genes encoding proteins EBNA1, BFRF3, BZLF2 and BBLF1 (Fig. 5B). The E0 values for EBV (strain В95-8) and EBV type 2 (strain АG876) are 7.18% and 6.69%, respectively. The same index for another representative of the gamma-herpesvirus sub-family, HHV8 (Kaposi sarcoma herpesvirus, KSHV, strain GK18, type P, not shown here) is 5.52%. The genome distribution of human/herpesvirus hits for two roseoloviruses, HHV6A (strain U1102) and HHV6B (strain 229) differs markedly (Fig. 5C), although the E0 values for both are very close, 8.41% and 8.32%, respectively. Remember that HHV6A and HHV6B are now two different types of the human herpesviruses since 2012, even

though they have up to 95% genomic similarity (Harberts et al., 2011). Two peaks of E in the hypothetic targetome map of HHV6B attract the attention. They fall on the very short ORF (putative gene L3 of 159 nt) in two short direct repeats in this virus genome, for which the expression product has no found so far. The Е0 value for HHV5 (cytomegalovirus, CMV, strain Merlin, not shown here) is minimal in HHV, 3.42%. It should be noted that HHV5 DNA is very long (4235,000 bp), the number of genes is also large (168). The terminal and inverted repeats of the DNA L segment of this virus contain more than 13 genes. The Е0 value for HHV7 (strain RK) is maximal in HHV, 17.58%, not shown here. At the same time, HHV5 and HHV7 belong to the same sub-family, beta-herpesviruses, and HHV7 and both HHV6 (radically different in terms of E0) belong even to the same genus, roseoloviruses.

3. Discussion Human herpesviruses is a family of pathogens divided into three sub-families: alphaherpesviruses (genera simplexvirus, S and varicellovirus, V), betaherpesviruses (genera cytomegalovirus, C, and roseolovirus, R) and gammaherpesviruses (genera lymphocryptovirus, L, and rhadinovirus, R0 ), Table 4. Each type of HHV is capable of causing latent, chronic and acute infections. HHV differ by preference of depositing cells (neurons for alpha-HV, nuclear blood cells for beta- and gamma-HV). They also differ by preference of cells for active reproduction (alpha-HV

18

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

Fig. 5. Comparison of the genomic distribution of hits in the three pairs of related herpesviruses: A-HSV1 vs HSV2, B-EBV vs EBV2, and C-HHV6A vs HHV6B. Ordinate—percent coverage E. Abscissa—virus genes. (A) HSV1 vs HSV2: human/ /HSV1 hits—black, human/HSV2 hits—yellow. (B) EBV vs EBV type 2: human/HHV4 hits—black, human/HHV4 type 2 hits—yellow. (C) HHV6A vs HHV6B: human/HHV6A hits—black, human/HHV6B hits—yellow (see text). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Table 4 Comparison of basic properties of the human herpesviruses.

prefer the cells of ectodermal origin, beta- and gamma-HV prefer the cells of mesodermal origin), and by the type of the pathology (mainly destructive for alpha-HV and CMV, mainly proliferative, transforming, oncogenic for gamma-HV, and mainly subclinical for beta-HV). Cellular and general pathology caused by different types

of herpesviruses are not strict virus features, Table 4 shows only the most striking features of each type described elsewhere (Whitley et al., 2007). Alpha-HV differs from beta-HV and gamma-HV by short period of reproduction. Alpha-HV, CMV and KSHV are capable of multiplying

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

Fig. 6. Human herpesvirus types ordered by increasing of the total percent coverage E0. Viruses that cause predominantly cellular (tissue) destruction are marked with red. Viruses that cause predominantly cellular proliferation (transformation, oncology) are marked with beige. Viruses that predominantly do not cause too serious pathology (abortive, sub-clinic) are marked with light grey. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

in vitro in monolayer cell cultures (e.g., fibroblasts), beta- and gammaHV reproduce in suspension cultures of the blood nuclear cells stimulated with PHA. Besides, HHV1, HHV2 and HHV5 belong to group E genome arrangement, HHV3—to group D, HHV6 (A and B) and HHV7—to group A, and HHV4 and HHV8—to group C (Table 4). Very preliminary analysis of the data above shows that the value Е0 increases seem to parallel changes in the type and extent of cell damage by various types of herpesviruses, from destruction to proliferation and transformation, and to a less severe pathology (Fig. 6A). These changes, in turn, reflects the severity and type of the viral pathology in vivo. Despite the proximity E0 values of this series, they can be divided into three groups (Fig. 6B) with the average values E0. So, we have HHV causing severe destructive damage to the brain, ganglia, skin, mucosa (red), HHV which can cause some types of lymphomas and sarcomas (beige), and HHV generally causing nonsevere symptoms, from skin rosella to chronic fatigue syndrome (light blue-gray). The number of herpesvirus genomes we analyzed cannot currently estimate reliably the statistical validity of the data; corresponding work will be continued. Fig. 6A shows also that one can divide the HHV series into two groups (separated with interval) depending on capacity of the virus to be cultivated in fibroblasts in vitro. These series follow the genome structure types: E-E-E-D-C-C-C-A-A-A (Roizman et al., 2013).

19

We tried to organize herpesviruses into three groups with an average of E0 marked with red (group a), beige (group b) and grey (group c) in Fig. 6 in accordance with the type of pathology (Table 5). Another comment on our data above is that they describe the genomes of herpesviruses of unequal passage history (from clinical isolates to laboratory strains). Fig. 4 shows that this may have some value (but hardly decisive) in the context of our work. We plan to consider this in further work, as enrichment GenBank. As we supposed in our previous study (Zabolotneva et al., 2010) these hits “may serve for antiviral protection. We also presume that the evolutional success of some groups of genomic repeats may be due to their ability to provide antiviral RNA motifs to the host organism. Intense genomic repeat propagation into the genome would inevitably cause bidirectional transcription of these sequences, and the resulting double-stranded RNAs may be recognized and processed by the RNA interference enzymatic machinery. Provided that these processed target motifs may be complementary to viral transcripts, fixation of the repeats into the host genome may be of a considerable benefit to the host”. We believe that the more putative targets for the host siRNA are contained in viral genome, the more it is susceptible to the RNAi pressure, restraining the destruction of affected cells. It reflects the gradual complication of the relationship between virus and host, which does not necessarily lead to the death of one of them. We also believe that in the analysis of possible antivirus functions of hits their concentration in viral genes (genomes) is of the greater importance, than their total number. There is little evidence for functional endogenous siRNA pathways in mammalian cells (Cullen, 2014). We realize that it is difficult to understand the mechanisms of formation of precursors of these molecules in the nucleus, leaving it for the future. Nevertheless, the hits shown here, as we think, are the most likely place for the experimental search of targets of antiviral RNAi. Metagenomics could be one of the approaches in the experimental work of this kind. We believe that in mammalians the RNAi is not so much a key factor, but a very important one in antiviral defense in mammals at the cellular level. This factor may be really important in reactivation of viral infection, to which the intracellular molecular mechanisms resist first. Interactions of siRNAs with the target are the “weak interactions”. It leads not to the suppression of virus replication, but to the temporary blockage of individual viral genes perhaps in small proportion of viral population. This makes the virus population more diverse, and modifies cellular targets of viral aggression. It is remarkable that the amount of hits in the sense strand of the gene RL1 is considerably greater than their amount in the complementary strand (15 to 5). For individual hit, this ratio can be even more: 6 to 1 (hit 4). Perhaps it selects the gene as a potential target for the siRNA. At least, in the EST database we found four short transcripts (discovered in brain, eye and placenta tissues), a position that coincides with the position of some of the hits of the HHV1 gene RL1. The HSV1 RL1 gene is so-called dispensable gene (Chou et al., 1990), and its knockdown results not suppression of the virus replication, but modification its properties (Chou et al., 1990). Product of this gene is multifunctional, late (gamma-1) protein ICP34.5, neurovirulence factor that, inter alia, is responsible for neuronal deposition of latent HSV1 (He et al., 1997; Jin et al., 2009). Virus lacking the protein ICP34.5 (e.g., by genetic engineering), acquires the property of preferred reproduction in rapidly proliferating undifferentiated cells and becomes oncolytic (Liu et al., 2003; Israyelyan et al., 2008; Campadelli-Fiume et al., 2012; Dambach et al., 2006). Such herpesvirus (and other oncolytic viruses obtained artificially) has significant therapeutic prospects (Maldonado et al., 2010).

20

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

Table 5 Comparison of basic pathology properties of three groups of the human herpesviruses.

The numbers on the vertical axis—the value of E0. (two-tailed Welch’s t-test with multiple-comparison Sidak correction; GLM procedure in SAS 9.1.) np o0.014. p o0.069. †p o 0.365.

If a high E value for the gene RL1 means a potential target for putative endogenous siRNA, then it can be assumed (our hypothesis) that under certain conditions in vivo (e.g., virus reactivation) siRNA block this gene, and HSV1 can become oncolytic virus participating in early anti-cancer defense. Some authors have noted the natural oncolytic action of the herpesviruses not modified artificially (Nawa et al., 2008). Complementary strand of the gene RL1 localized in the inverted repeat of the HSV1 DNA cannot be called nonsense one in the full sense of the word, because it is part of a template for synthesis of LAT RNA of the latent virus (Kent et al., 2003). miRNA (pre-miR-I) able to specifically and efficiently reduce the expression of ICP34.5 through a siRNA mechanism was found in the LAT RNA complementary to the 50 UTR of the HSV1 RL1 gene (125,942–125,922 nt), Tang et al., 2008. According to Werner et al., 2014, natural antisense transcription in human cells can also contribute to the pool of cellular endogenous siRNA. Regarding HSV1 in the context of the discussing of the data here, it does not matter how the peak E of the RL1 gene is formed, accidentally or not. We are willing to accept the fact that portions of the viral DNA containing the RL1 gene can randomly match the same sites of the host DNA due to the high GC content (i.e. low nucleotide complexity). However, such dependence is not linear at all (it follows from a comparison of GC content and concentration of HV/H hits from other herpesviruses, not shown here). According to our assumption the evolution could use the low complexity of DNA in order to facilitate and accelerate the formation a mechanism of inhibition of RL1, which proved beneficial to the host since led to more opportunities to resist carcinogenesis. Overall, these data show that “a genomic fossil record of herpesviruses exists and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis” (Aswad and Katzourakis, 2014).

Author contribution Felix Filatov has formulated the hypothesis, has conceived of the study participated in its design and coordination, and has drafted the manuscript. Alexander Shargunov has developed program for search human/virus identical DNA sequences, has made the major part of sequence alignment and statistical analyses.

Conflict of interests This study is based on personal initiative and is not supported by any grants. It was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that they have no competing interests.

Acknowledgements Authors would like to thank Mr. Peter Filatov for the development the first programs of searching short DNA homologues between the human and some viral genomes, which helped to get the initial results of this study.

References Aguilar, J.S., Devi-Rao, G.V., Rice, M.K., Sunabe, J., Ghazal, P., Wagner, E.K., 2006. Quantitative comparison of the HSV-1 and HSV-2 transcriptomes using DNA microarray analysis. Virology 348, 233–241. http://dx.doi.org/10.1016/j. virol.2005.12.036. Amr, Aswad, Katzourakis, A., 2014. The first endogenous herpesvirus, identified in the tarsier genome, and novel sequences from primate rhadinoviruses and lymphocryptoviruses. PLos Genet. 10 (1371), e1004332. Babiarz, J.E., Ruby, J.G., Wang, Y., Bartel, D.P., Blelloch, R., 2008. Mouse ES cells express endogenous shRNAs, siRNAs, and other microprocessor-independent, dicer-dependent small RNAs. Genes Dev. 22, 2773–2785. http://dx.doi.org/ 10.1101/gad.1705308. Babiarz, J.E., Blelloch, R., Eli, T., Broad, E., 2009. Small RNAs—their biogenesis, regulation and function in embryonic stem cells. StemBook, 1–16. http://dx.doi. org/10.3824/stembook.1.47.1. Bartel, D.P., 2004. MicroRNAs: genomics, biogenesis, mechanism and function. Cell 116, 281–297. http://dx.doi.org/10.1016/S0092-8674(04)00045-5. Bartel, D.P., 2009. MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233. http://dx.doi.org/10.1016/j.cell.2009.01.002. Birmingham, A., Anderson, E., Reynolds, A., Ilsley-Tyree, D., Leake, D., Fedorov, Y., et al., 2006. 30 UTR seed matches, but not overall identity, are associated with RNAi off-targets. Nat. Methods 3, 199–204. Campadelli-Fiume, G., Menotti, L., Zhou, G., de Giovanni, C., Nanni, P., Lollini, P.L., 2012. Herpesviruses as oncolytic agents. In: Blaho, J.A., Baines, J.D. (Eds.), In: From the Hallowed Halls of Herpesvirology (A tribute to Bernard Roizman). World Scientific Co. Pte. Ltd., pp. 223–250. Cernilogar, F.M., Onorati, M.C., Kothe, G.O., Burroughs, A.M., Parsi, K.M., Breiling, A., et al., 2011. Chromatin-associated RNA interference components contribute to transcriptional regulation in Drosophila. Nature 480, 391–395. http://dx.doi. org/10.1038/nature10492.

F. Filatov, A. Shargunov / Journal of Theoretical Biology 372 (2015) 12–21

Chou, J., Kern, E.R., Whitley, R.J., Roizman, B., 1990. Mapping of herpes simplex virus-1 neurovirulence to gamma-1 34.5, a gene nonessential for growth in culture. Science 250, 1262–1266. Cullen, B., 2006. Viruses and microRNAs. Nat. Genet. Suppl. 38, 525–530. http://dx. doi.org/10.1038/ng1793. Cullen, B.R., 2014. Viruses and RNA interference: issues and controversies. J. Virol. 88, 12934–12936. Dambach, M.J., Trecki, J., Martin, N., Nancy, S., Markovitz, N.S., 2006. Oncolytic viruses derived from the 34.5-deleted herpes simplex virus recombinant R3616 encode a truncated UL3 protein. Mol. Ther. 13, 891–898. http://dx.doi.org/ 10.1016/j.ymthe.2006.02.006. Dolan, A., Jamieson, F.E., Canningham, C., Barnett, B.C., McGeoch, D.J., 1998. The genome sequence of herpes simplex virus type 2. J. Virol. 72, 2010–2021. Grey, F., Hook, L., Nelson, J., 2008. Med. Microbiol. Immunol. 197, 261–267. http: //dx.doi.org/10.1007/s00430-007-0070-1. Grundhoff, A., Sullivan, C.S., 2011. Virus-encoded microRNAs. Virology 411, 325–343. http://dx.doi.org/10.1016/j.virol.2011.01.002. Hannon, G.J., Rossi, J.J., 2004. Unlocking the potential of the human genome with RNA interference. Nature 431, 371–378. Harberts, E., Yao, K., Wohler, J.E., Maric, D., Ohayon, J., Henkin, R., 2011. Human herpesvirus-6 entry into the central nervous system through the olfactory pathway. Proc. Natl. Acad. Sci. 108, 13734–13739. http://dx.doi.org/10.1073/ pnas.1105143108. He, B., Chou, J., Brandimarti, R., Mohr, I., Gluzman, Y., Roizman, B., 1997. Suppression of the phenotype of gamma-1 34.52 herpes simplex virus 1: failure of activated RNA-dependent protein kinase to shut off protein synthesis is associated with a deletion in the domain of the a47 gene. J. Virol. 71, 6049–6054. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., et al., 2001. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921. Israyelyan, A., Chouljenko, V.N., Baghian, A., David, A.T., Kearney, M.T., Kousoulas, K.G., 2008. Herpes simplex virus type-1(HSV-1) oncolytic and highly fusogenic mutants carrying the NV1020 genomic deletion effectively inhibit primary and metastatic tumors in mice. Virol. J. 5, 68–77. http://dx.doi.org/10.1186/1743-422X-5-68. Jeang, K.-T., 2012. RNAi in the regulation of mammalian viral infections. BMC Biol. 10, 58–63. http://dx.doi.org/10.1186/1741-7007-10-58. Jin, H., Ma, Y., Prabhakar, B.S., Feng, Z., Valyi-Nagy, T., Yan, Z., et al., 2009. The gamma-1 34.5 protein of herpes simplex virus 1 is required to interfere with dendritic cell maturation during productive infection. J. Virol. 83, 4984–4994. http://dx.doi.org/10.1128/JVI.02535-08. Jing Xia, Cailin E.J., Bowcock, A.M., Zhang, W., 2012. Noncanonical microRNAs and Endogenous siRNAs in normal and psoriatic human skin. Hum. Mol. Genet., 1–12. http://dx.doi.org/10.1093/hmg/dds481). Jovanovic, M., Hengartner, M.O., 2006. miRNAs and apoptosis: RNAs to die for. Oncogene 25, 6176–6187. http://dx.doi.org/10.1038/sj.onc.1209912. Kent, J.R., Kang, W., Miller, C.G., Fraser, N.W., 2003. Herpes simplex virus latencyassociated transcript gene function. J. Neurovirol. 9, 285–290. http://dx.doi.org/ 10.1080/13550280390200994. Kincaid, R.P., Sullivan, C.S., 2012. Virus-encoded microRNAs: an overview and a look to the future. PLoS Pathog. 8, 1–11. http://dx.doi.org/10.1371/journal.ppat.1003018. Kiyosawa, H., Yamanaka, I., Osato, N., Kondo, S., Hayashizaki, Y., 2003. Antisense transcripts with FANTOM2 clone set and their implications for gene regulation. Genome Res. 13 (6B), 1324–1334. http://dx.doi.org/10.1101/gr.982903. Leuschner, P.J., Ameres, S.L., Kueng, S., Martinez, J., 2006. Cleavage of the siRNA passenger strand during RISC assembly in human cells. EMBO Rep. 7, 314–320. http://dx.doi.org/10.1038/sj.embor.7400637. Lewis, B.P., Burge, C.B., Bartel, D.P., 2005. Conserved SEED pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20. http://dx.doi.org/10.1016/j.cell.2004.12.035. Ling, M.H., Ban, Y., Wen, H., Wang, S.M., Ge, S.X., 2013. Conserved expression of natural antisense transcripts in mammals. BMC Genomics 14, 243. http://dx. doi.org/10.1186/1471-2164-24314-243. Liu, B.L., Robinson, M., Han, Z.-Q., Branston, R., English, C., Reay, P., et al., 2003. ICP34.5 deleted herpes simplex virus with enhanced oncolytic, immune stimulating, and anti-tumour properties. Gene Ther. 10, 292–303. Maldonado, A.R., Klanke, C., Jegga, A.G., Aronow, B.J., Mahller, Y.Y., Cripe, T.P., et al., 2010. Molecular engineering and validation of an oncolytic herpes simplex virus type 1 transcriptionally targeted to midkine-positive tumors. J. Gene Med. 12, 613–623. http://dx.doi.org/10.1002/jgm.1479.

21

McGeoch, D.J., Cunningham, C., McIntyre, G., Dolan, A., 1991. Comparative sequence analysis of the long repeat regions and adjoining parts of the long unique regions in the genomes of herpes simplex viruses types 1 and 2. J. Gen. Virol. 72, 3057–3075. Meister, G., Tuschl, T., 2004. Mechanisms of gene silencing by double-stranded RNA. Nature 431, 343–349. Mello, C.C., Conte Jr., D., 2004. Revealing the world of RNA interference. Nature 431, 338–342. Minarovits, J., 2006. Epigenotypes of latent herpesvirus genomes “DNA methylation: development, genetic disease and cancer”. Curr. Top. Microbiol. Immunol. 310, 6180. http://dx.doi.org/10.1007/3-540-31181-5_5. Nawa, A., Luo, C., Zhang, L., Ushjima, Y., Ishida, D., Kamakura, M., et al., 2008. Nonengineered, naturally oncolytic herpes simplex virus HSV1 HF-10: applications for cancer gene therapy. Curr. Gene Ther. 8, 208–221. Okamura, K., Lai, E.C., 2008. Endogenous small interfering RNAs in animals. Nat. Rev. Mol. Cell Biol. 9, 673–678. http://dx.doi.org/10.1038/nrm2479. Parnas, O., Corcoran, D.L., Cullen, B.R., 2014. Analysis of the mRNA targetome of MicroRNAs expressed by Marek's disease virus. mBio, 5. http://dx.doi.org/ 10.1128/mBio.01060-13. Qi, P., Han, J., Lu, Y., Wang, C., Bu, F., 2006. Virus-encoded microRNAs: future therapeutic targets? Cell. Mol. Immunol. 3, 411–419. Roizman, B., Knipe, D.M., Whitley, R.J., 2013. Herpes simplex viruses. In: Knipe, D.M., Howley, P.M. (Eds.), In: Fields Virology, sixth ed. Wolters Kluwer|Lippincot Williams and Wilkins, pp. 1823–1897. Seitz, H., 2009. siRNAs: the hidden face of the small RNA world. Curr. Biol. 20, R108–R110. http://dx.doi.org/10.1016/j.cub.2009.12.027. Skalsky, R.L., Cullen, B.R., 2010. Viruses, microRNAs, and host interactions. Annu. Rev. Microbiol. 64, 123–141. Sontheimer, E.J., Carthew, R.W., 2005. Endogenous siRNAs and miRNAs. Cell 122, 9–12. http://dx.doi.org/10.1016/j.cell.2005.06.030. Szpara, M.L., Parsons, L., Enquist, L.W., 2010. Herpes simplex virus 1 reveals new mutations. J. Virol. 84, 5303–5313. http://dx.doi.org/10.1128/JVI.00312-10. Tam, O.H., Aravin, A.A., Stein, P., Girard, A., Murchison, E.P., Cheloufi, S., et al., 2008. Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. Nature 2008 (453), 534–538. http://dx.doi.org/10.1038/nature06904. Tang, S., Bertke, A.S., Patel, A., Wang, K., Cohen, J.I., Krause, P.R., 2008. An acutely and latently expressed herpes simplex virus 2 viral microRNA inhibits expression of ICP34.5, a viral neurovirulence factor. Proc. Natl. Acad. Sci. U.S.A. 105, 10931–10936, doi:10.1073_pnas.0801845105. Umbach, J.L., Kramer, M.F., Jurak, I., Karnowski, H.W., Coen, D.M., Cullen, B.R., 2008. MicroRNAs expressed by herpes simplex virus 1 during latent infection regulate viral mRNAs. Nature 454 (7205), 780–783. http://dx.doi.org/10.1038/ nature07103. Umbach, J.L., Nagel, M.A., Cohrs, R.J., Gilden, D.H., Cullen, B.R., 2009. Analysis of human alphaherpesvirus microRNA expression in latently infected human trigeminal ganglia. J. Virol. 83, 10677–10683. http://dx.doi.org/10.1128/ JVI.01185-09. Watanabe, T., Totoki, Y., Toyoda, A., Kaneda, M., Kuramochi-Miyagawa, S., Obata, Y., et al., 2008. Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 453, 539–543. http://dx.doi.org/10.1038/ nature06908. Watson, G., Xud, W., Reed, A., Babra, D., Putman, T., Wick, E., et al., 2012. Sequence and comparative analysis of the genome of HSV-1 strain McKrae. Virology 433, 528–537. http://dx.doi.org/10.1016/j.virol.2012.08.043. Werner, A., Cockell, S., Falconer, J., Carlile, M., Alnumeir, S., Robinson, J., 2014. Contribution of natural antisense transcription to an endogenous siRNA signature in human cells. BMC Genomics 15, 19–30. http://dx.doi.org/10.1186/ 1471-2164-15-19. Whitley, R.J., Kimberlin, D.W., Prober, C.G., 2007. Pathogenesis and disease. In: Arvin, A., Campadelli-Fiume, G., Mocarsky, E., Moore, P.S., Roizman, B., Whitley, R., Yamanishi, K. (Eds.), In: Human Herpesviruses: Biology, Therapy and Immunoprophylaxis. Cambridge University Press, Chapter 32. Yi, R., Pasolli, H.A., Landthaler, M., Hafner, M., Ojo, T., Sheridan, R., et al., 2009. DGCR8dependent microRNA biogenesis is essential for skin development. Proc. Natl. Acad. Sci. U.S.A. 106, 498–502. http://dx.doi.org/10.1073/pnas.0810766105. Zabolotneva, A., Tkachev, V., Filatov, F., Buzdin, A., 2010. How many antiviral small interfering RNAs may be encoded by the mammalian genomes? Biol. Direct. 5, 62–77. http://dx.doi.org/10.1186/1745-6150-5-62.