Microbial Pathogenesis 107 (2017) 368e371
Contents lists available at ScienceDirect
Microbial Pathogenesis journal homepage: www.elsevier.com/locate/micpath
Comparative characterization analysis of synonymous codon usage bias in classical swine fever virus Xin Xu a, b, c, Dongliang Fei a, Huansheng Han a, Honggui Liu a, Jiayong Zhang a, Yulong Zhou d, e, Chuang Xu d, e, Hongbin Wang a, *, Hongwei Cao c, d, e, **, Hua Zhang c, d, e, *** a
College of Animal Medicine, Northeast Agricultural University, Harbin 150030, China Heilongjiang Institute of Veterinary Science, Qiqihar 161005, China College of Life Science and Technology, HeiLongJiang BaYi Agricultural University, Daqing 163319, China d College of Animal Science and Veterinary Medicine, HeiLongJiang BaYi Agricultural University, Daqing 163319, China e Biotechnology Center, HeiLongJiang BaYi Agricultural University, Daqing 163319, China b c
a r t i c l e i n f o
a b s t r a c t
Article history: Received 19 March 2017 Received in revised form 1 April 2017 Accepted 3 April 2017 Available online 14 April 2017
Classical swine fever virus (CSFV) is responsible for the highly contagious viral disease of swine, and causes great economic loss in the swine-raising industry. Considering the significance of CSFV, a systemic analysis was performed to study its codon usage patterns. In this study, using the complete genome sequences of 76 CSFV representing three genotypes, we firstly analyzed the relative nucleotide composition, effective number of codon (ENC) and synonymous codon usage in CSFV genomes. The results showed that CSFV is GC-moderate genome and the third-ended codons are not preferentially used. Every ENC values in CSFV genomes are >50, indicating that the codon usage bias is comparatively slight. Subsequently, we performed the correspondence analysis (COA) to investigate synonymous codon usage variation among all of the CSFV genomes. We found that codon usage bias in these CSFV genomes is greatly influenced by G þ C mutation, which suggests that mutational pressure may be the main factor determining the codon usage biases. Moreover, most of the codon usage bias among different CSFV ORFs is directly related to the nucleotide composition. Other factors, such as hydrophobicity and aromaticity, also influence the codon usage variation among CSFV genomes. Our study represents the most comprehensive analysis of codon usage patterns in CSFV genome and provides a basic understanding of the mechanisms for its codon usage bias. © 2017 Elsevier Ltd. All rights reserved.
Keywords: CSFV Nucleotide composition Synonymous codon usage Mutational pressure Correspondence analysis
1. Short communication Classical swine fever (CSF) is firstly recognized in Tennessee, USA in 1810, and is described in France in 1822 [1]. As one of the office international des epizooties (OIE) notifiable diseases, CSF caused significant economic losses in the swine-raising industry worldwide [2]. CSF is an extremely contagious swine disease with high morbidity and mortality, featuring symptoms of hemorrhagic fever and immuno-suppression, which is caused by classical swine fever virus (CSFV) [3]. CSFV is a member of the genus Pestivirus
within the family Flaviviridae, which is an enveloped virus harboring a single strand positive-sense RNA genome with approximately 12, 300 nucleotides in length [4]. The genome of CSFV, comprising a single long open reading frame (ORF) that encodes a polyprotein composed of 3898 amino acids (aa), flanked by two non-coding regions at the 30 untranslated region (30 -UTR) and 50 untranslated region (50 -UTR) [5]. The polyprotein is subsequently processed into twelve mature proteins by viral and cellular proteases, including four structural proteins (C, Erns, E1 and E2) and eight nonstructural proteins (Npro, P7, NS2, NS3, NS4A, NS4B, NS5A
* Corresponding author. College of Animal Medicine, Northeast Agricultural University, Harbin 150030, China. ** Corresponding author. College of Life Science and Technology, HeiLongJiang BaYi Agricultural University, Daqing 163319, China. *** Corresponding author. College of Life Science and Technology, HeiLongJiang BaYi Agricultural University, Daqing 163319, China. E-mail addresses:
[email protected] (H. Wang),
[email protected] (H. Cao),
[email protected] (H. Zhang). http://dx.doi.org/10.1016/j.micpath.2017.04.019 0882-4010/© 2017 Elsevier Ltd. All rights reserved.
X. Xu et al. / Microbial Pathogenesis 107 (2017) 368e371
and NS5B) [6]. The CSFVs are classified into highly virulent, moderately virulent, lowly virulent, and avirulent strains. Phylogenetic analysis is extensively used for tracing CSFV and analyzing its epidemiological situation [7]. Based on sequence data-sets of the envelope glycoprotein gene (E2), polymerase gene (NS5B) and untranslated region (50 -UTR), phylogenetic analysis divides CSFVs into three genotypes, 1, 2, and 3, with each being further divided into three or four subgenotypes [8]. Recently, several studies reported that vaccination might affect CSFV diversity and immune escape through recombination and point mutation. At the same time, vaccination may influence the population dynamics, evolutionary rate and adaptive evolution of CSFV [1]. It is well known that synonymous codons are not used randomly. Codon usage is also found to be related to codon-anticodon interaction, dinucleotide bias, tRNA abundance, gene length, gene function, protein secondary structure, replicational and translational selection, and tissue or organ specificity [9,10]. Mutational pressure and translational selection are thought to be the main factors that account for codon usage variation among genes in some RNA virus [11,12]. Therefore, it is essential to the understanding of viral evolution, particularly the interplay between viruses and the host immune response through studying the extent and causes of biases in codon usage [13]. Previous studies of CSFV have mainly been limited to phylogenic analysis, and few synonymous codon usage analyses have been applied. In order to better understand the characteristics of the
369
CSFV genome and to reveal more information about the viral genome, the systemic analysis was performed to study its codon usage patterns. In addition, spearman's rank correlation analysis was used to determine the role of different factors in shaping the codon usage biases in the various CSFV genomes. All statistical analyses were carried out using the statistical analysis software SPSS (Version 17.0). In this present study, we firstly sought to address the issues concerning codon usage in CSFV genome. A total of 76 publicly available complete CSFV genomes representing three genotypes isolated from all over the world were obtained from Genebank (http://ncbi.nlm.nih.gov). The sequences with >99% sequence identities were excluded. The GenBank accession numbers and other detail information of each CSFV genome are listed in Table 1. Relative synonymous codon usage (RSCU) values are largely independent of amino acid composition and are particularly useful in comparing codon usage between genes, or sets of genes that differ in their size and amino acid composition [14]. For the sake of examining synonymous codon usage without the confounding influence of amino acid composition of different CSFV genome, RSCU values of each codon in each ORF were used to measure the synonymous codon usage. The preferentially used codons are A-ended (4 ones), U-ended (1 ones), C-ended (8 ones) and G-ended (6 ones) codons (Table 2). The average GC content of all CSFV genome is 46.38% (From 45.42% to 47.23%, with a standard deviation (S.D.) of 0.44%), and the average third position content in synonymous
Table 1 List of CSFV strains used for analysis of synonymous codon usage in this study. Genebank accession
GC3s
AY259122 KT119352 KP233071 KF977610 KF977609 KF977608 KF977607 HQ380231 AY775178 KU504339 KU556758 KT716271 KF669877 KP233070 KM362426 NC_002657 KM262189 KJ619377 KC149991 KC149990 JX262391 JX218094 GU592790 AY382481 AF326963 AY805221 GQ923951 EU789580 FJ529205 EU857642 EU490425 KP343640 KC503764 KC851953 EU915211 GU324242 GU233734 GU233733
0.5023 0.5182 0.5153 0.4955 0.4955 0.4956 0.4959 0.4933 0.4938 0.5286 0.5235 0.5151 0.5200 0.5087 0.5326 0.4975 0.4883 0.5218 0.5189 0.5187 0.5055 0.5081 0.5175 0.5083 0.4975 0.5063 0.5161 0.4955 0.5173 0.5052 0.4947 0.5131 0.4880 0.5061 0.4953 0.5093 0.5151 0.5160
ENC
52.0916 51.1479 51.6138 51.9976 51.9896 51.9890 51.9850 51.7254 51.7114 51.2851 51.3696 52.0706 51.6150 51.2914 51.3687 51.8972 51.5693 52.1422 51.1642 51.3783 51.2231 51.2548 51.5211 52.1012 51.8972 52.0463 51.9421 52.1820 51.2519 52.2717 51.9719 51.1766 51.6418 52.5419 52.1666 52.0929 52.5616 52.5102
Mononucleotide frequencies C
T
A
G
0.1987 0.2060 0.2063 0.1981 0.1981 0.1983 0.1984 0.1972 0.1971 0.2072 0.2067 0.2019 0.2021 0.2025 0.2094 0.1988 0.1958 0.2056 0.2059 0.2047 0.2027 0.2032 0.2062 0.1992 0.1988 0.1988 0.2030 0.1977 0.2057 0.1985 0.1977 0.2046 0.1965 0.2032 0.1977 0.2038 0.2032 0.2035
0.2212 0.2135 0.2140 0.2215 0.2215 0.2213 0.2213 0.2219 0.2219 0.2118 0.2120 0.2165 0.2158 0.2160 0.2108 0.2207 0.2240 0.2141 0.2141 0.2147 0.2159 0.2153 0.2124 0.2210 0.2207 0.2211 0.2158 0.2214 0.2134 0.2216 0.2219 0.2140 0.2238 0.2164 0.2215 0.2160 0.2170 0.2168
0.3060 0.3079 0.3086 0.3098 0.3099 0.3100 0.3097 0.3106 0.3102 0.3067 0.3070 0.3062 0.3068 0.3082 0.3071 0.3102 0.3110 0.3068 0.3077 0.3074 0.3105 0.3102 0.3078 0.3044 0.3102 0.3048 0.3081 0.3092 0.3062 0.3037 0.3094 0.3091 0.3111 0.3098 0.3093 0.3083 0.3065 0.3062
0.2632 0.2627 0.2608 0.2600 0.2599 0.2598 0.2599 0.2594 0.2599 0.2644 0.2633 0.2660 0.2659 0.2633 0.2629 0.2593 0.2584 0.2629 0.2620 0.2631 0.2606 0.2610 0.2630 0.2646 0.2593 0.2645 0.2620 0.2607 0.2642 0.2656 0.2601 0.2623 0.2577 0.2588 0.2605 0.2615 0.2627 0.2630
Genebank accession
GC3s
GU233732 GU233731 AY367767 AY646427 DQ127910 HQ148063 HQ148062 HQ148061 HM175885 HM237795 X87939 AY578688 AY578687 AY663656 GQ902941 GQ122383 AY554397 AY568569 J04358 FJ265020 EU497410 LT158502 LT158410 LT158409 LT158408 LT158407 LT158406 LT158405 LT158404 LT158403 LT158402 LT158401 KJ873238 KM522833 JQ268754 AF531433 AF407339 AF333000
0.5145 0.5149 0.5156 0.5114 0.4947 0.5143 0.5151 0.5143 0.5072 0.4966 0.4973 0.4995 0.4950 0.5069 0.5210 0.5206 0.5239 0.5209 0.5180 0.5109 0.4952 0.5107 0.5107 0.5073 0.5064 0.5089 0.5069 0.5079 0.5086 0.5100 0.5112 0.5101 0.4975 0.4983 0.5178 0.5069 0.5058 0.4941
ENC
52.4512 52.3199 51.3559 52.3832 51.7132 51.4004 52.3274 52.2577 52.0137 51.9944 52.0328 52.2760 51.6731 52.0515 51.7554 50.9538 51.4640 51.2686 52.1828 52.3005 51.6385 52.0471 52.0471 52.0830 52.0768 52.1037 52.0596 52.0196 52.0139 52.1054 52.1138 52.0425 51.9234 52.1057 51.2389 52.0395 51.3742 51.6170
Mononucleotide frequencies C
T
A
G
0.2039 0.2031 0.2045 0.2012 0.1972 0.2044 0.2057 0.2053 0.1989 0.1984 0.1979 0.2009 0.2008 0.1985 0.2060 0.2079 0.2061 0.2065 0.2060 0.2043 0.1973 0.2049 0.2049 0.2046 0.2046 0.2047 0.2044 0.2046 0.2048 0.2047 0.2049 0.2049 0.1993 0.2003 0.2051 0.1988 0.2041 0.1973
0.2165 0.2172 0.2154 0.2190 0.2214 0.2150 0.2150 0.2151 0.2211 0.2213 0.2216 0.2183 0.2189 0.2216 0.2143 0.2120 0.2131 0.2133 0.2139 0.2165 0.2215 0.2154 0.2154 0.2156 0.2157 0.2156 0.2161 0.2158 0.2156 0.2158 0.2158 0.2156 0.2194 0.2186 0.2143 0.2212 0.2161 0.2217
0.3068 0.3069 0.3079 0.3079 0.3105 0.3086 0.3074 0.3086 0.3046 0.3095 0.3092 0.3107 0.3115 0.3043 0.3071 0.3081 0.3074 0.3078 0.3078 0.3075 0.3107 0.3082 0.3082 0.3090 0.3090 0.3088 0.3088 0.3089 0.3089 0.3081 0.3077 0.3086 0.3104 0.3111 0.3078 0.3046 0.3089 0.3103
0.2624 0.2623 0.2620 0.2622 0.2601 0.2615 0.2612 0.2611 0.2646 0.2601 0.2603 0.2594 0.2578 0.2646 0.2625 0.2622 0.2631 0.2619 0.2619 0.2615 0.2596 0.2612 0.2612 0.2603 0.2602 0.2605 0.2603 0.2604 0.2603 0.2611 0.2613 0.2607 0.2601 0.2593 0.2624 0.2646 0.2603 0.2599
370
X. Xu et al. / Microbial Pathogenesis 107 (2017) 368e371
Table 2 Synonymous codon usage in CSFV viruses. AA
Codon
N
RSCU
AA
Codon
N
RSCU
Phe
UUU UUC UUA UUG UAU UAC UAA UAG CUU CUC CUA CUG CAU CAC CAA CAG AUU AUC AUA AUG AAU AAC AAA AAG GUU GUC GUA GUG GAU GAC GAA GAG
3640 4170 3088 5569 4313 6386 0 0 2046 3176 4571 7220 2246 3148 4215 3452 2175 3879 7412 5629 4202 5568 9166 9175 3762 4449 4818 7281 4731 6675 8014 8441
0.93 1.07 0.72 1.30 0.81 1.19 0.00 0.00 0.48 0.74 1.07 1.69 0.83 1.17 1.10 0.90 0.48 0.86 1.65 1.00 0.86 1.14 1.00 1.00 0.74 0.88 0.95 1.43 0.83 1.17 0.97 1.03
Ser
UCU UCC UCA UCG UGU UGC UGA UGG CCU CCC CCA CCG CGU CGC CGA CGG ACU ACC ACA ACG AGU AGC AGA AGG GCU GCC GCA GCG GGU GGC GGA GGG
1627 1430 3232 485 2279 2907 0 4037 2754 2471 4218 1808 244 261 295 494 4733 6562 6381 2259 2829 2594 5534 5806 3502 5646 5602 1637 4072 4231 4125 6633
0.80 0.70 1.59 0.24 0.88 1.12 0.00 1.00 0.98 0.88 1.50 0.64 0.12 0.12 0.14 0.23 0.95 1.32 1.28 0.45 1.39 1.28 2.63 2.76 0.85 1.38 1.37 0.40 0.85 0.89 0.87 1.39
Leu Tyr ter ter Leu
His Gln Ile
Met Asn Lys Val
Asp Glu
Cys ter Trp Pro
Arg
Thr
Ser Arg Ala
Gly
Fig. 1. Effective number of codons used in each ORF plotted against the GC3s. The continuous curve plots the relationship between GC3s and ENC in the absence of selection. These results show that all of spots lie below the expected curve.
The preferentially used codons (RSCU > 1) for each amino acid are displayed in bold. AA: represents amino acids, N: represents number of codons, RSCU: represents cumulative relative synonymous codon usage.
codons (GC3s) is 50.82% (From 48.80% to 53.26%, with a S.D. of 0.10%). These results are consistent with our previous observations that CSFV is GC-moderate genome [15], which shows that the thirdended codons are not preferentially used in CSFV genome. It is well known that the effective number of codons (ENC) of a gene is generally used to quantify the codon usage bias of a gene, which is essentially independent of gene length. As we all know, the ENC values range from 20 to 61. In an extremely biased gene where only one codon is used for each amino acid, this value will be 20. In contrast, it would be 61 in an unbiased gene. These events suggest that the larger the extent of codon preference in a gene, the smaller the ENC value is [16]. In order to investigate whether these 76 coding sequences of CSFV genome show similar compositional features, the ENC values were then calculated using RSCU software and listed in Table 1. We found that the ENC values of different CSFV genomes vary from 50.95 to 52.56, with a mean of 51.85 and S.D. of 0.39. These results showed that all the ENC values of CSFV genome are very high, as every ENC value is > 55. Based on these findings, together with published data on codon usage bias among some RNA viruses [10,17], we suggest that the codon usage bias in CSFV genome is comparatively slight. It is reported that translational selection and mutational pressure are thought to be the main factors accounting for codon usage variation among genes in some RNA virus [10,18]. These events promote us to further compare the G þ C content at the first and second codon positions (GC12s) with that at the third codon position (GC3s) to find out which factor in CSFV genome can influence their codon usage bias. We found that GC3s and GC12s are significantly correlated (r ¼ 0.986, P < 0.05), which implies that they are most likely caused by mutational pressure, because translational
Fig. 2. A plot of value of the first and second each ORF in COA. The first axis accounts for 41.42% of all variation among CSFV ORFs, and the second axis accounts for 17.67% of total vibrations.
selection would be expected to act differently on different codon positions. In addition, the plot of ENC and GC3S is another effective way to investigate codon usage variation among genes [19]. Genes, whose codon choice is constrained only by a G þ C mutational bias, will lie on or just below the curve of the predicted values. In order to further confirm whether codon usage variation among CSFV virus is determined by mutational bias, ENC values of each virus gene were plotted against its corresponding GC3s. The results showed that all the spots lie below the expected curve (Fig. 1). At the same time, an obviously correlation among ENC and GC3s (r ¼ 0.253, P < 0.05) and GC12s (r ¼ 0.247, P < 0.05) are observed. These results suggested that the G þ C mutation might greatly influence bias codon usage bias in these 76 CSFV genomes. Subsequently, we performed the correspondence analysis (COA) to investigate synonymous codon usage variation among all of the CSFV ORFs selected in this study. The position of each ORF on the plane defined by the first and second principal axes generated by COA on RSCU values of ORFs were represented in Fig. 2. The results indicated that the first principal axis accounting for 41.42% of the total variation, and the next three axes account for 17.67%, 6.52%, and 5.01% of the total variation, respectively. This observation indicates that although the first major axis explains a substantial amount of variation in trends in codon usage, the second major axis
X. Xu et al. / Microbial Pathogenesis 107 (2017) 368e371
371
Table 3 Summary of correlation analysis among the first two axes in COA, ENC and GC12s, GC3s, GRAVY, or aromaticity in all of the selected CSFV ORFs.
Axis 1 Axis 2 ENC
r P r P r P
GRAVY
Aromaticity
GC3s
GC12s
0.259* 0.024 0.137 0.237 0.159 0.17
0.230* 0.045 0.345** 0.002 0.272* 0.018
0.859** <0.01 0.209 0.07 0.253* 0.028
0.807** <0.01 0.255* 0.026 0.247* 0.032
*P value < 0.05; **P value < 0.01.
also has an appreciable impact on total variation in synonymous codon usage. Thus the values of the first two axes of this COA were used for correlation analysis in the next analysis. We found that the first axis value in COA of each selected genome, which contains most of the variation in synonymous codon usage, is significantly correlated with the GC12s and GC3s (P < 0.01). The second axis in the COA of each gene is not correlated with GC3s (P > 0.05), but closely correlated with the GC12s (P < 0.05) (Table 3). These results suggest that most of the codon usage bias among different ORFs is directly related to the nucleotide composition. Furthermore, the general average hydrophobicity score (GRAVY) and the frequency of aromatic amino acids (Aromaticity) in the putative gene product were also calculated using the analysis program CodonW (version 1.4) [20]. The correlation analysis was applied for evaluating whether GRAVY and Aromaticity values are related to first two axes of COA. The results showed that Aromaticity is correlated both with axis 1, axis 2 and ENC, and GRAVY is only correlated with axis 1 (P < 0.05), suggesting that the degree of hydrophobicity and the frequency of aromatic amino acids are also associated with the codon usage variation in CSFV genomes. Taken together, we have analyzed the synonymous codon usage biases in 76 CSFV genomes and demonstrated that CSFV genome has low codon usage bias. Mutational pressure might be the major factor determining the codon usage biases. Moreover, most of the codon usage bias among different genomes is directly related to the nucleotide composition. Aromaticity and hydrophobicity could be partially accounting for the codon usage variation. Conflict of interest There is no conflict of interest among the contributors of this paper. Acknowledgements This work was supported by grants from the National Natural Science Foundation of China (NSFC, Grant No. 31570159), Program for Young Scholars with Creative Talents in HeiLongJiang BaYi Agricultural University (CXRC2016-12), Doctor's Research Foundation, HeiLongJiang BaYi Agricultural University (XDB2015-16 and XDB2015-18), Postgraduate Innovation Science Research Project, HeiLongJiang BaYi Agricultural University (YJSCX2016-Y43 and YJSCX2016-Y50), China Postdoctoral Science Foundation funded project (2016M590297), Postdoctoral Foundation of HeiLongJiang Provincial Government (LBH-Z15188), Technology Research Foundation of Education Department of HeiLongJiang Province, China (12541578), Open Foundation of Key Laboratory of Veterinary Medicine, HeiLongJiang BaYi Agricultural University (AMKL201309), and Natural Science Foundation of Heilongjiang Province
of China (C201322). We are grateful to Prof. Paul Chu (Guest Professor of Institute of Microbiology, Chinese Academy of Sciences) and Dr. Zhenhua Yang (University of Alabama at Birmingham) for critical reading of the manuscript. References [1] W. Ji, D.D. Niu, H.L. Si, N.Z. Ding, C.Q. He, Vaccination influences the evolution of classical swine fever virus, Infect. Genet. Evol. 25 (2014) 69e77. [2] J. Tarradas, M.E. de la Torre, R. Rosell, L.J. Perez, J. Pujols, M. Munoz, I. Munoz, S. Munoz, X. Abad, M. Domingo, L. Fraile, L. Ganges, The impact of CSFV on the immune response to control infection, Virus. Res. 185 (2014) 82e91. [3] V. Kaden, P. Hubert, G. Strebelow, E. Lange, H. Steyer, P. Steinhagen, Comparison of different diagnostic methods for the detection of the classical swine fever virus (CSFV) in the early infection period, Berl. Munch. Tierarztl 112 (1999) 52e57. [4] A. Kosmidou, R. Buttner, G. Meyers, Isolation and characterization of cytopathogenic classical swine fever virus (CSFV), Arch. Virol. 143 (1998) 1295e1309. [5] H.Y. Shen, J.Y. Wang, X.Y. Dong, M.Q. Zhao, Y.M. Kang, Y.G. Li, J.J. Pei, M. Liao, C.M. Ju, L. Yi, Y.M. Hu, J.D. Chen, Genome and molecular characterization of a CSFV strain isolated from a CSF outbreak in south China, Intervirology 56 (2013) 122e133. [6] H. Zhang, H.W. Cao, Z.J. Wu, Y.D. Cui, A review of molecular characterization of classical swine fever virus (CSFV), Isr. J. Vet. Med. 66 (2011) 89e95. [7] D.K. Sarma, N. Mishra, S. Vilcek, K. Rajukumar, S.P. Behera, R.K. Nema, P. Dubey, S.C. Dubey, Phylogenetic analysis of recent classical swine fever virus (CSFV) isolates from Assam, India, Comp. Immunol. Microb. 34 (2011) 11e15. [8] D.J. Paton, A. McGoldrick, I. Greiser-Wilke, S. Parchariyanon, J.Y. Song, P.P. Liou, T. Stadejek, J.P. Lowings, H. Bjorklund, S. Belak, Genetic typing of classical swine fever virus, Vet. Microbiol. 73 (2000) 137e157. [9] J.M. Ma, T. Zhou, W.J. Gu, X. Sun, Z.H. Lu, Cluster analysis of the codon use frequency of MHC genes from different species, Biosystems 65 (2002) 199e207. [10] T. Zhou, W.J. Gu, J.M. Ma, X. Sun, Z.H. Lu, Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses, Biosystems 81 (2005) 77e86. [11] A.T. Lloyd, P.M. Sharp, Evolution of codon usage patterns - the extent and nature of divergence between candida-albicans and saccharomyces-cerevisiae, Nucleic. Acids. Res. 20 (1992) 5289e5295. [12] D.R. Smith, K.R. Arrigo, A.C. Alderkamp, A.E. Allen, Massive difference in synonymous substitution rates among mitochondrial, plastid, and nuclear genes of Phaeocystis algae, Mol. Phylogenet. Evol. 71 (2014) 36e40. [13] P. Tao, L. Dai, M.C. Luo, F.Q. Tang, P. Tien, Z.S. Pan, Analysis of synonymous codon usage in classical swine fever virus, Virus Genes 38 (2009) 104e112. [14] P.M. Sharp, W.H. Li, Codon usage in regulatory genes in Escherichia coli does not reflect selection for rare codons, Nucleic. Acids. Res. 14 (1986) 7737e7749. [15] H. Zhang, H.W. Cao, Z.J. Wu, Y.D. Cui, Evolutionary rate of E2 genes of classical swine fever virus in China, Isr. J. Vet. Med. 66 (2011) 161e163. [16] J.M. Comeron, M. Aguade, An evaluation of measures of synonymous codon usage bias, J. Mol. Evol. 47 (1998) 268e274. [17] H. Zhang, H.W. Cao, F.Q. Li, Z.Y. Pan, Z.J. Wu, Y.H. Wang, Y.D. Cui, Analysis of synonymous codon usage in enterovirus 71, VirusDisease 25 (2014) 243e248. [18] H.W. Cao, H. Zhang, D.S. Li, Analysis of synonymous codon usage in Newcastle disease virus hemagglutinin-neuraminidase (HN) gene and fusion protein (F) gene, VirusDiseases 25 (2014) 132e136. [19] F. Wright, The effective number of codons used in a gene, Gene 87 (1990) 23e29. [20] R.J. Grocock, P.M. Sharp, Synonymous codon usage in cryptosporidium parvum: identification of two distinct trends among genes, Int. J. Parasitol. 31 (2001) 402e412.