Journal of Genetics and Genomics (Formerly Acta Genetica Sinica) June 2007, 34(6): 555-561
Research Article
Analysis of Codon Usage Between Different Poplar Species Meng Zhou, Chunfa Tong, Jisen Shi① The Key Laboratory of Forestry Genetics and Engineering of State Forestry Administration & Jiangsu Province, Nanjing Foestry University, Nanjing 210037, China Abstract: Codon usage is the selective and nonrandom use of synonymous codons to encode amino acids in genes for proteins. The analysis of codon usage may improve the understanding of codon preferences between different species and allow to rebuild the codons of exogenous genes to increase the expression efficiency of exogenous genes. Here, codon DNA sequence (CDS) of four poplar species, including Populus tremuloides Michx., P. tomentosa Carr., P. deltoides Marsh., and P. trichocarpa Torr. & Gray., is used to analyze the relative frequency of synonymous codon (RFSC). High-frequency codons are selected by high-frequency (HF) codon analysis. The results indicate that the codon usage is common for all four poplar species and the codon preference is quite similar among the four poplar species. However, CCT encoding for Pro, and ACT coding for Thr are the preferred codons in P. tremuloides and P. tomentosa, whereas CCA coding for Pro, and ACA coding for Thr are preferred in P. deltoides and P. trichocarpa. The codons such as TGC coding for Cys, TTC coding for Phe, and AAG coding for Lys, are preferred in the poplar species except P. trichocarpa. GAG coding for Glu is preferred only in P. deltoides, while the other three poplar species prefer to use GAA. The commonness of preferred codon allows exogenous gene designed by the preferred codon of one of the different poplar species to be used in other poplar species. Keywords: poplar; codon usage; high-frequency codon; codon preference
Sixty-four codons are found in the universal genetic code, which encode 20 different amino acids in the organism world. Owing to the degeneracy of the genetic code, each amino acid may be coded by two or more codons (synonymous codons). However, coding sequences in DNA do not use synonymous codons with equal frequencies within and between organisms[1, 2]. Previous studies have shown that codon usage is a modulator of the gene expression because of the high correlation between codon usage, tRNA abundance, and the level of gene expression [3-6]. Highly expressed genes have shifted their codon usage toward a more restricted set of preferred
synonymous codons than other less highly expressed genes. Apparently, the analysis of codon usage of species that express exogenous genes provides a guide for increasing the expression efficiency of exogenous genes. Populus has tremendous economic and ecological values, which has several advantages, such as rapid growth, prolific sexual reproduction, ease of cloning, small genome, and strong correlation between the physiological traits and the biomass productivity. During the last 20 years, there have been several reports about the genetic transformation in different poplar species including the resistance to herbicide,
Received: 2006-07-28; Accepted: 2006-09-14 This work was supported by the National Major Basic Research and Development Program (No. TG1999016004) and Jiangsu Provincial Natural Science Foundation (No. BK2003213). ① Corresponding author. E-mail:
[email protected] www.jgenetgenomics.org
556
Journal of Genetics and Genomics
insect, disease, and salinity [7]. It is important to improve the understanding of the mechanism of codon distribution and variation in different poplar species for increasing the expression efficiency of exogenous genes in transgenic poplar plant. In this study, the codon usage data of P. tremuloides, P. tomentosa, P. deltoides, and P. trichocarpa were analyzed and HF (high-frequency) were utilized to select preferred codons. The codon preferences were also compared with the four poplar species. Knowledge of the comparison of the codon preferences in different poplar species will assist in rebuilding codons and in increasing the expression efficiency of exogenous genes in transgenic poplar plant.
1 1.1
Materials and Methods
samples. RFSC reflects the frequency of each synonymous codon usage without regard to the influence of the gene length and the amino acids abundance[9]. RFSC was analyzed using the CUSP program of EMBOSS (The European Molecular Biology Open Software Suite, Cambridge, UK; http://bioinfo.pbi. nrc.ca:8090/EMBOSS/) and the Codon W software (http://www.molbiol.ox.ac.uk). 1.3
Selection of HF (high-frequency codon)
High-frequency codon analysis was used to select the preferred codon between the four different poplar species. If the relative frequency of one codon synonymous codon usage solely exceeds 60% or is 0.5 folds more than the average frequency of synonymous codons of amino acid that it represents, this codon is considered as a high-frequency codon [10].
Gene sequence
Protein and nucleotide acid sequences data of P. tremuloides, P. tomentosa, P. deltoides and P. trichocarpa were obtained from the GenBank (http://www. ncbi.nlm.nih.gov/). Codon DNA sequence (CDS) were collected from the sequences according to the rules of Paul and Elizabeth [8]: 1) considering genes having complete coding sequence; 2) the length of the sequence encoding gene is more than 300 bp; 3) considering genes specifically translating in cytoplasm excluding translating in organelles such as chloroplast; 4) delimitating the genes located on plasmid, transposon, bacteria, fungi, and virus; and 5) including gene families even though these are very similar in the coding. The information of gene samples for analyzing codon usage in different poplar species is listed in Table 1. 1.2
遗传学报 Vol.34 No.6 2007
Analysis of the relative frequency of synonymous codon (RFSC) for CDS between the different poplar species
Codon DNA sequence (CDS) of the four different poplar species was used to analyze RFSC, the ratio of the number of one codon observed in samples to the number of all codons coding the same amino acids in
2
Results
According to the computing method of RFSC, the RFSC of 20,620 codons of 44 genes in P. tremuloides, 13,801 codons of 39 genes in P. tomentosa, 6,898 codons of 16 genes in P. deltoides, and 78,454 codons of 85 genes in P. trichocarpa were analyzed. Nine high-frequency codons in P. tremuloides, 8 in P. tomentosa, 11 in P. deltoides, and 12 in P. trichocarpa were detected based on the definition of HF (Table 2). The amino acids Arg, Leu, and Ser have six-fold coding degeneracy. For Arg, all four poplar species prefer to use the codons beginning with A (AGA and AGG) most frequently rather than C. The frequencies of the codons beginning with A are 0.50, 0.58, 0.49, and 0.33 times higher than that of the codons beginning with C for P. tremuloides, P. tomentosa, P. deltoides, and P. trichocarpa. For P. tremuloides, P. tomentosa, and P. deltoides, CGG is the lowest frequency, whereas CGC is the lowest frequency for P. trichocarpa. For Leu, all four poplar species prefer to use CTT. All four poplar species prefer to use the codons with A (CTA and TTA) ending at the lowest frequency, such as, P. trichocarpa uses CTA at the www.jgenetgenomics.org
Meng Zhou et al.: Analysis of Codon Usage Between Different Poplar Species
Table 1
557
The gene samples for analysis of codon usage in different poplar species Species
Genes
P. tremuloides
44
P. tomentosa
P. deltoides
39
16
Accession number in GenBank DQ223003
AY196961
AY162180
AF377868
AY147903
AF368291
AY341431
AY341026
AY369261
AY535003
AY055724
AY162181
AF527387
AY235222
AF209658
AY229881
AY229880
AY229879
AY229875
AY229874
AF034094
AF034093
AF185574
AY180376
AF480620
AF480619
AY095297
AF349443
AF349442
AF349441
AF273256
AF072131
AF016893
AF217957
AF041050 U08097
AF041049
U50522
U47293
U13171
U27116
AF217958
AY229877
AF016892
AY922315
AY922314
AY922313
DQ445094
DQ354395
DQ354394
AY800120
DQ118107
AY675563
AY675562
AY660752
AY660751
AY660750
AY660749
AY660748
AY660747
DQ076679
DQ020096
DQ004570
AY864733
AY789051
AY596171
AY596170
AY574053
AY501392
AY479975
AY479974
AY479973
AY479972
AY479971
AY479968
AY466400
AY359606
AY302060
AY302066
AY210488
AF314180
AY043495
AY043494
DQ131179
DQ131178
AJ874264
AJ874263
M77504
X70064
Z19568
CQ990398
AJ416708
AY515153
AY515152
AY515151
AY515150
AF309094
X70064
DQ513257
DQ513255
DQ513254
DQ513253
DQ513252
DQ513251
DQ513250
DQ513249
DQ513248
DQ513246
DQ513245
DQ513244
DQ513242
DQ513239
DQ513238
DQ513237
DQ513236
DQ513235
DQ513234
DQ513233
DQ513232
DQ513231
DQ513230
DQ513229
DQ513228
DQ513227
DQ513226
DQ513225
DQ513224
DQ513223
DQ513220
DQ513219
DQ513218
DQ513216
DQ513215
DQ513214
DQ513212
DQ513211
DQ513210
DQ513208
DQ513207
DQ513206
DQ513205
DQ513204
DQ513203
DQ513201
DQ513200
DQ513198
DQ513197
DQ513196
DQ513195
DQ513194
DQ481233
DQ452616
AY919619
AY919618
DQ343566
AY919621
AY919620
DQ270671
DQ270670
AY919617
AY919616
DQ310725
DQ270666
M77504 P. trichocarpa
85
DQ270665
DQ270669
DQ270668
DQ270667
AY652862
AY615966
DQ270664
DQ270663
AY652864
AF052571
AF052570
AY615965
AY615964
AY383600
AF309093
AF309092
U93196
AF309807
AF309806
AF057708
lowest frequency and the other three species use TTA as the lowest frequency. For Ser, TCT is the most commonly tremuloides, P. tomentosa, P. deltoides, and P. trichocarpa, respectively. www.jgenetgenomics.org
The amino acids Ala, Gly, Pro, Thr, and Val have four-fold coding degeneracy (XYA, XYC, XYG, and XYT). For Ala and Val, XYT is used most frequently in the four poplar species. For Gly, all four poplar
558
Table 2
Journal of Genetics and Genomics
Codon usage in different poplar species P. tremuloides
Amino acid R (Arg)
L (Leu)
S (Ser)
A (Ala)
G (Gly)
P (Pro)
T (Thr)
V (Val)
I (Ile)
C(Cys) D (Asp) E (Glu) F (Phe) H (His) K (Lys) N (Asn) Q (Gln) Y (Tyr) M (Met) W (Trp) STOP
遗传学报 Vol.34 No.6 2007
Codon CGT CGC CGA CGG AGA AGG TTA TTG CTT CTC CTA CTG TCT TCC TCA TCG AGT AGC GCT GCC GCA GCG GGT GGC GGA GGG CCT CCC CCA CCG ACT ACC ACA ACG GTT GTC GTA GTG ATT ATC ATA TGT TGC GAT GAC GAA GAG TTT TTC CAT CAC AAA AAG AAT AAC CAA CAG TAT TAC ATG TGG TAG TAA TGA
N
RFSC
141 102 93 57 307 283 162 413 545 325 195 300 388 202 340 68 202 184 593 299 403 104 485 309 522 256 432 185 390 76 407 246 305 72 608 297 182 406 548 415 202 196 222 772 365 722 653 421 439 283 204 503 768 476 389 331 313 386 289 488 307 8 16 20
14.34 10.38 9.46 5.80 31.23 28.79 8.35 21.29 28.09 16.75 10.05 15.47 28.03 14.60 24.57 4.91 14.60 13.29 42.39 21.37 28.81 7.43 30.85 19.66 33.21 16.28 39.89 17.08 36.01 7.02 39.51 23.88 29.61 6.99 40.72 19.89 12.19 27.19 47.04 35.62 17.34 46.89 53.11 67.90 32.10 52.51 47.49 48.95 51.05 58.11 41.89 39.58 60.42 55.03 44.97 51.40 48.60 57.19 42.81 100.00 100.00 18.18 36.36 45.46
P. tomentosa
1/1,000 6.84 4.95 4.51 2.76 14.89 13.72 7.86 20.03 26.43 15.76 9.46 14.55 18.82 9.80 16.49 3.30 9.80 8.92 28.76 14.50 19.54 5.04 23.52 14.99 25.32 12.42 20.95 8.97 18.91 3.69 19.74 11.93 14.79 3.49 29.49 14.40 8.83 19.69 26.58 20.13 9.80 9.51 10.77 37.44 17.70 35.01 31.67 20.42 21.39 13.72 9.89 24.39 37.25 23.08 18.87 16.05 15.18 18.72 14.02 23.67 14.89 0.39 0.78 0.97
N
RFSC
82 65 59 41 189 201 104 295 334 199 133 193 257 143 243 64 139 152 397 188 318 80 285 188 300 230 300 106 281 87 220 185 212 58 375 183 106 283 329 267 175 104 152 491 244 443 431 281 316 211 145 346 498 340 234 265 236 209 206 394 170 5 14 20
12.87 10.20 9.26 6.44 29.67 31.55 8.27 23.45 26..55 15.82 10.57 15.34 25.75 14.33 24.35 6.41 13.93 15.23 40.39 19.13 32.35 8.14 28.41 18.74 29.91 22.93 38.76 13.70 36.30 11.24 32.59 27.41 31.41 8.59 39.60 19.32 11.19 29.88 42.67 34.63 22.70 40.62 59.38 66.80 33.20 50.69 49.31 47.07 52.93 59.27 40.73 41.00 59.00 59.23 40.77 52.89 47.11 50.36 49.64 100.00 100.00 12.82 35.90 51.28
P. deltoides
1/1,000 5.94 4.71 4.28 2.97 13.69 14.56 7.54 21.38 24.20 14.42 9.64 13.98 18.62 10.36 17.61 4.64 10.07 11.01 28.77 13.62 23.04 5.80 20.65 13.62 21.74 16.67 21.74 7.68 20.36 6.30 15.94 13.40 15.36 4.20 27.17 13.26 7.68 20.51 23.84 19.35 12.68 7.54 11.01 35.58 17.68 32.10 31.23 20.36 22.90 15.29 10.51 25.07 36.08 24.64 16.96 19.20 17.10 15.14 14.93 28.55 12.32 0.36 1.01 1.45
N 43 35 43 30 126 99 56 141 140 76 72 66 155 73 122 42 103 92 180 87 142 46 125 72 141 95 140 58 161 39 113 62 147 44 183 76 71 116 147 124 71 47 57 253 136 212 249 153 179 121 75 198 206 175 155 126 113 123 107 141 102 2 0 14
RFSC 11.44 9.31 11.44 7.98 33.51 26.33 10.16 25.59 25.41 13.79 13.07 11.98 26.41 12.44 20.78 7.16 17.55 15.67 39.56 19.12 31.21 10.11 28.87 16.63 32.56 21.94 35.18 14.57 40.45 9.80 30.87 16.94 40.16 12.02 41.03 17.04 15.92 26.01 42.98 36.26 20.76 45.19 54.81 65.04 34.96 45.99 54.01 46.08 53.92 61.73 38.27 49.01 50.99 53.03 46.97 52.72 47.28 53.48 46.52 100.00 100.00 12.50 00.00 87.50
P. Trichocarpa 1/1,000 6.23 2.07 6.23 4.35 18.27 14.35 8.12 20.44 20.30 11.02 10.44 9.57 22.47 10.58 17.69 6.09 14.93 13.34 26.09 12.61 20.59 6.67 18.12 10.44 20.44 13.77 20.30 8.41 23.34 5.65 16.38 8.99 21.31 6.38 26.53 11.02 10.29 16.82 21.31 17.98 10.29 6.81 8.26 36.68 19.72 30.73 36.10 22.18 25.95 17.54 10.87 28.70 29.86 25.37 22.47 18.27 16.38 17.83 15.51 20.44 14.79 0.29 0.00 2.03
N 637 372 471 502 1620 1010 1323 2054 2419 1171 1089 1259 1741 694 1657 404 1257 939 1543 636 1527 233 1316 760 1749 780 1147 375 1285 285 1027 526 1071 333 1803 851 852 1230 2407 1107 1285 985 783 3267 1118 3366 2516 1901 1295 1312 598 2820 2373 2162 1125 1703 1488 1223 628 1796 1163 18 30 37
RFSC 13.81 8.07 10.21 10.88 35.13 21.90 14.20 22.05 25.97 12.57 11.69 13.52 26.02 10.37 24.76 6.04 18.78 14.03 39.17 16.15 38.77 5.92 28.58 16.50 37.98 16.94 37.10 12.13 41.56 9.22 34.73 17.79 36.22 11.26 38.07 17.97 17.99 25.97 50.16 23.07 26.78 55.71 44.29 74.50 25.50 57.23 42.77 59.48 40.52 68.69 31.31 54.30 45.70 65.77 34.23 53.37 46.63 66.07 33.93 100.0 100.0 21.18 35.29 43.53
1/1,000 8.12 4.74 6.00 6.40 20.65 12.87 16.86 26.18 3.83 14.93 13.88 16.05 22.19 8.85 21.12 5.15 16.02 11.97 19.67 8.11 19.46 2.97 16.77 9.69 22.29 9.94 14.62 4.78 16.38 3.63 13.09 6.70 13.65 4.24 22.98 10.85 10.86 15.68 30.68 14.11 16.38 12.56 9.98 41.64 14.25 42.90 32.07 24.23 16.51 16.72 7.62 35.94 30.25 27.56 14.34 21.71 18.97 15.59 8.00 22.89 14.82 0.23 0.38 0.47
RFSC refers to the proportion of all synonymous codons encoding the same amino acids. The frequency of each codon that appears in the coding sequence of the individual gene is 1/1,000. Rimmed codons are preferred codons. Triplets in bold face indicate a high frequency in encoding the amino acid. Shaded codons appear during low-frequency coding of the amino acid.
www.jgenetgenomics.org
Meng Zhou et al.: Analysis of Codon Usage Between Different Poplar Species
species prefer to use XYA more often. However, for Thr and Pro, P. tremuloides and P. tomentosa use XYT the most, whereas P. deltoides and P. trichocarpa use XYA the most. For Gly, P. tremuloides uses XYG the least, while the other species, which use XYC the least, show a different bias. For Val, all four poplar species use XYA the least. However, these poplar species prefer to use XYG the least for the other three amino acids. In general, the four poplar species prefer to use the codons with A and T ending more often, rather than codons with G and C ending. The usage of the codons with A and T ending is 1.77± 0.38 times higher than the codons with G and C ending for Ala, 0.69 ± 0.20 times higher for Gly, 2.24 ±0.22 times higher for Pro, 1.23 ± 0.22 times higher for Thr, and 0.19 ± 0.11 times higher for Val. Ile is the only amino acid that has three-fold codon degeneracy (XYA, XYC, and XYT). All poplar species tested prefer to use XYT most frequently for Ile. P. trichocarpa uses XYC the least, while the other three species use XYA the least. However, for Ile, codons with A and T ending (XYA and XYT) are used more often than codons with C ending (XYC): 0.81 times more for P. tremuloides, 0.89 times more for P. tomentosa, 0.76 times more for 1.87 P. deltoides, and 1.87 times more for P. trichocarpa. The amino acids Cys, Asp, Glu, Phe, His, Lys, Asn, Gln, and Tyr have two-fold coding degeneracy. For the amino acids Asp, His, Asn, Gln, and Tyr, the poplar species examined have the same bias, but these show a different bias for other amino acids, which have two-fold codon degeneracy. For Cys, P. trichocarpa uses TGT the most and TGC the least while the other species use TGC the most and TGT the least; for Glu, P. deltoides uses GAG the most and GAA the least while the other species use GAA the most and GAG the least; for Phe, P. trichocarpa uses TTT the most and TTC the least while the other species use TTC the most and TTT the least; for Lys, P. trichocarpa uses AAA the most and AAG the least while the other species use AAG the most and AAA the www.jgenetgenomics.org
559
least. On the whole, there is a slight difference in codon usage between the four poplar species for the two-fold coding degeneracy amino acids. The preferred codons in the different poplar species were analyzed and are listed in Table 3. It is clear that the four poplar species have a strong commonness in usage of the preferred codons, such as AGA and AGG for Arg, CTT for Leu, TCT for Ser, GCT for Ala, GGA for Gly, GTT for Val, ATT for Ile, GAT for Asp, CAT for His, AAT for Asn, CAA for Gln, TAT for Tyr, and TGA for stop. However, there is slight difference for Pro, Thr, Cys, Glu, Phe, and Lys. CCT coding for Pro and ACT coding for Thr are the preferred codons in P. tremuloides and P. tomentosa, whereas CCA coding for Pro and ACA coding for Thr are preferred in P. deltoides and P. trichocarpa. These codons— TGC coding for Cys, TTC coding for Phe, AAG encoding for Lys—are preferred in the there poplar species except for P. trichocarpa. GAG coding for Glu is preferred only in P. deltoides while the other three poplar species prefer to use GAA.
3
Discussion
The method of high-expression codons (HE) was used traditionally to analyze codon usage [11]. It is nec essary to first compute the effective number of codons (ENC)[11,12] and the relative synonymous codon usage (RSCU)[13], and then determine the sample groups of high expression and low expression by ENC and compute the RSCU of each codon in the two sample groups, and finally determine the optimal codons of high expression genes by the t-test statistically. As mentioned above, this method is very complicated and the genetic background of samples must be known, which increases the difficulty for the selection of samples. When compared to HE, the method of HF is simple and fast and has been used successfully in the analysis of codon usage in tobacco (Nicotiana tabacum L.)[10], codon usage in the Chloroplast Genome of Rice (Oryza sativa L. ssp. japonica) [14], and codon usage in Citrus spp. [9].
560
Journal of Genetics and Genomics
Table 3
The preference codon in different poplar species
Amino acid R (Arg) L (Leu) S (Ser) A (Ala) G (Gly) P (Pro) T (Thr) V (Val) I (Ile) C (Cys) D (Asp) E (Glu) F (Phe) H (His) K (Lys) N (Asn) Q (Gln) Y (Tyr) M (Met) W (Trp) STOP h
遗传学报 Vol.34 No.6 2007
P. tremuloides AGA h AGG h CTT h TCT h GCT h GGA GGT CCT h ACT h GTT h ATT TGC GAT h GAA TTC CAT AAG h AAT CAA TAT ATG TGG TGA
P. tomentosa AGA h AGG h AGG h CTT h TCT h GCT h GGA GGT CCT h ACT GTT h ATT TGC GAT h GAA TTC CAT AAG AAT CAA TAT ATG TGA h
P. deltoides AGA h AGG h CTT h TTG h TCT h GCT h GGA GGT CCA h ACA h GTT h ATT TGC GAT h GAG TTC CAT h AAG AAT CAA TAT ATG TGG TGA h
P. trichocarpa AGA h AGG CTT h TCT h GCT h GCA h GGA h CCA h ACA GTT h ATT h TGT GAT h GAA TTT CAT h AAA AAT h CAA TAT h ATG TGG TGA
refers to the preferred codons; other codons are high frequency in coding the amino acid.
In the present study, a detailed analysis of the codon usage in the four poplar species revealed that although there is slight difference in the usage of preferred codon for different poplar species, the four poplar species exhibit a strong commonness in the
in populus. Therefore, this study can serve as a guide for increasing the expression efficiency of exogenous genes in different poplar species. Since protein and nucleotide acid data of poplar in GenBank is limited, the reliability of this study may be influenced.
usage of preferred codon. Furthermore, it can also be seen that the codon preferences of P. tremuloides and
References
P. tomentosa are most similar, which agrees with the
1
individual genes confirm consistent choices of degenerate
conclusion that codon usage is associated with the
bases according to genome type. Nucleic Acids, 1980, 8 (9):
evolution of plants as reported by Campbell WH and Gowti G [15]. Because P. tremuloides and P. tomen-
Grantham R, Gautier C, Gouy M. Codon frequencies in 119
1893−1912. 2
Lu H, Zhao WM, Zheng Y, Wang H, Qi M, Yu XP. Analysis of
tosa belong to Leuce Duby, while P. deltoides be-
Synonymous Codon Usage Bias in Chlamydia. Acta Bio-
longs to Aigeiros, and P. trichocarpa belongs to Ta-
chimica et Biophysica Sinica, 2005, 37(1): 1−10.
camahaca. Our study indicates that exogenous gene designed by the preferred codon of one of the dif-
3
Holm L. Codon usage and gene expression. Nucleic Acids,
4
Coghlan A, Wolfe KH. Relationship of codon bias to mRNA
1986, 14 (7): 3075−3087.
ferent poplar species can be used in other poplar
concentration and protein length in Saccharomyces cerevisiae.
species. The commonness of the preferred codon reduces repetition in experiment and saves limited
Yeast, 2000, 16 (12): 1131−1145. 5
Akashi H. Translational selection and yeast proteome evolu-
6
Liu QP, Tan J, Xue QZ. Synonymous codon usage bias in the
tion. Genetics, 2003, 164 (4): 1291−1303.
resource. It has been reported that the expression efficiency of genes can be successfully increased in plants by rebuilding codons [16]. However, there is no example
rice cultivar 93-11 (Oryza sativa L. ssp. indica). Acta Genetica Sinica, 2003, 30(4): 335−340 (in Chinese with an English abstract).
www.jgenetgenomics.org
Meng Zhou et al.: Analysis of Codon Usage Between Different Poplar Species
7
Zhang Y, Zhang SG, Qi LW, Chen XQ, Chen RY, Song WQ. Poplar as a model for forest tree in genome research. Chinese Bulletin of Botany, 2006, 23 (3): 286−293 (in Chinese with an
English abstract). 8 Sharp PM, Cowe E. Synonymous codon usage in Saccharanyces cerevisiae. Yeast, 1991, 7(7): 657−678. 9 Hu GB, Zhang SL, Xu CJ, Lin SQ. Analysis of codon usage between different Citrus species. Journal of South China Agricultural University, 2006, 27(1): 13−16 (in Chinese with an English abstract). 10 Lin T, Ni ZH, Shen MS, Chen L. High-frequency codon analysis and its application in codon analysis of tobacco. Journal of Xiamen University, 2002, 41(5): 551−554 (in Chinese with an English abstract). 11 Frank W. The effective number of codons usage used in a gene. Gene, 1990, 87(1): 23−29. 12 Zhao X, Li Z, Lu SF, Huo KK, Li YY. Synonymous codon usage in Yarrowia lipolytica. Journal of Fudan University
561
(Natural Science), 1999, 38(5): 510−516 (in Chinese with an English abstract). 13 Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA, 1999, 96(8): 4482−4487. 14 Liu QP, Xue QZ. Codon usage in the chloroplast genome of rice (Oryza sativa L. ssp. japonica). Acta Agronomica Sinica, 2004, 30(12): 1220−1224 (in Chinese with an English abstract). 15 Campbell WH, Gowri G. Codon in higher plants, green algae, and Cyanobacteria. Plant Physiol, 1990, 92(1): 1−11. 16 Panahi M, Alli Z, Cheng X, Belbaraka L, Belqoudi J, Sardana R, Phipps J, Altosaar I. Recombinant protein expression plasmids optimized for industrial E.coli fermentation and plant systems produced biologically active human insulin-like growth factor-1 in transgenic rice and tobacco plants. Transgenic Res, 2004, 13(3): 245−259.
杨树派间不同种的遗传密码子使用频率分析 周 猛,童春发,施季森 南京林业大学,国家林业局和江苏省林木遗传和基因工程重点实验室,南京 210037 摘 要:遗传密码子的简并性特征造成了不同物种使用的密码子存在偏爱性。了解不同物种的密码子使用特点,可以为外源 基因导入过程中的基因改造提供依据,从而实现外源基因的高效表达。杨树是世界上广泛栽培的重要造林树种之一,已经 成为林木基因工程研究的模式植物。本研究采用高频密码子分析法,对美洲山杨 P. tremuloides, 毛白杨 P. tomentosa,美洲 黑杨 P. deltoids 和毛果杨 P. trichocarpa 4 种杨树的蛋白质编码基因序列(CDS)进行了分析,计算出了杨树同义密码子相对 使用频率(RFSC),确定了 4 种杨树的高频率密码子,发现虽然不同种类的杨树密码子使用上有一些差别,但是偏爱密码子 的差别却很小,共性的密码子占绝大多数。仅有 Pro, Thr 和 Cys 等少数几个氨基酸的偏爱密码子有差别。这种“共性”提 示我们,用不同种的杨树中任何一种杨树的偏爱密码子所设计的外源基因在其他杨树中也可以使用。 关键词:杨树;密码子使用频率;高频密码子;密码子偏爱性 作者简介:周猛(1981− ),男,辽宁人,硕士研究生,研究方向:生物信息学。E-mail:
[email protected]
www.jgenetgenomics.org