Analysis of Codon Usage Between Different Poplar Species

Analysis of Codon Usage Between Different Poplar Species

Journal of Genetics and Genomics (Formerly Acta Genetica Sinica) June 2007, 34(6): 555-561 Research Article Analysis of Codon Usage Between Differen...

173KB Sizes 0 Downloads 21 Views

Journal of Genetics and Genomics (Formerly Acta Genetica Sinica) June 2007, 34(6): 555-561

Research Article

Analysis of Codon Usage Between Different Poplar Species Meng Zhou, Chunfa Tong, Jisen Shi① The Key Laboratory of Forestry Genetics and Engineering of State Forestry Administration & Jiangsu Province, Nanjing Foestry University, Nanjing 210037, China Abstract: Codon usage is the selective and nonrandom use of synonymous codons to encode amino acids in genes for proteins. The analysis of codon usage may improve the understanding of codon preferences between different species and allow to rebuild the codons of exogenous genes to increase the expression efficiency of exogenous genes. Here, codon DNA sequence (CDS) of four poplar species, including Populus tremuloides Michx., P. tomentosa Carr., P. deltoides Marsh., and P. trichocarpa Torr. & Gray., is used to analyze the relative frequency of synonymous codon (RFSC). High-frequency codons are selected by high-frequency (HF) codon analysis. The results indicate that the codon usage is common for all four poplar species and the codon preference is quite similar among the four poplar species. However, CCT encoding for Pro, and ACT coding for Thr are the preferred codons in P. tremuloides and P. tomentosa, whereas CCA coding for Pro, and ACA coding for Thr are preferred in P. deltoides and P. trichocarpa. The codons such as TGC coding for Cys, TTC coding for Phe, and AAG coding for Lys, are preferred in the poplar species except P. trichocarpa. GAG coding for Glu is preferred only in P. deltoides, while the other three poplar species prefer to use GAA. The commonness of preferred codon allows exogenous gene designed by the preferred codon of one of the different poplar species to be used in other poplar species. Keywords: poplar; codon usage; high-frequency codon; codon preference

Sixty-four codons are found in the universal genetic code, which encode 20 different amino acids in the organism world. Owing to the degeneracy of the genetic code, each amino acid may be coded by two or more codons (synonymous codons). However, coding sequences in DNA do not use synonymous codons with equal frequencies within and between organisms[1, 2]. Previous studies have shown that codon usage is a modulator of the gene expression because of the high correlation between codon usage, tRNA abundance, and the level of gene expression [3-6]. Highly expressed genes have shifted their codon usage toward a more restricted set of preferred

synonymous codons than other less highly expressed genes. Apparently, the analysis of codon usage of species that express exogenous genes provides a guide for increasing the expression efficiency of exogenous genes. Populus has tremendous economic and ecological values, which has several advantages, such as rapid growth, prolific sexual reproduction, ease of cloning, small genome, and strong correlation between the physiological traits and the biomass productivity. During the last 20 years, there have been several reports about the genetic transformation in different poplar species including the resistance to herbicide,

Received: 2006-07-28; Accepted: 2006-09-14 This work was supported by the National Major Basic Research and Development Program (No. TG1999016004) and Jiangsu Provincial Natural Science Foundation (No. BK2003213). ① Corresponding author. E-mail: [email protected] www.jgenetgenomics.org

556

Journal of Genetics and Genomics

insect, disease, and salinity [7]. It is important to improve the understanding of the mechanism of codon distribution and variation in different poplar species for increasing the expression efficiency of exogenous genes in transgenic poplar plant. In this study, the codon usage data of P. tremuloides, P. tomentosa, P. deltoides, and P. trichocarpa were analyzed and HF (high-frequency) were utilized to select preferred codons. The codon preferences were also compared with the four poplar species. Knowledge of the comparison of the codon preferences in different poplar species will assist in rebuilding codons and in increasing the expression efficiency of exogenous genes in transgenic poplar plant.

1 1.1

Materials and Methods

samples. RFSC reflects the frequency of each synonymous codon usage without regard to the influence of the gene length and the amino acids abundance[9]. RFSC was analyzed using the CUSP program of EMBOSS (The European Molecular Biology Open Software Suite, Cambridge, UK; http://bioinfo.pbi. nrc.ca:8090/EMBOSS/) and the Codon W software (http://www.molbiol.ox.ac.uk). 1.3

Selection of HF (high-frequency codon)

High-frequency codon analysis was used to select the preferred codon between the four different poplar species. If the relative frequency of one codon synonymous codon usage solely exceeds 60% or is 0.5 folds more than the average frequency of synonymous codons of amino acid that it represents, this codon is considered as a high-frequency codon [10].

Gene sequence

Protein and nucleotide acid sequences data of P. tremuloides, P. tomentosa, P. deltoides and P. trichocarpa were obtained from the GenBank (http://www. ncbi.nlm.nih.gov/). Codon DNA sequence (CDS) were collected from the sequences according to the rules of Paul and Elizabeth [8]: 1) considering genes having complete coding sequence; 2) the length of the sequence encoding gene is more than 300 bp; 3) considering genes specifically translating in cytoplasm excluding translating in organelles such as chloroplast; 4) delimitating the genes located on plasmid, transposon, bacteria, fungi, and virus; and 5) including gene families even though these are very similar in the coding. The information of gene samples for analyzing codon usage in different poplar species is listed in Table 1. 1.2

遗传学报 Vol.34 No.6 2007

Analysis of the relative frequency of synonymous codon (RFSC) for CDS between the different poplar species

Codon DNA sequence (CDS) of the four different poplar species was used to analyze RFSC, the ratio of the number of one codon observed in samples to the number of all codons coding the same amino acids in

2

Results

According to the computing method of RFSC, the RFSC of 20,620 codons of 44 genes in P. tremuloides, 13,801 codons of 39 genes in P. tomentosa, 6,898 codons of 16 genes in P. deltoides, and 78,454 codons of 85 genes in P. trichocarpa were analyzed. Nine high-frequency codons in P. tremuloides, 8 in P. tomentosa, 11 in P. deltoides, and 12 in P. trichocarpa were detected based on the definition of HF (Table 2). The amino acids Arg, Leu, and Ser have six-fold coding degeneracy. For Arg, all four poplar species prefer to use the codons beginning with A (AGA and AGG) most frequently rather than C. The frequencies of the codons beginning with A are 0.50, 0.58, 0.49, and 0.33 times higher than that of the codons beginning with C for P. tremuloides, P. tomentosa, P. deltoides, and P. trichocarpa. For P. tremuloides, P. tomentosa, and P. deltoides, CGG is the lowest frequency, whereas CGC is the lowest frequency for P. trichocarpa. For Leu, all four poplar species prefer to use CTT. All four poplar species prefer to use the codons with A (CTA and TTA) ending at the lowest frequency, such as, P. trichocarpa uses CTA at the www.jgenetgenomics.org

Meng Zhou et al.: Analysis of Codon Usage Between Different Poplar Species

Table 1

557

The gene samples for analysis of codon usage in different poplar species Species

Genes

P. tremuloides

44

P. tomentosa

P. deltoides

39

16

Accession number in GenBank DQ223003

AY196961

AY162180

AF377868

AY147903

AF368291

AY341431

AY341026

AY369261

AY535003

AY055724

AY162181

AF527387

AY235222

AF209658

AY229881

AY229880

AY229879

AY229875

AY229874

AF034094

AF034093

AF185574

AY180376

AF480620

AF480619

AY095297

AF349443

AF349442

AF349441

AF273256

AF072131

AF016893

AF217957

AF041050 U08097

AF041049

U50522

U47293

U13171

U27116

AF217958

AY229877

AF016892

AY922315

AY922314

AY922313

DQ445094

DQ354395

DQ354394

AY800120

DQ118107

AY675563

AY675562

AY660752

AY660751

AY660750

AY660749

AY660748

AY660747

DQ076679

DQ020096

DQ004570

AY864733

AY789051

AY596171

AY596170

AY574053

AY501392

AY479975

AY479974

AY479973

AY479972

AY479971

AY479968

AY466400

AY359606

AY302060

AY302066

AY210488

AF314180

AY043495

AY043494

DQ131179

DQ131178

AJ874264

AJ874263

M77504

X70064

Z19568

CQ990398

AJ416708

AY515153

AY515152

AY515151

AY515150

AF309094

X70064

DQ513257

DQ513255

DQ513254

DQ513253

DQ513252

DQ513251

DQ513250

DQ513249

DQ513248

DQ513246

DQ513245

DQ513244

DQ513242

DQ513239

DQ513238

DQ513237

DQ513236

DQ513235

DQ513234

DQ513233

DQ513232

DQ513231

DQ513230

DQ513229

DQ513228

DQ513227

DQ513226

DQ513225

DQ513224

DQ513223

DQ513220

DQ513219

DQ513218

DQ513216

DQ513215

DQ513214

DQ513212

DQ513211

DQ513210

DQ513208

DQ513207

DQ513206

DQ513205

DQ513204

DQ513203

DQ513201

DQ513200

DQ513198

DQ513197

DQ513196

DQ513195

DQ513194

DQ481233

DQ452616

AY919619

AY919618

DQ343566

AY919621

AY919620

DQ270671

DQ270670

AY919617

AY919616

DQ310725

DQ270666

M77504 P. trichocarpa

85

DQ270665

DQ270669

DQ270668

DQ270667

AY652862

AY615966

DQ270664

DQ270663

AY652864

AF052571

AF052570

AY615965

AY615964

AY383600

AF309093

AF309092

U93196

AF309807

AF309806

AF057708

lowest frequency and the other three species use TTA as the lowest frequency. For Ser, TCT is the most commonly tremuloides, P. tomentosa, P. deltoides, and P. trichocarpa, respectively. www.jgenetgenomics.org

The amino acids Ala, Gly, Pro, Thr, and Val have four-fold coding degeneracy (XYA, XYC, XYG, and XYT). For Ala and Val, XYT is used most frequently in the four poplar species. For Gly, all four poplar

558

Table 2

Journal of Genetics and Genomics

Codon usage in different poplar species P. tremuloides

Amino acid R (Arg)

L (Leu)

S (Ser)

A (Ala)

G (Gly)

P (Pro)

T (Thr)

V (Val)

I (Ile)

C(Cys) D (Asp) E (Glu) F (Phe) H (His) K (Lys) N (Asn) Q (Gln) Y (Tyr) M (Met) W (Trp) STOP

遗传学报 Vol.34 No.6 2007

Codon CGT CGC CGA CGG AGA AGG TTA TTG CTT CTC CTA CTG TCT TCC TCA TCG AGT AGC GCT GCC GCA GCG GGT GGC GGA GGG CCT CCC CCA CCG ACT ACC ACA ACG GTT GTC GTA GTG ATT ATC ATA TGT TGC GAT GAC GAA GAG TTT TTC CAT CAC AAA AAG AAT AAC CAA CAG TAT TAC ATG TGG TAG TAA TGA

N

RFSC

141 102 93 57 307 283 162 413 545 325 195 300 388 202 340 68 202 184 593 299 403 104 485 309 522 256 432 185 390 76 407 246 305 72 608 297 182 406 548 415 202 196 222 772 365 722 653 421 439 283 204 503 768 476 389 331 313 386 289 488 307 8 16 20

14.34 10.38 9.46 5.80 31.23 28.79 8.35 21.29 28.09 16.75 10.05 15.47 28.03 14.60 24.57 4.91 14.60 13.29 42.39 21.37 28.81 7.43 30.85 19.66 33.21 16.28 39.89 17.08 36.01 7.02 39.51 23.88 29.61 6.99 40.72 19.89 12.19 27.19 47.04 35.62 17.34 46.89 53.11 67.90 32.10 52.51 47.49 48.95 51.05 58.11 41.89 39.58 60.42 55.03 44.97 51.40 48.60 57.19 42.81 100.00 100.00 18.18 36.36 45.46

P. tomentosa

1/1,000 6.84 4.95 4.51 2.76 14.89 13.72 7.86 20.03 26.43 15.76 9.46 14.55 18.82 9.80 16.49 3.30 9.80 8.92 28.76 14.50 19.54 5.04 23.52 14.99 25.32 12.42 20.95 8.97 18.91 3.69 19.74 11.93 14.79 3.49 29.49 14.40 8.83 19.69 26.58 20.13 9.80 9.51 10.77 37.44 17.70 35.01 31.67 20.42 21.39 13.72 9.89 24.39 37.25 23.08 18.87 16.05 15.18 18.72 14.02 23.67 14.89 0.39 0.78 0.97

N

RFSC

82 65 59 41 189 201 104 295 334 199 133 193 257 143 243 64 139 152 397 188 318 80 285 188 300 230 300 106 281 87 220 185 212 58 375 183 106 283 329 267 175 104 152 491 244 443 431 281 316 211 145 346 498 340 234 265 236 209 206 394 170 5 14 20

12.87 10.20 9.26 6.44 29.67 31.55 8.27 23.45 26..55 15.82 10.57 15.34 25.75 14.33 24.35 6.41 13.93 15.23 40.39 19.13 32.35 8.14 28.41 18.74 29.91 22.93 38.76 13.70 36.30 11.24 32.59 27.41 31.41 8.59 39.60 19.32 11.19 29.88 42.67 34.63 22.70 40.62 59.38 66.80 33.20 50.69 49.31 47.07 52.93 59.27 40.73 41.00 59.00 59.23 40.77 52.89 47.11 50.36 49.64 100.00 100.00 12.82 35.90 51.28

P. deltoides

1/1,000 5.94 4.71 4.28 2.97 13.69 14.56 7.54 21.38 24.20 14.42 9.64 13.98 18.62 10.36 17.61 4.64 10.07 11.01 28.77 13.62 23.04 5.80 20.65 13.62 21.74 16.67 21.74 7.68 20.36 6.30 15.94 13.40 15.36 4.20 27.17 13.26 7.68 20.51 23.84 19.35 12.68 7.54 11.01 35.58 17.68 32.10 31.23 20.36 22.90 15.29 10.51 25.07 36.08 24.64 16.96 19.20 17.10 15.14 14.93 28.55 12.32 0.36 1.01 1.45

N 43 35 43 30 126 99 56 141 140 76 72 66 155 73 122 42 103 92 180 87 142 46 125 72 141 95 140 58 161 39 113 62 147 44 183 76 71 116 147 124 71 47 57 253 136 212 249 153 179 121 75 198 206 175 155 126 113 123 107 141 102 2 0 14

RFSC 11.44 9.31 11.44 7.98 33.51 26.33 10.16 25.59 25.41 13.79 13.07 11.98 26.41 12.44 20.78 7.16 17.55 15.67 39.56 19.12 31.21 10.11 28.87 16.63 32.56 21.94 35.18 14.57 40.45 9.80 30.87 16.94 40.16 12.02 41.03 17.04 15.92 26.01 42.98 36.26 20.76 45.19 54.81 65.04 34.96 45.99 54.01 46.08 53.92 61.73 38.27 49.01 50.99 53.03 46.97 52.72 47.28 53.48 46.52 100.00 100.00 12.50 00.00 87.50

P. Trichocarpa 1/1,000 6.23 2.07 6.23 4.35 18.27 14.35 8.12 20.44 20.30 11.02 10.44 9.57 22.47 10.58 17.69 6.09 14.93 13.34 26.09 12.61 20.59 6.67 18.12 10.44 20.44 13.77 20.30 8.41 23.34 5.65 16.38 8.99 21.31 6.38 26.53 11.02 10.29 16.82 21.31 17.98 10.29 6.81 8.26 36.68 19.72 30.73 36.10 22.18 25.95 17.54 10.87 28.70 29.86 25.37 22.47 18.27 16.38 17.83 15.51 20.44 14.79 0.29 0.00 2.03

N 637 372 471 502 1620 1010 1323 2054 2419 1171 1089 1259 1741 694 1657 404 1257 939 1543 636 1527 233 1316 760 1749 780 1147 375 1285 285 1027 526 1071 333 1803 851 852 1230 2407 1107 1285 985 783 3267 1118 3366 2516 1901 1295 1312 598 2820 2373 2162 1125 1703 1488 1223 628 1796 1163 18 30 37

RFSC 13.81 8.07 10.21 10.88 35.13 21.90 14.20 22.05 25.97 12.57 11.69 13.52 26.02 10.37 24.76 6.04 18.78 14.03 39.17 16.15 38.77 5.92 28.58 16.50 37.98 16.94 37.10 12.13 41.56 9.22 34.73 17.79 36.22 11.26 38.07 17.97 17.99 25.97 50.16 23.07 26.78 55.71 44.29 74.50 25.50 57.23 42.77 59.48 40.52 68.69 31.31 54.30 45.70 65.77 34.23 53.37 46.63 66.07 33.93 100.0 100.0 21.18 35.29 43.53

1/1,000 8.12 4.74 6.00 6.40 20.65 12.87 16.86 26.18 3.83 14.93 13.88 16.05 22.19 8.85 21.12 5.15 16.02 11.97 19.67 8.11 19.46 2.97 16.77 9.69 22.29 9.94 14.62 4.78 16.38 3.63 13.09 6.70 13.65 4.24 22.98 10.85 10.86 15.68 30.68 14.11 16.38 12.56 9.98 41.64 14.25 42.90 32.07 24.23 16.51 16.72 7.62 35.94 30.25 27.56 14.34 21.71 18.97 15.59 8.00 22.89 14.82 0.23 0.38 0.47

RFSC refers to the proportion of all synonymous codons encoding the same amino acids. The frequency of each codon that appears in the coding sequence of the individual gene is 1/1,000. Rimmed codons are preferred codons. Triplets in bold face indicate a high frequency in encoding the amino acid. Shaded codons appear during low-frequency coding of the amino acid.

www.jgenetgenomics.org

Meng Zhou et al.: Analysis of Codon Usage Between Different Poplar Species

species prefer to use XYA more often. However, for Thr and Pro, P. tremuloides and P. tomentosa use XYT the most, whereas P. deltoides and P. trichocarpa use XYA the most. For Gly, P. tremuloides uses XYG the least, while the other species, which use XYC the least, show a different bias. For Val, all four poplar species use XYA the least. However, these poplar species prefer to use XYG the least for the other three amino acids. In general, the four poplar species prefer to use the codons with A and T ending more often, rather than codons with G and C ending. The usage of the codons with A and T ending is 1.77± 0.38 times higher than the codons with G and C ending for Ala, 0.69 ± 0.20 times higher for Gly, 2.24 ±0.22 times higher for Pro, 1.23 ± 0.22 times higher for Thr, and 0.19 ± 0.11 times higher for Val. Ile is the only amino acid that has three-fold codon degeneracy (XYA, XYC, and XYT). All poplar species tested prefer to use XYT most frequently for Ile. P. trichocarpa uses XYC the least, while the other three species use XYA the least. However, for Ile, codons with A and T ending (XYA and XYT) are used more often than codons with C ending (XYC): 0.81 times more for P. tremuloides, 0.89 times more for P. tomentosa, 0.76 times more for 1.87 P. deltoides, and 1.87 times more for P. trichocarpa. The amino acids Cys, Asp, Glu, Phe, His, Lys, Asn, Gln, and Tyr have two-fold coding degeneracy. For the amino acids Asp, His, Asn, Gln, and Tyr, the poplar species examined have the same bias, but these show a different bias for other amino acids, which have two-fold codon degeneracy. For Cys, P. trichocarpa uses TGT the most and TGC the least while the other species use TGC the most and TGT the least; for Glu, P. deltoides uses GAG the most and GAA the least while the other species use GAA the most and GAG the least; for Phe, P. trichocarpa uses TTT the most and TTC the least while the other species use TTC the most and TTT the least; for Lys, P. trichocarpa uses AAA the most and AAG the least while the other species use AAG the most and AAA the www.jgenetgenomics.org

559

least. On the whole, there is a slight difference in codon usage between the four poplar species for the two-fold coding degeneracy amino acids. The preferred codons in the different poplar species were analyzed and are listed in Table 3. It is clear that the four poplar species have a strong commonness in usage of the preferred codons, such as AGA and AGG for Arg, CTT for Leu, TCT for Ser, GCT for Ala, GGA for Gly, GTT for Val, ATT for Ile, GAT for Asp, CAT for His, AAT for Asn, CAA for Gln, TAT for Tyr, and TGA for stop. However, there is slight difference for Pro, Thr, Cys, Glu, Phe, and Lys. CCT coding for Pro and ACT coding for Thr are the preferred codons in P. tremuloides and P. tomentosa, whereas CCA coding for Pro and ACA coding for Thr are preferred in P. deltoides and P. trichocarpa. These codons— TGC coding for Cys, TTC coding for Phe, AAG encoding for Lys—are preferred in the there poplar species except for P. trichocarpa. GAG coding for Glu is preferred only in P. deltoides while the other three poplar species prefer to use GAA.

3

Discussion

The method of high-expression codons (HE) was used traditionally to analyze codon usage [11]. It is nec essary to first compute the effective number of codons (ENC)[11,12] and the relative synonymous codon usage (RSCU)[13], and then determine the sample groups of high expression and low expression by ENC and compute the RSCU of each codon in the two sample groups, and finally determine the optimal codons of high expression genes by the t-test statistically. As mentioned above, this method is very complicated and the genetic background of samples must be known, which increases the difficulty for the selection of samples. When compared to HE, the method of HF is simple and fast and has been used successfully in the analysis of codon usage in tobacco (Nicotiana tabacum L.)[10], codon usage in the Chloroplast Genome of Rice (Oryza sativa L. ssp. japonica) [14], and codon usage in Citrus spp. [9].

560

Journal of Genetics and Genomics

Table 3

The preference codon in different poplar species

Amino acid R (Arg) L (Leu) S (Ser) A (Ala) G (Gly) P (Pro) T (Thr) V (Val) I (Ile) C (Cys) D (Asp) E (Glu) F (Phe) H (His) K (Lys) N (Asn) Q (Gln) Y (Tyr) M (Met) W (Trp) STOP h

遗传学报 Vol.34 No.6 2007

P. tremuloides AGA h AGG h CTT h TCT h GCT h GGA GGT CCT h ACT h GTT h ATT TGC GAT h GAA TTC CAT AAG h AAT CAA TAT ATG TGG TGA

P. tomentosa AGA h AGG h AGG h CTT h TCT h GCT h GGA GGT CCT h ACT GTT h ATT TGC GAT h GAA TTC CAT AAG AAT CAA TAT ATG TGA h

P. deltoides AGA h AGG h CTT h TTG h TCT h GCT h GGA GGT CCA h ACA h GTT h ATT TGC GAT h GAG TTC CAT h AAG AAT CAA TAT ATG TGG TGA h

P. trichocarpa AGA h AGG CTT h TCT h GCT h GCA h GGA h CCA h ACA GTT h ATT h TGT GAT h GAA TTT CAT h AAA AAT h CAA TAT h ATG TGG TGA

refers to the preferred codons; other codons are high frequency in coding the amino acid.

In the present study, a detailed analysis of the codon usage in the four poplar species revealed that although there is slight difference in the usage of preferred codon for different poplar species, the four poplar species exhibit a strong commonness in the

in populus. Therefore, this study can serve as a guide for increasing the expression efficiency of exogenous genes in different poplar species. Since protein and nucleotide acid data of poplar in GenBank is limited, the reliability of this study may be influenced.

usage of preferred codon. Furthermore, it can also be seen that the codon preferences of P. tremuloides and

References

P. tomentosa are most similar, which agrees with the

1

individual genes confirm consistent choices of degenerate

conclusion that codon usage is associated with the

bases according to genome type. Nucleic Acids, 1980, 8 (9):

evolution of plants as reported by Campbell WH and Gowti G [15]. Because P. tremuloides and P. tomen-

Grantham R, Gautier C, Gouy M. Codon frequencies in 119

1893−1912. 2

Lu H, Zhao WM, Zheng Y, Wang H, Qi M, Yu XP. Analysis of

tosa belong to Leuce Duby, while P. deltoides be-

Synonymous Codon Usage Bias in Chlamydia. Acta Bio-

longs to Aigeiros, and P. trichocarpa belongs to Ta-

chimica et Biophysica Sinica, 2005, 37(1): 1−10.

camahaca. Our study indicates that exogenous gene designed by the preferred codon of one of the dif-

3

Holm L. Codon usage and gene expression. Nucleic Acids,

4

Coghlan A, Wolfe KH. Relationship of codon bias to mRNA

1986, 14 (7): 3075−3087.

ferent poplar species can be used in other poplar

concentration and protein length in Saccharomyces cerevisiae.

species. The commonness of the preferred codon reduces repetition in experiment and saves limited

Yeast, 2000, 16 (12): 1131−1145. 5

Akashi H. Translational selection and yeast proteome evolu-

6

Liu QP, Tan J, Xue QZ. Synonymous codon usage bias in the

tion. Genetics, 2003, 164 (4): 1291−1303.

resource. It has been reported that the expression efficiency of genes can be successfully increased in plants by rebuilding codons [16]. However, there is no example

rice cultivar 93-11 (Oryza sativa L. ssp. indica). Acta Genetica Sinica, 2003, 30(4): 335−340 (in Chinese with an English abstract).

www.jgenetgenomics.org

Meng Zhou et al.: Analysis of Codon Usage Between Different Poplar Species

7

Zhang Y, Zhang SG, Qi LW, Chen XQ, Chen RY, Song WQ. Poplar as a model for forest tree in genome research. Chinese Bulletin of Botany, 2006, 23 (3): 286−293 (in Chinese with an

English abstract). 8 Sharp PM, Cowe E. Synonymous codon usage in Saccharanyces cerevisiae. Yeast, 1991, 7(7): 657−678. 9 Hu GB, Zhang SL, Xu CJ, Lin SQ. Analysis of codon usage between different Citrus species. Journal of South China Agricultural University, 2006, 27(1): 13−16 (in Chinese with an English abstract). 10 Lin T, Ni ZH, Shen MS, Chen L. High-frequency codon analysis and its application in codon analysis of tobacco. Journal of Xiamen University, 2002, 41(5): 551−554 (in Chinese with an English abstract). 11 Frank W. The effective number of codons usage used in a gene. Gene, 1990, 87(1): 23−29. 12 Zhao X, Li Z, Lu SF, Huo KK, Li YY. Synonymous codon usage in Yarrowia lipolytica. Journal of Fudan University

561

(Natural Science), 1999, 38(5): 510−516 (in Chinese with an English abstract). 13 Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA, 1999, 96(8): 4482−4487. 14 Liu QP, Xue QZ. Codon usage in the chloroplast genome of rice (Oryza sativa L. ssp. japonica). Acta Agronomica Sinica, 2004, 30(12): 1220−1224 (in Chinese with an English abstract). 15 Campbell WH, Gowri G. Codon in higher plants, green algae, and Cyanobacteria. Plant Physiol, 1990, 92(1): 1−11. 16 Panahi M, Alli Z, Cheng X, Belbaraka L, Belqoudi J, Sardana R, Phipps J, Altosaar I. Recombinant protein expression plasmids optimized for industrial E.coli fermentation and plant systems produced biologically active human insulin-like growth factor-1 in transgenic rice and tobacco plants. Transgenic Res, 2004, 13(3): 245−259.

杨树派间不同种的遗传密码子使用频率分析 周 猛,童春发,施季森 南京林业大学,国家林业局和江苏省林木遗传和基因工程重点实验室,南京 210037 摘 要:遗传密码子的简并性特征造成了不同物种使用的密码子存在偏爱性。了解不同物种的密码子使用特点,可以为外源 基因导入过程中的基因改造提供依据,从而实现外源基因的高效表达。杨树是世界上广泛栽培的重要造林树种之一,已经 成为林木基因工程研究的模式植物。本研究采用高频密码子分析法,对美洲山杨 P. tremuloides, 毛白杨 P. tomentosa,美洲 黑杨 P. deltoids 和毛果杨 P. trichocarpa 4 种杨树的蛋白质编码基因序列(CDS)进行了分析,计算出了杨树同义密码子相对 使用频率(RFSC),确定了 4 种杨树的高频率密码子,发现虽然不同种类的杨树密码子使用上有一些差别,但是偏爱密码子 的差别却很小,共性的密码子占绝大多数。仅有 Pro, Thr 和 Cys 等少数几个氨基酸的偏爱密码子有差别。这种“共性”提 示我们,用不同种的杨树中任何一种杨树的偏爱密码子所设计的外源基因在其他杨树中也可以使用。 关键词:杨树;密码子使用频率;高频密码子;密码子偏爱性 作者简介:周猛(1981− ),男,辽宁人,硕士研究生,研究方向:生物信息学。E-mail: [email protected]

www.jgenetgenomics.org