Journal of Theoretical Biology 284 (2011) 92–98
Contents lists available at ScienceDirect
Journal of Theoretical Biology journal homepage: www.elsevier.com/locate/yjtbi
Correlation between the flexibility and periodic dinucleotide patterns in yeast nucleosomal DNA sequences Qinqin Wu a,b,n, Weiqiang Zhou a, Jiajun Wang b, Hong Yan a,c a
Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong School of Electronics and Information Engineering, Soochow University, Suzhou, China c School of Electrical and Information Engineering, University of Sydney, NSW 2006, Australia b
a r t i c l e i n f o
a b s t r a c t
Article history: Received 9 June 2010 Received in revised form 20 June 2011 Accepted 21 June 2011 Available online 25 June 2011
Nucleosome formation and positioning, which play important roles in a number of biological processes, are thought to be related to the distinctive periodic dinucleotide patterns observed in the DNA sequence wrapped around the protein octamer. Previous research shows that flexibility is a key structural property of a nucleosomal DNA sequence. However, the relationship between the flexibility and the periodic dinucleotide patterns has received little attention in research in the past. In this study, we propose the use of three different models to measure the flexibility of yeast DNA sequences. Although the three models involve different parameters, they deliver consistent results showing that yeast nucleosomal DNA sequences are more flexible than non-nucleosomal ones. In contrast to random flexibility values along non-nucleosomal DNA sequences, the flexibility of nucleosomal DNA sequences shows a clear periodicity of 10.14 base pairs, which is consistent with the periodicity of dinucleotide distributions. We also demonstrate that there is a strong relationship between the peak positions of the flexibility and the dinucleotide frequencies. Correlation between the flexibility and the dinucleotide patterns of CA/TG, CG, GC, GG/CC, AG/CT, AC/GT and GA/TC are positive with an average value of 0.5946. The highest correlation is shown by CA/TG with a value of 0.7438 and the lowest correlation is shown by AA/TT with a value of 0.7424. The source codes and data sets are available for downloading on http://www.hy8.com/bioinformatics.htm. & 2011 Published by Elsevier Ltd.
Keywords: Nucleosomes DNA sequence flexibility Periodic dinucleotides Nucleosome positioning and formation
1. Introduction Nucleosomes are the fundamental repetitive units of the eukaryotic chromatin, which play important roles in a number of biological processes (Grunstein, 1997; Kornberg and Lorch, 1999; Ashburner et al., 2000). A nucleosome is formed by 147 base pairs (bp) of DNA wrapped around a disk-like histone octamer. Fig. 1 shows the threedimensional (3D) structure of a nucleosome viewed from two different angles. The diagrams are created using UCSF Chimera (Pettersen et al., 2004). Previous studies suggest that the rotational positioning of DNA in the nucleosomes appears to be dominated by certain sequence-dependent modulations in the structure (Drew and Travers, 1985). Besides, there is an attractive hypothesis proposed in earlier research that the distinctive periodic variations in the DNA sequence may facilitate the sharp bending of the DNA around the nucleosome (Satchwell et al., 1986). However, this hypothesis simply comes from a comparison between the periodicity of the sequence
n Corresponding author at: Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong. E-mail address:
[email protected] (Q. Wu).
0022-5193/$ - see front matter & 2011 Published by Elsevier Ltd. doi:10.1016/j.jtbi.2011.06.026
modulation and that of the rotation of the double helix along the surface of the nucleosome (Drew and Travers, 1985; Satchwell et al., 1986). Although large-scale studies have been conducted on nucleosomes (Ababneh, 2009; Albert et al., 2007; Bina, 1994; Chela-Flores, 1994; Chen et al., 2008; Drew and Travers, 1985; Miele et al., 2008; Mobius et al., 2006; Peckham et al., 2007; Richmond and Davey, 2003; Satchwell et al., 1986; Sevinc et al., 2004; Widom, 2001; Wu et al., 2009; Yassour et al., 2008; Zhao and Yan, 2009), no comprehensive analysis of the structural properties of a large number of experimentally determined nucleosomal DNA sequences has been carried out. The structural properties of DNA play important roles in gene expression regulation (Alberts et al., 2002; Brukner et al., 1995; Fukue et al., 2005; Gowers and Halford, 2003; Matthews, 1992; Packer et al., 2000b; Pedersen et al., 1998; Starr et al., 1995; Travers and Klug, 1990). A key structural property, sequence-dependent flexibility, has been found to be able to guide DNA-binding proteins efficiently to the target sites (Gowers and Halford, 2003). Earlier research on this structural property also suggests that the flexibility may influence DNA looping (Matthews, 1992), promoter activities (Fukue et al., 2005), nucleosome positioning (Pedersen et al., 1998) and transcription factor binding (Fukue et al., 2005; Starr et al., 1995).
Q. Wu et al. / Journal of Theoretical Biology 284 (2011) 92–98
The objective of this work is to study the structural properties of the experimentally determined nucleosomal DNA sequences as well as the correlation between the dinucleotide periodicity and flexibility. Based on different experimental or statistical flexibility models, we analyze the average flexibility of 34,876 nucleosomal DNA sequences in yeast. Consistent with the general concept that flexible DNA sequence segments wrap around a histone core more easily than rigid ones, which usually results in transcription repression (Mobius et al., 2006), we report that nucleosomal DNA sequences are highly flexible compared with other genomic regions. Furthermore, a comparative study demonstrates that there is a strong correlation between the flexibility and periodic dinucleotide patterns in nucleosomal DNA sequences.
2. Materials and methods 2.1. The genome sequences All 34,876 experimentally determined non-overlapping nucleosomal DNA sequences (Albert et al., 2007) of 16 yeast chromosomes are extracted from the GenBank (http://www.ncbi. nlm.nih.gov/genbank/). Early research suggests that the nucleosomes are mainly 147 bp long (Widom, 2001) but some of them may be longer, say, 160 bp (Bryant et al., 2008) or even longer than that. Thus, to include as many nucleosomes as possible, the length of these sequences is set to 180 bp, that is, the sequences range from 90 bp upstream to 90 bp downstream relative to the nucleosomal centers determined by experiments (Albert et al., 2007). To validate if the structural properties observed in the flexibility are unique in the nucleosomal DNA sequences, it is necessary to calculate the flexibility of non-nucleosomal DNA sequences and make a comparison between them. Thus, we retrieve DNA sequences which are determined as non-nucleosomes in the experiments of 16 yeast chromosomes (Albert et al., 2007) from the GenBank. These sequences also contain 180 bp in their lengths.
Fig. 1. The three-dimensional structure of a nucleosome viewed from different angles: (a) vertical view of the nucleosome structure and (b) horizontal view of the nucleosome structure.
93
Another group of sequences is also extracted from the GenBank, to make a general analysis of structural properties of the nucleosomal DNA sequences as well as their genomic surrounding regions based on the flexibility of ‘non-nucleosome’–‘nucleosome’–‘non-nucleosome’ sequences. However, a large number of nucleosomes are found to be closely positioned along the DNA sequence. Therefore, it is difficult to find a nucleosome which has a non-nucleosome sequence long enough on each side. Due to this constraint, this group includes only 235 sequences, ranging from 400 bp upstream of the nucleosome center position to 400 bp downstream of the nucleosome end site, i.e. [ 400, nucleosomal region (147 bp long), þ400]. After calculating the flexibility for each sequence according to the dinucleotide, trinucleotide and tetranucleotide models, we average the flexibility values of the nucleosome region for each model and obtain a single flexibility value of the nucleosome region against the center position. For a fair comparison, we also average the flexibility values of the nonnucleosome region for every overlapping 147 bp. Using this method, we obtain the final flexibility of [ 400, þ400] in length with the position 0 corresponding to the nucleosome center.
2.2. DNA flexibility models Recently, various DNA flexibility models were proposed based on either experimental or statistical methods. These models make use of various structural properties including the angular parameters (twist, roll and tilt) as well as the translational parameters (shift, slide and rise). Using the flexibility parameters listed in the models, one can easily calculate the flexibility for DNA sequences. There are mainly three kinds of flexibility models, the dinucleotide model (Packer et al., 2000a), the trinucleotide model (Brukner et al., 1995) and the tetranucleotide model (Packer et al., 2000b). These three models are widely used to analyze the structural properties of, human promoters (Cao et al., 2008), mammalian and plant genomes (Florquin et al., 2005) as well as some other genomic regions. Here we intend to use these three flexibility models to find a common structural property of the sequences to be analyzed. Table 1 summarizes the characteristics of the flexibility models used in this paper. If all three models show consistent flexibility patterns, it can be confidently concluded that consensus physical signals do exist in the DNA sequences. The three flexibility models used in this paper, as mentioned above, measure flexibility in different ways. The parameters for the dinucleotide model (Packer et al., 2000a) range from the smallest CA/TG ¼1.35 to the largest AA/TT ¼13.72 where larger values correspond to more rigid dinucleotides. For the trinucleotide model (Brukner et al., 1995), larger values represent more flexible trinucleotides. Each trinucleotide flexibility parameter ranges from the smallest AAT/ATT¼ 0.28 to the largest TCA/ TGA¼0.194. The tetranucleotide model (Packer et al., 2000b) is similar to the dinucleotide model, and its higher values correspond to more rigid tetranucleotides. The smallest flexibility parameter is TACA/TGTA¼1.9 while the largest is AAAC/ GTTT¼27.2. Although different models exhibit different scales
Table 1 Characteristics of DNA flexibility models. Model
The smallest value
Corresponding base pair steps
The largest value
Corresponding base pair steps
Characteristics
Dinucleotide model (Packer et al., 2000a) Trinucleotide model (Brukner et al., 1995) Tetranucleotide model (Packer et al., 2000b)
1.35 0.28 1.9
CA/TG AAT/ATT TACA/TGTA
13.72 0.194 27.2
AA/TT TCA/TGA AAAC/GTTT
The larger the more rigid The larger the more flexible The larger the more rigid
94
Q. Wu et al. / Journal of Theoretical Biology 284 (2011) 92–98
Dinucleotide
of flexibility parameters, they reflect flexibility levels relatively along the DNA sequence from different base pair steps. To evaluate the flexibility of a long DNA sequence, we first calculate the flexibility value based on the 6-mer (Cao et al., 2008) at each position of the DNA sequence. Using the trinucleotide model as an example, the flexibility value for the 6-mer is calculated by
9
fTAT þfATA þfTAG þ fAGC , 4
ð2Þ
and the flexibility value at the second position A is
−400
Trinucleotide
where fi denotes the flexibility parameter of the ith trinucleotide in the 6-mer. The flexibility value of the 6-mer is calculated against its first nucleotide. For instance, the sequence TATAGCT has the flexibility value at the first position T FT ¼
FA ¼
8.5 −200
0
200
400
−0.03
fATA þ fTAG þ fAGC þ fGCT : 4
ð3Þ
As a result, we can convert a DNA sequence of L nucleotides to L 5 flexibility values along the sequence. The flexibility evaluated this way is a function of position along the DNA sequence. That is, the position is a discrete variable in the range of 0 to L 5, and the flexibity takes numerical values. Then we can take the Fourier transform or the derivative of the flexibility function.
−0.035 3. Results and discussions
−400
Tetranucleotide
averaging the flexibility values of 6 2¼4 overlapping trinucleotides P4 f F¼ i¼1 i, ð1Þ 4
−200
0
200
400
14.5
14 −400
−200
0
200
400
Dinucleotide
Fig. 2. The average flexibility around the nucleosomes in a wide genomic region. The horizontal axis represents nucleotide position with position 0 corresponding to the nucleosome center position.
9
Trinucleotide
To analyze the DNA flexibility of a nucleosomal DNA sequences in wider genomic regions, as mentioned in the Materials and Methods section, we expand the single nucleosomerelated region to 400 and the nucleosome region to þ 400. The 400 bp upstream and downstream regions are validated to be non-nucleosomes by experiments, while position 0 corresponds to the center of a nucleosome. The average flexibilities of these sequences aligned at the nucleosome center position are shown in Fig. 2. We can see that, compared with its surrounding regions, the nucleosome region is the most flexible part supported by all three flexibility models. Since the three models are developed independently from different
9 8.8
8.5 −50
0
50
−0.03
−0.03
−0.035
−0.035 −50
Tetranucleotide
3.1. Nucleosomal DNA sequences are highly flexible
0
50
14.5
14.5
14
14 −50
0
50
−50
0
50
−50
0
50
−50
0
50
Fig. 3. The average flexibility calculated from the dinucleotide model (the upper one), the trinucleotide model (the middle one) and the tetranucleotide model (the lower one). The horizontal axis represents nucleotide position, which ranges.from 90 to þ90 relative to the nucleosome center (the left column) and the non-nucleosomal center (the right column), respectively: (a) the average flexibility around the nucleosomal centers and (b) the average flexibility around the non-nucleosomal centers.
Q. Wu et al. / Journal of Theoretical Biology 284 (2011) 92–98
methods, this consistent observation of the highly flexible part confirms strongly that nucleosomal DNA sequences are more flexible than their surrounding regions.
2
3.3. The comparison between the flexibility and the dinucleotide frequency maps In the preceding section, we conclude that the flexibility function has the periodicity of about 10 bp, which is the same as that of periodic dinucleotide patterns. In this section, we compare the flexibility from the trinucleotide model and the frequency maps of 10 unique dinucleotides (Packer et al., 2000a): CA/TG, CG, GC, TA, GG/CC, AG/CT, AC/GT, GA/TC, AT and AA/TT. The result is shown in Fig. 5. The 10 dinucleotide frequency maps are arranged in the order from the most flexible CA/TG to the most rigid AA/TT. Note that there are 16 dinucleotides but only 10 of them are unique. For example, CA on the forward strand of a DNA sequence corresponds to TG on the reverse strand, so CA and TG are counted as only one unique dinucleotide. In good agreement with the previous conclusion, both the flexibility curve and the frequency maps show an obvious 10 bp periodicity. Furthermore, we can observe from Fig. 5 that the peaks on the frequency maps of the dinucleotides CA/TG, CG, GC, GG/CC, AG/CT, AC/GT and GA/TC are in phase with those local maximum points in the flexibility curve. Thus, we can speculate that, it is these periodic motifs of the seven dinucleotides that lead to the flexible points in the sequences. This observation
x 10−3
1.5
3.2. Consensus flexibility patterns in the DNA sequences
Flexibility
1 0.5 0 −0.5 −1 −1.5 −2
−60
−40
−20
0 Position
20
40
60
1 0.9 0.8 0.7 Power
We can see the difference between the average flexibility of the nucleosomal DNA sequences and that of the non-nucleosomal DNA sequences from a comparison of Fig. 3(a) and (b). Both are calculated from the three models introduced above. For either the nucleosomal DNA sequences or the non-nucleosomal DNA sequences, consistent flexibility patterns are supported by all three flexibility models. Although slight differences exist between the models in each column, the rigidity characteristics calculated from the dinucleotide model agree well with that exhibited in the tetranucleotide rigidity model. In the trinucleotide flexibility curve (the middle one), the peaks, indicating flexible regions, are shown at the positions which are occupied by the dips in the rigidity models. Therefore, all three models demonstrate that a consensus physical signal of the flexibility does exist in the sequences. In addition, significant differences can be observed from the flexibility curve in both columns by comparing Fig. 3(a) with (b). We can see that there is an obvious 10 bp periodicity in the flexibility of the nucleosomal DNA sequences, with peaks or valleys located at around 73, 65, 53, 42, 31, 22, 3, 8, 16, 27, 39, 49, 61 and 69, while the flexibility of nonnucleosomal DNA appears to be random (Fig. 3(b)). Since we focus on the positions of the peaks and valleys of the flexibility curve, we take the derivative of the trinucleotide flexibility curve and apply the Fourier transform (Fig. 4). The result demonstrates that the flexibility of nucleosomes shows a periodicity of 10.14 bp. From the comparison and the result of the Fourier analysis, we can conclude that a similar periodicity exists in the flexibility as observed on the dinucleotide patterns in the nucleosomal DNA sequences, while for the non-nucleosomal sequences, no obvious periodicity is found from any of the three models. A previous study demonstrated that in nucleosome structures, a DNA sequence bends at a period of 10.18 70.05 bp per turn (Hayes et al., 1990). This result shows that there is a strong relationship between the flexibility and the periodicity of nucleosomal DNA sequences.
95
0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.05
0.1
0.15
0.2 0.25 0.3 0.35 Frequency (Hz)
0.4
0.45
0.5
Fig. 4. (a) The derivative of the average flexibility function and (b) the Fourier spectrum of the derivative function.
strongly supports the hypothesis that the periodic motifs facilitate the sharp bending of the DNA sequence around the protein octamer in a nucleosome. However, the other three dinucleotides TA, AT and AA/TT exhibit a contrary result. Their peaks mainly correspond to the rigid points in the flexibility curve, suggesting that the periodic motifs of these three dinucleotides are more rigid in the nucleosomes. Further analysis of the relationship between the periodicity and the flexibility is carried out by calculating the correlation coefficient between the flexibility and the frequencies of the 10 dinucleotides. We have found that the flexibility function for the first three and last three points of nucleosomal DNA sequences are very noisy (see Fig. 5), so we exclude them in the computation. The correlation coefficients are presented in Table 2. The result is consistent with those of Packer et al. (2000a). From the second and fifth rows of Table 2, we can see that the most flexible step CA/TG shows the highest correlation and the most rigid step AA/TT shows the lowest correlation with the flexibility function. It has been reported that CA/TG facilitates the sharp bending in protein–DNA interaction (Dickerson, 1998). The result agrees with our analysis shown in Fig. 5. To further prove the spatial correlation between the flexibility and the dinucleotide frequencies, we take the derivative of the flexibility function and recalculate the correlation. The derivative
96
Q. Wu et al. / Journal of Theoretical Biology 284 (2011) 92–98
CA/TG
0.01
Flexibility Frequency
0.005
0
−0.005
−0.005 −60
−40
−20
0
20
40
60
GC
0.01
−0.01
Flexibility Frequency
0
−0.005
−0.005
−40
−20
0
20
40
60
GG/CC
0.01
−0.01
0
−0.005
−0.005
−40
−20
0
20
40
60
AC/GT
0.01
Flexibility Frequency
0.005
−0.01
−0.005
−0.005 −40
−20
0
20
40
60
AT
0.01
Flexibility Frequency
−0.01
0.005
0
0
−0.005
−0.005
−60
−40
−20
0
−60
−40
−20
20
40
60
60
0
20
−0.01
40
60
AG/CT Flexibility Frequency
−60
−40
−20
0
20
40
60
GA/TC Flexibility Frequency
−60
−40
−20
0
20
40
60
AA/TT
0.01
0.005
−0.01
40
Flexibility Frequency
0.005 0
−60
20
TA
0.01
0
−0.01
0
0.005
0
−60
−20
0.01 Flexibility Frequency
0.005
−0.01
−40
0.005
0
−60
−60
0.01
0.005
−0.01
Flexibility Frequency
0.005
0
−0.01
CG
0.01
Flexibility Frequency
−60
−40
−20
0
20
40
60
Fig. 5. Comparison between the flexibility and the dinucleotide frequency maps of yeast nucleosomal DNA sequences. The horizontal axis represents the nucleotide position, which ranges from 73 to þ 73 relative to the nucleosomal center (position 0). The curves are shifted and scaled vertically so that a clear comparison between flexibility and dinucleotide frequency can be observed.
of the flexibility function F(x) is defined as F
0
ðxÞ ¼ 13½Fðx þ3Þ þ Fðx þ 2Þ þ Fðx þ 1ÞFðxÞFðx1ÞFðx3Þ,
ð4Þ
where x represents the nucleotide position and F is defined in Eq. (1). Because the derivative operation enhances noise, we average the flexibility function over a 3-point window before
Q. Wu et al. / Journal of Theoretical Biology 284 (2011) 92–98
Table 2 Correlation between the flexibility and the dinucleotide frequencies. Dinucleotide
CA/TG
CG
GC
TA
GG/CC
Flexibility The derivative of flexibility Dinucleotide Flexibility The derivative of flexibility
0.7438 0.7150 AG/CT 0.6192 0.3138
0.1290 0.5940 AC/GT 0.6684 0.6363
0.4039 0.0566 0.3315 0.7179 0.5796 0.6588 GA/TC AT AA/TT 0.1315 0.2375 0.7424 0.5262 0.5834 0.7970
taking the difference to reduce the noise. This technique is commonly used in numerical analysis and signal processing. The results are shown in the third and sixth rows of Table 2. We can see that the correlation after taking the derivative shows a clearer picture of the relationship between the flexibility and the dinucleotide frequencies because the derivative operation removes the global background variations in the flexibility function. The flexible dinucleotide steps, CA/TG, CG, GC, TA, GG/CC, AG/ CT, AG/GT, GA/TC, all show good correlation with the flexibility function except TA. Correlations for CA/TG, CG, GC, GG/CC, AG/CT, AC/GT and GA/TC are positive with an average value of 0.5946. The highest correlation is shown by CA/TG with a value of 0.7438. Although TA has been proven to be a flexible step by Packer et al. (2000a, 2000b) we notice its flexibility is much less than that of CA/TG. TA shows a low correlation due to the effect of sequence context, which means that TA has context dependent flexibility. The rigid dinucleotide steps, AT and AA/TT, show low correlation. AT and AA/TT are found to keep essentially the same conformation in different protein–DNA structures (El Hassan and Calladine, 1996). We can conclude from the above results that there is a strong relationship between the flexibility and the positions of most periodic dinucleotides. Periodic dinucleotides have been found in nucleosomes in previous research works (Bina, 1994; Ioshikhes et al., 1996; Segal et al., 2006). However, there have been few studies showing how such periodicity appears physically. Recently, Zhou and Yan (2010) used a 3D model to demonstrate that the atoms in periodic dinucleotides form an interface surface between proteins and DNA in a nucleosome. While the result in this paper is obtained based on the static structure of nucleosomes, the correlation between the flexibility of, and periodic dinucleotide patterns in nucleosomal DNA sequences can be investigated through molecular dynamics. Recently, we studied collective motions of the atoms in a nucleosome using normal mode analysis (NMA). Our work shows that the atoms of periodic dinucleotides move with larger amplitudes than other atoms, which again confirms the correlation between the flexibility and the dinucleotide patterns (Wang and Yan, in press).
4. Conclusion In this paper, we investigate flexibility as a structural property of yeast nucleosomal DNA sequences using three different flexibility models. The results confirm our two findings in this paper: One is that the nucleosomal DNA sequences tend to be more flexible compared with their surrounding regions in general. The other is that the flexibility of nucleosomal DNA sequences not only shows a 10 11 bp periodicity but also correlates well with the periodic dinucleotide frequencies. For example, CA/TG shows the highest correlation of 0.7438. This result demonstrates that flexibility as a structural property has a strong relationship with the periodic dinucleotide patterns. Consistent with previous studies (Drew and Travers, 1985; Mobius et al., 2006; Zhou and Yan, 2010), we find that the highly flexible characteristics of nucleosomal DNA sequences allow the sequences to wrap around
97
the histone octamer and thus form the nucleosomes. These results indicate that periodic dinucleotides are useful for the prediction of nucleosome positions (Wu et al., 2009). Since the flexibility of a nucleosomal DNA sequence shows significant difference from the surrounding region and it is well correlated with the periodic dinucleotide patterns, we can expect that flexibility can also be used as a useful feature for nucleosome position prediction. Our work here only addresses the flexibility of individual nucleosomes. This work can be extended in several ways. Firstly, we can analyze a number of other structural properties of a DNA sequence in addition to flexibility, such as the propeller twist, stacking energy and duplex stability, which have been used successfully for gene recognition (Song and Yan, 2010). Secondly, we can study the dynamic properties of nucleosomes, such as the relations between structural features and collective motions of proteins and DNA (Wang and Yan, in press). Thirdly, we can also apply our method to the study of high-order chromatin structures such as solenoids, zigzag patterns, and scaffolds. For example, Adolph et al. (1986) have observed radial chromatin loops in ¨ transmission and scanning electron microscopic images. Konig et al. (2007) have found aligned arrays of packed nucleosomal clusters and interconnected 3D chromatin network through electron microscopy tomography. Woodcock and Ghosh (2010) have recently provided a review of high-order structures and dynamics of chromatin and their roles in genome organization. The methods discussed in this paper can be useful for validating these high-order structures and even discovering other ones. For example, the zigzag and radial loop structures involve periodic patterns, so it is possible that flexibility or other structural features along a genome sequence exhibit periodicities reflecting these patterns. That is, we may be able to find regular patterns in the structural features at a larger scale or across many nucleosomes along the sequence.
Acknowledgments This work is supported by the Hong Kong Research Grant Council (Project CityU 123408), the National Natural Science Foundation of China (Project 60871086) and the Natural Science Foundation of Jiangsu Province, China (Project BK2008159). References Ababneh, A.M., 2009. The role of polarization interactions in the wrapping/ unwrapping of nucleosomal DNA around the histone octamer: implications to gene regulation. J. Theor. Biol. 258, 229–239. Adolph, K.W., Kreisman, L.R., Kuehn, R.L., 1986. Assembly of chromatin fibers into metaphase chromosomes analyzed by transmission electron microscopy and scanning electron microscopy. Biophys. J. 49, 221–231. Albert, I., Mavrich, T.N., Tomsho, L.P., Qi, J., Zanton, S.J., Schuster, S.C., Pugh, B.F., 2007. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446, 572–576. Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P., 2002. Molecular Biology of the Cell. fourth ed. Garland Science, New York, pp. 379–395. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., 2000. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29. Bina, M., 1994. Periodicity of dinucleotides in nucleosomes derived from Simian Virus 40 chromatin. J. Mol. Biol. 235, 198–208. Brukner, I., Sanchez, R., Suck, D., Pongor, S., 1995. Sequence-dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. EMBO J. 14, 1812–1818. Bryant, G.O., Prabhu, V., Floer, M., Wang, X., Spagna, D., Schreiber, D., Ptashne, M., 2008. Activator control of nucleosome occupancy in activation and repression of transcription. PLoS Biol. 6, e317. doi:10.1371/journal.pbio.0060317. Cao, X.Q., Zeng, J., Yan, H., 2008. Structural property of regulatory elements in human promoters. Phys. Rev. E 77, 041908 (1–7). Chela-Flores, J., 1994. Towards the theoretical bases of the folding of the 100-A˚ nucleosome filament. J. Theor. Biol. 168, 65–73.
98
Q. Wu et al. / Journal of Theoretical Biology 284 (2011) 92–98
Chen, K., Meng, Q., Ma, L., Liu, Q., Tang, P., Chiu, C., Hu, S., Yu, J., 2008. A novel DNA sequence periodicity decodes nucleosome positioning. Nucleic Acids Res. 36, 6228–6236. Dickerson, R.E., 1998. DNA bending: the prevalence of kinkiness and the virtues of normality. Nucleic Acids Res. 26, 1906–1926. Drew, H.R., Travers, A.A., 1985. DNA bending and its relation to nucleosome positioning. J. Mol. Biol. 186, 773–790. El Hassan, M.A., Calladine, C.R., 1996. Propellertwisting of base-pairs and the conformational mobility of dinucleotide steps in DNA. J. Mol. Biol. 259, 95–103. Florquin, K., Degroeve, S., Saeys, Y., Van de Peer, Y., 2005. Large-scale structural analysis of the core promoter in Mammalian and plant genomes. Nucleic Acids Res. 33, 4255–4264. Fukue, Y., Sumida, N., Tanase, J.I., Ohyama, T., 2005. Core promoter elements of eukaryotic genes have a highly distinctive mechanical property. Nucleic Acids Res. 33, 3821–3827. Gowers, D.M., Halford, S.E., 2003. Protein motion from non-specific to specific DNA by three-dimensional routes aided by supercoiling. EMBO J. 22, 1410–1418. Grunstein, M., 1997. Histone acetylation in chromatin structure and transcription. Nature 389, 349–352. Hayes, J.J., Tullius, T.D., Wolffe, A.P., 1990. The structure of DNA in a nucleosome. Proc. Natl. Acad. Sci. USA 87 (19), 7405–7409. Ioshikhes, I., Bolshoy, A., Derenshteyn, K., Borodovsky, M., Trifonov, E.N., 1996. Nucleosome DNA sequence pattern revealed by multiple alignment of experimentally mapped sequences. J. Mol. Biol. 262, 129–139. ¨ Konig, P., Braunfeld, M.B., Sedat, J.W., Agard, D.A., 2007. The three-dimensional structure of in vitro reconstituted Xenopus laevis chromosomes by EM tomography. Chromosoma 116, 349–372. Kornberg, R.D., Lorch, Y., 1999. Twenty-five years of the nucleosome, review fundamental particle of the eukaryote chromosome. Cell 98, 285–294. Matthews, K.S., 1992. DNA looping. Microbiol. Mol. Biol. Rev. 56, 123–136. Miele, V., Vaillant, C., d’Aubenton-Carafa, Y., Thermes, C., Grange, T., 2008. DNA physical properties determine nucleosome occupancy from yeast to fly. Nucleic Acids Res. 36, 3746–3756. Mobius, W., Neher, R.A., Gerland, U., 2006. Kinetic accessibility of buried DNA sites in nucleosomes. Phys. Rev. Lett. 97, 208102. Packer, M.J., Dauncey, M.P., Hunter, C.A., 2000a. Sequence-dependent DNA structure: dinucleotide conformational maps. J. Mol. Biol. 295, 71–83. Packer, M.J., Dauncey, M.P., Hunter, C.A., 2000b. Sequence-dependent DNA structure: tetranucleotide conformational maps. J. Mol. Biol. 295, 85–103. Peckham, H.E., Thurman, R.E., Fu, Y., Stamatoyannopoulos, J.A., Noble, W.S., Struhl, K., Weng, Z., 2007. Nucleosome positioning signals in genomic DNA. Genome Res. 17, 1170–1177.
Pedersen, A.G., Baldi, P., Chauvin, Y., Brunak, S., 1998. DNA structure in human RNA polymerase II promoters. J. Mol. Biol. 281, 663–673. Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C., Ferrin, T.E., 2004. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612. Richmond, T.J., Davey, C.A., 2003. The structure of DNA in the nucleosome core. Nature 423, 145–150. Satchwell, S.C., Drew, H.R., Travers, A.A., 1986. Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol. 191, 659–675. Segal, E., et al., 2006. A genomic code for nucleosome positioning. Nature 442, 772–778. Sevinc, E., Michael, J.C., Jerry, L.W., 2004. Global nucleosome distribution and the regulation of transcription in yeast. Genome Biol. 5, 243. doi:10.1186/ gb-2004-5-10-243. Song, N.Y., Yan, H., 2010. Short exon detection in DNA sequences based on multifeature spectral analysis. EURASIP J. Adv. Signal Process. 2011 (Article ID 780794-1), 8. Starr, D., Hoopes, B., Hawley, D., 1995. DNA bending is an important component of site-specific recognition by the TATA binding protein. J. Mol. Biol. 250, 434–446. Travers, A.A., Klug, A., 1990. Bending of DNA in nucleoprotein complexes. In: Cozzarelli, N.R., Wang, J.C. (Eds.), DNA Topology and its Biological Effects. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, pp. 57–106. Wang, D., Yan, H. Analysis of nucleosome structures based on molecular dynamics. Proc. IEEE Int. Conf. Systems Man Cybernetics, in press. Widom, J., 2001. Role of DNA sequence in nucleosome stability and dynamics. Q. Rev. Biophys. 34, 269–324. Woodcock, C.L., Ghosh, R.P., 2010. Chromatin higher-order structure and dynamics. Cold Spring Harb. Perspect. Biol. 2, a000596. Wu, Q., Wang, J., Yan, H., 2009. Prediction of nucleosome positions in the yeast genome based on matched mirror position filtering. Bioinformation 3, 454–459. Yassour, M., Kaplan, T., Jaimovich, A., Friedman, N., 2008. Nucleosome positioning from tiling microarray data. Bioinformatics 24, 139–146. Zhao, H., Yan, H., 2009. Computational analysis of nucleosome positioning signals in the Simian Virus 40 chromatin. In: Proceedings of the International MultiConference of Engineers and Computer Scientists, pp. 245–249. Zhou, W., Yan, H., 2010. Relationship between periodic dinucleotides and the nucleosome structure revealed by alpha shape modeling. Chem. Phys. Lett. 489, 225–228.