Journal of Molecular Structure 799 (2006) 28–33 www.elsevier.com/locate/molstruc
Effect of the window size in moving-window two-dimensional correlation analysis Hideyuki Shinzawa a, Shigeaki Morita a, Isao Noda b, Yukihiro Ozaki a
a,*
Department of Chemistry and Research Center for Near Infrared Spectroscopy, School of Science and Technology, Kwansei-Gakuin University, Sanda 669-1337, Japan b The Procter & Gamble Company, 8611 Beckett Road, West Chester, OH 45069, USA Received 5 October 2005, received in revised form 10 January 2006; accepted 6 March 2006 Available online 19 April 2006
Abstract The effect of window size in the moving-window two-dimensional (MW2D) correlation analysis is studied. In the specific case of sigmoidal signal variation, the results of MW2D correlation with various window sizes have indicated that too large a window is not effective to obtain the information along the perturbation direction, because the window always includes excessive perturbation region more than needed. It just offers too much information broadly distributed along the perturbation direction. On the other hand, too small a window is not effective either, because an excessively small window is subject to exhibit an undesirable effect of the high noise level in the data. Thus, we have investigated index to determine the optimal window size in the MW2D correlation analysis. Autocorrelation spectra of a simulated spectral data were obtained with all possible window size. Peak height of each autocorrelation peak was plotted as a function of the window size, and it was revealed that this parameter can be used as a convenient index for determining the optimal window size in the specific case studied. This index was applied to temperature-dependent IR spectra of a poly(methyl methacrylate) (PMMA) film to optimize the window wise in MW2D analysis. The result showed that the index can effectively optimize the window size in the MW2D analysis of the IR spectra of PMMA. Ó 2006 Elsevier B.V. All rights reserved. Keywords: Generalized two-dimensional, 2D correlation spectroscopy; Moving-window two-dimensional, MW2D correlation analysis; Window size; Autocorrelation; Poly(methyl methacrylate), PMMA
1. Introduction Generalized two-dimensional (2D) correlation spectroscopy was proposed by Noda in 1993 [1–4]. Its interpretation, however, sometimes is far from straightforward due to the inclusion of too many underlying processes influencing the spectral changes. To overcome this problem, Thomas and Richardson proposed a variant form of 2D correlation analysis method using a systematic subdivision of a spectral data set, called moving-window two-dimensional (MW2D) correlation analysis [5]. They used a moving window with an arbitrarily fixed size in the autocorrelation analysis. It is carried
*
Corresponding author. Tel.: +81795658349; fax: +81795659077. E-mail address:
[email protected] (Y. Ozaki).
0022-2860/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.molstruc.2006.03.019
out by the following steps: (1) subdivision of the whole spectral data chosen selectively by a moving widow with an arbitrarily fixed window size; (2) calculating an autocorrelation for each subset of the data; and (3) plotting the autocorrelation thus obtained versus the local average value of the perturbation variable within the moving window. This method has a definite advantage in extracting the information about the parameters related to the perturbation dimension. For example, Thomas and Richardson applied the MW2D analysis to temperature-dependent IR spectra of a thermotropic liquid crystal sample. They were successful in showing the existence and specific value of the phase transition temperature with this method. As described above, MW2D correlation analysis is based on the sequential change of autocorrelation intensities in the specific region of perturbation variable that a moving window
H. Shinzawa et al. / Journal of Molecular Structure 799 (2006) 28–33
29
includes. Thus, the proper selection of the window size becomes an important factor in the MW2D correlation analysis. For example, an excessively large window very likely contains the perturbation region more than required or even desired. In this case, it yields too much information broadly distributed in the perturbation direction. On the other hand, an excessively small window tends to posses an undesirable effect arising from high noise level in the data. Therefore, we should be careful in the selection of the window size. Although several studies with the moving window technique have been reported [5–8], there is no systematic study reported about the selection of optimal window size. The purpose of this study is to investigate the effect of window size in the MW2D correlation analysis, focused on a specific case with sigmoidal signal variation. MW2D correlation analysis was carried out with a simulated data to evaluate the effects of the window size. The results show that, in a specific case, an index for the optimal window size is obtained from the shapes of autocorrelation peaks in various window sizes. This index was applied to the temperature-dependent IR spectra of a poly(methyl methacrylate) (PMMA) film. The result revealed that the index can be used in optimizing window size with the practical data. 2. Methods 2.1. Simulated data In the present study, the specific intensity change described below is discussed. Z p yðpÞ ¼ f ðpÞdp; ð1Þ 1
where the function f (p) has a Gaussian profile as follows f ðpÞ ¼ exp 4 log 2p2 =w2 . ð2Þ In the Eq. (2), the variable w is a parameter to determine the shape of the Gaussian profile, namely the bandwidth (full width at half height or FWHH). The parameter w was arbitrarily set to 101 in this simulation study. The shape of the Gaussian profile and the change of the spectral intensity are shown in Figs. 1(A) and (B). 2.2. Calculation of MW2D correlation intensities In the MW2D correlation analysis, the spectral data is divided by a moving window, with the window size of N = 2m + 1 around jth spectrum as follows 1 0 y pjm C B B y pjmþ1 C C B C B C B C. B ð3Þ y j ðp J Þ ¼ B C C B y pj C B C B A @ y pjþm
Fig. 1. (A) A Gaussian profile of the simulated data and (B) a simulated intensity change with the Gaussian profile.
Note that J and j correspond, respectively, to the index of the selected window position and the index of the individual spectrum within the window. A dynamic spectrum of the sub data in the window is described as follows ~y j ðpJ Þ ¼ y j ðpJ Þ y ,
ð4Þ
where the local average spectrum of the window is given by y ¼
jþm 1 X y ðp Þ. 2m J ¼jm j J
ð5Þ
An autocorrelation spectrum is calculated as follows Xj ðpÞ ¼
jþm 1 X ~y j ðpJ Þ2 . 2m J ¼jm
ð6Þ
It is worth pointing out that the autocorrelation spectrum, derived from data observed over a period between j m and j + m, results in the expression similar to that of statistical variance of the spectral intensity with 2m degree of freedom. In this study, all the MW2D correlation spectra were calculated by in-house-written program in MATLAB (Version 7.01; The Math Works, USA).
30
H. Shinzawa et al. / Journal of Molecular Structure 799 (2006) 28–33
3. Results and discussion 3.1. Effect of window size in the MW2D correlation analysis MW2D autocorrelation analysis was applied to the simulated data by changing its window size from 3 to 801 with an increment of 2. The MW2D autocorrelations, in the cases of window size from 11 to 651 every 40 size, are shown in Fig. 2. When the window includes only the perturbation region in which the intensity does not change, the autocorrelation becomes 0. On the other hand, the autocorrelation increases when the window includes the slope region in which the intensity increases. Thus, the information about the slope region in the simulated data can be obtained as an autocorrelation peak as shown in Fig. 2. The result shows that the larger the window becomes, the less specific information along the perturbation direction can be obtained, e.g., the selective inclusion of only the slope region in the simulated data as shown in Fig. 1(B). It means that the use of an excessively large window is ineffective in extracting local characteristic changes in the perturbation direction. It merely offers too much information broadly distributed in the perturbation direction. On the other hand, when the window is reasonably smaller than the slope region, it works well to extract the information characteristic to the specific local region. It is true that a small window is effective to obtain the details of information along the perturbation direction, but it becomes far from robust when it contains high noise level, as Morita et al. reported [8]. It is worth describing this undesirable effect of excessive small window from the statistic point of view. It is apparent that the variance with a very low degree of freedom is much less robust for the data analysis, especially those including a high level of noise. One of the main purposes of MW2D correlation analysis is to extract the information about the localized autocorrelation intensities and, at the same time, to obtain as
Fig. 2. A plot of autocorrelation versus a window position in the cases of window size form 11 to 201 every 20.
much information as possible in the direction of the perturbation. In other words, it is to locate the objective region, namely the perturbation region in which the spectral intensity changes most dramatically. For this simulation example, the search corresponds to the identification of the bandwidth of the data. From these points of view, the use of extreme window size, either too wide or too narrow, is not effective. To overcome this problem, an index for the optimal window size is needed. 3.2. Index for optimal window size The value of autocorrelation intensity at each peak is plotted as a function of the window size. In Fig. 3(A), the horizontal axis indicates the window size, and the vertical axis means the autocorrelation values of the peaks. Fig. 3(A) shows that the autocorrelation value of each peak increases significantly when the window size becomes less than that to include whole objective region, such as the region with window size less than 101. On the other hand, it increases much less when the window size is set so large that the moving window contains an excessively large region, such as the region with window size of more than 401. The autocorrelation peak value increases with the increase of the window size. Once the window size becomes so large that it can include more than the objective region,
Fig. 3. (A) A plot of autocorrelation peak value versus a window size and (B) first derivative of the autocorrelation peak.
H. Shinzawa et al. / Journal of Molecular Structure 799 (2006) 28–33
31
IR spectra of a PMMA film were measured every 1 °C over a temperature range of 40–200 °C. Detailed
experimental conditions were described elsewhere [8]. Fig. 4(A) shows the original IR spectra of PMMA in the 1300–1210 cm1 region. In this study, we focused on an optimization of window size in MW2D analysis with a sequence of the band intensity of C–O stretching at 1241 cm1 as shown in Fig. 4(B). Autocorrelation was calculated by changing a window size from 3 to 169 with two increments. Maximum peak in each autocorrelation versus corresponding window size was plotted in Fig. 5(A). Fig. 5(B) shows first derivative of maximum peak value shown in Fig. 5(A). Optimal window size corresponds to the coordinate of horizontal axis which offers the maximum first derivative value in Fig. 5(B). Thus, the optimal window size is revealed as 135. Fig. 6 represents the autocorrelation calculated with the window size (A) 135, (B) 3 and (C) 161. In Fig. 6(B), apparently it is influenced by the noise of the spectral data. It shows clearly that too small window is strongly influenced by the noise in the case of such practical data. On the other hand, although large window is less sensitive to the noise, too large window offers little information in the perturbation direction as shown in Fig. 6(C). The moving window includes too large perturbation region to reveal a sequential spectral change in the perturbation direction in Fig. 6(C). It merely offers too much information broadly distributed in the perturbation direction. In Fig. 6(A), in the case of optimal window
Fig. 4. (A) Temperature-dependent IR spectra of a PMMA film in the 1300–1210 cm1 region. (B) A plot of spectral intensity at 1241 cm1 versus temperature from 40 to 200 °C.
Fig. 5. (A) A plot of autocorrelation peak value versus a window size with a sequence of spectral intensity at 1241 cm1. (B) First derivative of the autocorrelation peak.
the rate of the change become reduced, because it contains the region in which it has no slope. Therefore, it is possible to locate the objective region by finding the window size which offers the maximum rate of the autocorrelation change. The first derivative of the autocorrelation value of each peak was calculated to emphasize the rate change. The result of the first derivative is shown in Fig. 3(B), which indicates that the autocorrelation value increases most noticeably at the window size of 101. It shows a good agreement with the parameter of the bandwidth, i.e., w = 101, in the simulated data shown in Fig. 1(A). The result shows that the region which each window includes is strongly associated with the form of autocorrelation peak. The parameter, height of autocorrelation peak, can be used as convenient index for the optimal window size determination. By choosing proper index, it becomes possible to extract not only the information about autocorrelation but also the information about the change of the autocorrelation in the perturbation direction. This approach is useful in focusing the 2D correlation analysis exclusively on selected regions for unambiguous analysis. 3.3. A practical example
32
H. Shinzawa et al. / Journal of Molecular Structure 799 (2006) 28–33
3.4. Limits of the proposed method The simulation study and practical example indicate that one can obtain an index for optimal window size by the first derivative of the autocorrelation peak value, as long as the intensity change is described with Eqs. (1) and (2). It is noticed that this study is based on a specific case, in which the change of peak intensity can be described by Eqs. (1) and (2), so it is needed to discuss about other situations, such as other forms of intensity changes, like Gaussian, Lorentzian, or sinusoidal shapes. 4. Conclusion
Fig. 6. Plots of autocorrelation for the window size (A) 135, (B) 3 and (C) 161.
size, the perturbation region where autocorrelation changes most is calculated with the optimal window size N = 135 and the coordination of horizontal axis which gives the maximum autocorrelation j = 133 as follows N 1 N 1 ;j þ j . ð7Þ 2 2 Thus, the perturbation region, where autocorrelation changes most dramatically, covers from 86 to 200 °C. A characteristic change of the spectral intensity is also revealed from the MW2D analysis with this optimized window size in Fig. 6(A). Glass transition temperature (Tg) of PMMA is easily found at 110 °C as Morita et al. reported [8].
The effect of window size in MW2D correlation analysis was discussed in this study. The result of MW2D correlation analysis with the simulated data showed that an excessively large window cuts down the local information in the objective perturbation region. On the other hand, an excessively small window may have an undesirable effect of high noise level in the data. Thus, the selection of the optimal window size is necessary to extract the information along the perturbation direction in the MW2D correlation analysis. Index for the optimal window size was also discussed in this study. Autocorrelation peak coordinate of each window size was plotted as a function of the window size. The first derivative was calculated to emphasize the change in the autocorrelation intensities. The result of the first derivative showed that the peak corresponds to the slope region in the simulated data. These results show that this parameter, peak height of autocorrelation peak, can be used as convenient index for the optimal window size in a specific case. This index was also applied to the temperature-dependent IR spectra of a PMMA film. The results revealed the index can optimize the window size in the case of practical data. It is noted that the result shown in this study is applicable to a specific case of sigmoidally varying signals. Although such a form represents a large class of experimental data encountered in the real life, further consideration for the general case is required. As a further study, we are interested in the effects of window size in practical data based on actual experimental measurements, which will be reported elsewhere in the near future. Acknowledgments This work was supported by ‘Open Research Center’ project (Research Center for Near Infrared Spectroscopy) for private universities: matching fund subsidy from MEXT (Ministry of Education, Culture, Sports, Science and Technology), 2001–2005. References [1] I. Noda, Appl. Spectrosc. 47 (1993) 1329.
H. Shinzawa et al. / Journal of Molecular Structure 799 (2006) 28–33 [2] I. Noda, A.E. Dowrey, C. Marcott, G.M. Story, Y. Ozaki, Appl. Spectroscc. 54 (2000) 236A. [3] I. Noda, Appl. Spectrosc. 54 (2000) 994. [4] I. Noda, Y. Ozaki, Two-dimensional Correlation Spectroscopy, Jhon Willey & Sons, Chichester, West Sussex, 2004.
[5] [6] [7] [8]
33
M. Thomas, H. Richardson, Vib. Spectrosc. 24 (2000) 137. S. Sˇasˇic, Y. Ozaki, Appl. Spectroscc. 57 (2003) 996. S. Sˇasˇic, Y. Katsumoto, H. Sato, Y. Ozaki, Anal. Chem. 75 (2003) 4010. S. Morita, H. Shinzawa, R. Tsenkova, I. Noda, Y. Ozaki, J. Mol. Struct. in press.