Mechanical Systems and Signal Processing 110 (2018) 139–151
Contents lists available at ScienceDirect
Mechanical Systems and Signal Processing journal homepage: www.elsevier.com/locate/ymssp
Improvement to the sources selection to identify the low frequency noise induced by flood discharge Jijian Lian, Xiaoqun Wang ⇑, Bin Ma, Dongming Liu State Key Laboratory of Hydraulic Engineering Simulation and Safety, Tianjin University, Tianjin 300072, China
a r t i c l e
i n f o
Article history: Received 11 October 2017 Received in revised form 9 January 2018 Accepted 15 March 2018 Available online 29 March 2018 Keywords: Low frequency noise Flood discharge Single channel Blind source separation Source number estimation Cross-correlation procedure
a b s t r a c t Recent studies indicate that low frequency noise (LFN) is generated by flood discharge of high dam. For prototype observation of the LFN, the observed data usually contain multiple sources induced by different mechanisms. To separate and identify the LFN generated by flood discharge from the multiple sources, an improved version of single-channel blind source separation (SCBSS) is proposed. In this study, the source number estimation is improved by a singular entropy (SE) method based on the eigenvalues calculated by principal components analysis (PCA). Then an SCBSS algorithm with no interruption is proposed. Both traditional method and PCA-SE method may result in extra sources that do not exist actually and are introduced by SCBSS due to the misconduct of human judgment or underestimation of threshold. Therefore, a cross-correlation procedure is proposed to identify and eliminate the extra sources and other sources that we are not really concerned about. The proposed method is first applied to a pre-determined signal to validate its effectiveness. Then the LFN data observed during the flood discharge of the Jin’anqiao hydropower station are analyzed and separated using this improved method. Two components, with dominant frequencies about 0.7 Hz and 0.95 Hz respectively, are successfully recognized as the actual acoustic sources induced by the flood discharge. Ó 2018 Elsevier Ltd. All rights reserved.
1. Introduction Low frequency noise (LFN) with frequency ranging from 0.1 Hz to 20 Hz can travel very far and still be detected thousands of kilometers away from acoustic source. This is because it is difficult for the atmosphere to absorb the energy of LFN [1,2]. Many researchers have reported that houses and workshops near the LFN source have been found vibrating violently and that people who live around the LFN source suffered a variety of clinical symptoms [3,4]. However, past studies mainly concentrate on structure-borne LFN [5], such as infrasound generated by wind turbines [6,7], low-speed fans [8], multi-span bridges [9], and wind farms [10]. Very few literatures have mentioned the LFN induced by flood discharge in hydraulic engineering. Recently, LFN has been detected during flood discharge from many high dams in China, such as Xiang’jiaba [11], Jin’ping and Jin’anqiao hydropower station. Therefore, it is important to reveal the mechanism of the LFN induced by flood discharge. When we conduct prototype observation and model experiment to study the characteristic of the LFN sources, the recorded signal usually contains white noise components, colored noise components and information of LFN induced by flood discharge. These different components in the observed signal should be separated and the LFN induced by flood discharge should be identified and recognized. However, in practical measurement, only one or two sensors are usually used ⇑ Corresponding author. E-mail address:
[email protected] (X. Wang). https://doi.org/10.1016/j.ymssp.2018.03.030 0888-3270/Ó 2018 Elsevier Ltd. All rights reserved.
140
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
to gather the original acoustic signals because the use of multiple channels is usually uneconomical. Therefore, a technique called single channel blind source separation (SCBSS), which can separate multiple sources from a single channel signal, becomes increasingly important. Studies on blind source separation (BSS) have been developed intensively over the past decade. It has been applied in many fields, such as machine fault diagnosis [12,13], speech process [14] and damping estimation [15]. One of effective method of BSS is called independent component analysis (ICA), which is based on the assumption that all the sources are non-Gaussian and statistically independent. The ICA method can only solve the problems when the number of channels for recording is larger than the source number [16,17]. As for the SCBSS, the recorded single channel signal should be preprocessed by other techniques, such as Fourier transform, wavelet transform, phase space reconstruction [18] and empirical mode decomposition (EMD). The problem of separability of single channel signal with multiple sources was first studied by Hopgood and Rayner [19]. One idea of SCBSS is based on exploiting the inherent time structure of source signals by learning a priori sets of time-domain basis functions [20]. However, it is difficult to find proper basis functions because little was known about the sources in advance in practical applications. Another idea of SCBSS is to map the single-channel signal into multi-channels by the preprocessing techniques [21]; then sources contained in the multi-channel signals can be separated by ICA. Since the first combination of multi-channel mapping and ICA, many similar SCBSS methods have emerged and been studied. The single channel ICA (SCICA) [21], which combines Fourier transform and ICA, can only deal with stationary signals. The wavelet-ICA (WICA) [22], which decomposes single channel signal by wavelet transform and reconstructs a multichannel signal as an input of ICA, can deal with no-stationary signal, but it involves the selection of proper wavelet basis function and the number of wavelet decomposition layers. It is difficult to find the most appropriate basis function as it greatly influences the performance of WICA. Another version of SCBSS called EMD-ICA is a method which can deal with the nonlinear, nonstationary, even chaotic signals [23]. The empirical mode decomposition (EMD) can map the single channel signal into numbers of intrinsic mode functions (IMFs), and then several independent components are calculated by performing the ICA algorithm to the IMFs. The sources of interest have to be selected in this method. However, the characteristics of important sources are uncertain, which makes the selection difficult. Furthermore, the number of IMFs is usually larger than that of actual sources, which may deteriorate the result of separation. Another algorithm, EEMD-PCA-ICA (ensemble empirical mode decomposition-principal component analysis-independent component analysis) [24,25], which employs a more robust version of EMD called ensemble empirical mode decomposition (EEMD), makes use of the principal component analysis (PCA) to estimate the number of sources. However, the EEMD-based method usually have a problem called edge effect when using the EEMD algorithm, so an edge effect elimination method called extreme point symmetry extension (EPSE) has been introduced by Guo et al. [26] Furthermore, one has to interrupt the algorithm to determine the number of sources to be separated. It is inconvenient especially when a number of single channel signals are needed to deal with. In addition, the source number determined is not always appropriate. If the source number selected is larger than the actual number, the extra sources are illusive because they do not exist actually but are calculated by the algorithm. Also, for the LFN observation, there may be some colored sources that occur casually, which are not the sources we are concerned about. The existing algorithm has not provided a specific guidance about how to identify and eliminate the extra sources. And researchers in the past usually select the sources of interest directly. However, we have no priori-knowledge about the characteristics of the LFN sources induced by flood discharge, which makes the selection more difficult. To overcome the problems of EEMD-PCA-ICA, we introduce a cross-correlation analysis procedure to eliminate the extra sources called ghost source (GS) and casually-occurred sources (CS). In the improved separation algorithm, the number of sources is determined directly by a threshold which is selected by pre-experiment. Then the separation process with no additional intervention is ready for further analysis to eliminate the GS and CS. This study is organized as follows. In Section 2, the method of EEMD-PCA-ICA is briefly reviewed. To overcome the shortcomings of previous approaches, the source number estimation procedure in EEMD-PCA-ICA is simplified by the singular entropy (SE) method along with a threshold. In addition, a cross-correlation procedure to eliminate unimportant sources is introduced into the separation algorithm. In Section 3, a pre-determined experiment is conducted to validate the proposed method. In Section 4, the proposed and validated method is applied to analyze the LFN data observed during the flood discharge. Two source components are identified and recognized as the actual sources. It is also demonstrated that the proposed approach can be also used to determine the exact location of actual sources of the LFN induced by flood discharge. Conclusions are drawn in Section 5.
2. Methodology 2.1. Description of the problem and review of EEMD-PCA-ICA Multiple acoustic sources are usually mixed in the single channel signal during the observation of the LFN induce by flood discharge. In this paper, it is assumed that the observed signals are the linear combination of all the acoustic sources:
xðnÞ ¼ a1 s1 ðnÞ þ a2 s2 ðnÞ þ . . . þ ai si ðnÞ þ . . . þ aN sN ðnÞ þ rnðnÞ
ð1Þ
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
141
where the xðnÞ is the observed signal from single channel; nðnÞ is the white Gaussian noise component, which is inevitable in real-life observation; r is the standard deviation of the noise; ai si ðnÞ are individual sources. For example, besides LFN sources induced by flood discharge, some casually-occurred sources (CS) signals, which are not induced by flood discharge (e.g., human talking, other working noise on observation site, cars running across and other artificial sources), are recoded by the LFN sensors when prototype observation is conducted in Jin’anqiao hydropower station. The multiple sources in the observed signal should be separated and the CS should be eliminated because they are not the sources we are concerned about. In order to separate the observed sources, the EEMD-PCA-ICA has been adopted to analyze the signal. To understand the EEMD-PCA-ICA method, the relationship between independent component analysis (ICA) and ensemble empirical mode decomposed (EEMD) should be reviewed first. The ICA is employed to estimate N unknown sources from M measurement signals resulting from the mixture of sources through unknown channels. If A is defined as the mixing matrix, the relationship between measured signals and sources reads:
X ¼ AS
ð2Þ
where X ¼ ½x1 ðnÞ; x2 ðnÞ . . . ; xi ðnÞ; . . . ; xM ðnÞT , xi ðnÞ the measured signal; S ¼ ½s1 ðnÞ; s2 ðnÞ . . . ; sj ðnÞ; . . . ; sN ðnÞT , sj ðnÞ the unknown source. ICA has been applied in many different fields [12–15]. The basic assumptions of ICA is that the sources, si ðnÞ, must be statistically independent and each of them should be non-Gaussian signal. It can solve the problems with M P N [16,17]. The EEMD algorithm is a more robust version of the EMD. It is an effective method for non-stationary signals with random vibration [27] which decomposes a single channel signal xðnÞ into small number of components of intrinsic mode functions (IMFs):
EEMDðxðnÞÞ ¼
M X imf i ðnÞ þ r
ð3Þ
i¼1
where r denotes the residual and M is usually larger than the source number N; imf i ðnÞ is the i-th IMFs. A pseudo M-channel signal can be constructed by the imf i ðnÞ as an input of ICA algorithm. Therefore, ICA is a statistical supplement of EEMD and EEMD helps ICA to solve the single channel problem. In addition, the dimension of the pseudo M-channel signal should be reduced by the principal components analysis (PCA) before it is input into the ICA algorithm because the number of IMFs is usually larger than the source number. The PCA is an effective method of dimension reduction, which was first introduced by Pearson [28] and most widely used in data compression and image processing [29,30]. It converts M dimension vector X ¼ ½x1 ; . . . ; xM T into N dimension vector Y ¼ ½y1 ; . . . ; yN T through an orthogonal transformation [31]:
YNL ¼ WT XML where yi ¼
PM
k¼1 wki xk
ð4Þ ¼
wTi X
(i = 1, 2, . . ., N); and yi is uncorrelated with each other; xj (j = 1, 2, . . ., M) are all zero-mean sig-
nals; WT ¼ ½w1 ; . . . ; wN T is the orthogonal matrix in which N is between 1 and M. The objective function to calculate wTi is the maximization of the variance of yi : 2
J PCA ðwi Þ ¼ Efy2i g ¼ EfðwTi XÞ g ¼ wTi EfXXT gwi ¼ wTi CX wi i
ð5Þ
where wi ¼ 1; CX denotes the M M covariance matrix of XML ; a solution of this optimization problem is wi ¼ ei ; the sequence of e1 ; e2 ; . . . eN makes the corresponding eigenvalues sort from the largest to the smallest (d1 P d2 P . . . P dN ). The magnitude of di represents the magnitude of the variance of yi . If di is small, then yi can be neglected. The dimension of Y can be determined in this way. The PCA provides the best lower dimensional approximation of the data vectors and will filter out part of the noise components [32], although EMD can also be used as a noise filter [33]. The flow chart of the EEMD-PCA-ICA method [24] can be seen in Fig. 1. Firstly, an observed single-channel signal xðnÞ is decomposed into IMFs by EEMD. The pseudo multi-channel from observed signal consists of all the imf i ðnÞ but imf 1 ðnÞ. Sec-
Fig. 1. The flow chart of the EEMD-PCA-ICA method (black part) and the differences of source number estimation proposed in this study (red part). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
142
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
ondly, the dimension of the multi-channel signal is determined by PCA. In this stage, the PCA algorithm helps calculate a set of eigenvalues. According to the eigenvalues, the number of sources to be estimated should be determined by judgment. Then a new dimension-reduced signal is ready. Finally, the signal from PCA is set as an input of ICA algorithm to get the estimated sources si ðnÞ. In the EEMD-PCA-ICA method, one has to interrupt the algorithm to determine the number of sources to be separated. It is inconvenient especially when there are a number of single channel signals to deal with. In addition, researchers usually have to select the sources of interest. However, they have no priori-knowledge about the characteristics of the LFN sources induced by flood discharge, which makes the selection difficult. 2.2. Improvement and simplification of source number estimation The source number estimation in EEMD-PCA-ICA is based on the PCA. However, PCA can only calculate a set of eigenvalues. Some criteria should be introduced to help determining the source number. Basing the calculation on the PCA, Karhunen et al. [32] introduced the Akaike’s information criterion (AIC) and the minimum description length (MDL) criterion [34] to estimate the number. Cheng et al. [35] introduced a method for source number estimation based on the statistical characteristic of ICA results and clustering evaluation analysis. Nevertheless, the existing thresholds are only applicable when the number of sensors M is larger than the number of sources N. Here, the method of order determination--singular entropy (SE), which is widely used in the analysis of hydraulic structure [36], is employed as the criterion and the estimation procedure is simplified by a threshold determined by preexperiment. Based on the eigenvalues calculated by PCA, the SE values are calculated and a threshold can be used to estimate the source number:
EM ¼
M X DEi
ð6Þ
i¼1
DEi ¼
M X di = dk k¼1
!
M X log di = dk
! ð7Þ
k¼1
where di are the eigenvalues calculated by PCA; EM denotes the singular entropy; DEi is the increment of singular entropy on the i-th principal eigenvalues. When DEi 6 , the source number can be determined as N ¼ i 1, where the threshold should be determined in pre-experiments. Once the threshold is selected, the source number can be estimated with no additional intervention on the sources separation process. The improvement to source number estimation by PCA-SE has been shown in the red dotted line frame of Fig. 1. It is more convenient in practical application because one does not have to interrupt the algorithm to choose the source number. 2.3. Improvement in sources selection When the source number is estimated, both PCA and threshold methods may result in extra components due to the misjudgment of the operators. These extra components are called ghost sources (GS) because they do not exist actually, and there are also CS recorded by sensors during the observation of LFN. Both GS and CS should be recognized and eliminated as they are not the sources we are really concerned about. Researchers in the past usually select the sources of interest. However, we have no priori-knowledge about the characteristics of the LFN sources induced by flood discharge. To eliminate the GS in this condition, we introduce a cross correlation procedure, which needs another sensor to observe another single channel signal simultaneously. The additional signal contains the same sources but with a time-lag:
xðn s f s Þ ¼ a1 s1 ðn s f s Þ þ . . . þ aN sN ðn s f s Þ þ a0 nðtÞ
ð8Þ
where the time-lag s ¼ DL=c, DL being the distances difference between two sensors and the sources locations, f s is the sampling frequency of sensor. Two groups of sources can be estimated by the improved SCBSS:
S1 ðtÞ ¼ ½v 11 ; v 12 ; . . . ; v 1l T $ ½a11 s1 ; . . . ; a1N sN ; gs11 ; . . . ; gs1ðlNÞ T
ð9Þ
S2 ðtÞ ¼ ½v 21 ; v 22 ; . . . ; v 2k T $ ½a21 s1 ; . . . ; a2N sN ; gs21 ; . . . ; gs2ðkNÞ T
ð10Þ
where and are estimated from xðnÞ and xðn s fsÞ respectively;gsij (j = 1, 2, . . ., l N when i = 1; j = 1, 2, . . ., k N when i = 2) denotes the ghost sources; l and k are estimated number of sources and are not necessarily equal; the order of si (i = 1, 2,. . ., N) and gsij are undetermined, so we can only get v ij (j = 1, 2,. . ., l when i = 1; j = 1, 2, . . ., k when i = 2). The crosscorrelation function is used to recognize whether v 1i and v 2j is from the same source: S1 ðtÞ
S2 ðtÞ
8 Msf 1 s > < X v ðn þ s fsÞv ðnÞ; 1i 2j Rðv 1i ; v 2j ; sÞ ¼ > : n¼0 Rðv 2j ; v 1i ; sÞ;
sP0 s<0
ð11Þ
143
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
Fig. 2. The process of the sources selection, the superscript, t1, t2 and tn, indicate that the corresponding signals are observed in different time periods.
Two important assumptions should be made to distinguish si and gsij : (1) The threshold is small enough to estimate all the sources, which means l; k P N; (2) The magnitudes of the cross-correlation function follow the order:
Rða1i si ; a2i si ; sÞ > Rða1i si ; a2j sj ; sÞ Rðv 1i ; gs2j ; sÞ 0; where i – j
ð12Þ
Then a calculating process between two matrices is defined:
S1 S2 ¼ q ¼ ½q1 ; q2 ; . . . ; ql
ð13Þ
~ lk , where q ~ ij ¼ Rðv 1i ; v 2j ; sÞ; Step 1: Calculate the matrix q ~ ij Þ ¼ q ~ hg , then qh ¼ q ~ hg ; Step 2: If maxðq ~ lk to zero. Back to Step 2. Operate the loop until all the qi get value. Step 3: Set the h-th row and g-th column of q A cross-correlation coefficient vector, q ¼ ½q1 ; q2 ; . . . ; ql , is obtained after the process above. Based on the assumption made above, the sources (or CS) are found: if v 1i are the actual sources (or CS), qi should be close to 1 or significantly larger than others. It is still possible that CS is identified as the actual sources by mistake. Fortunately, the CS usually lasts for a short time, while the actual sources should maintain throughout the flood discharge process. If the LFN signal induced by flood discharge is a time stationary process, the CS can be eliminated by the cross-correlation procedure as discussed above. To eliminate CS, the observed signal from another sensor should be replaced with the signal with a time lag from the same sensor. ~ ij should be modified into: The calculation of q
q~ ij ¼ maxfRðv 1i ; v2j ; sÞg; s 2 ½0; 1=f c
ð14Þ
~ ij is the maximum of the correlation where f c is the smallest frequency that we are concerned about, which means that q function in one period. To prevent the error of single calculation, the process above can be expanded to ensemble average: n 1X 1; q 2; . . . ; q l ; S S2 Þ ¼ ½q n i¼1 1
q ¼ limn!1 ð
ð15Þ
is the ensemble average vector of the cross-correlation coefficient. The variation of n denotes that the observation is where q conducted in different time periods. Theoretically, when n ! 1, if v 1i are the actual sources, the corresponding qi should be much larger than other qi . The improvement in sources selection is shown in Fig. 2. 3. Case study 1: pre-determined signal 3.1. Experimental setup The EEMD-PCA-ICA algorithm has been proved to estimate the sources in high correlation coefficient and low relative rood mean square error (RRMSE) [24–26]. In this section, we set a pre-determined experiment only to validate the improved algorithm on the source number estimation and source identification. In order to validate the improved algorithm discussed in Section 2, a series of sinusoidal signals with three sources and white Gaussian noise are constructed as follows:
xðtÞ ¼ a1 s1 ðtÞ þ a2 s2 ðtÞ þ a3 s3 ðtÞ þ a0 nðtÞ
ð16Þ
144
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
where a1 s1 ¼ 6 sinð10ptÞ, a2 s2 ¼ 8 sinð30ptÞ and a3 s3 ¼ 7 sinð80ptÞ are components to simulate the acoustic sources mixed into the single observed signal xðtÞ; n denotes the white Gaussian noise and a0 is the coefficient to control the noise jamming. The sampling frequency is 400 Hz and sampling time is 15 s, so the sampling number N = 6001.The measure for the noise jamming of the signal is the noise to signal ratio (NSR) [24]:
NSRða0 Þ ¼ RMSða0 nÞ=RMSðx a0 nÞ
ð17Þ
where RMS denotes the root mean square. The performance of the algorithms can be evaluated by the correlation coefficient. For convenience of programing, it can be expressed as:
Covðsi ; sj Þ REi ¼ maxj¼1;2;3 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi; Varðsi ÞVarðsj Þ
ð18Þ
where REi denotes the correlation coefficient between si and estimated source sj ; Covðsi ; sj Þ is the covariance of si and sj ; Varðsi Þ is the variance of si .
3.2. Results and discussions Both PCA and PCA-SE along with a threshold are used to estimate the sources of the single-mixed signal xðtÞ under different NSR. The NSRða0 Þ ranges from 0.1138 to 1.7423, where a0 ranges from 1 to 15. For PCA, Fig. 3(a) shows the eigenvalues distribution under different NSRða0 Þ. The IMF components, calculated from EEMD, are concentrated on the bigger eigenvalues, so the number of sources is determined accordingly by judgment. For PCA-SE along with a threshold, Fig. 3(b) shows the DEi calculated by PCA and SE methods. Source number is determined by threshold . If the threshold is selected reasonably, it will yield correct result. Comparing the eigenvalues with DEi in Fig. 3, the DEi for a certain i calculated under different NSR shows smaller changes than that of eigenvalues, which makes the SE method more suitable for a threshold to estimate the source number. Table 1 shows the source number estimation of both methods. The estimation number of PCA-SE is a little larger than that of PCA, because is defined as 0.1, which is very small, to guarantee the first assumption that has been made in Section 2.3. Nevertheless, Fig. 4 shows that the REi between both methods under different NSR appear little difference, which indicates that even if is chosen a little smaller and the source number is estimated a little larger, it will not seriously deteriorate the result. It is obvious that the estimated number increases as NSRða0 Þ becomes larger. Both methods cannot correctly estimate the number, which is 3 (the actual number). The exceeding number will result in the extra sources in the final estimation. As can be seen in Fig. 5, two extra sources are identified by the threshold method. They are GS as they do not exist at all. To eliminate the GS, another signal, xðt 1Þ, which contains the same sources is simulated with a time-lag as 1 s. The signal is also separated by the proposed SCBSS method. Executing the process in the green dashed frame in Fig. 2, the cross-correlation coefficient vector of their SCBSS results is calculated. The results shows that q ¼ ½0:9571; 0; 0:8919; 0:8514; 0, which means that the sources in Fig. 5(a1), (a3) and (a4) are the actual sources, while the sources in Fig. 5(a2) and (a5) are the GS.
(a) 40
(b) NSR=0.3387 NSR=0.4652 NSR=0.5907 NSR=0.6915 NSR=0.8089 NSR=0.9324 NSR=1.0448 NSR=1.1553 NSR=1.2507
Eigenvalue
30 25 20 15
0.3 0.25 0.2
0.15
10
0.1
5
0.05
0
1
2
3
4
5
i
6
7
8
9
NSR=0.3387 NSR=0.4652 NSR=0.5907 NSR=0.6915 NSR=0.8089 NSR=0.9324 NSR=1.0448 NSR=1.1553 NSR=1.2507
0.35
Ei
35
0.4
10
0
ε=0.1
1
2
3
4
5
i
6
7
8
9
10
Fig. 3. Sources number estimation of both methods: (a) the eigenvalues bar shown by PCA in different NSR; (b) the DEi calculated by PCA-SE method.
145
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151 Table 1 The estimation of sources number of both methods in different NSR. Estimation of the sources number (actual sources number: 3) NSR
0.12
0.46
0.92
1.15
1.38
1.74
PCA PCA-SE ðe ¼ 0:1Þ
3 4
4 5
4 5
4 5
5 5
5 6
PSD
0 -5
0
1
2
3
4
0
5
f(S2)=15Hz
2 0
20
40
5
0.2
PSD
(b2)
0 0
1
2
3
4
0
5
Time(s)
2
(b3)
0 -2
0
1
2
3
4
Time(s)
5
Amplitude
100
1
2
3
4
0
5
5
0
20
0 0
1
2
3
Time(s)
4
5
40
60
80
100
f(S3)=40Hz 0
20
40
60
80
100
40
50
Frequency(Hz)
Ghost Source
1 0
200
Frequency(Hz)
2
(b6)
150
f(S1)=5Hz
0.5
0
100
Frequency(Hz)
1
(b5)
Time(s)
-5
50
5 0
0 -5
0
10
5
80
Ghost Source
0.1
PSD
-5
60
Frequency(Hz)
PSD
Amplitude
(a5)
(a6)
4
(b1)
PSD
(a3)
Amplitude
(a2)
5
Time(s)
Amplitude
(a1)
Amplitude
Fig. 4. The separating effect of two methods basing on different source number estimation methods; method 1: source number estimated by PCA, method 2: source number estimated by PCA-SE.
0
10
20
30
Frequency(Hz)
Fig. 5. The independent sources estimated by the SCBSS based on the PCA-SE to estimate the source number:
e ¼ 0:1, NSR(a) = 0.92.
146
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
4. Case study 2: LFN induced by flood discharge The improved approach has been validated in Section 3. In this section, the approach will be applied in a practical engineering case. 4.1. Arrangement of prototype observation Jin’anqiao Hydropower Station, which is located in Lijiang, Yunnan province of China, is a concrete gravity dam, with maximum dam height 160 m and dam axis 640 m. Every year during flood season, it discharges flood through five overflows. Since the first time of discharging, it has been found that doors, windows and roofs of workshops and houses near the dam vibrate violently and make continuous squeaky noise during the flood discharge. This phenomenon should result from a kind of secondary noise of the LFN induced by flood discharge. A prototype measurement, which last 19 days, measured the LFN signals near the dam under different flood discharging conditions. The measuring points of LFN should cover enough area because LFN can travel very far and the distance between any points should not be too close as the wave length of LFN is relatively long. Also, at least two synchronous observation points should be set so as to locate the physical sources of LFN. Fig. 6 shows the measuring points arrangement of 33 measuring points, including the upstream arrangement (UR1
UR2), the axis of dam (M2 M4), and the downstream arrangement (left bank: DL1 DL12 and P1 P4, right bank: DR1 DR11). We set P1-P2 and P3-P4 as synchronous observation points and two sensors were placed in the points for continuous observation (saving a file every minute). The sensors (CASI-DBF) used in this prototype measurement are from the Institute of Acoustic, Chinese Academy of Science. The sensitivity of the sensor is 150 mv Pa1 and the range of frequency response is 0.1–300 Hz. The time-history curve and power spectrum density (PSD) of DL4 when the discharge Q ¼ 0 (no flood discharge) and Q ¼ 3222 m3 s1 are shown in Fig. 7. The PSD of LFN is estimated using the autoregressive (AR) model with the Burg algorithm, which is one of the most frequently used parametric methods for processing various signals [37,38]. As can be seen
15 Q=3222 m3/s
10
Q=0 m3/s
Q=3222 m3/s
PSD
Amplitude(Pa)
Fig. 6. LFN observation points arrangement, UR1 UR2 are on the right bank of upstream, M2 M5 are on the axis of dam, DL1 DL3 are between pressure piping and powerhouse, DL4 DL12 are on the left bank of downstream, DR1 DR11 are on the right bank of downstream; P1, P2 and P3, P4 are synchronous observation points.
0 -10 0
10
20
30
40
Time(s)
50
60
70
Q=0 m3/s
10 5 0 0
2
4
6
Frequency (Hz)
Fig. 7. Comparisons of the time history and PSD curves of the LFN signal of point DL4.
8
10
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
147
Fig. 8. Correlation between the LFN intensity and Q.
in Fig. 7, the amplitude of the LFN during the flood discharge is remarkably larger than under condition of no discharge. In addition, there is an obvious peak frequency within 20 Hz in PSD curve during the flood discharge, while the PSD curve shows no obvious peak frequency when Q ¼ 0. Obviously, LFN was generated during the flood discharge. Fig. 8 shows the relationship between LFN intensity and the flow rate Q. It can be seen that the LFN intensity and the flow rate are positively related. 4.2. SCBSS of LFN The LFN data of the synchronous observation points (P1-P2 and P3-P4) are separated by the improved method proposed in Section 2. LFN data from P1 and P2 are set as references. Their time-history curves are shown in Fig. 9(a) and (h). The threshold for source number estimation is set to 0.15 through pre-experiments. The separation results can be also seen in Fig. 9. The LFN data of P1 and P2 observed in other time periods (totally 172 files observed in 172 min) are used for the cross correlation procedure. Four correlation vectors are calculated:
qP1 ¼ ½0:2195; 0:0720; 0:0831; 0:5912; 0:4569; 0:0893
ð19Þ
q P1 ¼ ½0:0864; 0:0642; 0:1510; 0:1903; 0:1415; 0:0769
ð20Þ
qP2 ¼ ½0:0893; 0:0831; 0:0720; 0:2195; 0:4569; 0:5912
ð21Þ
q P2 ¼ ½0:0787; 0:1158; 0:0452; 0:0884; 0:1819; 0:2225
ð22Þ
Firstly, the GS can be eliminated by qP1 and qP2 . It is obvious that the P1-ICA1, P1-ICA4 and P1-ICA5 (correspondingly, P2ICA4, P2-ICA6 and P2-ICA5) are LFN by discharge or CS because the corresponding correlation coefficients in qP1 or qP2 are P2 . The P1-ICA1 and P1 and q significantly larger than others. Other sources are all GS. Secondly, the CS can be eliminated by q P2-ICA4 are recognized as CS. Finally, two sources are recognized and their PSD curves are shown in Fig. 10. P1-ICA4 and P2ICA6 may be from the same sources and their peak frequency is about 0.7 Hz, while P1-ICA5 and P2-ICA5 might be from the same sources and their peak frequency is about 0.95 Hz. 4.3. Results and discussions The location of the acoustic sources of the LFN induced by flood discharge is usually thought to be in the stilling basin [11]. The sources identified by the SCBSS method used in the Section 4.2 can be used further to locate the acoustic sources. Therefore, to validate whether the actual sources identified and recognized are the LFN induced by flood discharge, the data of P1 P4 is used to find the source locations. Take P1-ICA4 and P2-ICA6 in Section 4.2 for an example: the curve of their cross correlation coefficient function is shown in Fig. 11. The cross correlation reaches maximum value at a time delay Dt ¼ 0:29 s , which should be the time-lag for the sound propagating from acoustic sources to P1 and P2. A coordinate system can be established in Fig. 12 and the position equations of S(x, y) can be expressed as follows:
ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 x2 þ y2 ðx dÞ þ y2 ¼ ðr þ DdÞ r ¼ Dd where Dd ¼ c0 ðt þ DtÞ c0 t ¼ c0 Dt; c0 denotes the speed of sound in air.
ð23Þ
148
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
(a) Amplitude
10
5
0
0
2
4
6
8
10
12
14
(e)
0 -5
(c)
0
5
5
Amplitude
10
15
20
P1-ICA1
-5
0
5
10
15
20
P1-ICA2
Amplitude
Amplitude
0 0
5
10
15
20
0
5
10
2 0 -2 -4 -6 -8
0
2
4
15
20
15
20
15
20
P1-ICA4
5 0 -5
(g)
5
-5
-5
0
5
10
P1-ICA5
5 0 -5
0
5
10
P1-ICA6
P1-ICA3
Amplitude
20
0
(f)
0
(d)
6
8
10
12
14
16
18
20
Orignal signal of P2
Amplitude
0
5
10
15
20
P2-ICA1
5
(m)
0
Amplitude
-5
0
5
10
15
20
P2-ICA2
(n)
5 0 -5
Amplitude
-5
Amplitude
0
(j)
(k)
(l)
5
Amplitude
Amplitude
(i)
18
5
Amplitude
Amplitude
5
Amplitude
(b)
(h)
16
Original signal of P1
0
5
10
P2-ICA3
15
20
5 0 -5
0
5
10
15
20
15
20
35
40
P2-ICA4
5 0 -5
0
5
10
P2-ICA5
5 0 -5 20
25
30
P2-ICA6
Fig. 9. The time history curves of: (a)–(g) the original signal of P1 and its six independent sources calculated by SCBSS; (h)–(n) the original signal of P2 and its six independent sources.
A parabola lining out the probable locations of the source location of P1-ICA4 (or P2-ICA6) is obtained from substituting the parameter of P1 and P2 to solve the Eq. (23). As can be seen in Fig. 12, another parabola can also be drawn by substituting the parameter of P3 and P4 to solve the Eq. (23). These two parabolas intersect at two points, one of which is in the stilling
149
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
(a)
(b)
1
PSD of P1-ICA4 PSD of P2-ICA6
0.6
0.6
0.4
0.4
0.2
0.2
0
0
2
4 6 Frequency(Hz)
8
PSD of P1-ICA5 PSD of P2-ICA5
0.8
0.7167Hz 0.6500Hz
PSD
PSD
0.8
1
0
10
0.9667Hz 0.9167Hz
0
2
4 6 Frequency(Hz)
8
10
Fig. 10. The PSD curves of the recognized actual sources: (a) P1-ICA4 and P2-ICA6, their correlation coefficient is 0.5912; (b) P1-ICA5 and P2-ICA5, their correlation coefficient is 0.4569.
0.8 0.5 0.6 0
xcorr
0.4
-0.5
0
2
4
6
8
10
time-lag = 0.2917
0.2
0
-0.2
-0.4 -30
-20
-10
0
10
20
30
Time(s) Fig. 11. The cross-correlation function of P1-ICA4 and P2-ICA6.
Fig. 12. The coordinate system to find the location of the source.
40
50
60
150
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
basin. Thus, it can be seen that the improved SCBSS method proposed in Section 2 is effective because the location of the acoustic sources of the LFN induced by flood discharge is usually thought to be in the stilling basin and the actual sources in Fig. 10(a) which is identified by the method is found locating in the stilling basin. It should be mentioned that in actual calculation, the coordinate of the stilling basin was put into Eq. (23), and the equation was found to be satisfied. Therefore, it is proved that both parabolas go through the point of stilling basin. 5. Conclusions In this study, the source number is estimated by the SE method based on the eigenvalues calculated by PCA along with the threshold determined by pre-experiment. It makes the SCBSS algorithm operate with no additional intervention, so it becomes convenient to deal with a number of data files. The GS induced by the deviation of the source number estimation and the CS recorded by sensors are eliminated by the improvement of the sources selection procedure. The method proposed in this paper is validated by the pre-determined experiment and the LFN data observed during flood discharge of Jin’anqiao hydropower station in China. Two sources, with dominant frequencies about 0.7 Hz and 0.95 Hz respectively, are recognized as the actual sources that induced by flood discharge. Based on the delay time between the LFN signals recorded in different synchronous observation points, which are calculated from the maximum of cross correlation function, two parabolas that line out the probable locations of the physical source location of the LFN can be calculated; then the exact location can be identified at the intersect of these parabolas. Accordingly, we conclude that one of the sources locates in the stilling basin of the dam. However, we have not found a reasonable explanation of another source. One possible reason may be from the flood discharge of another hydropower station on the river as the LFN can travel very far and still be detected. In a word, the SCBSS method used in this paper is a fundamental work to study the mechanisms of the LFN induced during the flood discharge of high dam. Further investigations should focus on reducing the influence of the noise jamming and separating the sources observed in the scaling model testing since the proposed SCBSS method may lose effectiveness efficacy when the noise jamming is serious. Acknowledgements This work was supported by National Key R&D Program of China (grant number 2016YFC0401705), the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (grant number 51621092), National Natural Science Foundation of China (51379140), Program of Introducing Talents of Discipline to Universities (B14012), and Tianjin Innovation Team Foundation of Key Research Areas (2014TDA001). References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
H. Møller, Physiological and psychological effects of infrasound on humans, J. Low Freq. Noise V. A. 3 (1) (1984) 1–17. C. Pilger, M. Bittner, Infrasound from tropospheric sources: impact on mesopause temperature?, J Atmos. Solar-Terrestrial Phys. 71 (2009) 816–822. B. Berglund, P. Hassmén, R. Job, Sources and effects of low-frequency noise, J. Acoust. Soc. Am. 99 (1996) 2985–3002. G. Leventhall, Review: Low frequency noise. What we know, what we do not know, and what we would like to know, Noise Notes 8 (2010) 3–28. G.Q. Di, Z.G. Li, B.J. Zhang, Y. Shi, Adjustment on subjective annoyance of low frequency noise by adding additional sound, J. Sound Vib. 330 (2011) 5707–5715. A.N. Salt, T.E. Hullar, Responses of the ear to low frequency sounds, infrasound and wind turbines, Hear. Res. 268 (2010) 12–21. R.G. Berger, P. Ashtiani, C.A. Ollson, M. Whitfield Aslund, L.C. McCallum, G. Leventhall, L.D. Knopper, Health-based audible noise guidelines account for infrasound and low-frequency noise produced by wind turbines, Front. Public Heal. 3 (2015) 31. E. Canepa, A. Cattanei, F.M. Zecchin, Scaling properties of the aerodynamic noise generated by low-speed fans, J. Sound Vib. 408 (2017) 291–313. X.D. Song, D.J. Wu, Q. Li, D. Botteldooren, Structure-borne low-frequency noise from multi-span bridges: a prediction method and spatial distribution, J. Sound Vib. 367 (2016) 114–128. B. Zajamšek, K.L. Hansen, C.J. Doolan, C.H. Hansen, Characterisation of wind farm infrasound and low-frequency noise, J. Sound Vib. 370 (2016) 176– 190. J. Lian, W. Zhang, Q. Guo, F. Liu, Generation mechanism and prediction model for low frequency noise induced by energy dissipating submerged jets during flood discharge from a high dam, Int. J. Environ. Res. Pub. Health 13 (6) (2016) 594. Z. Wang, J. Chen, G. Dong, Constrained independent component analysis and its application to machine fault diagnosis, Mech. Syst. Signal. Process. 25 (7) (2011) 2501–2512. Y. Yang, S. Nagarajaiah, Blind identification of damage in time-varying systems using independent component analysis with wavelet transform, Mech. Syst. Signal. Process. 47 (1–2) (2014) 3–20. A. Kardec Barros, J. Carlos Principe, D. Erdogmus, Independent component analysis and blind source separation, Signal Process. 87 (8) (2007) 1817– 1818. P.T. Brewick, A.W. Smyth, On the application of blind source separation for damping estimation of bridges under traffic loading, J. Sound Vib. 333 (2014) 7333–7351. A. Hyvärinen, E. Oja, A fast fixed-point algorithm for independent component analysis, Neural Comput. 9 (7) (1997) 1483–1492. A. Hyvärinen, Independent component analysis: algorithms and applications, Neural Networks 13 (4–5) (2000) 411–430. F. Takens, Detecting strange attractors in turbulence, Lecture Notes Math. 898 (1981) 361–381. J. Hopgood, P. Rayner, Single channel separation using linear time varying filters: separability of non-stationary stochastic signals, IEEE Int. Conf. 3 (1999) 1449–1452. G. Jang, T. Lee, Y. Oh, Single-channel signal separation using time-domain basis functions, IEEE Signal Process. Lett. 10 (6) (2003) 168–171. C.J. James, D. Lowe, Single channel analysis of electromagnetic brain signals through ICA in a dynamical systems framework, in: 2001 Conf. Proc. 23rd Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., vol. 2, 2001, pp. 1974–1977. J. Taelman, S. Van Huffel, Wavelet-independent component analysis to remove electrocardiography contamination in surface electromyography, IEEE Eng. Med. Biol. 1 (2007) 682–685.
J. Lian et al. / Mechanical Systems and Signal Processing 110 (2018) 139–151
151
[23] B. Mijovic, I. Gligorijevic´, J. Taelman, Source separation from single-channel recordings by combining empirical-mode decomposition and independent component analysis, IEEE Transact. Bio-Med. Eng. 57 (9) (2010) 2188–2196. [24] Y. Guo, S. Huang, Y. Li, Single-mixture source separation using dimensionality reduction of ensemble empirical mode decomposition and independent component analysis, Circ. Syst. Signal Process. 31 (6) (2012) 2047–2060. [25] Y. Guo, Q. Wang, S. Huang, A. Abraham, Hand gesture recognition system using single-mixture source separation and flexible neural trees, J. Vib. Control. 20 (9) (2014) 1333–1342. [26] Y. Guo, S. Huang, Y. Li, G.R. Naik, Edge effect elimination in single-mixture blind source separation, Circ. Syst. Signal Process. 32 (5) (2013) 2317–2334. [27] N.E. Huang, Z. Shen, S.R. Long, The empirical mode decomposition method and the Hilbert spectrum for non-stationary time series analysis, Proc. Roy. Soc. A Lond. 454 (1998) 903–995. [28] K. Pearson, On lines and planes of closest fit to systems of points in space, Philos. Mag. 2 (11) (1901) 559–572. [29] J. Luo, O. Gwun, A comparison of SIFT, PCA-SIFT and SURF, Int. J. Image Process. 3 (4) (2013) 143–152. [30] J. Yang, D. Zhang, Two-dimensional pca: a new approach to appearance-based face representation and recognition, IEEE Transact. Pattern Anal. 26 (1) (2004) 131. [31] J. Karhunen, Principal component neural networks-theory and applications, Pattern Anal. Appl. 1 (1) (1998) 74–75. [32] J. Karhunen, A. Cichocki, W. Kasprzak, P. Pajunen, On neural blind separation with noise suppression and redundancy reduction, Int. J. Neural Syst. 8 (2) (1997) 219–237. [33] Y. Zhang, J. Lian, F. Liu, An improved filtering method based on EEMD and wavelet-threshold for modal parameter identification of hydraulic structure, Mech. Syst. Signal. Process. 68–69 (2016) 316–329. [34] M. Wax, T. Kailath, Detection of signals by information theoretic criteria, IEEE Trans. Acoust., Speech, Signal Process. 33 (1985) 387–392. [35] W. Cheng, S. Lee, Z. Zhang, Z. He, Independent component analysis based source number estimation and its comparison for mechanical systems, J. Sound Vib. 331 (2012) 5153–5167. [36] J.J. Lian, H.K. Li, J.W. Zhang, ERA modal identification method for hydraulic structures based on order determination and noise reduction of singular entropy, Sci. China Ser. E-Technol. Sci. 52 (2) (2008) 400–412. [37] X. Liu, C. Zhang, Z. Ji, Multiple characteristics analysis of Alzheimer’s electroencephalogram by power spectral density and Lempel-Ziv complexity, Cogn Neurodynamics. 10 (2) (2016) 121–133. [38] Z.-Q. Wang, J.-H. Zhou, M.-C. Zhong, Calibration of optical tweezers based on an autoregressive model, Opt Express. 22 (14) (2014) 16956–16964.