A novel underdetermined blind source separation method with noise and unknown source number

A novel underdetermined blind source separation method with noise and unknown source number

Journal of Sound and Vibration 457 (2019) 67e91 Contents lists available at ScienceDirect Journal of Sound and Vibration journal homepage: www.elsev...

5MB Sizes 0 Downloads 77 Views

Journal of Sound and Vibration 457 (2019) 67e91

Contents lists available at ScienceDirect

Journal of Sound and Vibration journal homepage: www.elsevier.com/locate/jsvi

A novel underdetermined blind source separation method with noise and unknown source number Jiantao Lu, Wei Cheng*, Dong He, Yanyang Zi State Key Laboratory for Manufacturing Systems Engineering, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, PR China

a r t i c l e i n f o

a b s t r a c t

Article history: Received 10 August 2018 Received in revised form 12 May 2019 Accepted 16 May 2019 Available online 22 May 2019 Handling Editor: K. Shin

It has been challenging to correctly separate sources from their mixtures with large noise and unknown source number in underdetermined cases. To address this problem, a novel underdetermined blind source separation (UBSS) method is proposed using synchrosqueezing transform (SST) and improved density peaks clustering (DPC). First, SST is employed to reassign time-frequency (TF) coefficients of short time Fourier transform (STFT) of mixed signals from their original position to the center of gravity along frequency direction, contributing to sparser TF representation than STFT. Second, a single source point (SSP) identification method is proposed by directly searching the identical normalized TF vectors, which considers linear representation relations among TF vectors and therefore can identify SSPs more accurately in noisy cases. Third, DPC is improved to automatically identify source number, with which the mixing matrix can be estimated. After that, sources are finally recovered using l1-norm decomposition method. The effectiveness of the proposed method is validated with some numerical studies and experimental studies. The results show that the proposed method can estimate source number correctly, and recover sources accurately even in noisy cases. © 2019 Elsevier Ltd. All rights reserved.

Keywords: Underdetermined blind source separation Single source point Synchrosqueezing transform Density peaks clustering l1-norm decomposition

1. Introduction Independently acquiring information from each source of the mechanical system can help to quickly judge its running state. However, in practice, the information measured by sensors is the superposition of some sources because each part of the mechanical system will interfere with each other, which makes it difficult to directly measure the source information [1]. Therefore, some supplementary signal processing methods are needed to further process the collected information to obtain the expected source signals [2]. Among postprocessing approaches, blind source separation (BSS) has demonstrated its usefulness in separating sources from mixed signals due to its simplicity and effectiveness. More importantly, BSS can be utilized without the structure model and the transmission paths that are difficult to be obtained, and therefore BSS has been widely used in practice [3e6]. Antoni has studied the principles and demonstrations for blind separation of vibration components [7]. Based on the study of signal propagation mechanism, an effective BSS method is proposed to extract periodic, random stationary, and random nonstationary sources from only one vibration signal. Some widely used BSS and blind equalization algorithms have been compared to evaluate their performance in the separation of mechanical vibrations [8]. To find the major vibration and noise sources of vehicles, Zhang et al. proposed a quantitative estimation method for source

* Corresponding author. E-mail address: [email protected] (W. Cheng). https://doi.org/10.1016/j.jsv.2019.05.037 0022-460X/© 2019 Elsevier Ltd. All rights reserved.

68

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Abbreviations (BSS) Blind Source Separation (UBSS) Underdetermined Blind Source Separation (UBSS-SC) UBSS method based on Spare Coding (DPC) Density Peaks Clustering (IDPC) Improved Density Peak Cluster (TF) Time-Frequency (STFT) Short Time Fourier Transform (SST) SynchroSqueezing Transform (SSP) Single Source Point (SCA) Sparse Component Analysis (WVD) Wigner-Ville Distribution (DUET) Degenerate Unmixing Estimation Technique (TIFROM) TIme-Frequency Ratio Of Mixtures (soft-LOST) Soft Line Orientation Separation Technique (SNR) Signal-to-Noise Ratio (SIR) Signal-to-Interference Ratio

contribution using the kurtosis-based constrained independent component analysis [2]. He et al. proposed an UBSS method based on the modified k-means clustering algorithm and the Laplace potential function, which is used to estimate the mixing matrix for bearing vibration signals [9]. A review paper for identifying the modal parameters using BSS technology is given in Ref. [10]. However, most of these methods are mainly designed for (over)determined BSS where the number of sensors is no smaller than that of sources, and thus will fail in underdetermined cases. In this paper, we mainly address the problem of UBSS where the number of mixed signals is smaller than that of sources. Some methods have been proposed to solve UBSS problems, which can be roughly divided into two categories. One takes advantage of the statistic characteristics or time structure of the data, such as uncorrelated AR-model signals [11], nonnegative tensor factorization [12] and minimum mean-squared error-based beamforming approach [13], etc. The other applies the sparsity [14] of sources in time domain or other transform domains and the typical method is sparse component analysis (SCA) [15,16] that mainly contains two steps: mixing matrix estimation and source recovery. SCA-based methods require no condition on independence or irrelevance of sources, and sparsity is the only requirement that can be satisfied in the transform domain [16]. The main disadvantage of SCA-based method lies in its worse estimation performances in noisy cases. Besides, SCA-based method is difficult to be applied with unknown source number. Our research will propose a novel UBSS method that can obtain good estimation performance in noisy cases and give the solution to automatically obtain the source number. Sparsity of sources is very important for UBSS, especially in noisy cases. When source signals are sparse in time domain, the scatter plot of mixed signals will show a clear directivity in mixture space, which can be applied to mixing matrix estimation [17]. However, if sources are not sparse in time domain, they need to be transformed into other domains to increase their sparsity. Generally, better estimation of source signals can be obtained with sparser representation in transform domains. STFT [18e21] and Wigner-Ville distribution (WVD) [22e25] are the most widely used methods to increase the sparsity of sources by transforming them from time domain into TF domain. However, both STFT and WVD have some unexpected shortcomings. The window function used in STFT unavoidably blurs the TF representation, and TF concentration of STFT is limited by Heisenberg uncertainty principle, thereby restraining further improvement of the sparsity of signals [26]. WVD has an inherent disadvantage of cross-term disturbance, hard to be reconstructed, and costs so much in computation, especially for signals with long sample length. Therefore, to improve the estimation performance, a suitable TF analysis method is required that has higher TF concentration and robustness than STFT, and is still free from cross-term disturbance. Previous UBSS methods rely highly on the sparsity of source signals, for example, the degenerate unmixing estimation technique (DUET) [27] needs W-disjoint orthogonality sources that each TF point is dominated by one source and the timefrequency ratio of mixtures (TIFROM) method [28] requires adjacent TF regions where only one source is active. The main disadvantage of these methods lies in low estimation accuracy for the insufficiently sparse sources in noisy cases. To improve the estimation performance for insufficiently sparse sources, some methods are proposed based on SSPs that are TF points where only one source is dominant. These methods only require some TF points that are SSPs for each source, thus relaxing the requirement for the sparsity of sources. Reju et al. proposed a method that can identify SSPs by comparing the absolute directions of the real and imaginary parts of the Fourier transform coefficient vectors of mixed signals, and then the mixing matrix was estimated by clustering the identified SSPs [29]. Li et al. also proposed an effective method to identify SSPs by utilizing the TF coefficients of mixed signals and the complex conjugates of the coefficients [30]. Sun et al. identified SSPs by checking whether the phase angle of each component of TF vectors of mixed signals is the same. If the differences between all phase angles of each component are smaller than a threshold, this TF point can be regarded as a SSP [31]. SSP-based methods

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

69

can greatly improve the estimation accuracy of the mixing matrix. Obviously, in SSP-based methods, accurate extraction of SSPs is the premise of the accurate estimation of the mixing matrix. However, the above-mentioned SSPs identification methods are mainly based on the property of single SSP, which will lead to low estimation accuracy in noisy cases. As indicated in Ref. [29], SSPs present a good clustering property which corresponds to the columns of the mixing matrix. Therefore, after obtaining enough SSPs, the mixing matrix can be estimated by clustering these SSPs. Based on this view, O'Grady and Pearlmutter adopted a soft line orientation separation technique (soft-LOST) that partially assigns data points to lines by weighting each line's distance [32]. Then, the mixing matrix is gradually updated by iteratively calculating the principal eigenvector of the covariance matrix of the weighted data. This method directly clusters all data points rather than the identified SSPs. Some researchers attempt to find the peak of SSPs using probability density method [30] that can obtain the source number and the mixing matrix. However, the appropriate bandwidth is difficult to be selected, which may lead to the misestimating of source number. Other excellent clustering methods are also used for estimating the mixing matrix, such as K-means clustering methods [18,33] and hierarchical clustering algorithms [20,29]. The main advantages of K-means clustering method lie in its high accuracy and fast computation. However, the results of K-means clustering method are sensitive to initial cluster centers, though this can be improved by K-meansþþ method [34], and the source number should be known before clustering. As reported in Ref. [29], after removing the outliers in each cluster, the estimation accuracy of the mixing matrix can be further improved by reusing hierarchical clustering algorithm. However, the source number is also difficult to be determined. Recently, an advanced cluster method, named DPC, is proposed by Rodriguez and Laio [35]. The cluster centers in DPC are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities. Based on this idea, the cluster number, i.e., the source number, and cluster centers can be intuitively obtained in a two-dimensional decision graph that is the scatter plot of the distance and density of all data. In this paper, a novel UBSS method is proposed based on STFT-based SST and improved density peak cluster (IDPC). The advantages of SST are that it makes TF representation sparser than STFT and is robust to bounded perturbations of Gaussian white noise [36]. In the proposed method, first, SST that contains the post-processing of STFT will be employed to reassign TF coefficients of STFT from their original position to the center of gravity along the frequency direction. Therefore, the dispersion energy caused by the window in STFT will be concentrated, thus increasing TF concentration of mixed signals. Second, we find that the normalized TF vectors of mixed signals will be equal to each other at SSPs corresponding to the same active sources. Therefore, an effective SSPs identification method is proposed by directly searching the identical normalized TF vectors of mixed signals in some optimal frequency bins. Since our method considers the linear representation relations among all TF vectors in the selected frequency bins, it can estimate the mixing matrix more accurately and is more robust to noise than the existing UBSS methods. To automatically obtain the source number, IDPC is proposed to process the obtained SSPs. By removing SSPs with very low local density and removing the trend item of remaining SSPs, cluster centers can be automatically recognized via a threshold and the source number is the number of cluster centers. After that, each SSP will be assigned to the cluster core or cluster halo, and the mixing matrix is obtained with the cluster centers that are recalculated by averaging the cluster cores. Finally, TF representations of source signals are recovered by l1-norm decomposition in TF domain and are transformed into time domain using inverse STFT. Numerical studies and experimental studies on a cylindrical shell structure are conducted to validate the effectiveness of the proposed method. The proposed mixing matrix estimation method has the advantages of utilizing a sparser TF representation, and employing the linear representation relations among TF vectors to identify SSPS. Besides, SSPs are identified only in some optimal frequency bins such that all SSPs in these bins can be identified at one time. Due to these outstanding virtues, the proposed method can estimate the source signals more accurately than the existing methods in noisy cases and remains relatively high efficiency. Furthermore, the proposed method can automatically identify the source number by IDPC, which is very useful in practice because the source number may be unknown in many cases. The remainder of this study is organized as follows. Section 2 presents the proposed SSPs identification method based on SST. Section 3 presents the proposed mixing matrix estimation method by clustering the identified SSPs using IDPC. Section 4 briefly introduces the source recovery method based on l1-norm decomposition. The performances of the proposed method are illustrated in Section 5 through some numerical studies and experimental studies on a cylindrical shell structure. The conclusion is drawn in Section 6. 2. Proposed SSPs identification method The linear instantaneous mixed model of UBSS can be expressed as

xðtÞ ¼ AsðtÞ

(1)

where xðtÞ ¼ ½x1 ðtÞ; x2 ðtÞ; :::; xN ðtÞT and sðtÞ ¼ ½s1 ðtÞ; s2 ðtÞ; :::; sM ðtÞT are the mixed vector and the source vector in time domain, respectively, and $T represents the transpose operation; N and M (N < M) are the number of mixed signals and source signals, respectively; A ¼ ½a1 ; a2 ; :::; aM  is the mixing matrix with ai as its column. The aim of UBSS is to estimate source signals without any prior information of sðtÞ or A, except that N < M. 2.1. SSPs identification To increase sparsity of signals, mixed signals are transformed into TF domain by STFT, that is,

70

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Xðt; f Þ ¼ ASðt; f Þ

Xðt; f Þ ¼

M X

(2)

ai Si ðt; f Þ

(3)

i¼1

where Xðt; f Þ and Sðt; f Þ are the STFT coefficients of xðtÞ and sðtÞ, respectively. Actually, it is nearly impossible to obtain effective estimation of source signals if we know nothing about sðtÞ or A. Therefore, some assumptions are given first.  Assumption 1. For each source signal si ðtÞ, there are some TF points ðt; f Þ where only Si ðt; f Þ is dominant, i.e., jSi ðt;f Þj[ Sj ðt;  f Þ, c jsi. Assumption 2.

Any N  N submatrix of the mixing matrix A is of full rank.

Now, at any TF point, say ðu;vÞ, if only one source is active, say Si ðu;vÞ, i.e., ðu; vÞ is a SSP corresponding to si ðtÞ, then Eq. (3) can be rewritten as

Xðu; vÞ ¼ Si ðu; vÞai

(4)

Equation (4) shows that Xðu; vÞ will be collinear with ai at ðu;vÞ. Assume that at another TF point, say ðj; uÞ, only Si ðj; uÞ is active, we can obtain

Xðj; uÞ ¼ Si ðj; uÞai

(5)

Obviously, Xðj; uÞ is also collinear with ai at ðj; uÞ. Therefore, all TF vectors of mixed signals at SSPs corresponding to si ðtÞ will be collinear with ai . From Eqs. (4) and (5), we can draw the conclusion that all TF vectors of the mixed signals at SSPs corresponding to the same active sources can be linearly represented by each other as

Xðu; vÞ ¼ rXðj; uÞ

(6)

where r is a real coefficient. By normalizing both sides of Eq. (6), the following equation is easily obtained, that is,

~ vÞ ¼ Xð ~ j; uÞ Xðu;

(7)

where ~$ represents the normalized vector of $. Eq. (6) and Eq. (7) are equivalent. Therefore, SSPs can be identified by checking whether normalized TF vectors are identical or not. As all vectors have been normalized, they will be identical if the directions of vectors are the same. Therefore, the proposed SSPs identification method can be rewritten as

   ~ ~ j; uÞ〉 < ε vÞ; Xð 1  〈Xðu; 1

(8)

~ ~ j; uÞ〉 represents the scalar product of Xðu; ~ ~ j; uÞ, and ε is a threshold where j$j is the absolute value of $, 〈Xðu; vÞ; Xð vÞ and Xð 1 close to zero. 2.2. Synchrosqueezing transform In order to overcome the shortages of STFT and WVD, Daubechies et al. proposed a wavelet-based TF post-processing method called synchrosqueezing wavelet transform [26]. This method can obtain higher TF concentration representation without interference terms. Then, synchrosqueezing wavelet transform was extended to SST [37]. Actually, SST has been studied and applied in a wide range of field, such as vibration analysis [38e40], modal identification [41], machining chatter detection [42] and geography [43] and so on. However, SST has not been employed to estimate the mixing matrix in UBSS. In this study, SST is adopted to improve the sparsity of source signals. SST is performed via three successive steps as follows. Step 1: Obtain TF representation Xðt; f Þ of mixed signals by STFT. b i ðt; f Þ for each Xi ðt; f Þ by Step 2: Calculate the instantaneous frequency estimators u

b i ðt; f Þ ¼ u

 jIm½vt Xi ðt; f Þ = Xi ðt; f Þj; ∞;

jXi ðt; f Þj > y ; jXi ðt; f Þj  y

i ¼ 1; :::; N

(9)

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

71

where Im½ $ is the imaginary part of $; vt Xi ðt; f Þ denotes the partial derivative of Xi ðt; f Þ with respect to t; y denotes the threshold used to overcome the instability. Step 3: For each TF coefficient in the TF supporting region F sition to the center of gravity in the frequency direction by

Z

X i ðt; uÞ ¼ F

xi ;

b i ðt; f ÞÞdf Xi ðt; f Þdðu  u

xi ; Xi ðtÞ,

reallocate coefficients from the computational po-

(10)

Xi ðtÞ

b i ðt; f ÞÞ plane. And the discrete form where X i ðt; uÞ is SST of xi ðtÞ. This step transforms the information from ðt; f Þ plane to ðt; u of Eq. (10) is

X i ðt; fl Þ ¼

X

Xi ðt; f ÞðDf Þ

(11)

k:j b u i ðt;fk Þul jDf =2

As a result, the synchrosqueezing operation in (11) improves the TF concentration of STFT. The schematic diagram of SST is shown in Fig. 1. 2.3. Some skills to improve efficiency As shown in Ref. [29], only a few frequency bins are enough for identifying SSPs. Therefore, in this study, SSPs are identified only in the selected frequency bins to reduce the computation cost. The variances of all frequency bins of x1 ðtÞ are sorted in descending order and only the first Nvariance frequency bins are selected. Then, the energies of all frequency bins of x1 ðtÞ are also sorted in descending order and only the first Nenergy frequency bins are selected. The final selected frequency bins are

  Nbins ¼ u Nvariance ; Nenergy

(12)

where uðNvariance ; Nenergy Þ represents the union of Nvariance frequency bins selected by variance and Nenergy frequency bins selected by energy. In addition, before identifying SSPs, the TF point with negligible energy should be removed, that is, these TF vectors will be removed if

kXðu; vÞk2 < ε2 kXk2

(13)

where kXk2 represents the average of 2-norm of all TF vectors in the optimal frequency bins and ε2 is a threshold coefficient. By selecting frequency bins and removing TF points with negligible energy, SSPs identification efficiency of the proposed method can be greatly improved. It should be noted that the proposed SSPs identification method can be implemented with matrix operation. Assume G is the combinatorial matrix of remaining TF vectors, Eq. (8) can be rewritten as

1GT G  GT G < ðε1 ÞGT G

(14)

Fig. 1. Schematic diagram of SST. (a) Signal in time domain; (b) Theoretical instantaneous frequency; (c) TF representation with STFT; (d) TF representation with SST.

72

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

where 1GT G and ðε1 ÞGT G represent the matrixes that all elements are equal to 1 and ε1 , respectively, and their dimensions are equal to GT G. Unlike the sparse coding method used in Ref. [20], the proposed SSPs identification method only needs to check whether the normalized TF vectors of mixed signals are identical and all SSPs in the selected frequency bins can be identified at one time. As a result, our method has better accuracy and higher efficiency than the method based on sparse coding method [20]. 2.4. Mixing matrix estimation based on IDPC After obtaining enough SSPs, the mixing matrix can be estimated by cluster technology. Source number must be known beforehand when using traditional clustering methods. To solve this problem, our research proposes a mixing matrix estimation method via IDPC. Actually, DPC has been used in UBSS for sparse frequency hopping signals [44]. However, in Ref. [44], DPC is directly used to cluster all TF vectors of mixed signals rather than the identified SSPs. In this study, SSPs obtained in Section 2 will be clustered using DPC to estimate the mixing matrix. Furthermore, DPC is improved to automatically identify the cluster number and centers. 2.5. Density peaks clustering In DPC, cluster centers are assumed to be surrounded by neighbors with lower local density and they are at a relatively large distance from any points with higher local density [35]. Denote U as the set of identified SSP vectors, that is,

U ¼ fy1 ; y2 ; :::; yK g

(15)

where yi , i ¼ 1; 2; :::; K, represent the extracted SSPs. The major procedures of DPC can be summarized as follows [35]. Step 1: Compute the distances dij between yi and yj . In this study, the Euclidean distance is used, that is,

    dij ¼ yi  yj  ; 2

1  i; j  K

and isj

(16)

Step 2: Calculate the local density ri for each yi by

ri ¼

X 

c dij  dc



(17)

j

where cðzÞ ¼ 1 if z < 0 and cðzÞ ¼ 0 otherwise, and dc is the cutoff distance. Actually, ri is equal to the number of points that are closer than dc to yi . Step 3: Permute all local density values in descending order with the indices q1 ; q2 ; :::; qK , that is, rq1  rq2  :::  rqK . Step 4: Calculate the minimum distance di for each yi by

8   > < min dqi qj ; i > 1 di ¼ qj : j < i  > i¼1 : max dj ;

(18)

j2

Step 5: Draw the decision graph with ri and di . For each SSP yi, two quantities are obtained: local density ri and minimum distance di from yi to the point with higher density. According to the two quantities, the decision graph can be obtained by the plot of di as a function of ri for each point. Then, cluster centers can be intuitively obtained as the points with relative large di and ri at the same time. Step 6: Assign each remaining point to cluster core or cluster halo.

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

73

After obtaining all cluster centers, each remaining point is assigned to the same cluster as its nearest neighbor of higher density. The border regions of each cluster are then searched. They are the set of points assigned to a cluster but being within a distance dc from data points belonging to other clusters. Then, for each cluster, we find the point with highest density within its border region and denote its density by rb. The points of the cluster whose density is higher than rb are considered as part of the cluster core and the others are considered as part of the cluster halo. As studied in Ref. [35], DPC is robust to dc and one can choose dc so that the average number of neighbors is around 1%e2% of the total number of points in the data set. In this study, dc is set as 2%. Moreover, clusters are recognized regardless of their shape and the dimensionality of the space where they are embedded. Besides, the distance in Eq. (16) needs to be calculated only once. Therefore, DPC has the advantages of fewer parameters, higher efficiency and easiness to be implemented. 2.6. Improved density peaks clustering The cluster centers can be intuitively obtained from the decision graph of DPC. However, they cannot be automatically identified. In Ref. [35], a final indicator gi ¼ ri di is introduced. If the true cluster number is M, then there will be M values of gi far larger than other values. However, the variation of g from the cluster centers to other points is not always obvious. In this section, DPC is improved to automatically extract cluster centers. An example may better present the proposed cluster center recognition method, as shown in Fig. 2. Fig. 2(a) shows the scatter plot of original data. For each data, we can obtain its two quantities: local density ri and minimum distance di , as mentioned in Section 3.1. Then, the decision graph is given in Fig. 2(b), from which we can see cluster centers are point 2 and point 18 because they have higher ri and larger di. Some points in Fig. 2(b), say point 7, have larger di but lower ri. These points are outliers. Some points, say point 31, have higher ri but smaller di. These points correspond to the points around cluster centers. Therefore, two thresholds are required to directly identify cluster centers from the decision graph. By observing the distribution of the data in Fig. 2(b), we found that most points are located near the X-axis and very few outliers are located near the Y-axis. Therefore, we can eliminate the interference of outliers firstly by a threshold sr . As shown in Fig. 2(b), point 7 can be removed by set sr ¼ 0:25. This operation can reduce the influence of these outliers on subsequent fitting steps. However, it is worth noting that sr does not have to be a precise threshold to remove all outliers, for example, point 9 and point 23 still exist, as shown in Fig. 2(b). In this study, sr ¼ maxðrÞ=100 is used, where maxðrÞ represents the maximum value of the local density of all data. From Fig. 2(b), the distribution of the remaining data in the decision graph is similar to the probability density function of Weibull distribution. The function of Weibull distribution is given by

Fig. 2. Example diagram of the proposed method for recognizing cluster centers. (a) Scatter plot of original data; (b) Decision graph for the data in (a); (c) Weibull distribution with different parameters; (d) Decision graph with fitting curves; (e) New decision graph for recognizing cluster centers; (f) Original decision graph with the thresholds in the proposed method.

74

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

gðxÞ ¼

k xk1

l l

x k

eðlÞ

(19)

where k represents shape factor and l is the scale factor. To better describe the data, we add a bias in gðxÞ, that is

f ðxÞ ¼

k xk1

l l

x k

eðlÞ þ c

(20)

where c is a bias. f ðxÞ with different parameters are shown in Fig. 2(c). We can see that the shape of f ðxÞ is similar to the distribution of the rest data after removing outliers in Fig. 2(b). The shape of f ðxÞ can be adjusted by the parameters in Eq. (20). Then, the rest data will be fit by f ðxÞ, as shown in Fig. 2(d). For each SSP yi, a new distance ~ di can be obtained by

~d ¼ d  f ðr Þ i i i

(21)

From Eq. (21), the trend item is removed. Therefore, each SSP yi has a new quantities pair: ~ di and ri . Fig. 2(e) shows the new decision graph by the plot of ~ di as a function of ri , from which the cluster centers can be identified by the threshold sd , that is,

~d > s i d

(22)

We can map the threshold line sd to Fig. 2(b) by adding f ðxÞ to the threshold. Obviously, the threshold line will become a curve line, as shown in Fig. 2(f). Therefore, the second threshold sd in the proposed method is actually equivalent to two thresholds and this is the reason why sr does not have to be a precise threshold to remove all outliers. After cluster centers are obtained, each SSPs will be assigned to the cluster core or cluster halo. Assume that the cluster core of i-th cluster has D SSPs and their indices are q1 ; q2 ; :::; qD , then the i-th cluster center Ci is calculated by

Ci ¼

D 1X y qk ; D

i ¼ 1; 2; ::: ; M

(23)

k¼1

Finally, the mixing matrix is obtained by

b ¼ ½C A 1

:::

C2

CM 

(24)

From the above analysis, two thresholds are required to identify cluster centers in DPC, however, only one threshold sd is needed in IDPC, which makes DPC easier to be used in practice. 2.7. Source recovery based on l1-norm decomposition After obtaining the matrix mixing, the next step is to recover source signals. Some source recovery methods have been proposed in time domain [13] and TF domain [17,20]. As reported in Refs. [13,20], though the mixing matrix is available, source recovery is still a difficult task because the mixing matrix is irreversible in UBSS problems. The usual approach to sparse BSS consists of finding the solution that minimizes the l1-norm [17,45], as

min sðtÞ

ksðtÞk1

s:t: AsðtÞ ¼ xðtÞ t ¼ 1; 2; :::; T

(25)

Eq. (25) can be applied for each time t, leading to T tractable small problems. Actually, minimizing ksðtÞk1 means finding the shortest path to xðtÞ from all feasible solutions. The above l1-norm decomposition method can be easily extended from two-dimensional signals [17] to multi-dimensional signals [46]. In multi-dimensional cases, assume the mixing matrix A2ℝNM and define A as the set composed of all N N submatrices of the mixing matrix A, that is

A ¼ fA i jA i ¼ ½aa1 ; aa2 ; :::; aaN g

(26)

N basis vectors at time t can be firstly obtained via

A

*

¼ arg

Assume that A

min

A i 2A

*

  A



 1 i xðtÞ1

(27)

¼ ½aq1 ; aq2 ; :::; aqN  is obtained from Eq. (27). Then, the components of the sources can be constructed as

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

bs j ðtÞ ¼



ei ; 0;

if j ¼ qi otherwise

75

(28)

1 where e ¼ ½e1 ; e2 ; :::; eN T ¼ A 1 * xðtÞ and A * is the inverse of A * . In this study, l1-norm-based source recovery method is applied in TF domain, that is,

min

sðt; f Þ

ksðt; f Þk1

s:t: ASðt; f Þ ¼ Xðt; f Þ

For each TF point ðt0 ; k0 Þ, we can find A

A

*

¼ arg

A

min

i 2A

  A

*

(29)

¼ ½a41 ; a42 ; :::; a4N  by

  1 i Xðt0 ; k0 Þ

1

(30)

Then, TF vector of source signals at ðt0 ; k0 Þ is constructed as

b S j ðt0 ; k0 Þ ¼



ei ; 0;

if j ¼ 4i otherwise

(31)

where e ¼ ½e1 ; e2 ; :::; eN T ¼ A 1 * Xðt0 ; k0 Þ. After restoring TF vectors of source signals at each TF point, source signals in time domain can be obtained by inverse STFT. The overall flow chart of the proposed UBSS method is shown in Fig. 3. 3. Numerical studies and experimental studies 3.1. Comparison between STFT and SST An example is given to show that TF representation of signal using SST is sparser than using STFT. The signal in time domain is

i n h io h  xðtÞ ¼ cos 2p 0:1t 2 þ 3 sinð2tÞ þ 10t þ cos 2p 40 þ t 1:3 t

(32)

Sampling frequency and sampling length are 200 Hz and 10 s, respectively. In both STFT and SST, Hanning window is used, the window length is 64 and time step equals 10. Gaussian white noise is added into the signal with signal-to-noise ratio (SNR) 10 dB. Fig. 4 shows the comparison of STFT and SST. Fig. 4(a) shows the waveform of the signal and Fig. 4(b) illustrates the theoretical instantaneous frequency of the signal. Fig. 4(c) and (d) show the TF representation of the signal using STFT and SST, respectively. Compare Fig. 4(c) with Fig. 4(b), the energy of the signal is dispersed due to window effect in STFT. As revealed by Fig. 4(d), by reassigning TF coefficients of STFT from its original position to its gravity center along the frequency direction, the TF concentration of SST is obviously better than that of STFT. Thus, TF coefficients of SST are sparser than those of STFT. 3.2. Example of the proposed method In this simulation, the proposed method is used to separate five source signals from three mixtures. Source signals are: s1 ðtÞ is a multi-component signal; s2 ðtÞ is a low frequency sinusoidal wave; s3 ðtÞ is a high frequency sinusoidal wave; s4 ðtÞ is periodic wave with amplitude modulation and s5 ðtÞ is periodic wave with frequency modulation. The generating functions of source signals and the mixing matrix A are listed as follows:

3 2 3 0:5 sinð2p , 17tÞ þ sinð2p , 171tÞ þ sinð2p , 350tÞ s1 ðtÞ 6 s2 ðtÞ 7 6 sinð2p , 43tÞ 7 7 6 6 7 sðtÞ ¼ 6 s3 ðtÞ 7 ¼ 6 sinð2p , 200tÞ 7 4 s ðtÞ 5 4 5 sinð2p , 7tÞcosð2p , 80tÞ 4 s5 ðtÞ sin½2p , 300t  cosð2p , 10tÞ 2 3 0:5736 0:6157 0:5299 0:6293 0:5878 4 A ¼ 0:6627 0:2435 0:7969 0:2402 0:6545 5 0:4851 0:7494 0:2900 0:7391 0:4755 2

(33)

(34)

Sampling frequency and sampling length are 4 kHz and 4 s, respectively. Gaussian white noise is separately added to each source signal with SNR ¼ 0 dB. After obtaining mixed signals, Gaussian white noise with SNR ¼ 0 dB is also added into each mixed signal. Waveforms and Fourier spectrums of source signals are displayed in Fig. 5, while the major frequencies of source

76

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 3. Flow chart of the proposed method.

signals can be easily obtained from Fig. 5(b). Then, mixed signals are generated by xðtÞ ¼ AsðtÞ and illustrated in Fig. 6. From Fig. 6 (a), mixed signals are the superposition of source signals, so we cannot directly obtain waveforms of source signals from mixed signals. Form Fig. 6(b), the major frequencies of source signals can be found in each Fourier spectrum of mixed signals. We first extracted enough SSPs by the proposed SSPs identification method. Then, the identified SSPs will be processed by IDPC. Origin decision graph in IDPC is shown in Fig. 7(a). As shown in Fig. 7(a), we first remove the effect of some outliers by setting sr ¼ rmax =100. The fitting line of the remaining data is shown in Fig. 7(b). After removing the trend item, the final decision graph can be obtained and the cluster centers can be identified by setting ~d > 0:5, as illustrated in Fig. 7(c). Fig. 7(d) shows the indicator g ¼ rd. The first fourth value is obviously larger than other values. However, the fifth cluster center is difficult to be selected because g of point ⑤ is nearly the same as g of points ⑥ and ⑦, as illustrated in Fig. 7(d), thus the cluster number will be easily mistaken as four. After obtaining the cluster centers, the mixing matrix and source signals are estimated. The estimated mixing matrix is

2 0:5231 b A ¼ 4 0:6407 0:4961

0:5798 0:2776 0:7549

0:5967 0:6660 0:3311

0:6643 0:4853 0:4639

3 0:6778 0:4922 5 0:3621

(35)

Compare Eq. (35) with Eq. (34), the mixing matrix has been well estimated. Waveforms and Fourier spectrums of the estimated source signals are shown in Fig. 8, where we can see that the major frequencies of source signals have been well recovered. These results can illustrate the effectiveness of the proposed method.

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

77

Fig. 4. Example of the comparison of STFT and SST. (a) Signal in time domain; (b) Theoretical instantaneous frequency of the signal in (a); (c) TF representation of STFT; (d) TF representation of SST.

3.3. Separation of Gaussian sources As is known, the mixture of two Gaussian variables is still a Gaussian variable. Therefore, independence-based BSS methods will fail when source signals contain more than one Gaussian source. In this section, we use two non-Gaussian sources and two Gaussian sources as source signals to test the separation performance of the proposed method for Gaussian sources. The performances of the estimated mixing matrix and the estimated source signals are evaluated using average signal-tointerference ratio (SIR), defined as Eq. (36) and Eq. (37), respectively. M 1 X kai k22 SIRA ¼ 10 log10 2 M i¼1 ak ka  b i

! (36)

i 2

M 1 X ksi k22 SIRs ¼ 10 log10 2 M i¼1 ksi  mbs i k2

! (37)

where m is a scalar reflecting the scalar indeterminacies; the average correlation coefficient R between source signals and estimated source signals is also calculated by

78

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 5. Source signals. (a) Waveforms; (b) Fourier spectrums.

Fig. 6. Mixed signals. (a) Waveforms; (b) Fourier spectrums.

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

79

Fig. 7. Automatic identification of cluster centers with IDPC. (a) Original decision graph; (b) Decision graph after removing SSPs with very low local density; (c) Final decision graph after removing the trend item; (d) Indicator g ¼ rd.

h iT ) ( M bs i ðtÞ ½si ðtÞ  si ðtÞ bs i ðtÞ  b 1 X   R¼  b M i¼1 ks ðtÞ  s ðtÞk  b b ðtÞ  ðtÞ s s   i i i i 2

(38)

2

bs i ðtÞ are the means of si ðtÞ and bs i ðtÞ, respectively. where si ðtÞ and b Two non-Gaussian signals (Gaussian white noise is added with SNR ¼ 5 dB) are s1 ðtÞ ¼ sinð2p,12tÞcosð2p,112tÞ and s1 ðtÞ ¼ sinð2p,20tÞ. Two Gaussian signals are recommended in benchmarks in Ref. [47]. Sampling frequency and sampling length are 1000 Hz and 2 s, respectively. Waveforms and Fourier spectrums of source signals are displayed in Fig. 9. The mixing matrix is randomly generated and then each column of the mixing matrix has been normalized with 2-norm, and the mixing matrix is

2

0:4937 A ¼ 4 0:4078 0:7681

0:3594 0:8961 0:2602

0:3285 0:5627 0:7586

3 0:2912 0:1370 5 0:9468

(39)

80

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 8. Estimated source signals. (a) Waveforms; (b) Fourier spectrums.

Then, mixed signals are generated by xðtÞ ¼ AsðtÞand illustrated in Fig. 10. The proposed method is used to estimate source signals from the mixed signals. The final decision graph can be obtained and the cluster centers can be identified by setting ~ d > 0:3, as illustrated in Fig. 11. Then the mixing matrix is obtained and SIRA of the estimated mixing matrix is 20.59 dB. Waveforms and Fourier spectrums of the estimated source signals are shown in Fig. 12, where we can see that waveforms of the estimated source signals are quite similar to those of source signals. The SIRs of the estimated source signals is 6.89 dB and the average correlation coefficient R between source signals and estimated source signals is 0.8840. The proposed method did not rely on the independence, but relied highly on the sparsity of the source signals, and sparsity of Gaussian sources will increase by being transformed from time domain into TF domain using SST, which is the main reason why the proposed method can well estimate two Gaussian sources.

3.4. Comparison with other methods The proposed method is also compared with other existing methods to evaluate its estimation performance. Source signals are four artificial source signals. s1 ðtÞ is a shock attenuation signal; s2 ðtÞ is a periodic wave with amplitude modulation; s3 ðtÞ is a low frequency sinusoidal wave; s4 ðtÞ is a high frequency sinusoidal wave. Sampling frequency and sampling length are 4096 Hz and 5 s, respectively. The generating functions of source signals and the mixing matrix A are listed as follows:

2 3 40 X 3 s1 ðtÞ sin½2p , 300tðt  0:125nÞexp½  50ðt  0:125nÞuðt  0:125nÞ 7 65 7 6 s2 ðtÞ 7 6 n¼0 7 7¼6 sðtÞ ¼ 6 6 7 4 s3 ðtÞ 5 6 sinð2p , 12tÞcosð2p , 112tÞ 7 4 5 sinð2p , 20tÞ s4 ðtÞ sinð2p , 164:5tÞ 2

(40)

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 9. Source signals with two Gaussian sources. (a) Waveforms; (b) Fourier spectrums.

Fig. 10. Mixed signals with two Gaussian. (a) Waveforms; (b) Fourier spectrums.

81

82

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 11. Final decision graph with two Gaussian sources.

Fig. 12. Estimated source signals with two Gaussian sources. (a) Waveforms; (b) Fourier spectrums.

2

0:6287 A ¼ 4 0:4261 0:5635

3 0:6752 0:7236 0:6954 0:5657 0:6428 0:5497 5 0:4734 0:3625 0:2763

(41)

We compare the performance of mixing matrix estimation of the proposed method with the methods in Ref. [29] and in Ref. [20]. The method in Ref. [20] is a UBSS method based on spare coding, denoted as “UBSS-SC”, that can effectively estimate

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

83

the mixing matrix even in noisy cases, and can recover source signals in TF domain. The method in Ref. [29] identifies SSPs by comparing the absolute directions of the real parts and imaginary parts of the Fourier transform coefficient vectors of the mixed signals. The method in Ref. [29] is denoted as “Ratio method”, However, Ratio method is only designed to estimate mixing matrix and cannot restore source signals. Therefore, the mixing matrix estimated by Ratio method will be input into UBSS-SC method to obtain source signals. The parameter Dq in Ratio method is set as 0.8 and the parameter l in UBSS-SC is set as 0.002. In the proposed method, SSPs threshold ε1 ¼ 1  104 , the selected frequencies bins is Nvariance ¼ 70 and Nenergy ¼ 70; The threshold sd ¼ 0:5. Gaussian white noise is added into mixed signals with SNR ¼ 10 dB and Gaussian white noise added into source signals varying from SNR ¼ 30 dB to SNR ¼ 0 dB. In all experiments, average performance of 50 independent trials was used. Fig. 13 shows the performance comparison of the proposed method, UBSS-SC and Ratio method in different noisy scenarios. Fig. 13(a) shows the performance comparison of mixing matrix estimation. From Fig. 13 (a), the estimation performance of mixing matrix of Ratio method is similar to that of UBSS-SC in higher SNR cases. However, in Ratio method, SSPs are identified by using the property of single SSP, ignoring the linear representation relations among TF vectors of mixed signals. On the contrary, UBSS-SC can identify SSPs using these linear representation relations, which will have a result that the estimation performance of mixing matrix using UBSS-SC is better than using Ratio method in lower SNR cases. The average SIRA of the proposed method is the largest in all cases, showing that the proposed method can estimate the mixing matrix more accurately than UBSS-SC and Ratio method. The reasons are as follows. First, SST used in the proposed method can provide a sparser TF representation and is more robust to noise than STFT used in Ratio method and UBSS-SC. Second, the proposed method considers the linear representation relations among all TF vectors in the selected frequency bins, which is more robust to noise than the methods using the property of single SSP. Third, the proposed method can identify all SSPs in

Fig. 13. Performance comparison in different SNR. (a) Performance comparison of matrix estimation with average SIRA. (b) Performance comparison of source recovery with average SIRs. (c) Performance comparison of source recovery with average correlation coefficients R.

84

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

the optimal selected frequency bins at one time, however, UBSS-SC only obtains a part of SSPs due to its computational complexity. Therefore, the estimation performance of the mixing matrix of the proposed method outperforms that of Ratio method and UBSS-SC. Fig. 13(b) shows performance comparison of source recovery. From Fig. 13(b), the average SIRs of the proposed method is larger than that of Ratio method and UBSS-SC. Therefore, the proposed method can provide more accurate source signals than Ratio method and UBSS-SC. As revealed by Fig. 13(a) and (b), the estimation accuracy of source signals depends heavily on the estimation accuracy of mixing matrix. Therefore, improving the estimation accuracy of mixing matrix is an important way to obtain more accurate source signals. The average correlation coefficients between source signals and restored source signals are displayed in Fig. 13(c). The similar results can be obtained with Fig. 13(b) that the proposed method is better than the comparison methods. From the above results, we can see that the proposed method could outperform the contrast methods. It should be noted that the source number can be automatically determined with IDPC in the proposed method, however, the source number in Ratio method and UBSS-SC should be known beforehand. The average running time is used to evaluate computational complexity of the methods. The detailed information of the computer in numerical studies is listed in Table 1. The average running time is listed in Table 2, where Ratio method is the fastest and the proposed method is faster than UBSS-SC. Running time of the proposed method and UBSS-SC are mainly consumed in the process of identifying SSPs. As we all know, there are many classic BSS methods, such as AMUSE, SOBI, JADEop, FPICA and so on which have been integrated into ICALAB Ver. 3 toolbox [48] available at [47]. However, they are designed mainly for (over)determined BSS and could not well estimate source signals in underdetermined cases. Due to the limitation of this paper, the comparisons of the proposed method with classic BSS methods is not exhibited.

3.5. Experimental studies on a cylindrical shell structure Some practical mechanical systems or their partial sections have the shape of cylindrical shells, such as the underwater vehicles. Therefore, a test bed with cylindrical shell structure is used to examine the effectiveness of the proposed method. Excessive mechanical noise of underwater vehicles will interfere with the detection accuracy. The primary problem of vibration & acoustics monitoring and control is source identification. However, it is a difficult task as different sources and strong background noises are coupling together. Therefore, signal processing is conducted to recover source information from the measured signals. When the number of sensors is smaller than that of sources, UBSS is an excellent method to estimate sources in these cases. Therefore, it is of great significance to study UBSS technology of the cylindrical shell structure. In the experiments, source signals are produced by an adjustable speed motor with an eccentric mass disc and two loudspeakers. Two different signals are generated by two arbitrary waveform generators, respectively, and are used as the input signals for the two loudspeakers. Besides, four sound pressure sensors are installed to collect mixed sound signals. Mixed signals are recorded by GEN2i high-speed data recorder. The cylindrical shell structure is supported by four air springs to simulate floating environment in water. The schematic diagram of the test bed and the photos of the test site are displayed in Fig. 14 and Fig. 15, respectively. The motor is running at a speed of 2340 r/m, that is, the rotation frequency of the motor is about 39 Hz. Input of the first loudspeakers is a sine wave with 713 Hz and input of the second loudspeakers is amplitude modulated signal with modulation frequency 32 Hz and carrier frequency 720 Hz. Sampling length and sampling frequency are 10 s and 10000 Hz, respectively. Fig. 16 shows the time waveforms and Fourier spectrums of mixed signals (only the second and the fourth signals are used and therefore only these two signals are shown in Fig. 16), as is revealed, each mixed signal contains the major frequencies of three source signals. It is difficult to decide which sources these frequencies come from without additional information. Therefore, mixed signals need further processing. Moreover, we only select one segment of signals from 4 s to 6 s to restore source signals. Ratio method and UBSS-SC are first used to separate mixed signals into four estimated source signals. The reason why we adopted four separated signals will be given later. Source signals estimated by Ratio method and UBSS-SC are illustrated in Fig. 17 and Fig. 18, respectively. However, both Ratio method and UBSS-SC separate the major frequency (523 Hz) of the first loudspeaker and the major frequency (32 Hz and 720 Hz) of the second loudspeaker into the same signal. Only the major

Table 1 Detailed information of the computer in numerical studies. Device

Detailed information

CPU Memory Motherboard Hard Disk Drive Display Card Environment

Inter Core i5-4590 of 3.30 GHz (4 CPUs) Kingston DDR3 1866 MHz 16G Haswell B85-PLUS R2.0 ST1000DM003-1SB1 SCSI Disk Device 1 TB NVIDIA GeForce GTX 750 Matlab R2015b

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

85

Table 2 Average running time of the proposed method and the contrast methods in different SNRs. Methods

Average running time (s) SNR ¼ 0 dB

SNR ¼ 10 dB

SNR ¼ 20 dB

SNR ¼ 30 dB

Ratio method UBSS-SC The proposed method

0.422 20.745 2.874

0.445 7.371 3.536

0.456 7.077 3.796

0.460 7.047 3.667

Fig. 14. Schematic diagram of the test bed.

Fig. 15. Photos of the test site.

86

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 16. Mixed signals. (a) Waveforms; (b) Fourier spectrums.

frequency (39 Hz) of the motor is separated correctly. Therefore, the frequencies 32 Hz, 720 Hz and 523 Hz will be mistaken for coming from the same source if Ratio method and UBSS-SC are used. The proposed method is also used to estimate source signals from mixed signals. Fig. 19 displays the final decision graph and reveals that the new distance ~ d of four points is abnormally larger than other points. Therefore, the cluster center and cluster number can be obtained by setting sd ¼ 0:45. This is the reason why we separate mixed signals into four signals rather than three signals. Though three sources are used in the experiments, environmental noise can be regarded as the fourth source, that is, the source number is four actually. Fig. 20 shows the estimated source signals of the proposed method. Clearly, the frequencies of the motor, loudspeaker 1 and loudspeaker 2 have been well recovered and correspond to b s 1 , bs 2 and b s 3 in Fig. 16, respectively. The fourth separated signal b s 4 can be regarded as environmental noise. To better show the superiority of the proposed method, source signals are also collected. The signal of the motor is collected by letting it work alone, and the signals of two loudspeakers are directly obtained from the output of two arbitrary waveform generators, respectively. Time waveforms and Fourier spectrums of source signals are displayed in Fig. 21. Compare Fig. 20 with Fig. 21, the major frequencies of source signals have been well estimated by the proposed method.

Fig. 17. Estimated source signals by Ratio method. (a) Waveforms; (b) Fourier spectrums.

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

87

Fig. 18. Estimated source signals by UBSS-SC. (a) Waveforms; (b) Fourier spectrums.

Fig. 19. Final decision graph.

The performance comparison of the source recovery between Ratio method, UBSS-SC and the proposed method are given in Fig. 22. From Fig. 22(a), the SIRs of the estimated source signals of the proposed method is larger than that of Ratio method and UBSS-SC. Fig. 22(b) shows the correlation coefficient between the estimated source signals and real source signals, which shows that the coefficients of the proposed method are also larger than those of Ratio method and UBSS-SC. Time costs of Ratio method, UBSS-SC and the proposed method are 0.268 s, 3.576 s, and 1.918 s, respectively. In Ratio method, SSP is identified based on the ratio between the real and imaginary part of single TF vector of mixed signals. Therefore, time cost of Ratio method is less than both UBSS-SC and the proposed method that identify SSP by considering the linear representation relations among TF vectors of mixed signals. However, there are two main advantages of the proposed method over UBSS-SC. First, SSPs are searched in the optimized frequency bins rather than in the whole TF plane and all SSPs in these frequency bins can be identified at one time, like the pattern of searching SSPs used in Ratio method. Second, SSPs are identified by directly searching the identical normalized vectors, that is, it only needs to calculate the scalar product of the normalized TF vectors, instead of finding the sparsest coefficients. Those are the reasons why the proposed method is faster than UBSS-SC.

88

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Fig. 20. Estimated source signals by the proposed method. (a) Waveforms; (b) Fourier spectrums.

Fig. 21. Source signals. (a) Waveforms; (b) Fourier spectrums.

4. Conclusion This study proposes an effective UBSS method based on SST and IDPC to recover M sources from N mixtures in underdetermined cases where M > N. SST is utilized to increase the sparsity of signals and it is more robust to noise than STFT. Then, a SSPs identification method is proposed by directly finding the identical normalized TF vectors of mixed signals in the optimal frequency bins, which considers the linear representation relations among these TF vectors. To estimate the mixing matrix with unknown source number, DPC is improved to automatically identify the cluster centers and the cluster number. After that, the mixing matrix can be estimated by recalculating cluster centers and source signals are restored via l1-norm decomposition.

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

89

Fig. 22. Performance comparison of source recovery. (a) SIRs of three signals; (b) Correlation coefficients R between source signals and the estimated source signals.

To validate the effectiveness of the proposed method, some numerical studies and experimental studies on a cylindrical shell structure are provided. The results of numerical studies show that the proposed method can correctly and automatically recognize the source number. The average SIRA, average SIRs and the correlation coefficient R of the proposed method are larger than the contrast methods in all SNR cases. From the experiments with a cylindrical shell structure, the proposed method can correctly separate all major frequencies of sources, however, both Ratio method and UBSS-SC can correctly recover the major frequency of only one source. Moreover, the proposed method can estimate source signals without knowing source number beforehand, which is a great merit in practice compared with the contrast methods because the source number is unknown in many cases. These results can illustrate that the proposed method outperforms Ratio method and UBSS-SC. Hence, the proposed method emerges as an effective and promising tool to estimate sources with noise and unknown source number when fewer sensors are available. The proposed method also has some limitations. First, at least two mixed signals are needed to estimate the mixing matrix and the source signals, that is, the proposed method cannot estimate sources from single channel signal. Second, linear instantaneous mixed model is used in our method and it may be not applicable to convolution cases. This paper mainly focuses on effectively recovering sources for mechanical systems in underdetermined cases when source number is unknown, which is crucial and may be considered as the first step in fault diagnosis. After sources are correctly estimated from their mixtures, other advanced signal processing method could be employed to identify fault features from the separated signals. Therefore, our method could be easily extended to actual fault diagnosis by combining it with other advanced signal processing method. Author contributions Conceptualization, Jiantao Lu and Wei Cheng; Formal analysis, Jiantao Lu and Dong He; Methodology, Jiantao Lu and Dong He; Resources, Wei Cheng and Yanyang Zi; Writing e original and revised manuscript, Jiantao Lu; Writing e review & editing, Jiantao Lu, Wei Cheng and Yanyang Zi. Funding This work was supported by the Project of the National Natural Science Foundation of China [No. 51775407], the Key Project supported by National Natural Science Foundation of China [No. 61633001], the General Project of Joint Preresearch Fund for Equipment of Ministry of Education [No. 6141A02022121] and the Fundamental Research Funds for the Central Universities.

Acknowledgments The authors appreciate late Professor Zhengjia He sincerely for his guidance and groundwork for us. The authors would like to thank Yingxian Zhang from China University of Geosciences for her useful suggestions.

90

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

Appendix A. Supplementary data Supplementary data to this article can be found online at https://doi.org/10.1016/j.jsv.2019.05.037. References [1] W. Cheng, Z. Zhang, S. Lee, Z. He, Source contribution evaluation of mechanical vibration signals via enhanced independent component analysis, J. Manuf. Sci. Eng, Transactions of the ASME. 134 (2012), 021014. [2] J. Zhang, Z. Zhang, W. Cheng, X. Li, B. Chen, Z. Yang, Z. He, Kurtosis-based constrained independent component analysis and its application on source contribution quantitative estimation, IEEE Transactions on Instrumentation and Measurement 63 (2014) 1842e1854. [3] G. Wolf, S. Mallat, S. Shamma, Rigid motion model for audio source separation, IEEE Trans. Signal Process. 64 (2016) 1822e1831. [4] W. Naanaa, J.-M. Nuzillard, Extreme direction analysis for blind separation of nonnegative signals, Signal Process. 130 (2017) 254e267. [5] H. Becker, L. Albera, P. Comon, A. Kachenoura, I. Merlet, A penalized semialgebraic deflation ICA algorithm for the efficient extraction of interictal epileptic signals, IEEE Journal of Biomedical and Health Informatics 21 (2017) 94e104. [6] W. Cheng, Z. He, Z. Zhang, A comprehensive study of vibration signals for a thin shell structure using enhanced independent component analysis and experimental validation, J. Vib. Acoust, Transactions of the ASME. 136 (2014), 041011. [7] J. Antoni, Blind separation of vibration components: principles and demonstrations, Mech. Syst. Signal Process. 19 (2005) 1166e1180. [8] P.W. Tse, J. Zhang, X. Wang, Blind source separation and blind equalization algorithms for mechanical signal separation and identification, J. Vib. Control 12 (2006) 395e423. [9] H. Jun, Y. Chen, Q.-H. Zhang, G. Sun, Q. Hu, Blind source separation method for bearing vibration signals, IEEE Access 6 (2018) 658e664. [10] A. Sadhu, S. Narasimhan, J. Antoni, A review of output-only structural mode identification literature employing blind source separation methods, Mech. Syst. Signal Process. 94 (2017) 415e431. [11] B. Liu, V.G. Reju, A.W. Khong, A linear source recovery method for underdetermined mixtures of uncorrelated AR-model signals without sparseness, IEEE Trans. Signal Process. 62 (2014) 4947e4958. votte, A. Ozerov, Notes on nonnegative tensor factorization of the spectrogram for audio source separation: statistical insights and towards self[12] C. Fe clustering of the spatial cues, in: International Symposium on Computer Music Modeling and Retrieval, Springer, 2010, pp. 102e115. [13] Z. Koldovský, P. Tichavský, A.H. Phan, A. Cichocki, A two-stage MMSE beamformer for underdetermined signal separation, IEEE Signal Process. Lett. 20 (2013) 1227e1230. [14] Y. Chang, Y. Zi, J. Zhao, Z. Yang, W. He, H. Sun, An adaptive sparse deconvolution method for distinguishing the overlapping echoes of ultrasonic guided waves for pipeline crack inspection, Meas. Sci. Technol. 28 (2017), 035002. [15] F.M. Naini, G.H. Mohimani, M. Babaie-Zadeh, C. Jutten, Estimating the mixing matrix in Sparse Component Analysis (SCA) based on partial k -dimensional subspace clustering, Neurocomputing 71 (2008) 2330e2343. [16] F. Amini, Y. Hedayati, Underdetermined blind modal identification of structures by earthquake and ambient vibration measurements via sparse component analysis, J. Sound Vib. 366 (2016) 117e132. [17] P. Bofill, M. Zibulevsky, Underdetermined blind source separation using sparse representations, Signal Process. 81 (2001) 2353e2362. [18] H. Sawada, S. Araki, S. Makino, Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment, IEEE Trans. Audio Speech Lang. Process. 19 (2011) 516e527. [19] N. Tengtrairat, W.L. Woo, Single-channel separation using underdetermined blind autoregressive model and least absolute deviation, Neurocomputing 147 (2015) 412e425. [20] L. Zhen, D. Peng, Z. Yi, Y. Xiang, P. Chen, Underdetermined blind source separation using sparse coding, IEEE Transactions on Neural Networks and Learning Systems (2017) 3102e3108. [21] J.J. Thiagarajan, K.N. Ramamurthy, A. Spanias, Mixing matrix estimation using discriminative clustering for blind source separation, Digit. Signal Process. 23 (2013) 9e18. [22] S. Xie, L. Yang, J.-M. Yang, G. Zhou, Y. Xiang, Time-frequency approach to underdetermined blind source separation, IEEE Transactions on neural networks and learning systems 23 (2012) 306e316. [23] T. Peng, Y. Chen, Z. Liu, A timeefrequency domain blind source separation method for underdetermined instantaneous mixtures, Circuits Syst. Signal Process. 34 (2015) 3883e3895. [24] D. Peng, Y. Xiang, Underdetermined blind source separation based on relaxed sparsity condition of sources, IEEE Trans. Signal Process. 57 (2009) 809e814. [25] L. Yang, J. Lv, Y. Xiang, Underdetermined blind source separation by parallel factor analysis in time-frequency domain, Cognitive computation 5 (2013) 207e214. [26] I. Daubechies, J. Lu, H.-T. Wu, Synchrosqueezed wavelet transforms: an empirical mode decomposition-like tool, Appl. Comput. Harmon. Anal. 30 (2011) 243e261. [27] S. Rickard, The DUET blind source separation algorithm, Blind Speech Separation (2007) 217e237. [28] F. Abrard, Y. Deville, A timeefrequency blind signal separation method applicable to underdetermined mixtures of dependent sources, Signal Process. 85 (2005) 1389e1403. [29] V.G. Reju, S.N. Koh, Y. Soon, An algorithm for mixing matrix estimation in instantaneous blind source separation, Signal Process. 89 (2009) 1762e1773. [30] Y. Li, W. Nie, F. Ye, Y. Lin, A mixing matrix estimation algorithm for underdetermined blind source separation, Circuits Syst. Signal Process. 35 (2016) 3367e3379. [31] J. Sun, Y. Li, J. Wen, S. Yan, Novel mixing matrix estimation approach in underdetermined blind source separation, Neurocomputing 173 (2016) 623e632. [32] P.D. O'grady, B.A. Pearlmutter, Soft-LOST: EM on a mixture of oriented lines, in: ICA, Springer, 2004, pp. 430e436. [33] A. Aissa-El-Bey, K. Abed-Meraim, Y. Grenier, Blind separation of underdetermined convolutive mixtures using their timeefrequency representation, IEEE Trans. Audio Speech Lang. Process. 15 (2007) 1540e1550. [34] D. Arthur, S. Vassilvitskii, k-meansþþ, The advantages of careful seeding, in: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, 2007, pp. 1027e1035. [35] A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science 344 (2014) 1492e1496. [36] G. Thakur, E. Brevdo, N.S. Fu ckar, H.T. Wu, The Synchrosqueezing algorithm for time-varying spectral analysis: robustness properties and new paleoclimate applications, Signal Process. 93 (2013) 1079e1094. [37] G. Thakur, H.-T. Wu, Synchrosqueezing-based recovery of instantaneous frequency from nonuniform samples, SIAM J. Math. Anal. 43 (2011) 2078e2095. [38] H. Cao, S. Xi, X. Chen, S. Wang, Zoom synchrosqueezing transform and iterative demodulation: methods with application, Mech. Syst. Signal Process. 72e73 (2016) 695e711. [39] S. Xi, H. Cao, X. Chen, X. Zhang, X. Jin, A frequency-shift synchrosqueezing method for instantaneous speed estimation of rotating machinery, J. Manuf. Sci. Eng. 137 (2015) 18e22. [40] S Xi, H Cao, X Chen, Zoom synchrosqueezing transform for instantaneous speed estimation of high speed spindle, Mater. Sci. Forum 836e837 (2016) 310e317. [41] M. Mihalec, J. Slavi c, M. Bolte zar, Synchrosqueezed wavelet transform for damping identification, Mech. Syst. Signal Process. 80 (2016) 324e334.

J. Lu et al. / Journal of Sound and Vibration 457 (2019) 67e91

91

[42] H. Cao, Y. Yue, X. Chen, X. Zhang, Chatter detection based on synchrosqueezing transform and statistical indicators in milling process, Int. J. Adv. Manuf. Technol. 95 (2017) 1e12. [43] N. Liu, J. Gao, X. Jiang, Z. Zhang, Q. Wang, Seismic timeefrequency analysis via STFT-based concentration of frequency and time, IEEE Geosci. Remote Sens. Lett. 14 (2017) 127e131. [44] C. Li, L. Zhu, Z. Luo, Underdetermined blind source separation of adjacent satellite interference based on sparseness, China Communications 14 (2017) 140e149. [45] C Hu, Q. Yang, M Huang, W Yan, Sparse component analysis-based under-determined blind source separation for bearing fault feature extraction in wind turbine gearbox, IET Renew. Power Gener. 11 (2016) 330e337. [46] Y. Hao, L. Song, Y. Ke, H. Wang, P. Chen, Diagnosis of compound fault using sparsity promoted-based sparse component analysis, Sensors 17 (2017) 1307. [47] A. Cichocki, S. Amari, K. Siwek, T. Tanaka, A.H.P, et al., ICALAB Toolboxes. http://www.bsp.brain.riken.jp/ICALAB. [48] A. Cichocki, S.-i. Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications, John Wiley & Sons, 2003.