Modified singular value decomposition by means of independent component analysis

Modified singular value decomposition by means of independent component analysis

Signal Processing 84 (2004) 645 – 652 www.elsevier.com/locate/sigpro Modied singular value decomposition by means of independent component analysis...

372KB Sizes 0 Downloads 65 Views

Signal Processing 84 (2004) 645 – 652

www.elsevier.com/locate/sigpro

Modied singular value decomposition by means of independent component analysis V.D. Vrabiea; b;∗ , J.I. Marsa , J.-L. Lacoumea a Laboratoire

des Images et des Signaux, ENSIEG-INPG, BP 46, 38402 Saint Martin d’H"eres Cedex, France b Laboratorul de Analiza si Prelucrarea Imaginilor, UPB - Bucuresti, Romania Received 25 June 2001; received in revised form 21 November 2003

Abstract In multisensor signal processing (underwater acoustics, geophysics, etc.), the initial dataset is usually separated into complementary subspaces called signal and noise subspaces in order to enhance the signal-to-noise ratio. The singular value decomposition (SVD) is a useful tool to achieve this separation. It provides two orthogonal matrices that convey information on normalized wavelets and propagation vectors. As signal and noise subspaces are on the whole well evaluated, usually the SVD procedure cannot correctly extract only the source waves with a high degree of sensor to sensor correlation. This is due to the constraint given by the orthogonality of the propagation vectors. To relax this condition, exploiting the concept of independent component analysis (ICA), we propose another orthogonal matrix made up of statistically independent normalized wavelets. By using this combined SVD–ICA procedure, we obtain a better separation of these source waves in the signal subspace. E8ciency of this new separation procedure is shown on synthetic and real datasets. ? 2003 Elsevier B.V. All rights reserved. Keywords: Multisensor signal processing; Singular value decomposition; Eigenimages ; Subspace separation; Independent component analysis

1. Introduction The singular value decomposition (SVD) is a powerful decomposition in linear algebra. This procedure is also referred in signal processing as Karhunen– Lo:eve transformation [8] or principal component analysis [12]. The basic approach to compute the SVD is given in [6].



Corresponding author. Tel.: +33-4-76-82-64-57; fax:+33-476-82-63-84. E-mail addresses: [email protected] (V.D. Vrabie), [email protected] (J.I. Mars), [email protected] (J.-L. Lacoume).

SVD is used in multisensor signal processing to separate the initial dataset into signal and noise subspaces [5,7,12–14]. It provides two orthogonal matrices that convey information on normalized wavelets and propagation vectors. In some applications [4], the SVD is also used to decompose the signal subspace in two subsets: • The rst subset contains what the authors have called sources with a “high sensor to sensor correlation”. • The second subset is made up of sources with a “low sensor to sensor correlation”. For example, in vertical seismic proling (VSP) in geophysics, this tool is used to dene the low-pass, the

0165-1684/$ - see front matter ? 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2003.12.007

646

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

band-pass and the high-pass subspaces respectively, in terms of range of singular values. This allows to separate the downgoing waveeld, upgoing waveeld and the noise. We will show that the low-pass subspace usually cannot give a correct description of the source waves with a high degree of sensor to sensor correlation and generally, will not extract these waves only. In order to overcome this problem, this paper proposes to associate the SVD procedure with the independent component analysis (ICA). Hence, a new right matrix is created and made up of fourth-order independent normalized wavelets. This makes it possible to relax the non-physically justied orthogonality condition of the propagation vectors, improving the low-pass subspace estimate. The content of this paper is as follows: Section 2 introduces the model of signals recorded on multisensor array. Section 3 is a brief presentation of the SVD and the subspace separation concept. Section 4 refers to the ICA notion and its use for the separation of the signal subspace in two parts. Applications of this new separation tool on synthetic and real datasets are shown in Section 5 before the conclusions.

2.2. Simpli9ed model In a great number of applications (underwater acoustics, seismic prospection, etc.) the propagation of waves causes only a delay and an attenuation. In these situations each transfer function aip (n) is reduced to an attenuation aip and a time delay nip , so that the model (2) becomes xi (n) =

P 

aip sp (n − nip ) + bi (n):

(3)

p=1

2.3. Preprocessing The aim of the preprocessing is to obtain the smallest possible signal subspace. To achieve that, we apply a time correction (i.e. waves alignment) for each recorded signal, that compensate the sensor to sensor delay of the dominant component [5,7]. In VSP for example, the downgoing waves are the dominant waves and the sensor to sensor delay can be estimated by cross-correlation. Denoting s1 (n) the aligned wave, model (3) after the alignment becomes yi (n) = ai1 s1 (n) +

P 

aip sp (n − nip ) + bi (n);

(4)

p=2

2. Model of multisensor array signals 2.1. General model In multisensor array processing, the propagation of waves is described by the convolutive model [10] xi (t) =

P  

aip (t − )sp () d + bi (t):

(1)

p=1

The received signal xi (t) on the ith sensor, with i = 1; : : : ; Nc and Nc the number of sensors, results from the superposition of P source waves sp (t) via the transfer functions aip (t). bi (t) are unknown noises, supposed to be Gaussian, spatially white, centered and independent of the source waves. In discrete time, the model can be written as xi (n) =

P ∞   p=1 l=−∞

aip (n − l)sp (l) + bi (n):

(2)

where yi (n) = xi (n + ni1 ); nip = nip − ni1 and the noise bi (n) = bi (n + ni1 ). From now on, we will consider the simplied model (4) assuming that the aligned source wave s1 (n) is independent from the others and therefore independent from sp (n − nip ) [11]. On the assumption that Nt time samples are available, we rewrite the received signals in a data matrix Y ∈ RNc ×Nt Y = {yin = yi (n)|1 6 i 6 Nc ;

1 6 n 6 Nt }:

(5)

3. Singular value decomposition The SVD of the data matrix Y is [6,9] Y = UVT =

N 

k uk vkT ;

(6)

k=1

where U ∈ RNc ×Nc is an orthogonal matrix made up of left singular vectors uk ∈ RNc (eigenvectors of YYT ), V ∈ RNt ×Nt is an orthogonal matrix made up

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

of right singular vectors vk ∈ RNc (eigenvectors of YT Y), and  ∈ RNc ×Nt is a pseudo-diagonal matrix  = diag(1 ; : : : ; k ; : : : ; N ) made up of singular values k , with elements ordered 1 ¿ · · · ¿ N ¿ 0 and N = min(Nc ; Nt ). The product uk vkT is a rank one Nc ×Nt matrix called the kth eigenimage of data matrix Y [4]. Due to the orthogonality of the singular vectors, the eigenimages form an orthogonal basis for the representation of Y. Let uk = [uk1 ; : : : ; uki ; : : : ; ukNc ]T and vk = [vk1 ; : : : ; vkj ; : : : ; vkNt ]T be the left and right kth singular vectors. In the kth eigenimage, the sample data yijk at time j on sensor i is expressed as yijk = uki vkj . Thus • vk is called “normalized wavelet” since vkj (1 6 j 6 Nt ) gives the time dependence of the component associated to the kth eigenimage. • uk is called “propagation vector” since uki (1 6 i 6 Nc ) gives the amplitude of the normalized wavelet received on the ith sensor. In the noise-free case, if the Nc recorded signals are linearly independent, the matrix Y is full rank and its perfect reconstruction requires all eigenimages. If the recorded signals are linearly dependent (i.e. they are equal to within a scale factor) the matrix Y is of rank one and its perfect reconstruction requires only the rst eigenimage. 3.1. Subspace separation Let M be the rank of Y in the noise-free case and N in the real case (i.e. when the recorded signals are corrupted by noise). Bases on the assumption that the noise is spatially white and decorrelated from the source waves, the SVD is used directly on the raw data Y to achieve a separation between signal and noise subspaces [4,5,7,12–14] Y = Ysig + Ynois =

M  k=1

k uk vkT

+

N 

k uk vkT :

(7)

k=M +1

The rst M eigenimages of data matrix represents the signal subspace Ysig and the other N − M the noise subspace Ynois . In practice, the choice of M depends on the relative magnitudes of singular values. In Ref. [4], the authors decompose the signal subspace in two parts. In this partition, the data matrix Y

647

is projected on three orthogonal subspaces Y = YLP + YBP + YHP

(8)

• The low-pass subspace YLP , associated to the higher Q singular values, contains source waves with a “higher degree of sensor to sensor correlation”. • The band-pass subspace YBP , associated to residual M − Q singular values in the signal subspace, is constructed by events with a “little degree of sensor to sensor correlation”. • The high-pass subspace YHP is the noise subspace Ynois . For example, in geophysical interpretation of vertical seismic proling (VSP), after synchronizing the downgoing waves, YLP contains the downgoing waveeld, YBP the upgoing waveeld and the high-pass subspace YHP the noise components. In crosswell 1 applications, YLP describes high amplitude correlated events (direct waves, refracted waves or tube waves), YBP low amplitude correlated events, as reNections and converted modes and YHP represents the uncorrelated noise. The purpose of this kind of decomposition is to isolate only the highest sensor to sensor correlated waves in the low-pass subspace YLP , like the aligned wave s1 (n) dened in Eq. (4). This is possible if s1 (n) is decorrelated from the other source waves and if there is a large gap between the rst Q and the other M − Q singular values describing the signal subspace. It was shown in [5] that Q = 1 for a perfect wave alignment and Q = 2 for a non-perfect wave alignment (for example a phase rotation of s1 (n) along the array due to dispersive phenomena). In this case, SVD provides a good separation of aligned waves in the low-pass subspace. If there is no large gap between the st Q singular values and the following ones, YLP will not extract entirely the aligned waves, and will sometimes contain high energy non-aligned waves as well [14]. Hence, more than 2 singular values are usually needed for a complete extraction of the aligned waves. Let R be the number of singular values necessary to describe the subspace containing the aligned waves (denoted with YR ). Note that Q 6 R 6 M and the subspace YR contains also non-aligned waves. 1

Measurement between two wells.

648

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

Let UR = [u1 ; : : : ; uR ] ∈ RNc ×R and VR = [v1 ; : : : ; vR ] ∈ RNt ×R be the matrices made up by the rst R left and the right singular vectors, respectively. Let R be the diagonal matrix having on its diagonal the rst R singular values. The normalized wavelets vk in VR describing the subspace YR are orthogonal and therefore second-order statistically independent. The propogation vectors uk in UR are also orthogonal by construction. There is no physical reason for which the propagation vectors should be orthogonal. In this case, imposing the criterion of orthogonality for the vectors uk , the normalized wavelets are forced to be a mixture of source waves [14]. One way of relaxing this constraint is to nd another matrix of normalized wavelets [˜v1 ; : : : ; v˜ R ] = ˜ R ∈ RNt ×R for which these waves as “as independent V as possible”. Based on the assumption that s1 (n) is independent from the other source waves, this can be achieved by the ICA. 4. SVD and ICA ICA is a blind decomposition of a multichannel dataset made of an unknown linear mixture of unknown source signals, based on the assumption that the source signals are mutually statistically independent [1–3]. This means that the cross-cumulants of any order must be equal to zero. As usual, the third-order cumulants are discarded because they are generally null or close to zero, and we will use fourth-order statistics which have been found to be su8cient for instantaneous mixtures [1,2]. ICA can be solved by a two-stage algorithm, consisting of a prewhitening and a high-order step. The rst step is carried out by the SVD directly on the raw data. The normalized wavelets in the matrix VR ∈ RNt ×R , second-order statistically independent, are an instantaneous linear mixture of source waves if the non-aligned sources waves in YR are contained in a subspace of dimension smallest than R − 1 [14]. Assuming that and supposing the source waves mutually statistically independent, the second step consists in nding a rotation (orthogonal) matrix B ∈ RR×R for ˜ R = VR B are independent which the components of V at the fourth-order. There are diPerent methods of nding the rotation matrix B: joint approximate diagonalization of

eigenmatrices (JADE) [1], maximal diagonality (MD) [2] or simultaneous third-order tensor diagonalization (STOTD) [3]. In this paper we used the JADE algorithm, although we believe that similar results are obtained employing other algorithms. ˜ TR , the subspace YR becomes Since VRT = BV YR =

R 

˜ TR ; k uk vkT = UR R VRT = CR V

(9)

k=1

where CR =[c1 ; : : : ; cR ] ∈ RNc ×R is CR =UR R B. From this matrix, we obtain two matrices: 2 ˜ R = [˜u1 ; : : : ; u˜ R ] ∈ RNc ×R • a rectangular matrix U made up of normalized columns (the new propagation vectors) u˜ k = ck = ck , which are generally not orthogonal. ˜ R = diag(1 ; : : : ; R ) ∈ RR×R • a diagonal matrix D with diagonal elements k = ck called “modied singular values”. The elements k are usually not ordered. For this reason, we perform a permutation P between the ˜ R as well as between the columns of columns of U ˜ R . Using the notations ˜ VR to order the inputs of D ˜ ˜ UP(R) = [˜uP(1) ; : : : ; u˜ P(R) ], VP(R) = [˜vP(1) ; : : : ; v˜ P(R) ] ˜ P(R) = diag(P(1) ; : : : ; P(R) ), with the diagonal and D elements ordered, i.e. P(1) ¿ · · · ¿ P(R) ¿ 0, the decomposition (9) becomes ˜ P(R) D ˜ P(R) V ˜ TP(R) = YR = U

R 

P(k) u˜ P(k) v˜ TP(k) :

(10)

k=1

With Q1 the number of normalized wavelets necessary to describe the high sensor to sensor correlated waves (depending on the relative magnitudes of the ˜ P(R) ), the low-pass elements in the diagonal matrix D ˜ ˜ YLP and the band-pass YBP subspaces using SVD– ICA are given by ˜ LP = Y

Q1 

P(k) u˜ P(k) v˜ TP(k)

(11)

k=1

˜ BP = Y

R  k=Q1 +1

2

P(k) u˜ P(k) v˜ TP(k)

+

M  k=R+1

ck  is the ‘2 -norm of the vector ck .

k uk vkT :

(12)

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

There is no diPerence between the subspaces YR obtained with the classical SVD and with SVD–ICA, but there are diPerences between the low-pass and band-pass subspaces. Due to the orthogonality of v˜ P(k) , these subspaces are orthogonal as well. Furthermore, in the decomposition (10) a stronger criterion for the new normalized wavelets v˜ P(k) has been imposed, i.e. to be statistically independent at the fourth-order, and, at the same time, the condition of orthogonality for the new propagation vectors u˜ P(k) has been relaxed.

649

Fig. 1. Synthetic dataset Y.

5. Applications 5.1. Synthetic data In this section we present an application on a synthetic dataset. The preprocessed recorded signals Y on 8 sensors array (Nc = 8) are represented in Fig. 1. In this case we have P = 3 source waves: the waves wA and wC with innite apparent propagation velocity on the array (arriving at the same time on each sensor, representing the aligned waves) and the wave wB , with a negative apparent propagation velocity on the array, which is present between sensor 1 and 4. On each wave we have applied an amplitude reduction and a phase rotation along the array in order to simulate the absorption phenomena. The waves with a high degree of sensor to sensor correlation are the waves wA and wC . Spatially white Gaussian noise was added to these source waves. The signal-to-noise ratio 3 is SNR = 30 dB and the number of time samples is Nt = 256. N The percent relative magnitudes 100 × k = k=1 k of the singular values given by SVD are presented in Fig. 2. The number of singular values used to describe the signal subspace is M = 6. Among those 6 singular values, the rst two are related to the highest correlated waves (Q = 2). Fig. 3 shows the rst 6 normalized wavelets vk . As we can see, these wavelets are an instantaneous linear mixture of the source waves. The low-pass subspace YLP obtained with the rst Q = 2 eigenimages is presented in Fig. 4 and the 3

The SNR denition used here is SNR = 20 log10 Ysig = Ynois , where  ·  is the matrix Frobenius norm, Ysig the noise-free dataset and Ynois the additive noise.

Fig. 2. Relative magnitudes in .

Fig. 3. Normalized wavelets in VR .

Fig. 4. Low-pass subspace YLP .

band-pass subspace YBP obtained with the following 4 eigenimages in Fig. 5. It is clear from these gures that the classical SVD implies some artefacts in the low-pass and band-pass subspaces for a waveeld separation objective.

650

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

Fig. 5. Band-pass subspace YBP .

Fig. 8. Low-pass subspace Y˜ LP .

˜ P(R) . Fig. 6. Relative magnitudes in D Fig. 9. Band-pass subspace Y˜ BP .

Fig. 7. Normalized wavelets in V˜ P(R) .

Fig. 6 shows the relative magnitudes of the elements ˜ P(R) obtained using the SVD– in the diagonal matrix D ICA on the subspace YR (in this case, R = M = 6) and also the last singular values given by SVD, related to the noise subspace, i.e. 7 and 8 . The number of components that describes the high correlated waves is Q1 =2. With SVD–ICA the repartition of the modied values in 3 clusters is more evident than with classical ˜ P(R) SVD. Furthermore, the normalized wavelets in V presented in Fig. 7 are clearly closed to the original waves than the normalized wavelets given by SVD (Fig. 3). ˜ LP is given in Fig. 8 and The low-pass subspace Y ˜ BP in Fig. 9. The low-pass the band-pass subspace Y subspace extracts the rst two highly correlated waves without any visible interference with the other waves. The residual artefacts presented in classical SVD

(Fig. 4) are eliminated. This improvement is due to the fact that by ICA we have imposed a fourth-order independence condition stronger than the decorrelation used in classical SVD. With this method of decomposition we have relaxed the non-physically justied orthogonality of the propagation vectors. For a quantitative evaluation of the errors, we use the mean-square error between the ideal low-pass subspace YLP and the reconstructed ones 1 R LP 2 ; MSE = YLP − Y (13) Nc Nt R LP where · is the matrix Frobenius norm and Y represents either the estimated low-pass subspace YLP obtained using the SVD or the estimated low-pass ˜ LP obtained using the SVD–ICA. The gain subspace Y factor (in dB) between the two methods is given by   MSE|SVD GMSE = 10 log10 : (14) MSE|SVD−ICA In this example we have MSE|SVD = 15:7 × 10−3 for classical SVD and MSE|SVD−ICA = 5:2 × 10−3 for SVD–ICA, giving a gain factor GMSE = 4:8 dB. The gain factor for diPerent signal-to-noise ratio SNR (in dB) of the dataset is presented in the

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

651

Fig. 12. Normalized wavelets in V˜ P(R) .

Fig. 10. Real dataset Y.

˜ P(R) . Fig. 13. Relative magnitudes in R and D

Fig. 11. The rst 9 normalized wavelets in V.

following table SNR 50 40 35 30 20 10 GMSE 8:5 7:9 6:9 4:8 3 1:3 5.2. Real data We present now an application on a real dataset obtained during an ocean bottom seismic (OBS) acquisition. The recorded signals Y on Nc = 40 sensors array and Nt =480 time samples, after the preprocessing, are presented in Fig. 10. In this case, the signal subspace Ysig is approximatively described by the rst M = 35 singular values. Fig. 11 shows the rst 9 normalized wavelets vk in V. As we can see, the aligned waves are projected on the rst R=7 normalized wavelets. These wavelets, used to describe the subspace YR , are an instantaneous linear mixture of the source waves. With

Fig. 14. Residual subspace Y − YLP .

˜ P(R) SVD–ICA, the normalized wavelets v˜ P(k) in V presented in Fig. 12 are clearly better separated than in the classical SVD. Fig. 13 presents the relative magnitudes of the ele˜ P(R) . A better ments in the diagonal matrices R and D distribution of these elements is obtained using SVD– ICA than using classical SVD. Fig. 14 shows the residual subspace (i.e. the diPerence between the initial data Y and the low-pass

652

V.D. Vrabie et al. / Signal Processing 84 (2004) 645 – 652

Q and M for the classical SVD, as well as Q1 and R for SVD–ICA depends on the data themselves. The presented examples have shown that the clustering of modied singular values is more ePective with SVD–ICA. References

Fig. 15. Residual subspace Y − Y˜ LP

subspace YLP ) obtained using SVD and keeping the rst Q = 2 eigenimages. In this case, the aligned waves are not completely eliminated. Fig. 15 presents ˜ LP obtained using SVD– the residual subspace Y − Y ICA and Q1 = 2. Here, the low-pass subspace extracts the highly correlated waves without any visible interference with the other waves. 6. Conclusions We have presented a new method for multidimensional signal space separation into a low-pass and a band-pass subspace, respectively. The classical SVD imposes the orthogonality for the propagation vectors and forces the normalized wavelets in the right matrix to be a matrix of source waves. This procedure introduces artefacts in the two estimated subspaces. The results were improved and the number of errors in the two subspaces was decreased by using the ICA on the rst R normalized wavelets in the right orthogonal matrix. The choice of the number of singular values

[1] J.-F. Cardoso, A. Souloumiac, Blind beamforming for non-Gaussian signals, IEE Proc.-F 140 (6) (1993) 362–370. [2] P. Comon, Independent component analysis, a new concept? Signal Process. (special issue on HOS) 36 (3) (April 1994) 287–314. [3] L. De Lathauwer, Signal processing based on multilinear algebra, Ph.D. Thesis, Katholieke Universiteit Leuven, September 1997. [4] S.L.M. Freire, T.J. Ulrych, Application of singular value decomposition to vertical seismic proling, Geophysics 53 (June 1988) 778–785. [5] F. Glangeaud, J.L. Mari, F. Coppens, Signal processing for geologists and geophysicists, Technip Ed., Paris, 1999. [6] G.H. Golub, C.F. Van Loan, Matrix Computations, The John Hopkins University Press, Baltimore, London, 1989. [7] B. Hardage, Crosswell Seismology and Reverse VSP, Geophysical Press Ltd, London, 1992. [8] C. HTemon, D. MacTe, Essai d’une application de la transformation Karhunen–Lo:eve au traitement sismique, Geophys. Prospect. 26 (4) (1978) 600–626. [9] V.C. Klema, A.J. Laub, The singular value decomposition: its computation and some applications, IEEE. Trans. Automat. Control AC25 (2) (1980) 164–176. [10] J.-L. Lacoume, P.O. Amblard, P. Comon, Statistiques d’ordre supTerieur pour le traitement du signal, Masson, Paris, 1997. [11] H.-L. Nguyen Thi, C. Jutten, Blind source separation for convolutive mixtures, Signal Process. 45 (1995) 209–229. [12] L.L. Scharf, Statistical Signal Processing: Detection, Estimation, and Time Series Analysis, Addison-Wesely, New York, 1991. [13] B. Ursin, Y. Zheng, Identication of seismic reNections using singular value decomposition, Geophys. Prospect. 33 (1985) 773–799. [14] V.D. Vrabie, Statistiques d’ordre supTerieur: applications en gTeophysique et eT lectrotechnique, Ph.D. Thesis, I.N.P. of Grenoble, France, October 2003.