Harmonic tonal detectors based on the BOGA

Harmonic tonal detectors based on the BOGA

Signal Processing 106 (2015) 215–230 Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/sigpro H...

595KB Sizes 1 Downloads 16 Views

Signal Processing 106 (2015) 215–230

Contents lists available at ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

Harmonic tonal detectors based on the BOGA Lu Wang a, Chunru Wan b, Shenghong Li c, Guoan Bi a,n a

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore DSO National Laboratory, Singapore c Department of Electronic Engineering, Shanghai Jiao Tong University, China b

a r t i c l e i n f o

abstract

Article history: Received 25 December 2013 Received in revised form 18 July 2014 Accepted 4 August 2014 Available online 11 August 2014

Tonals generated by machineries with rotating elements typically have a harmonic structure with unknown fundamental frequencies, amplitude, harmonic order and phase. Detecting this type of signals is of great importance to numerous engineering applications. In the frequency domain, tonals are represented by a few harmonic frequencies, which appear in blocks, related to one or more fundamental frequencies. This block-sparsity property of the frequency content suggests alternative ways to recover and detect tonals by using sparse signal processing techniques. Motivated by the success of the block orthonormal greedy algorithm (BOGA), new detection architectures, which require no prior information about the number of the fundamental frequencies, are proposed for robust tonal detection in low signal to noise ratio (SNR) environments. The distributions of the test statistics of detection architectures are firstly analyzed theoretically and comprehensively based on the theory of order statistics. Detection performances are also analyzed and compared theoretically and experimentally. Significant improvements on detection performance in low SNR environments are shown over the conventional detectors that do not consider the harmonic structure and the sparsity of the tonals. & 2014 Elsevier B.V. All rights reserved.

Keywords: Block-sparsity Signal detection BOGA Order statistic

1. Introduction The detection of acoustic signals produced by the motorized vessels or vehicles is needed for various Sonar applications. Acoustic signals in complex underwater environments are usually contaminated by ambient sounds due to seismic disturbances, oceanic turbulence, and various anthropogenic sources. For simplicity, the ambient noise is generally assumed to be a white Gaussian stationary process and the underwater sources, produced by rotating machineries such as shafts, blades and propellers, are typically modeled as a periodic envelope multiplying broadband noise. This audible phenomenon has

n

Corresponding author. E-mail addresses: [email protected] (L. Wang), [email protected] (C. Wan), [email protected] (S. Li), [email protected] (G. Bi). http://dx.doi.org/10.1016/j.sigpro.2014.08.008 0165-1684/& 2014 Elsevier B.V. All rights reserved.

been widely exploited to extract useful information, such as the number of blades and rotational rate [1]. Demodulated-noise (DEMON) processing is commonly used to obtain the required information based on the Fourier analysis, in which the received signals are presumed to have a harmonic line spectrum, i.e., a series of discrete lines or tonals that are harmonically related [2]. In [3], the received signal, xðtÞ, is defined as cavitation noise, sðtÞ, plus white Gaussian stationary ambient noise vðtÞ: xðtÞ ¼ sðtÞ þ vðtÞ ¼ mω ðtÞcðtÞ þvðtÞ where mω ðtÞ is the periodic modulating signal with fundamental frequency ω determined by the motion of the rotating machineries, and cðtÞ is assumed to be stationary and change slowly with time. Within a short time period, we can reasonably assume that cðtÞ is a constant. Since mω is presumed to be periodic, the spectrum of the acoustic signal contains several harmonically related narrow-band peaks (or

216

L. Wang et al. / Signal Processing 106 (2015) 215–230

tonals) which can be modeled as sinusoids. Because engines typically have more than one rotating or vibrating elements, these harmonics appear in different blocks related to their fundamental frequencies determined by the period of the rotating elements. Several types of tonal detectors exist in the literature. The most common technique, low frequency analysis and recording (LOFAR) [4], averages the spectra of data segments after noise-mean removal and normalization. Wagstaff's integration silence processor (WISPR) [5] achieves better performance by integrating the spectrum of each data segment nonlinearly based on the fluctuations between signal and noise. Unlike the LOFAR and WISPR, which perform simple integration, a generalized likelihood ratio test based on power spectrum (PGLRT) [6] optimally integrates the power spectra of data segments according to their statistical properties. Although the PGLRT is optimal from the perspective of statistical properties, it loses the phase information of the signal by squaring the amplitude of the discrete Fourier transform (DFT). The generalized likelihood ratio test based on coherent integration (CGLRT) [7] was then proposed to make full use of the phase information to achieve coherent signal integration for improvement on the detection performance. Previously reported detectors only focus on the detection of a single spectrum line without sufficiently considering the harmonic structure. It is generally assumed that observation time is long enough such that incoherent and coherent integration can be applied to increase the SNR. This paper considers more critical application environments with very low SNR and short observation duration, in which cðtÞ is considered to be constant. In such cases, most detectors based on long time integration cannot provide satisfactory performance. Conventionally, one may try to make use of the concept of a generalized matched filter (GMF) to achieve the optimal detection performance under the assumption that the signal space is known [8]. However, the only prior information is that these tonals are harmonically related to their own fundamental frequencies. Since the number of the fundamental frequencies and the harmonic order associated with each fundamental frequency are unknown, it becomes impossible to implement the GMF technique. This paper will show that the harmonic tonals associated with different fundamental frequencies are blocksparse in nature [11]. That is, only a few harmonic blocks of the tonals are nonzero. In [11], audio signal is decomposed successfully by the Matching Pursuit (MP) with a block structure and a detector based on the estimation of the block-structured MP is designed particularly for music signals. Inspired by [11], our basic idea is to take the block-sparsity into consideration to find the support of the signal and then design the detector for harmonic tonals in Sonar application from a statistical signal processing perspective. Block-sparsity has been recently exploited for many applications such as: motion segmentation [9], multiband signal reconstruction [10], and harmonic decomposition of audio signals [11]. Algorithms, such as lp;q norm based convex optimization, group lasso [12], block orthogonal greedy algorithm (BOGA) [13–16], as well as the block-sparse Bayesian

learning (BSBL) algorithm [17], have been reported in the literature to solve the problem with block-sparsity. It is also noted that block-sparsity has already used in [28,29] for flexible multi-pitch estimation without prior knowledge of numbers of pitches and their harmonics in speech and audio processing. Other fundamental frequency and model order estimation methods can be found in [26,27,30] and the references therein. In this paper, the BOGA method is used due to the following two reasons. Firstly, since group lasso algorithm requires convex optimization and BSBL requires an iterative expectation maximization (EM) procedure, both methods are much more expensive in computation complexity compared to the BOGA. Secondly, since the BOGA method finds the signal support in successive steps, it is easy to insert a sequential detection procedure into each step of the BOGA. Two BOGA based architectures are proposed. The first one is a subspace harmonic detector in which the BOGA is firstly applied to find the signal space and then the energy detector is used to detect the existence of the signal. By inserting a sequential detection procedure into each step of BOGA, the second one is a sequential harmonic detector based on order statistics. Both proposed detectors accumulate the energy from the fundamental frequencies and their associated harmonics to achieve robust detection performance in low SNR environments. There are three main contributions of this paper. The first one is to formulate the harmonic tonals by the block-sparse signal model and suggest using the block sparse recovery techniques in harmonic detection and estimation. The second contribution is that two new detection architectures are proposed by modifying the BOGA which theoretically uses the maximal signal energy in the tonal detection with no prior information of the number of harmonic sets. The theoretical analysis of the test statistics and the detection performances using the theory of the order statistics are given comprehensively, which is the third contribution of this paper. This paper is organized as follows. In the next section, the harmonic signal model, dictionary construction and a harmonic signal model with block-sparsity are presented. Based on the constructed dictionary, two detection architectures are proposed in Section 3. By studying the test statistics and thresholds under the null hypothesis, the performances of the proposed architectures are also analyzed in this section. Some issues on the proposed architectures are discussed in Section 4. The results of simulations are given in Section 5 to show the effectiveness of the proposed detection architectures. Finally, Section 6 discusses some limitations of the reported architectures and addresses the possible future work. Throughout the paper, we denote the variables by lowercase and uppercase letters, e.g., k and K, the vectors by bold lowercase letters, e.g., u and its lth element uðlÞ, and the matrices by bold uppercase letters, e.g., D and its lth column by dl . Given a vector u constructed by a concatenation of blocks of the same size, its lth block is denoted by u½l. For a given matrix D, DH and D† denote its conjugate transpose and pseudo-inverse, respectively. Finally we use ‖‖q , where q ¼ 2; 1, to represent l2 and l1 norm.

L. Wang et al. / Signal Processing 106 (2015) 215–230

2. Signal model In this section, the model of the harmonic signals and its relation with the block-sparsity signal model are introduced. 2.1. Harmonic signal model It is assumed that the received signals may contain K simultaneous fundamental frequencies. During a short observation period, the tonal with a harmonic structure [3] can be expressed into K

M

∑ am;k expðjðmωk n þ θm;k ÞÞ þvðnÞ

xðnÞ ¼ ∑

ð1Þ

k¼1m¼1

where n ¼ 0; 1; …; N  1, ωk is the kth fundamental frequency, K is the total number of fundamental frequencies, am;k and θm;k are, respectively, the amplitude and the phase of the mth harmonic of the kth fundamental frequency, M is the maximum order of the harmonics and vðnÞ is the additive complex white Gaussian noise with a variance σ2. Because of the short duration of the signal, the amplitude, am;k , is assumed to be constant. In matrix notation, the model defined in (1) is expressed as x ¼ s þ v ¼ D0 b0 þ v;

ð2Þ

where D0 ¼ ½D½1D½2…D½K; 2 6 6 D½k ¼ 6 6 4

1 ejωk ⋮

ejωk ðN  1Þ

… … ⋱ …

1 ejMωk ⋮

ejMωk ðN  1Þ Þ

3 7 7 7; 7 5

2

3T

6 7 b1 …bM bM þ 1 …b2M … bðK  1ÞM þ 1 …bKM 7 ; b0 ¼ 6 4|fflfflfflffl{zfflfflfflffl} |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}5 T

b ½1

T

b ½2

bT ½K

bkm ¼ am;k ejθm;k : In the above equations, harmonic structure is included in the observation matrix D0 , which contains K columnblocks, i.e., sub-matrix D½k for k ¼ 1; …; K, corresponding to the kth fundamental frequency. This signal model allows us to develop detection algorithms utilizing the inherent harmonic structure. To reformulate the model into a standard block-sparse signal model, a proper dictionary with a block structure is constructed. Similar to the time-frequency analysis, signals are represented as a product between the corresponding coefficient vector and a dictionary whose atoms are represented by Gabor atoms, wavelets, or Fourier basis. As shown in (2), the atoms used to represent the harmonics are the Fourier atoms. Without any prior information about the fundamental frequencies, an over-complete dictionary is constructed to include all the possible fundamental frequencies, which unavoidably results in a large dictionary and more computation complexity. In this section, we attempt to simplify the dictionary construction

217

for practical applications by coarsely estimating possible fundamental frequencies before other processing steps are performed. 2.2. Dictionary construction The size of the sub-dictionary is determined by the maximum harmonic order which is assumed to be known since it can be determined from the properties of the sound sources in a particular application. The exact harmonic order of each fundamental frequency is not necessarily required because our detection methods are not sensitive to the inaccuracies of the maximum harmonic order. Compared with the Fourier transform, the sparse representation can achieve high resolution by setting narrow frequency bins in the fundamental frequency range given that the signal has sparse representation. However, narrow frequency bins lead to high dictionary H coherence, defined by μ ¼ maxr a l ðdr dl Þ, which results in a strict constraint of low sparsity, as shown in [18–20]. For example, the normalized Fourier basis has atoms pffiffiffiffi dl ðkÞ ¼ expf  j2π lðk  1Þ=ðhNÞg= N; k ¼ 1; …; N;

i ¼ 1; …; hN:

When the width of the frequency bin is 2π =N, i.e., h ¼1, the dictionary, D, becomes the same as that of the conventional Fourier transform, which is ideal for the sparse recovery due to its orthogonality. When h¼2, however, the narrower frequency bin results in an over-complete dictionary with a much higher coherence μ ¼ 1=ðN sin ðNπ ÞÞ [19]. This issue has been dealt with in [20] to achieve super-resolution by using over-sampled DFT frame and model based sparse signal recovery algorithm. Since this paper mainly focuses on the detection of the harmonics rather than recovering exact signal frequencies, the dictionary is chosen to be orthogonal, i.e., h¼1. So far, the possible fundamental frequencies are still unknown for dictionary construction. In some applications, the range of the fundamental frequencies can be approximately determined by the velocity of the sources. Assuming a known fundamental frequency range, the dictionary can be constructed by a set of fundamental frequencies evenly distributed in the range. If such information is unknown, a general solution is to estimate the fundamental frequencies from the data. To coarsely estimate the fundamental frequencies, a maximum likelihood (ML) estimator making use of the harmonic structure is introduced. Let us assume that there is only one fundamental frequency ω in the signal. Then D0 in (2) degenerates to Dω . Based on this inherent harmonic structure, a variable, E ¼ ðDω bÞH ðDω bÞ, is defined to represent the coherent integration of the energy from each harmonic component. Since both ω and b are unknown, it is necessary to estimate them beforehand. Assuming the noise follows a complex Gaussian distribution, we have   p x; b; σ 2 ; ω   1 1 ð3Þ ¼ N 2N exp  2 ðx Dω bÞH ðx  Dω bÞ :

π σ

σ

218

L. Wang et al. / Signal Processing 106 (2015) 215–230

If ω is known, the ML estimate of b can be found by minimizing the following cost function: JðbÞ ¼ ðx  Dω bÞH ðx  Dω bÞ:

ð4Þ

Therefore the ML estimate of b can be represented by 1 H b^ ¼ ðDH ω Dω Þ Dω x:

ð5Þ

Putting (5) into (4), the cost function becomes ^ ¼ xH ðI  Dω ðDH Dω Þ  1 DH Þx: JðbÞ ω ω

ð6Þ

Then, ω is found by minimizing (6), or equivalently maximizing 1 H xH ðDω ðDH ω Dω Þ Dω Þx:

ð7Þ H

With (5) and (7) for E ¼ ðDω bÞ ðDω bÞ, we obtain the total estimated harmonic energy for a fundamental frequency ω: 1 H E^ ¼ maxðxH ðDω ðDH ω Dω Þ Dω ÞxÞ

ω

¼ maxðxH PDω xÞ

ð8Þ

ω

where PDω represents the projection matrix onto the signal space spanned by the columns of Dω , and xH PDω x represents the projection energy of the observations upon the space. From (8), it is concluded that the summation of the energy from all harmonic components of a fundamental frequency can be obtained by searching for the maximum ^ and the corresponding frequency bin is the ML value of E, estimate of the fundamental frequency ω. Eq. (8) has already been proposed and named as an approximate nonlinear least-squares (NLS) estimator in [26,27], where detailed properties and performance analysis can be found. A similar idea of combining the harmonic structure into pitch tracking has been addressed in [21,22]. It is shown in [21] that the variance of the pitch estimation is significantly lower than that obtained by considering the fundamental frequency alone. However, the existence of multiple fundamental frequencies may lead to false peak estimation and sometimes the true peaks are submerged by noise in the low SNR environment. In general, a number of peaks are estimated simultaneously to ensure that the estimates cover all the possible fundamental frequencies.

b½l ¼ ½b1l …bMl T : Then (2) becomes x ¼ Db þ v:

ð11Þ

Comparing (2) with (11), the K nonzero blocks of coefficient vector b correspond to the K fundamental frequencies in the signal. It is suggested that the original timedomain signal s defined in (1) is a block-sparse signal with block-sparse degree of K when it is represented on dictionary D since K oL. Therefore, amplitude vector b0 is the support of the coefficient vector b and the observation matrix D0 is the corresponding part in D. This complex block-sparse signal model is quite a good fit to the harmonic tonal, which is one of the contributions of this paper. By estimating the complex coefficient vector b based on the block sparse signal recovery techniques, the information of harmonic tonal can be totally retrieved. In the remaining parts of this paper, the block-sparse structure is exploited to show its potentials in weak signal detection. 3. Harmonic signal detection Two architectures are developed for detecting the harmonic signals with multiple fundamental frequencies based on the BOGA and their detection performances are analyzed theoretically in this section. The proposed architectures create test statistics gradually in steps of BOGA to accumulate signal energy including that of the harmonics of each fundamental frequency. These architectures require to know the noise variance and are suitable for the scenario that additional data are available to estimate the noise power σ2. In the proposed architectures, complex coefficient vector b is assumed to be unknown but determinative. Generally, it is a binary hypothesis testing problem: H 0 : b ¼ 0;

H 1 : b a 0:

where H0 is the noise-only hypothesis and H1 is the signalpresent hypothesis.

2.3. Block-sparse signal model

3.1. BOGA description

Without loss of generality, let us assume that L peaks are estimated in a set ½ω1 ω2 …ωL . Based on the estimated fundamental frequencies, a dictionary D composed of L blocks is constructed to represent the original signal:

BOGA is an extension of orthogonal matching pursuit (OMP) and has been reported to successfully recover the block-sparse signal in a recursive manner in both noisy and noiseless environments [13]. The BOGA starts with an initial residual r0 ¼ x and finds a low-dimensional block support at each iteration step. At the lth step, a column block is selected based on its correlation with the current residual

D ¼ ½D½1…D½L;

ð9Þ

where Dl ¼ ½d1l …dMl ; dml ¼ expðj2π mωl nÞ;

il ¼ arg max J D½iH rl  1 J 22 :

1 r m r M:

i

The coefficient vector is represented by b, which is a concatenation of sub-blocks b½l, denoting the ith block of size M, that is: b ¼ ½b½1…b½LT where

ð10Þ

ð12Þ

The following decomposition consisting of the blocks selected from the current and previous steps is built by l

^  x^ l ¼ ∑ D½ij b½i j j¼1

ð13Þ

L. Wang et al. / Signal Processing 106 (2015) 215–230

^  that are fitted by least where the block coefficients b½i j squares to minimize ‖x  x^ l ‖2 are given by b^ ¼ D†l x

1 H with D†l ¼ ðDH Dl l Dl Þ

and Dl ¼ ½D½i1 …D½il :

ð14Þ

A new residual is then updated by rl ¼ x  x^ l

ð15Þ

as the input to the next step of the algorithm. The BOGA is terminated once the residual signal energy is equal to or BOGA below a threshold, i.e., δ . For the signal corrupted by bounded noise, it is stated in [14,15] that for signal with K fundamental frequencies, the BOGA selects one correct block from the dictionary at each step and stops exactly at BOGA K steps when δ is equal or larger than the energy of the bounded noise. If the noise is unbounded, such as the white Gaussian noise, it is shown in [16] that the BOGA selects the first K correct blocks with a high probability and stops after the Kth step under the assumption that K is known. When K is unknown, the stopping criterion that BOGA is terminated once the residual signal energy is BOGA equal to or below a threshold, i.e., δ ¼ βNσ 2 is often used in practical applications, where β is a user selected parameter. In this paper, we assume that the BOGA stops at the K 0 th step, where the value of K 0 can be determined in two ways. One way is to select K 0 as a fixed number based on the prior information on the number of fundamental frequencies. When there is no such a prior information, K 0 will be given randomly by the criterion that BOGA stops when residual signal energy is equal to or BOGA below δ , i.e., J rl J 22 r βNσ 2 . To the best knowledge of the authors, there has been no theoretical analysis on the performance of BOGA when this criterion is used in Gaussian noise environment. 3.2. Architecture 1 The details of the architecture are given below. Algorithm 1. Detection architecture 1. Initialize the residual r0 ¼ x, and step counter l ¼ 0. while l r K 0 do il ¼ arg maxi J D½iH rl  1 J 22 H

T l ¼ J D½il  rl  1 J 22 Estimate the coefficient for the selected blocks: b^ ¼ D† x l

Update the residual: rl ¼ x  x^ l end while Create test statistic Tarch 1 T arch1 ¼ T 1 þ T 2 þ …þ T K 0 Make decision 1 T arch1 ⋛H H 0 γ arch1

where

γarch 1 is the threshold of architecture 1.

A similar idea has been used for incoherent detection and estimation algorithm (IDEA) [23] in which matching pursuit (MP) is used to estimate the coefficient vector and the decision is simply made by J b^ J 1 4 γ , where γ is the threshold. Architecture 1 can be thought as an extension of IDEA to block-sparse signal with a different form of test

219

statistic. In each step of the BOGA, the partial signal energy is recorded after selecting one correct block. Since the BOGA algorithm never chooses the same block more than once, summation of signal energy in K 0 steps, which is the test statistic used in architecture 1, i.e., T arch1 ¼ ∑K0 l ¼ 1Tl, yields the estimation of the total signal energy. The theoretical performance of architecture 1 is studied for different selections of K 0 in A.1. When K 0 is fixed and known a priori, the distribution Tarch 1 is studied in Sections 7.1.1 and 7.1.2. The theoretical detection probability (PD) under the assumption that BOGA selects the correct blocks in first K steps and the probability of false alarm (PFA) for fixed K 0 is given in (32), (34) and (37) to approximately evaluate the performance of architecture 1. They are summarized below: Under H0, T arch1 0 ¼ T arch1 =ðN σ 2 =2Þ is the sum of K 0 largest order statistics in a sample of L observations from χ 22M distribution. The PFA is given by Z P FA ¼

1 2γ arch1 =Nσ 2

f T arch1 0 ðxÞ dx

ð16Þ

with f T arch1 0 ðxÞ is given by f T arch1 0 ðxÞ ¼ 1=2f K 0 ð2xÞ

ð17Þ

where f K 0 ð2xÞ can be found in (30). Under H1, assume that BOGA selects the correct signal blocks in the first K steps and that K 0 ZK. When K 0 ¼ K, the test statistic follows noncentral chi-squared distribution with density function f χ 2 ðλÞ, where λ ¼ N‖b0 ‖22 =ðσ 2 =2Þ. 2KM The PD of architecture 1 is given by Z PD ¼

1

f χ2

γ

2KM

ðλÞ ðxÞ

ð18Þ

dx

where γ ¼ 2γ arch1 =N σ 2 . When K 0 4K, the density function f T 0arch1 is the convolution of density functions of two independent distributions, given by Z f T 0arch1 ðyÞ ¼ f χ 2 ðλÞ ðxÞf T 2 ðy xÞ dx ð19Þ 2KM

arch1

where f T 2 ¼ 1=2f K 0  K ð2xÞ with f K 0  K given in (30). The arch1 PD of architecture 1 when K 0 4K is given by Z PD ¼

1

γ

f T 0arch1 ðyÞ dy:

ð20Þ

When K 0 is randomly determined by the stopping criterion, J rl J 22 r βNσ 2 , the theoretical distribution of test statistic, Tarch 1, is hard to find. To determine the threshold, this distribution under H0 is found by Monte Carlo method. It is seen in A.1.3 that the density function of the test statistic under the null hypothesis is a linear combination of order statistics conditioned on K 0 with coefficients given by the probability of mass function of K 0 . Assume the conditional distribution of Tarch 1 on K 0 ¼ l and PrfK 0 ¼ lg are known and represented by f T arch1 jK 0 ¼ l and C l ðβ Þ. The PFA of architecture 1 for random K 0 is given by Z P FA ¼

1 2γ arch1 =Nσ 2

f 0T arch1 ðxÞ dx

ð21Þ

220

L. Wang et al. / Signal Processing 106 (2015) 215–230

T arch2 jK 0 ¼ l and PrfK 0 ¼ lg are known is given in A.2.2:

with f 0T arch1 ðxÞ ¼ ∑f 0T arch1 jK 0 ¼ kðxÞPrðK 0 ¼ kÞ:

ð22Þ

K0

P FA ¼ ∑P FA;k jK 0 ¼ k PrðK 0 ¼ kÞ k

rc1 ðβÞQ f 1 0 þc2 ðβÞðQ f 1 0 þð1  Q f 1 0 ÞQ f 2 0 Þ þ ⋯ þ cl ðβ ÞfQ f 1 0 þ ð1  Q f 1 0 ÞQ f 2 0 þ ⋯ þ ð1  Q f 1 0 Þ⋯ð1 Q f k  1 0 ÞQ f k 0 g þ⋯

3.3. Architecture 2 Detection architecture 1 tries to estimate the support of the signal and uses the estimated signal energy as the test statistic. The detection performance depends on the selection of K 0 . To alleviate this dependency, a detector, known as architecture 2, is proposed to sequentially integrate the energy of the signal in steps of BOGA. Algorithm 2. Detection architecture 2 Initialize the residual r0 ¼ x and step counter l ¼0. while l r K 0 do Selected a block and construct: il ¼ arg maxi J D½iH rl  1 J 22 b^ ¼ D†l x Update the residual: rl ¼ x  x^ l

T l ¼ J D½il H rl  1 J 22

if T arch2 ¼ ∑li ¼ 1 T i 4 γ l then Choose H1 and exit the loop else l ¼ lþ1 end if end while Choose H0 where

γl is the threshold at step l.

As shown in Algorithm 2, a decision is made in each step of BOGA. Once the decision H1 is made, the detection process is terminated directly. Otherwise, if decision H0 is made at l-th step, estimated block energy T l þ 1 is added to Tarch 2 and a new decision is made for step l þ 1. Decision on H0 is made if l r K 0 . Similarly, K 0 can be set as a fixed value based on the prior on the number of fundamental frequencies or be determined randomly by stopping criterion J rl J 22 r βNσ 2 . The thresholds for the steps of architecture 2 can be determined theoretically by finding the distribution of test statistics under null hypothesis if K 0 is fixed. The derivations are given in A.2.1. The theoretical probability bound of the false alarm for this case is given by K0

P FA ¼ ∑ P FA;l l¼1

rQ f 1 þ ð1 Q f 1 ÞQ f 2 þ ⋯ ð23Þ þð1  Q f 1 Þ⋯ð1 Q f K 0  1 ÞQ f K 0 R 1 0 where Q f l ðγ l =Nσ 2 Þ ¼ γ =Nσ 2 f l ðxÞ dx for 1 r l r K is the l right-tail probability of a random variable with the PDF fl in (30). When K 0 ¼ 1, the equality in (23) holds strictly. When K 0 is randomly determined by the stopping criterion as it is used in the architecture 1, the distribution of the test statistic Tarch 2 under H0 can be found by Monte Carlo method to determine the threshold at each step. The final probability bound of false alarm for this case under the assumption that the conditional distribution of

ð24Þ

where Q f 0l , given in (49), is the right tail function of the density function of Tarch 2 at the lth step. The theoretical detection probability of architecture 2 cannot be determined theoretically. We evaluate the detection probabilities under both ways of selecting K 0 by Monte Carlo method in the simulation. 4. Discussion 4.1. The effect of

β on the distribution of test statistic

The test statistic distributions of both architectures under the stopping criterion J rl J 22 r βNσ 2 depend on β. As discussed in (A.1.3), β controls the PMF of K 0 . A small β will result in a high possibility of a relatively large K 0 . It leads to a heavy tail distribution of test statistic under H0 hypothesis, thus resulting in a high threshold γ in order to keep a low probability of false alarm. On the contrary, a large β results in a relatively small K 0 , which gives a small threshold, but a high possibility that the BOGA stops before the Kth step when there are still signal components present in low SNR environments. Therefore, there must be an optimal value of β which yields the best detection performance. Unfortunately, neither the PMF of K 0 nor the distribution of test statistic under H1 hypothesis can be found theoretically. We resort to the Monte Carlo method to select the optimal β. 4.2. Difference between the two proposed architectures The detection architecture 1 tries to give an estimated number of fundamental frequencies and detect the signal based on the estimated signal subspace. In contrast, the detection architecture 2 detects the existence of the signal in each step of the BOGA. Once the decision on H1 is made, the BOGA stops. Therefore architecture 2 is more efficient and saves computational time. However, the optimal thresholds in architecture 2 are difficult to find. We determine the threshold at each step of the BOGA by setting a constant false alarm probability in the current step to provide a theoretical false alarm probability bound of architecture 2 as a reference. 4.3. Model mismatch A model mismatch occurs when the value of M used in the signal model is not equal to the true harmonic order of the received signal. When M is smaller than the true harmonic order, the created test statistic contains only a part of the signal energy. When M is larger, the test statistic contains the partial noise energy. Both situations result in a gracefully degraded detection performance as shown in the next section.

L. Wang et al. / Signal Processing 106 (2015) 215–230

221

Table 1 Estimated PMF of K 0 . b

C0

C^ 0

C^ 1

C^ 2

C^ 3

C^ 4

C^ 5

C^ 6

C^ 7

C^ 8

C^ 9

β ¼ 0.95 β ¼ 1.00 β ¼ 1.05 β ¼ 1.10 β ¼ 1.15 β ¼ 1.20 β ¼ 1.25

0.2926 0.5118 0.7210 0.8697 0.9503 0.9845 0.9960

0.29256 0.51209 0.72076 0.86937 0.95025 0.98440 0.99607

0.31013 0.28769 0.19814 0.10455 0.04312 0.01408 0.00366

0.23035 0.13863 0.06365 0.02206 0.00599 0.00140 0.00025

0.11329 0.04724 0.01459 0.00352 0.00055 0.00013 0.00002

0.04033 0.01169 0.00254 0.00045 0.00008 0.000003 0

0.01040 0.00229 0.00028 0.00005 0.000006 0 0

0.00247 0.00030 0.00003 0 0 0 0

0.00041 0.00005 0.000003 0 0 0 0

0.000045 0.000013 0 0 0 0 0

0.000003 0 0 0 0 0 0

5. Simulation results In this section, two sets of simulations are conducted. The first set is conducted to estimate the PMF of K 0 and verify some theoretical derivations of the distributions of test statistics. The second set aims at evaluating the detection performances of the proposed architectures by Monte Carlo method. 5.1. PMF of K 0 and Verification of Equations in (47) 5.1.1. Estimate PMF of K 0 under H0 As demonstrated in A.1.3, only the probability of K 0 ¼ 0, i.e., C 0 ðβÞ can be found exactly. The Monte Carlo method is used to estimate the PMF of K 0 when K 0 a0. As shown in A.1.3, Cl depends on β and N only. With fixed β and N, any σ2 results in the same PMF of K 0 . Therefore in our simulation, it is assumed the noise follows a complex white normal distribution with a length N ¼256 and the sampling frequency is 256 Hz. The block length is M¼4. We perform 30 experiments, each having 10 000 independent realizations. In each experiment, the relative frequency of K 0 ¼ l is recorded and the mean of the relative frequencies of K 0 ¼ l in 30 experiments are computed as C^ l . The estimated PMF of K^ with different values of β is given in Table 1, where C0 is theoretically computed by (38). As seen from Table 1, the estimated values of C^ 0 s are quite similar to the theoretical ones. Since the theoretical C0 is given, the estimated Cl can be further scaled to satisfy C^ 1 þ C^ 2 þ⋯ ¼ 1  C 0 . These scaled estimated C^ l s are used to give the PDF of test statistics for both architectures. 5.1.2. Verification To verify the equations in (47) by Monte Carlo method, we take T1 as an example. As shown in Fig. 1, the estimated histogram bar of T1 for fixed K 0 , denoted by f^ T 1 and the estimated histograms bar of T1 for random K 0 under the 0 conditions that K 0 ¼ 1, 2, and 3, denoted by f^ T 1 jK 0 ¼ 3 , 0 ^f 0 0 ^ given when T 1 jK ¼ 2 and f T 1 jK 0 ¼ 1 , respectively, are 0 0 β ¼ 1:1. Those estimated histograms f^ T 1 jK 0 ¼ 3 , f^ T 1 jK 0 ¼ 2 0 and f^ T 1 jK 0 ¼ 1 are quite similar to each other, which shows the correctness of our derivations in (47). The curves marked by dash-dotgive the theoretical density function of the largest order statistics without any constraint, which fits the estimated histogram bar of T1 quite well for fixed K 0 . It is seen that the density function of T1 for fixed K 0 does 0 differ from f T 1 jK 0 ¼ 1 due to the dependence of T1 on random integer variable K 0 . As long as density functions 0 f T 1 þ T 2 þ ⋯T k jK 0 ¼ k and Ck are estimated, thresholds for

architecture 1 and architecture 2 can be computed according to (42) and (56) for random K 0 with different values of β. 5.2. Detection performance evaluation In this subsection, we first conduct a serial of simulations with different values of β to find the best one producing the highest detection probability. Then the performances of the proposed architectures, the single tone detector based on the Fourier power spectrum given in (60) and the energy detector given in (63) are compared to show the improvement obtained by exploring the sparse harmonic structure. The performances of the proposed architectures are also compared with that of the GLRT detector [31] given in (57) to measure the performance loss. Thirdly, the comparisons of PFA, given in (32), (53) and (55), with those obtained by Monte Carlo method are made to verify the theoretical derivations in Appendix. Finally, we measure the performance loss due to model mismatch, i.e., incorrect information of harmonic order. In our simulations, the false alarm rate is P FA ¼ 0:03 for architecture 1. To determine the threshold in each detection of architecture 2 for fixed K 0 and random K 0 , respectively, P gFA ¼ 0:03 and P gFA ¼ 0:2 are used. The input signal for all the proposed architectures is sampled at a frequency fs¼ 256 Hz and has a length of N ¼ 256. Simulations are conducted using different signals which contain one ðK ¼ 1Þ and two ðK ¼ 2Þ harmonic sets. Fundamental frequencies are ω1 ¼ 11 Hz and ω2 ¼ 37 Hz, respectively, with harmonic order M ¼4. The SNR is defined as the ratio of signal power and noise power in a harmonic set, i.e., ‖D½ib½i‖22 =ðNσ 2 Þ. It is also assumed that each harmonic set has the same SNR in our simulations. 5.2.1. Selection of β Fig. 2(a) and (b) gives the simulated detection probabilities obtained by architecture 1 with different values of β for K¼1 and 2, respectively. As seen from both figures, the detection performance does depend on β and the best detection performance is achieved when β ¼ 1:1. Therefore, we chose β ¼ 1:1 for all the following simulations. It is also seen in Fig. 2(a) and (b), the detection probability of architecture 1 is improved when the number of harmonic sets K increases. 5.2.2. Detection comparisons Fig. 3(a) and (b) gives the detection probabilities obtained by architecture 1 with different SNR values for

222

L. Wang et al. / Signal Processing 106 (2015) 215–230

0.12

0.12

0.1

0.1

0.08

0.08

0.06

0.06

0.04

0.04

0.02

0.02 10

20

30

0

40

10

20

30

40

10

20

30

40

0.12

0.12

0.1

0.1

0.08

0.08 0.06

0.06

0.04

0.04

0.02

0.02 10

20

30

40

1

1

0.9

0.9

0.8 0.7 0.6 0.5

beta=0.95 beta=1 beta=1.05 beta=1.1 beta=1.15 beta=1.2

0.4 0.3 0.2 0.1 0 −15

−10

−5

0

SNR(dB)

probability of detection

probability of detection

Fig. 1. Density function comparisons. The histogram bars are obtained by the Monte Carlo method and the dash-dot curves are the theoretical density functions. (a) Fixed K 0 . (b) Random K 0 ¼ 1. (c) Random K 0 ¼ 2. (d) Random K 0 ¼ 3.

0.8 0.7 0.6 0.5 0.4

beta=0.95 beta=1 beta=1.05 beta=1.1 beta=1.15 beta=1.2

0.3 0.2 0.1 0 −15

−10

−5

0

SNR(dB)

Fig. 2. Detection performances of architecture 1 for different values of β. (a) K¼ 1. (b) K¼2.

K ¼1 and 2, respectively. The dashed curves give the detection probabilities of architecture 1 for fixed K 0 obtained by the Monte Carlo method with the thresholds

computed according to (32) when K 0 ¼ 1; 2 and 3, respectively. The solid curves marked by the symbol, þ, represents the detection probability of the architecture 1 for

L. Wang et al. / Signal Processing 106 (2015) 215–230

1

probability of detection

probability of detection

1

0.8

0.6

K’=1 0.4

K’=2 K’=3

0.2

beta=1.1 0 −15

223

−10

−5

0.8

0.6

0.4

K’=3 0.2

beta=1.1 0 −15

0

K’=2

−10

SNR/dB

0

1

probability of detection

probability of detection

1

0.8

0.6

K’=1 0.4

K’=2 K’=3

0.2

beta=1.1 0 −15

−5

SNR(dB)

−10

−5

0

SNR(dB)

0.8

0.6

0.4

K’=2 K’=3

0.2

0 −15

beta=1.1

−10

−5

0

SNR(dB)

Fig. 3. Detection performance of proposed architectures. (a) Architecture 1 K¼ 1. (b) Architecture 1 K¼2. (c) Architecture 2 K¼1. (d) Architecture 2 K¼2.

random K 0 when β ¼ 1:1. As seen from both figures, the detection performance is decreased when K 0 is larger than the true number of harmonic sets K. The detection probability of architecture 1 for random K 0 is a slightly lower than that for the fixed K 0 . However, it requires no prior information of the number of the harmonic sets. Fig. 3(c) and (d) gives the detection probabilities of architecture 2 with different SNR values for K ¼1 and 2, respectively. The dashed curves give the detection probabilities of architecture 2 for fixed K 0 obtained by the Monte Carlo method with the thresholds computed according to (54) when K 0 ¼ 1; 2 and 3, respectively. The solid curves marked by the symbol, n, represents the detection probability of architecture 2 for random K 0 when β ¼ 1:1. Similarly, the detection probability of architecture 2 for random K 0 is slightly lower than that for the fixed K 0 without requiring the prior information on the number of the harmonic sets. The detection performance of architecture 2 when K 0 is slightly larger than the true number of harmonic sets is similar to that when K 0 ¼ K with increased probability of false alarm, as seen in Fig. 5 on

which more details on false alarm rates are to be presented later. Fig. 4(a) and (b) presents the comparisons on the detection probabilities obtained by both architectures with various values of SNR for K ¼1 and 2, respectively. In Fig. 4 (a), the solid and dashed curves marked by n and ○ show the detection probabilities of architectures 1 and 2 for fixed K 0 ¼ 1 and random K 0 , respectively. The curves marked by , ♢ and n represent the performances obtained by the GLRT given in (58), energy detector in (64) and single tone detector in (61). The detection performances achieved by architectures 1 and 2 for K 0 ¼ 1 and the GLRT are quite similar. The detection performance of architecture 2 for random K 0 is better than that of architecture 1 and with a lower probability of false alarm, as shown in Fig. 5. The detection performances obtained by both architectures significantly outperform the energy detector and single tone detector. Similar conclusions can be made from Fig. 4(b) for K ¼2 and K 0 ¼ 2, where the dotted curve marked with □ gives the detection probability obtained by the GMF for harmonics

L. Wang et al. / Signal Processing 106 (2015) 215–230

1

1

0.9

0.9

0.8 0.7 0.6

arch2

r

0.5 0.4

GLRT arch1

0.3

arch1r

0.2

arch2f

0.1

single tone ED

f

0 −15

probability of detection

probability of detection

224

0.8 0.7 0.6

arch2r

0.5

arch1

0.4

arch2f

0.3

arch1f

0.2

ED GMF single tone

r

0.1 0

−10

−5

−14

0

−12

−10

−8

−6

−4

SNR(dB)

SNR(dB)

probability of detection

1 0.9 0.8 0.7 0.6 0.5

s−K’=1 s−K’=2 s−K’=3 t−K’=2 t−K’=1 t−K’=3

0.4 0.3 0.2 0.1 −15

−10

−5

SNR(dB)

K’=1

0.09

K’=2 0.08

K’=3

0.07

PFAB3

0.06

PFAB2

0.05

PFAB1

0.04 0.03

−15

−10

−5

SNR(dB)

0.032

probability of false alarm

probability of false alarm

Fig. 4. Performance comparisons with (a) other detectors when K¼1; (b) other detectors when K¼2; (c) the theoretical performances when K¼ 1.

0.03

PFAarch2

0.028

PFA−Barch2 0.026

0.024

0.022 −15

−10

−5

SNR(dB)

Fig. 5. False alarm probability of architecture 2. (a) Architecture 2 fixed K 0 . (b) Architecture 2 random K 0 .

0

L. Wang et al. / Signal Processing 106 (2015) 215–230

1

0.9

probability of detection

probability of detection

1

0.8 0.7

K’=1, M=3 K’=1, M=5 K’=1, M=4 K’r, M=4

0.6 0.5 0.4 0.3

K’ , M=3 r

0.2

K’r, M=5

0.1 0 −15

225

−10

−5

0

0.8

K’=1, M=3 K’=1, M=5 K’r, M=3

0.6

0.4

K’r, M=5 K’ , M=4 r

0.2

K’=1, M=4 0 −15

SNR(dB)

−10

−5

0

SNR(dB)

Fig. 6. Performance loss evaluation due to model mismatch. (a) Architecture 1. (b) Architecture 2.

with known fundamental frequency. The detection curves of single-tone detector and the GMF are meant to give some sense of performance bounds: the single tone detector does not exploit the harmonic structure and therefore one would expect the proposed methods to do better, and the GMF gives the “best” when the harmonic structure is exploited and all the parameters are known. Fig. 4(c) gives the approximated theoretical detection probabilities of architecture 1 in (37) for fixed K 0 when K ¼1. The solid curves marked by ○; n and n are the theoretical approximations to the detection probabilities of architecture 1 for K 0 ¼ 1; 2 and 3, respectively. The dashed curves marked by ○; n and ▵ give the corresponding detection probabilities of architecture 1 obtained by Monte Carlo simulations for K 0 ¼ 1; 2 and 3, respectively. The theoretical approximations of the detection probabilities are similar to those obtained by simulations when K 0 ¼ K. The larger the difference between K 0 and K, the worse the detection probability in (37) approximates the detection performance. It is concluded that although there are gaps between the performances obtained by the proposed architectures and the GMF, the proposed ones significantly outperform those detectors without making use of the harmonic property or sparsity of the signals. 5.2.3. False alarm rates Since there is no theoretical expression for the final false alarm probability of architecture 2, we evaluate the false alarm by Monte Carlo methods and make comparisons with the theoretical bounds of the false alarm rate in Appendix. According to (53), the theoretical false alarm bounds for fixed K 0 with P gFA ¼ 0:03 and K 0 ¼ 1; 2 and 3 are 0.03, 0.0591 and 0.0873, respectively. Similarly, the theoretical false alarm bound for random K 0 with P gFA ¼ 0:2 can be computed, according to (55), to be 0.031. Fig. 5 shows the false alarm probability of architecture 2. In Fig. 5(a), solid and dashed curves marked by n; ○ and ▵ represent the theoretical probability bounds of the false alarm rate and the simulated ones for K 0 ¼ 1; 2 and 3, respectively.

Although the simulated probabilities of false alarm are below the theoretical ones, there are still large gaps. In Fig. 5(b), the theoretical bound and the simulated probability of false alarm are given by the solid and dashed lines, respectively. The false alarm probability of architecture 2 for random K 0 is smaller than 0.03 which is the false alarm rate preset for architecture 1. As shown in Figs. 4 and 5, architecture 2 outperforms architecture 1 for the case of random K 0 in terms of higher detection probability, lower probability of false alarm, and lower computation complexity. 5.2.4. Model mismatch Fig. 6 presents the detection probabilities of the proposed architectures when there is a model mismatch. In this simulation, the true harmonic order is 4, and the assumed harmonic order is M¼3, 4, and 5, respectively. There is only one harmonic set in the signal, i.e., K ¼1, and K 0 ¼ 1 is used when K 0 is fixed for both detection architectures. Dashed and solid curves represent the detection probabilities with fixed and random K 0 , respectively. The curves marked by n; ▵ and ○ give the detection probabilities for M¼ 3, 4 and 5. Comparing to the detection probability for M¼4, as discussed in Section 4.3, there are slight performance decreases for M¼3 and 5 for both architectures. 6. Conclusion Target detection of mechanical harmonics using a block-sparsity model with an orthogonal dictionary was shown. Based on this new model, two detection architectures are proposed for robust detection. Both of them presume the knowledge of the maximum harmonic order and the noise power. Theoretically, those architectures use the maximum signal energy with no prior knowledge of the number of the harmonic sets and achieve much better performance than the conventional single tone detector especially in low SNR environments. In addition, the proposed architectures try to estimate the signal subspace

226

L. Wang et al. / Signal Processing 106 (2015) 215–230

based on the prior that signal is block-sparse. Comparing to the energy detector in which the signal subspace is assumed unknown, the proposed architectures achieve better detection performances. The threshold used in the detection architectures is given theoretically and the corresponding theoretical detection performances are also provided based on the theory of order statistics. Yet, the presented frame work still has some drawbacks which need to be overcome. One is the prior knowledge of the maximum harmonic order, which determines the size of each sub-dictionary in the block-sparse model. Another problem remaining is to theoretically determine the stopping criterion to yield the optimal detection performance. Our future work will focus on the adaptive selection of the stopping criterion to achieve a better detection performance. Appendix A A.1. Analysis on architecture 1 A.1.1. PDF of Tarch 1 under H0 for fixed K 0 For simplicity, we assume that f‖D½iH v‖22 g0 r i r L , are ordered by

the

values,

Since all variables in f‖D½iH v‖22 g0 r i r L are independent, the test statistic Tarch 1 is then the sum of first K^ largest observations in a sample of independent and identical distributed (IID) observations from a scaled χ 22M distribution. It is known that the distribution of the sum of l largest observations in a sample of IID observations from a gamma distribution is a linear function of the gamma distribution functions [25]. Let us define g n ðxÞ ¼ xn  1 e  x =Γ ðnÞ

ð25Þ

Ll1

^ r1 ¼ v  x^ 1 ¼ v  D½1b½1



Γ ðj þk þ nÞðl þ 1 þ rÞ  j  k  n gnrjk ðxÞ

nl  j  1



s¼0

þ αr 1

¼ ðI  D½1D½1† Þv:

αr ¼

i

¼ max‖D½iH ðI D½1D½1† Þv‖22 H

H



¼ max‖D½i v D½i D½1D½1 i

i

 bs g j þ k þ n  s

s¼0



x

αr

l 1þr þl

i

¼ max‖D½i



with

T 2 ¼ max‖D½iH r1 ‖22

A

ð30Þ

as g nl  j  s ðxÞ

jþkþn1

ð26Þ

For step 2,

ð29Þ

where g nrjk ðxÞ ¼

The residual signal at this step is

x 40

to represent the gamma distribution with n degrees of freedom. Suppose that fx1 ; x2 ; …; xL g are L samples generated from g n ðxÞ where xl 0 denotes the lth largest value in the L observations. Let Z l ¼ ∑li ¼ 1 xi 0 denotes the sum of l largest observations. As stated in [25], the density function f l of Z l follows:  L  1 L  l  1 ðn  1Þl ðn  1Þr L ∑ ∑ ð 1Þr cjl ckr f l ðxÞ ¼ ∑ Γ ðnÞ l r¼0 j¼0 k¼0

r

Under H0, x ¼ v, the first step is, i

‖D½iH v‖22  χ 22M : Nσ 2 =2



‖D½1H v‖22 Z ‖D½2H v‖22 Z ⋯ Z ‖D½LH v‖22 :

T 1 ¼ max‖D½iH r0 ‖22 ¼ ‖D½1H v‖22 :

2M degree of freedom (dof), i.e.,

H

v‖22 ;

a s ¼ ð  αr Þ s

v‖22 ð27Þ

where (27.A) holds because of the orthogonality of the Fourier basis, i.e., ( NIM i ¼ j; H D½i D½j ¼ ð28Þ 0 i aj:

 L l

¼

s 

ð1  αr Þ  j  k  n  s

nl  j þs  1 s



1

αr

 nl þ j  s 1

L! ðL lÞ!l!

and cjk can be computed recursively from j

cjk ¼

T 2 ¼ max‖D½iH v‖22 ¼ ‖D½2H v‖22 ;

cjk ¼ 0;

which means that T2 is the second largest value in f‖D½iH v‖22 g0 r i r L . Similarly, it can be found that T K 0 is the K 0 th largest value in f‖D½iH v‖22 g0 r i r L . Since v  CN ð0; σ 2 IN Þ, D½iH v becomes an M dimensional vector following complex white Gaussian distribution with zero-mean and variance σ 2 D½iH D½i ¼ Nσ 2 IM . ‖D½iH v‖22 is then a variable following the scaled χ 22M with

jþ k þ n þ s  1

bs ¼ αr s ð 1Þnl  j

Since the BOGA never selects a previously chosen block, we have 2 ri rL. Then (27.A) becomes i



k ; j!

n1

cjk ¼ ∑

j rn 1 jZ ðn 1Þk 1

cðj  mÞðk  1Þ ;

m ¼ 0 m!

n Zj Z ðn 1Þk

where k 41 and m! computes the factorial of m. Based on the density function of Zl, the upper pth percentile of the distribution of Zl for certain values of L, l and n can be precomputed off-line for further searching.

L. Wang et al. / Signal Processing 106 (2015) 215–230

It is found that T arch1 =ðNσ 2 =2Þ under H0 is the sum of K 0 largest order statistics in a sample of L observations from χ 22M , which is a special case of gamma distribution. Based on (30), the density function of T arch1 0 ¼ T arch1 =ðNσ 2 =2Þ under null hypothesis is given by the density function of 2Z K 0 , i.e., f T arch1 0 ðxÞ ¼ 1=2f K 0 ð2xÞ:

ð31Þ

Therefore the PFA is given by Z 1 P FA ¼ 2γ f T arch1 0 ðxÞ dx:

ð32Þ

arch1 Nσ2

227

Eqs. (32), (34) and (37) describe the theoretical detection performance of the detection architecture 1 with fixed K 0 under the assumption that the BOGA selects the correct signal blocks in the first K steps. A.1.3. PDF of Tarch 1 under H0 for Random K 0 When K 0 is determined randomly by stopping criterion J rl J 22 r βNσ 2 , it is an integer from the set f0; 1; 2; 3; …g and its PMF depends on the parameter β. For easy representation, we denote C k ðβÞ ¼ PrðK 0 ¼ kÞ with k A f0; 1; 2; 3; …g. Based on the stopping criterion of the BOGA, if we have r 0 r βNσ 2 , then K 0 ¼ 0. Therefore, C 0 ðβÞ ¼ PrðK 0 ¼ 0Þ ¼ Prð J r0 J 22 r βNσ 2 Þ:

A.1.2. PDF of Tarch 1 under H1 for fixed K 0 The detection probability is derived under the assumption that BOGA selects the correct signal blocks in the first K steps and that K 0 ZK. Under H1, x ¼ Dbþ v. When K 0 ¼ K, under the assumption that BOGA selects the correct signal blocks in the first K steps, the test statistic T arch1 ¼ ∑Kl¼ 1 T l ¼ ∑Kl¼ 1 ‖Nb½l þ D½lH v‖22 . It is then easy to

show that T arch1 0 ¼ T arch1 =ðN σ 2 =2Þ follows noncentral chisquared distribution with noncentrality parameter λ ¼ N‖b0 ‖22 =ðσ 2 =2Þ and degree 2MK. The density function is given by   

pffiffiffiffiffi  1 x ðKM  1Þ=2 1 exp  x þ λ I KM  1 λx : f χ 2 ðλÞ ðxÞ ¼ 2KM 2 λ 2 ð33Þ Then the PD of architecture 1 is given by Z 1 PD ¼ f χ 2 ðλÞ ðxÞ dx ¼ Q χ 2 ðλÞ ðγ Þ γ

2KM

ð34Þ

2MK

where Q χ 2m represents the right-tail probability for a χ2m variable with a dof m and γ ¼ 2γ arch1 =Nσ 2 . When K 0 4 K, the distribution of Tarch 1 is more complex. Rewrite Tarch 1 into two parts: T arch1

0

K0

T Tl ¼ ∑ ¼ arch1 Nσ 2 =2 l ¼ 1 N σ 2 =2 K

¼ ∑

l¼1

‖Nb½l þ D½l Nσ 2 =2

H

v‖22

¼ T 1arch1 þ T 2arch1 :

þ

K



l ¼ K þ1

H

‖D½l N σ 2 =2

v‖22 ð35Þ

As previously discussed, the density function of is f χ 2 ðλÞ ðxÞ given in (33) and T 2arch1 is the sum of K  K 2KM largest order statistics in a sample of L K observations from χ 22M . Therefore, the density function of T 2arch1 is f T 2 ¼ 1=2f K 0  K ð2xÞ with f K 0  K given by (30). Since T 1arch1 arch1 and T 2arch1 are independent, the density function of Tarch 1 is the convolution of density functions of T 1arch1 and T 2arch1 . Then the density function of T arch1 0 is given by Z f T 0arch1 ðyÞ ¼ f χ 2 ðλÞ ðxÞf T 2 ðy xÞ dx: ð36Þ arch1

Therefore the PD of architecture 1 when K 0 4K is given by Z 1 PD ¼ f T 0arch1 ðyÞ dy ð37Þ γ

with γ ¼ 2γ arch1 =N σ 2 .

To have K 0 ¼ 1, it is required that ‖r0 ‖22 4 β Nσ 2 and ‖r1 ‖22 r β Nσ 2 , C 1 ðβÞ ¼ Prð‖r1 ‖22 r βN σ 2 ; ‖r0 ‖22 4 βN σ 2 Þ:

ð39Þ

Since the events f‖r0 ‖22 4 β Nσ 2 g and f‖r1 ‖22 r βNσ 2 g are dependent. the theoretical expressions of Ck cannot be found when k 4 0. The density function of T arch1 ¼ ∑K0 l ¼ 1T l under the condition that K 0 ¼ k, denoted as f 0T jK 0 ¼ k , is no arch1 longer be the density of the sum of k largest order statistics. It can be clearly shown in the following equation: PrðT arch1 oxjK 0 ¼ kÞ ¼ Pr

k

∑ ‖D½lH v‖22 oxj‖r0 ‖22 4N βσ 2 ;

l¼1

…; ‖rk  1 ‖22 4 Nβσ 2 ; ‖rk ‖22 oN βσ 2 : Since events f‖rl ‖22 4 Nβσ 2 g0 r l r k  1 f∑kl ¼ 1 ‖D½lH v‖22 o xg are dependent,

ð40Þ and

PrðT arch1 jðK 0 ¼ kÞ o xÞ a PrðT arch1 oxÞ: 0

T 1arch1 0

2KM

Under H0 hypothesis, J r0 J 22 ¼ J v J 22 , so J r0 J 22 follows a scaled χ 22N . We have     C 0 β ¼ Pr ‖r0 ‖22 r βN σ 2    ‖r0 ‖22 r2βN ¼ 1  Q χ 2 2βN : ð38Þ ¼ Pr 2N σ 2 =2

Unfortunately, the theoretical expressions of f 0T jK 0 ¼ k arch1 cannot be found either. 0 Suppose that Ck and f T jK 0 ¼ k are known, the distribuarch1 tion of Tarch 1 can be found by marginalizing K 0 : f 0T arch1 ðxÞ ¼ ∑f 0T arch1 jK 0 ¼ k ðxÞPrleftðK 0 ¼ kÞ:

ð41Þ

K0

Therefore the PFA of architecture 1 for random K 0 is given by Z 1 P FA ¼ 2γ f 0T arch1 ðxÞ dx: ð42Þ arch1 Nσ2

To find the threshold for architecture 1 when K 0 is random, we estimate Ck and f 0T jK 0 ¼ k by Monte Carlo method. arch1

A.2. Analysis on architecture 2 A.2.1. K 0 is fixed Let us assume that we know the approximated number of fundamental frequencies which is used as the value of

228

L. Wang et al. / Signal Processing 106 (2015) 215–230

K 0 in architecture 2. Suppose in each step the threshold used is γl for 1 r l rK 0 . Under H0 hypothesis, if BOGA stops in the first step, as shown in (A.1.1), T arch2 ¼ T 1 ¼ ‖D½1H v‖22 is the largest value selected from set f‖D½iH v‖22 g1 r i r L . The PDF of T arch2 =ðNσ 2 =2Þ is given by 1=2f 1 ð2xÞ in (30). So the PFA in this step is given by Z

γ 1 1 1 P FA;1 ¼ f 1 ð2xÞ dx ¼ Q f 1 2 2γ 1 =Nσ 2 Nσ 2 R1 where Q f l ðγ l =N σ 2 Þ ¼ γ =Nσ 2 f l ðxÞ dx, for 1 rl rK 0 , is the l right-tail probability of a random variable with the PDF fl. If T 1 r γ 1 , T arch2 ¼ T 1 þ T 2 is created. Similarly, Tarch 2 is the sum of first two largest values in f‖D½iH v‖22 g1 r i r L . Therefore the PDF of Tarch 2 when BOGA is stopped at the second step is given by 1=2f 2 ð2xÞ in (30). The PFA of Tarch 2 at this step is  2γ 2γ A P FA;2 ¼ Pr T 1 o 12 ; T 2 þT 1 Z 22 Nσ Nσ   B 2γ 1 2γ Pr T 1 þT 2 Z 22 r Pr T 1 o 2 Nσ Nσ ¼ Q f 2 ð1  Q f 1 Þ: ð43Þ Since events fT 1 o 2γ 1 =Nσ 2 g and fT 1 þ T 2 Z2γ 2 =Nσ 2 g in (43.A) are negatively dependent, (43.B) holds. Similarly, the PFA of Tl can be bounded by P FA;l rð1  Q f l  1 Þ⋯ð1  Q f 1 ÞQ f l :

ð44Þ

0

After K steps, the final PFA of architecture 2 is bounded by K0

P FA ¼ ∑ P FA;l l¼1

rQ f 1 þ ð1 Q f 1 ÞQ f 2 þ ⋯ þ ð1  Q f 1 Þ⋯ð1  Q f K 0  1 ÞQ f K 0 :

ð45Þ

0

When K ¼ 1, the equality of (45) holds strictly. A.2.2. K 0 is random K 0 will be a random integer with its PMF given by C k when it is determined by the stopping criterion that J rl J 22 r βN σ 2 . When K 0 ¼ k, Tarch 2 may have k possible expressions depending on the step index at which the BOGA stops 8 T 1 jK 0 ¼ k > > > 0 > > > < T 1 þT 2 jK ¼ k ð46Þ T arch2 jK 0 ¼ k ¼ ⋮ > > k > > 0 > > : ∑ T l jK ¼ k: l¼1

From the last line in (46), it is noted that when the BOGA stops at the kth step, T arch2 jK 0 ¼ k has the same expression as T arch1 jK 0 ¼ k in (A.1.3). Once Ck and the density functions 0 f 0T jK 0 ¼ k ¼ f T 1 þ T 2 þ ⋯ þ T k jK 0 ¼ k are estimated by the Monte arch1 Carlo method in (A.1.3), they are readily used in this part. Apart 0 from f T 1 þ T 2 þ ⋯ þ T k jK 0 ¼ k , we still need to evaluate the other 0 0 k  1 density functions f T 1 jK 0 ¼ k ; f T 1 þ T 2 jK 0 ¼ k ; … to make a decision in each step of architecture 2. Luckily, we can prove that 0

0

f T 1 jK 0 ¼ k ¼ f T 1 jK 0 ¼ 1 0 f T 1 þ T 2 jK 0 ¼ k



0 ¼ f T 1 þ T 2 þ ⋯T k  1 jK 0 ¼ k  1 :

Proof of (47). Since PrðT 1 o xjK 0 ¼ kÞ ¼ Prð‖D½1H v‖22 o xj‖r0 ‖22 4 Nβσ 2 ; ⋯; ‖rk  1 ‖22 4N βσ 2 ; ‖rk ‖22 oN βσ 2 Þ

ð48Þ

H

and D½1 v is orthogonal to rk with k Z1, event f‖D½1H v‖22 o xg is only dependent on event f‖r0 ‖22 o Nβσ 2 g and independent on events f‖rk ‖22 oN βσ 2 gk Z 1 . So we have PrðT 1 oxjK 0 ¼ kÞ ¼ PrðT 1 oxj‖r0 ‖22 4N βσ 2 Þ ¼ PrðT 1 oxj‖r0 ‖22 4N βσ 2 ; ‖r1 ‖22 o Nβσ 2 Þ

¼ PrðT 1 oxjK 0 ¼ 1Þ:

With the density functions and Ck, we derive the PFA of architecture 2 as follows. Suppose that J rl J 22 r βN σ 2 and the BOGA stops at step K 0 ¼ k. According to the detection architecture 2, detection may end at each step l for l rk. When detection ends at l¼0, there is no false alarm. When l¼1, 0 0 T arch2 ¼ T 1 follows f T 1 jK 0 ¼ k ðxÞ ¼ f T 1 jK 0 ¼ 1 ðxÞ. The false alarm in this step is given by  Z 1 2γ 1 0 P FA;1 jK 0 ¼ k ¼ f T 1 jK 0 ¼ 1 ðxÞ dx ¼ Q f 01 ð49Þ 2 Nσ 2γ 1 0 =Nσ 2 where Q f 0l is the right tail function of the density function of Tarch 2 at the l-th step. When l¼ 2, T arch2 ¼ T 1 þ T 2 follows 0 f T 1 þ T 2 jK 0 ¼ 2 ðxÞ, the PFA at this step is bounded by  2γ 0 P FA;2 jK 0 ¼ k r Pr T 1 o 12 K 0 ¼ 2 Nσ  2γ 2 0 0 Pr T 2 þT 1 Z K ¼2 2 Nσ ð50Þ ¼ Q f 02 ð1 Q f 01 Þ: 0

Similarly, at step l ¼ k, the PFA is P FA;k jK 0 ¼ k rfQ f 1 0 þ ð1  Q f 1 0 ÞQ f 2 0 þ ⋯ þð1  Q f 1 0 Þ⋯ð1  Q f k  1 0 ÞQ f l 0 g:

ð51Þ 0

The final PFA bound of architecture 2 under a random K is P FA ¼ ∑P FA;k jK 0 ¼ k PrðK 0 ¼ kÞ k

rc1 ðβÞQ f 1 0 þc2 ðβÞðQ f 1 0 þð1  Q f 1 0 ÞQ f 2 0 Þ þ ⋯ þ cl ðβ ÞfQ f 1 0 þ ð1  Q f 1 0 ÞQ f 2 0 þ ⋯

þ ð1  Q f 1 0 Þ⋯ð1 Q f k  1 0 ÞQ f k 0 g þ⋯:

ð52Þ

Since the exact PFA of architecture 2 cannot be given theoretically, there is no way to find the optimal set of thresholds for all the steps of the BOGA. One practical way is to compute the threshold γl or γ l 0 in the lth step directly based on the distribution of Tl under a given false alarm P gFA . The bounds of final PFA of architecture 2 under different selections of K 0 can be accordingly recomputed by (45) and (52) to be: P FFA r P gFA þ ð1  P gFA ÞP gFA þ ⋯

0

¼ f T 1 þ T 2 jK 0 ¼ 2

0 f T 1 þ T 2 þ ⋯T k  1 jK 0 ¼ k

Therefore, all the density functions used in architecture 2 are 0 readily estimated by f T 1 þ T 2 þ ⋯T k jK 0 ¼ k in (A.1.3). We prove the first equation in (47) as follows, and the proofs of other equations can be made in a similar way.

þ ð1  P gFA Þ⋯ð1  P gFA ÞP gFA ð47Þ

¼ 1 ð1  P gFA ÞK

0

ð53Þ

L. Wang et al. / Signal Processing 106 (2015) 215–230

229

A.4. Detection probability of energy detector (ED)

with

γ l : P gFA ¼ Q f l Nσ 2

ð54Þ

Suppose the only information known is that signal follows the linear model (11). Without using the sparsity prior, the ED is then given by [24]

ð55Þ

T ED ¼

P RFA rc1 ðβÞP gFA þ c2 ðβÞð1  ð1  P gFA Þ2 Þ þ ⋯ þ cl ðβÞð1  ð1  P gFA Þl Þ þ ⋯ with  P gFA ¼ Q f l 0

2γ l 0 Nσ 2



‖DH x‖22 : Nσ 2 =2

ð63Þ

The detection performance can be given by ð56Þ

PD ¼ Q χ2

2ML

where P FFA , P RFA and P gFA represent the probability of false alarm at fixed K 0 , random K 0 determined by the stopping criterion J rl J 22 r βNσ 2 and the given false alarm at each step of the BOGA.

γ

ð2N‖b0 ‖22 =σ 2 Þ ð ED Þ;

P FA ¼ Q χ 2 ðγ ED Þ; 2ML

ð64Þ ð65Þ

where γ ED is the threshold, L is the number of possible fundamental frequencies. The detection probability can be very low if L is large.

A.3. Detection probability of GLRT References The theoretical detection probability of the GLRT detector [31] is given under the prior that one harmonic set is present with unknown fundamental frequency ω. The data follows linear model x ¼ Dω bω þ v. The logarithm of the GLRT is then given by ln maxω fpðx; b^ ω ; ω; H 1 Þ=pðx; H0 Þg. Using the GLRT for linear model [24,31], the test statistic is ‖DH x‖2 T GLRT ¼ max ω2 2 : ω N σ =2

ð57Þ

Suppose there are L possible fundamental frequencies, the size of the searching range of ω is L. The PD and PFA are given by P D ¼ Q χ 2 ðλÞ ðγ GLRT Þ; 2M 2 ‖DH ω x‖2 Z γ GLRT ; H 0 ω N σ 2 =2

ð58Þ !

P FA ¼ Pr max

¼ 1 ð1  Q χ 2 ðγ GLRT ÞÞL : 2M

ð59Þ

with λ ¼ 2N‖bω ‖22 =σ 2 and γ GLRT is the threshold determined by the false alarm probability. For harmonics with one fundamental frequency, this GLRT detector is optimal. When multiple fundamental frequencies are present in the signal, this detector can be used to successively detect harmonic block corresponding to each fundamental frequency with the detection probabilities determined by the SNRs of harmonic blocks using (58). When the harmonic structure is not considered, (57) turns into the conventional single tone detector [24]: H

‖d x‖2 T s ¼ max ω2 2 ¼ maxP ðωÞ ω N σ =2 ω

ð60Þ

where dω is one atom in Fourier basis corresponding to ω and PðωÞ is the periodogram evaluated at frequency ω. The PD and PFA are given by P D ¼ Q χ 2 ðλÞ ðγ s Þ;

ð61Þ

2

P FA ¼ 1  ð1  Q χ 2 ðγ s ÞÞL : 2

ð62Þ

where λ ¼ 2N‖a‖22 =σ 2 , a is the complex amplitude of the tone corresponding to fundamental frequency ω and γs is the threshold.

[1] L.M. Gray, D.S. Greeley, Source level model for propeller blade rate radiation for the world's merchant fleet, J. Acoust. Soc. Am. 67 (2) (1980) 516–522. [2] J.L. Terry, A. Crampton, C.J. Talbot, Passive sonar harmonic detection using feature extraction and clustering analysis, in: Proceedings of MTS/IEEE OCEANS, vol. 3, 2005, pp. 2760–2766. [3] J.G. Lourens, J.A. du Preez, Passive sonar ML estimator for ship propeller speed, IEEE J. Ocean. Eng. 23 (4) (1998) 448–453. [4] R.O. Nielsen, Sonar Signal Processing, Artech House, Boston, 1991. [5] R.A. Wagstaff, The AWSUM filter: a 20 db gain fluctuation based processor, IEEE J. Ocean. Eng. 22 (1997) 110–118. [6] C. Wan, J.T. Goh, H.T. Chee, Optimal tonal detectors based on the power spectrum, IEEE J. Ocean. Eng. 25 (4) (2000) 504–552. [7] Q. Wang, C. Wan, A novel CFAR tonal detector using phase compensation, IEEE J. Ocean. Eng. 30 (4) (2005) 2146–2157. [8] L. Scharf, B. Friedlander, Matched subspace detector, IEEE Trans. Signal Process. 42 (8) (1994) 993–1009. [9] R. Vidal, Y. Ma, A unified algebraic approach to 2-D and 3-D motion segmentation and estimation, J. Math. Imaging Vis. 25 (3) (2006) 403–421. [10] M. Mishali, Y.C. Eldar, Blind multi-band signal reconstruction: compressed sensing for analog signal, IEEE Trans. Signal Process. 57 (3) (2009) 993–1009. [11] R. Gribonval, E. Bacry, Harmonic decomposition of audio signals with matching pursuit, IEEE Trans. Signal Process. 51 (1) (2003) 101–110. [12] X. Lv, G. Bi, C. Wan, The group lasso for stable recovery of blocksparse signal representations, IEEE Trans. Signal Process. 59 (4) (2011) 1371–1382. [13] Y.C. Eldar, P. Kuppinger, H. Bolcskei, Block-sparse signals: uncertainty relations and efficient recovery, IEEE Trans. Signal Process. 58 (6) (2010) 3042–3054. [14] X. Lv, C. Wan, G. Bi, Block orthogonal greedy algorithm for stable recovery of block-sparse signal representations, Signal Process. 86 (12) (2010) 3265–3277. [15] L. Wang, G. Bi, C. Wan, X. Lv, Improved stability conditions of BOGA for noisy block-sparse signals, Signal Process. 91 (11) (2011) 2567–2574. [16] Z. Ben, Y.C. Eldar, P. Kuppinger, H. Bolcskei, Near-oracle performance of greedy block-sparse estimation techniques from noisy measurements, IEEE J. Sel. Top. Signal Process. 5 (5) (2011) 1032–1046. [17] Z. Zhang, B. Rao, Extension of SBL algorithm for recovery of block sparse signals with intra-block correlation, Available online: 〈http:// arxiv.org/abs/1201.0862〉. [18] M.F. Duarte, R.G. Baraniuk, Spectral compressive sensing, IEEE Trans. Signal Process. 91 (11) (2011) 2567–2574. [19] D.L. Donoho, M. Elad, V.N. Temlyakov, Stable recovery of sparse overcomplete representations in the presence of noise, IEEE Trans. Info. Theory 52 (1) (2006) 6–18. [20] M.F. Duarte, R.G. Baraniuk, Compressed sensing and redundant dictionary, IEEE Trans. Info. Theory 91 (11) (2011) 2567–2574. [21] J. Tabrikian, S. Dubnov, Y. Dickalov, Maximum A-posteriori probability pitch tracking in noisy environments using harmonic model, IEEE Trans. Audio Speech Lang. Process. 12 (1) (2004) 76–87.

230

L. Wang et al. / Signal Processing 106 (2015) 215–230

[22] A. Koretz, J. Tabrikian, Maximum a posteriori probability multiplepitch tracking using the harmonic model, IEEE Trans. Audio Speech Lang. Process. 19 (7) (2011) 2210–2221. [23] M.F. Duarte, M.A. Davenport, M.B. Wakin, R.G. Baraniuk, Sparse signal detection from incoherent projections, in: Proceedings of IEEE International Conference on Acoustics Speech, Signal and Processing. (ICASSP’06), vol. 3, May 2006. [24] S.M. Kay, Fundamentals of Statistical Signal Processing: Detection Theory, Prentice-Hall PTR, Upper Saddle River, NJ, 1998 vol. II, Prentice-Hall Signal Processing Series, ch. 7, sec.7.6-7.7. [25] K. Alam, K.T. Wallenius, Distribution of a sum of order statistics, Scand. J. Stat. 6 (3) (1979) 123–126. [26] M.G. Christensen, A. Jakobsson, Multi-pitch estimation, in: Synthesis Lectures on Speech & Audio Process, vol. 5, Morgan & Claypool Publishers, 2009. [27] M.G. Christensen, J.L. Højvang, A. Jakobsson, S.H. Jensen, Joint fundamental frequency and order estimation using optimal filtering, EURASIP J. Adv. Signal Process. 2011 (13) (2011) 1–13.

[28] S.I. Adalbjörnsson, A. Jakobsson, M.G. Christensen, Estimating multiple pitches using block sparsity, in: 38th International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, May 2013, pp. 26–31. [29] T. Kronvall, S.I. Adalbjörnsson, A. Jakobsson, Joint DOA and multipitch estimation using block sparsity, in: 39th International Conference on Acoustics, Speech, and Signal Processing, Florence, Italy, May 2014, pp. 4–9. [30] J.K. Nielsen, Default Bayesian estimation of the fundamental frequency, IEEE Trans. Audio Speech Lang. Process. 21 (March (3)) (2013) 1–14. [31] E. Fisher, Generalized likelihood ratio test for voiced–unvoiced decision in noisy speech using the harmonic model, Speech Lang. Process. 14 (March (2)) (2006) 502–510.