Performance evaluation of adaptive dual microphone systems

Performance evaluation of adaptive dual microphone systems

Available online at www.sciencedirect.com Speech Communication 51 (2009) 1180–1193 www.elsevier.com/locate/specom Performance evaluation of adaptive...

2MB Sizes 0 Downloads 51 Views

Available online at www.sciencedirect.com

Speech Communication 51 (2009) 1180–1193 www.elsevier.com/locate/specom

Performance evaluation of adaptive dual microphone systems Jianfeng Chen, Koksoon Phua *, Louis Shue, Hanwu Sun Institute for Infocomm Research, 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore 138632, Singapore Received 19 June 2008; received in revised form 11 May 2009; accepted 1 June 2009

Abstract In this paper, the performance of the adaptive noise cancellation method is evaluated on several possible dual microphone system (DMS) configurations. Two groups of DMS are taken into consideration with one consisting of two omnidirectional microphones and another involving directional microphones. The properties of these methods are theoretically analyzed under incoherent, coherent and diffuse noise respectively. To further investigate their achievable noise reduction performance in real situations, a series of experiments in simulated and real office environments are carried out. Some recommendations are given at the end for designing and choosing the suitable methods in real applications. Ó 2009 Elsevier B.V. All rights reserved. Keywords: Microphone array; Adaptive beamforming; Noise reduction

1. Introduction In recent years, with the significant progresses seen in digital signal processors and miniature electret condenser microphones, dual microphone system (DMS) are receiving more attention from industries for directional audiocapturing and noise reduction. Notable applications include mobile speech communications (Sasaki and Gyotoku, 1995; Elko and Pong, 1995; Compernolle, 1990; Phua et al., 2005) as well as hearing aids (Luo et al., 2002; Maj, 2004; Desloge et al., 1997; Welker et al., 1997), where the improved performance in directional-audio acquisition is delicately balanced against the need for compactness. Generally speaking, a DMS can be defined as a composite directional audio-capturing device, and consists of two microphones with either the same or different directional characteristics, e.g., omnidirectional, dipole or cardioid, etc. When used in conjunction with adaptive noise cancellation (Widrow et al., 1975), DMS can achieve good directivity (therefore good noise reduction) by adapting itself *

Corresponding author. E-mail addresses: [email protected] (J. Chen), ksphua@i2r. a-star.edu.sg (K. Phua), [email protected] (L. Shue), hwsun@i2r. a-star.edu.sg (H. Sun). 0167-6393/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.specom.2009.06.002

according to the noise characteristics. Compared with the use of a single directional microphone, which has a fixed directivity pattern, an adaptive DMS is more flexible and in many cases can lead to better enhancement in term of signal-to-noise ratio (SNR) and speech intelligibility (Maj et al., 2003; Tompson, 1999; Ricketts and Henry, 2002). Furthermore, compared with those microphone arrays of larger sizes with more than two sensors, a DMS is compact which in turn allows it to be incorporated into applications that use a single directional microphone, for example, a behind-the-ear (BTE) hearing aids (Luo et al., 2002; Maj, 2004; Maj et al., 2003; Tompson, 1999; Ricketts and Henry, 2002). According to the types of microphones used, we can broadly divide DMS into two groups. Group 1 consists of two microphones which have fixed but different polar patterns, some examples of which have been proposed and studied in (Sasaki and Gyotoku, 1995; Maj, 2004; Maj et al., 2003) (see Fig. 1 for possible combinations). The signals from the two microphones, designated as the primary and reference microphones, are fed into an adaptive noise canceller (Widrow et al., 1975) to produce the desired output, as denoted by y(t) in Fig. 2. A key advantage of Group 1 is that, since the spatial information of the incoming signals is implicitly obtained via the inherent

J. Chen et al. / Speech Communication 51 (2009) 1180–1193 R( θ )=1

R( θ )=½(1+cos θ )

R( θ )=cos θ

R( θ )=½(1+cos θ )

R( θ )=½(1-cos θ )

R( θ )=½(1-cos θ )

R( θ )=½(1-cos θ )

R( θ )=sin θ

(a)

(b)

(c)

(d)

Primary microphone Reference microphone

Fig. 1. Various configurations for dual microphone system: (a) omnidirectional–cardioid, (b) back-to-back cardioid, (c) dipole–cardioid and (d) cardioid–dipole. The arrows denote the direction of the desired signal. R(h) is the magnitude response.

Fig. 2. Block diagram of an adaptive noise cancellation system. (Note: sc denotes the time delay for causality, w the adaptive filter and y(t) the output.)

directionality of the individual microphones, the two microphones can be placed physically as close as possible, which can lead to an attractive compact profile. However, to steer this group of DMS, it must be pointed to the desired direction. The systems in Group 2 use two omnidirectional microphones that are separated by a small distance d. By using appropriate delays and the two microphones, intermediate first-order differential microphones with various polar patterns, such as those shown in Fig. 1, can be generated (Elko and Pong, 1995; Tompson, 1999). Similarly, an adaptive noise canceller then follows for noise reduction. In this case, the spatial information is obtained through the time difference caused by the inter-microphone separation. Hence, unlike Group 1, the two omnidirectional microphones in Group 2 require some minimal separation. This configuration is flexible because the look-direction can be easily adjusted (Compernolle, 1990). Also, efficient algorithms for real-time DSP implementation are available, see (Elko and Pong, 1995; Compernolle, 1990; Luo et al., 2002; Berghe, 1998). In order to make optimal use of the unique features of a DMS, a critical performance evaluation in various application scenarios is highly desirable. In (Bitzer and Simmer, 1998), Bitzer and Simmer evaluated the theoretical limitations and the noise reduction performance of a dual-omnidirectional-microphone system, implemented using the fixed delay-and-sum beamformer, the Griffiths–Jim method and the adaptive noise canceller. Ricketts and Henry (2002) compared the performance of two DMS from Group 2 under different noise situations, with one using an adaptive structure and the other with fixed directionality. Desloge and Welker studied several microphone arrays developed for hearing aids, having both fixed (Desloge

1181

et al., 1997) and adaptive structures (Welker et al., 1997). A more recent study (Maj et al., 2003) compared two implementations of the adaptive omnidirectional–cardioid configuration, consisting of (1) two omnidirectional microphones and (2) intrinsic omnidirectional and cardioid microphones (see Fig. 1a). Having reviewed the existing techniques, we notice that those previous works are primarily on one or two specific microphone configurations and mostly in simulated environments. Consequently, a comparative study on various possible DMS configurations working in both simulated and real application scenarios is still desirable. In this paper, an evaluation of the existing DMS, particularly the adaptive ones, under three typical noise conditions is presented. The results are further investigated through a series of evaluation experiments partially in compliance with the well-defined industrial testing standard ANSIS3,35-2004. The paper is organized as follows. In Section 2, we present a brief overview of adaptive noise canceller and review some typical algorithms for DMS. Different DMS configurations are then recast into a unified representation, namely the GSC framework. The theoretical noise reduction for DMS is then derived using this framework in Section 3. Detailed simulations as well as experimental results are given in Sections 4 and 5 respectively. Some practical issues are discussed in Section 6 and the paper is concluded in Section 7. 2. Adaptive dual microphone systems In this section we briefly discuss the structural components of an adaptive DMS. 2.1. Group 1: DMS containing intrinsically directional microphone The core in an adaptive DMS is an adaptive noise canceller (Widrow et al., 1975). In this canceller, the primary microphone picks up both the desired speech signal s(t) and the noise n(t), whereas the reference microphone picks up mostly a noise n0 (t) component – the ‘‘noise” reference as illustrated in Fig. 2. Noise cancellation is achieved by adaptively recreating a replica of the interfering noise using the reference input and subtracting the replica from the corrupted signal. Based on this principle, an adaptive DMS system can be constructed by two microphones in several different configurations, with each individual microphone having an inherent directionality and arranged in appropriate orientation, as shown in Fig. 1. As can be seen, these configurations share one common feature: in the look-direction, the primary microphone always has a maximal response while the reference microphone shows a null response. The DMS as shown in Fig. 1 are denoted as G1-a to G1-d, respectively.

1182

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

After a DMS system is set up and the role of each microphone is defined, their signal flow illustrated in Fig. 2 can be expressed as followings: yðtÞ ¼ y 1 ðt  sc Þ  wðtÞT y2 ðtÞ y2 ðtÞ , ½ y 2 ðtÞ w , ½ w0 ðtÞ

y 2 ðt  1Þ

w1 ðtÞ   

ð1Þ

   y 2 ðt  L þ 1Þ 

T

T

wL1 ðtÞ 

ð2Þ ð3Þ

where sc is the time delay for causality, y2(t) the reference vector at time t and L the filter length of the adaptive filter w. For comparison purposes, in this paper we will invariably apply the normalized-least-mean-squares (NLMS) algorithm (Haykin, 1996) to update the adaptive filter coefficient W, which is given by wðt þ 1Þ ¼ wðtÞ þ l

yðtÞ y ðtÞ yT2 ðtÞy2 ðtÞ þ d 2

ð4Þ

where kk denotes the Euclidean norm, l the step size, d a positive constant and w(t + 1) the updated filter. 2.2. Group 2: DMS containing two omnidirectional microphones A DMS can also be constructed with two omnidirectional microphones, which are separated by a small distance d, and used in either broadside or endfire orientations. By using the differential array technique (Elko and Pong, 1995), the outputs of the two omnidirectional microphones can be combined to form a differential cardioid or dipole microphone. As a result, the configurations in G1-a to G1-c can be equivalently implemented by using two omnidirectional microphones. As an example, G1-b can be achieved by using the back-to-back differential cardioid microphones scheme (Elko and Pong, 1995; Luo et al., 2002). The Group 2 DMS system can be unified into the wellknown GSC structure, as shown in Fig. 3. Denoting the inputs of the two microphones as x1(t) and x2(t) respectively. The upper branch in Fig. 3 enhances signals coming from the look-direction by using delay and weighting coefficients, while the lower branch is to block the desired signal and to provide a noise reference. The lower branch can be regarded as the Griffiths–Jim block matrix (Griffiths and Jim, 1982), which attempts to model the noise contribution

in the noisy signal (primary output) through an adaptive filter w during the noise-only period. The noise estimate will then be subtracted from y1(t) so as to obtain the desired signal y(t). The formulation of Fig. 3 is similar to that of Group 1 where y1(t) and y2(t) are the primary and reference signals respectively shown in Fig. 1. In addition to (1)–(4), there are additional definitions for y1(t) and y2(t): y 1 ðtÞ ¼ a1 x1 ðt  si1 Þ þ a2 x2 ðt  si2 Þ y 2 ðtÞ ¼ x1 ðt  so1 Þ þ x2 ðt  so2 Þ

ð5Þ ð6Þ

Compared with the traditional GSC structure (Griffiths and Jim, 1982), the proposed structure has two additional features. Firstly, there are two explicit and independent time delay units so1 and so2 in the lower branch, which makes the design of the noise reference more flexible. Either broadside or endfire sensor configuration can be now formed by simply adjusting the time delays si1 and si2, without which a complex exponential expression has to be involved in the blocking matrix (Griffiths and Jim, 1982). Secondly, for the weighting coefficients a1 and a2 in the upper branch, the traditional usage as in (Griffiths and Jim, 1982) are all set to be positive and used for sidelobe control (when more than two sensors are employed). Here they can be either positive or negative, hence can be conveniently used to construct any differential microphones. 2.3. Configurations for DMS in Group 2 Table 1 summarizes various techniques that address the choice of coefficients in (5), (6) and applicable to DMS consisting of two omnidirectional microphones. With the notation of Fig. 3, the parameters for each configuration are given as well. As seen from Table 1, Group 2 systems are different either in the weighting coefficients a1, a2, time delays si1, si2, so1, so2, or in the adaptation scheme. In addition, for methods G2-C, the directivity pattern of the ‘‘noise reference microphone” is dipole in nature, while for methods G2-D to G2-G, it is cardioid instead. We will show the effect of this difference in the following sections. On the other hand, we can also match the configurations in Group 1 and Group 2 by comparing the directionalities of their primary and reference microphone. Keeping these differences and equivalent relationship in mind, we will perform a theoretical analysis on Group 2 DMS based on the complex coherence function. In the sequel, we will use the abbreviations G2-A, G2-B, etc., for DMS from Group 2. 3. Theoretical performance

Fig. 3. Illustration of GSC structure for a DMS where Mic #1 and Mic #2 are both omnidirectional microphones.

The performance of DMS is highly dependent on the application environment and the noise field (Maj et al., 2003; Tompson, 1999; Ricketts and Henry, 2002; Bitzer and Simmer, 1998). In this section a theoretical evaluation of the noise reduction performance will be carried out for

Table 1 Parameters and features of various DMS configurations. si1

si2

so1

so2

a1

a2

Taps

Orient.

Ref.

G2-(A) Delay-and-sum (Brandstein and Ward, 2001)

0

0





1/2

1/2

n.a.

B

n.a.

Features and remarks

G2-(B) Superdirective (Elko, 2000; Cox et al., 1986; Bitzer et al., 1999)

0

0

0

0

w1

w2

n.a.

E

n.a.

G2-(C) Griffiths–Jim (broadside) (Griffiths and Jim, 1982; Hoshuyama et al., 1999)

0

0

0

0

1/2

1/2

L

B

Dipole

G2-(D) Griffiths–Jim (endfire) (Compernolle, 1990; Griffiths and Jim, 1982)

d/c

0

d/c

0

1/2

1/2

L

E

Cardioid

G2-(E) Omnidirectional–cardioid (Sasaki and Gyotoku, 1995; Maj, 2004; Maj et al., 2003) G2-(F) Dipole–cardioid

0

0

d/c

0

1

0

L

E

Cardioid

 

0

0

d/c

0

1/2

1/2

L

E

Cardioid



           

 G2-(G) Back-to-back Cardioid 1 (Elko and Pong, 1995) G2-(H) Back-to-back cardioid 2 (Elko and Pong, 1995; Luo et al., 2002)

0

d/c

d/c

0

1/2

1/2

L

E

Cardioid



0

d/c

d/c

0

1/2

1/2

1

E

Cardioid

   

Steering direction can be adjusted through si1 and si2 Optimal solution under spatial uncorrelated white noise Time-invariant process, ignorable computation cost a1 and a2 are frequency-dependent Equivalent to a hypercardioid mic when optimized for diffused field (Elko, 2000) Better performance in low frequency range and reduce to delay-and-sum in high frequency band (Bitzer et al., 1999) Tradition adaptive beamformer Looking direction can be adjusted through si1 and si2 Best performance when looking direction is at the broadside Potential target signal cancellation problem (Hoshuyama et al., 1999) Looking direction co-linear with the array Target signal cancellation problem mitigated because the noise reference microphone is cardioid in nature, which has a strong effect on the sensitivity of the looking direction error Looking direction co-linear with the array (same as above) Target signal cancellation problem mitigated due to cardioid-type noise reference microphone (same as above) Primary signal is obtained from a dipole-type differential mic, which attenuates the interference more effective in some cases Non-flat frequency response (a low pass filter following output y(t) needed to compensate the differentiator frequency dependence (Elko and Pong, 1995) Non-flat frequency response and a low pass filter following the output y(t) should be used to compensate the differentiator frequency dependence (Elko and Pong, 1995) (same as above) Non-flat frequency response (same as above) The adaptive filter W reduces to a scalar coefficient Low computational complexity No adaptation when source is in frontal half hemisphere to avoids the target signal cancellation

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

No.

Note: d = inter-microphone distance; c = speed of sound; taps = length of adaptive filter; Orient. denotes sensor arrangement, B – broadside, E – endfire; Ref. indicates the equivalent directionality of the noise reference microphone.

1183

1184

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

Group 2 DMS as listed in Table 1. The analysis will be carried out for three noise conditions: incoherent noise (e.g., microphone thermal noise or internal noise generated by the microphone or its circuitry), coherent noise (e.g., noise field generated by a single dominant source, and from a well-defined direction) as well as diffuse noise (same noise level in all direction, e.g., noise field in a high reverberant room). The discussion will be based on the complex coherence function (Bitzer and Simmer, 1998) and we also make the assumption that the background noise or interference signal is stationary. Denote the outputs of the two microphones of a DMS in the frequency domain as X1(x) and X2(x), the complex coherence function CX 1 X 2 ðxÞ is defined as P X 1 X 2 ðxÞ CX 1 X 2 ðxÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P X 1 X 1 ðxÞP X 2 X 2 ðxÞ

ð7Þ

scheme, i.e., a good Voice Activity Detector (VAD), the theoretical noise reduction limit of a DMS is equivalent to the case when only noise source is present (Bitzer and Simmer, 1998). We will consider more complicated situation where the desired speech signal is present later on. For the purposes of our discussion, we further assume that the microphone self-noise is negligible compared with the environmental noise (we will discuss the self-noise issue again in Section 6.4). Considering the small distance d (e.g., 15 mm) between the two microphones, the noise as received by both microphones are approximately at the same level with its spectral density denoted as PNN, i.e., 1 1 jX 1 ðxÞj2  jX 2 ðxÞj2  P NN ðxÞ N N Rewriting Eqs. (5) and (6) in the frequency domain,

where P X 1 X 2 ðxÞ, P X 1 X 1 ðxÞ and P X 2 X 2 ðxÞ are the cross and auto power spectral densities of x1(t) and x2(t) respectively. For incoherent noise, the coherence function CX 1 X 2 ðxÞ is independent of the inter-microphone distance d and is 0 for all frequencies x, see (Bitzer and Simmer, 1998), as in Eq. (8),

Y 1 ðxÞ ¼ a1 X 1 ðxÞejxsi1 þ a2 X 2 ðxÞejxsi2 Y 2 ðxÞ ¼ X 1 ðxÞe

jxso1

 X 2 ðxÞe

jxso2

P Y 2 Y 2 ðxÞ ¼ 2P NN ðxÞð1  RefCX 1 X 2 ðxÞe P Y 2 Y 1 ðxÞ ¼ P NN ðxÞ½a1 e CX 1 X 2 ðxÞ ¼ 0;

8x

 a2 e

jxðso1 so2 Þ

jxðso2 si2 Þ

ð15Þ jxðso2 si1 Þ

ð8Þ

In the case of coherent noise, the omnidirectional microphone outputs are completely coherent except for a time delay. The complex coherence function CX 1 X 2 ðxÞ is given by (see also (Brandstein and Ward, 2001)) CX 1 X 2 ðxÞ ¼ ejxðd=cÞ cos h



CX 1 X 2 ðxÞ þ a2 e

ð10Þ

where sinc c = (sin c)/c. The diffuse noise condition is often regarded as a good approximation of a highly reverberant environment (especially in lower frequency band); hence the analysis under this noise field provides a good indication for many realistic scenarios. For the easy of analysis, let us first assume that, in a given environment, no desired signal is present and only noise exists. Our reason for this assumption is that, for the configurations requiring adaptive cancellation, the adaptation is carried out during the pure noise periods (Compernolle, 1990). With appropriate adaptation control

jxðso1 si2 Þ

CX 1 X 2 ðxÞ

ð16Þ

where * is the conjugate operator, Re{} denotes the real part. Let us define the noise reduction (NR) level of the upper (NRU) and the lower (NRL) branches of Fig. 3 as NRU ¼

ð9Þ

where h is source incident angle, and c the speed of sound in air (c  344 m/s). For partially correlated diffuse noise, the coherence function is real-valued and given by Brandstein and Ward (2001) CX 1 X 2 ðxÞ ¼ sincðx  d=cÞ

ð13Þ

ð14Þ



 a1 e

ð12Þ

Under the assumption (11), the auto- and cross-spectral densities of Y1(x) and Y2(x) are

P Y 1 Y 1 ðxÞ ¼ P NN ðxÞða21 þ a22 þ 2a1 a2 RefCX 1 X 2 ðxÞejxðsi1 si2 Þ gÞ jxðso1 si1 Þ

ð11Þ

NRL ¼

P NN ðxÞ P Y 1 Y 1 ðxÞ

ð17Þ P Y 1 Y 1 ðxÞ 2

P Y 1 Y 1 ðxÞ  jH opt ðxÞj P Y 2 Y 2 ðxÞ

ð18Þ

Minimizing the output power results in (see (Bitzer and Simmer, 1998)) H opt ðxÞ ¼

P Y 2 Y 1 ðxÞ P Y 2 Y 2 ðxÞ

ð19Þ

Noise reduction for the whole system (NRW) is then the product of NRU and NRL, i.e., NRW ¼ NRU  NRL

ð20Þ

By substituting the parameters from Table 1 into Fig. 3 and using Eqs. (8)–(10) and (14)–(19), the performance in noise reduction for the three types of noises can be readily obtained. We will only provide some brief discussion about the results in incoherent and coherent noise conditions, where analysis is relative simple, but elaborate in greater detail about the result in partially correlated diffuse noise

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

environments where analysis can be very complicated. The results can be summarized as follows: (1) In an incoherent noise field, all the studied methods have the same noise reduction (NRW) of 3 dB. This is reasonable as the adaptation function does not work in an incoherent noise field because the primary noise cannot be predicted from the reference noise. (2) In a coherent noise field, the performances of these methods are list below: – NRW of method G2-A depends on the direction of arrival h, given by NRW ¼

2 1 þ cosðx  d  cos h=cÞ

ð21Þ

– For G2-B, the weighing coefficients w1 and w2 have been purposely optimized for diffuse noise field and its performance will be discussed shortly in diffuse noise field condition. – For method G2-C, NRW approaches infinity in all directions except 0° and 180°. – For methods G2-D to G2-G, the NRW approaches infinity in all directions except 0°. – The NRW for method G2-H is infinite only in the back hemisphere due to its coefficient constraint. In real application, the performance would be affected by some practical problem such as the finite length filters and the microphone self-noise, etc. (3) The NR derived under a diffuse noise field are shown in Table 2. Since method G2-H is a practical implementation of G2-G, only the results of G2-G are presented. In addition, the weighting coefficients w1 and w2 of method G2-B (the superdirective array) are obtained as follows (Brandstein and Ward, 2001):  1 H C d ð22Þ ½ w1 w2  ¼ H 1 d C d where d = [1, ejxd/c]T, ()H denotes the conjugate transpose, and   1 CX 1 X 2 ðxÞ ð23Þ C¼ CX 2 X 1 ðxÞ 1

1185

From Table 2, it can be seen that in diffuse noise fields, G2-A and G2-C have the same performance. In fact, G2-C produces no noise reduction in the adaptive branch as NRL = 1. When comparing the NRW of G2-D and G2-E with that of G2-F and G2-G, one may find that they are similar except an extra factor in G2-F and G2-G. Superficially it means that there are extra noise reduction in G2-F and G2G. However, if considering the primary signal in these two methods, one may find that due to the differentiator frequency effect, their targeting signal will also incur an attenuation of the same factor. That is to say, the difference is due to the non-flat response of the directional primary signal. For example, in case of G2-F, when the angle of incidence of the target signal s(t) is 0, y1(t) (refer to Fig. 3) in the frequency domain will be   1 1 ð24Þ Y 1 ðxÞ ¼ SðxÞ  ejxd=c 2 2 The auto-spectral density of y1(t) is P Y 1 Y 1 ðxÞ ¼

1  cos c P SS ðxÞ 2

ð25Þ

where PSS is the spectral density of s(t). The signal attenuation factor due to the differentiator frequency effect is thus given by GY 1 S ðxÞ ¼

P SS ðxÞ 2 ¼ P Y 1 Y 1 ðxÞ 1  cos c

ð26Þ

For method G2-G, a similar result can be obtained. Thus, due to the unity signal response in the looking direction for G2-B, G2-D and G2-E, if we consider only NRW, i.e., SNR improvement, these methods are all identical to G2-B, which is the optimal solution under diffuse noise field. This coincides with the proposition (Cox et al., 1986) that the superdirective beamformer and Griffiths–Jim adaptive algorithm (endfire) are both based on the same optimization criterion. Nevertheless, while the theoretical performances are identical, in practice, there are still some differences within G2-D to G2-G, which will be discussed shortly in Section 6.

Table 2 Noise reduction of various DMS under diffuse noise fields, c = xd/c, sinc c = (sinc)/c. No.

NRU

NRL

NRW = NRU NRL

G2-(A) Delay-and-sum

2 1þsincðcÞ

1 (non-adaptive)

2 1þsincðcÞ

G2-(B) Superdirective

2ð1sincc cos cÞ 1sinc2 ðcÞ

1 (non-adaptive)

2ð1sincc cos cÞ 1sinc2 ðcÞ

G2-(C) Griffiths–Jim broadside

2 1þsincðcÞ

1

2 1þsincðcÞ

G2-(D) Griffiths–Jim endfire

2 1þcos csincðcÞ

G2-(E) Omnidirectional–cardioid

1

G2-(F) Dipole–cardioid

2 1sincðcÞ

1sinc2 c cos2 c 1sinc2 c 2ð1sincc cos cÞ 1sinc2 ðcÞ 2ð1sincc cos cÞ ð1þsinccÞð1cos cÞ

2ð1sincc cos cÞ 1sinc2 ðcÞ 2ð1sincc cos cÞ 1sinc2 ðcÞ 2ð1sincc cos cÞ 2 1cos c 1sinc2 c

G2-(G) Back-to-back cardioid 1

2 1cos csincðcÞ

ð1sincc cos cÞ2 ð1sinc2 cÞð1cos2 cÞ

2ð1sincc cos cÞ 2 1cos 2c 1sinc2 c

1186

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

(B )(D )-(G ) ( A )( C )

6

5

DI (dB)

10lgNRW(dB)

4

3

2

1

0 -2 10

10

35

350

-1

0

1

10

700

1k

3.5k

2

10

10

7k 10k

γ

f (Hz)

Fig. 4. Theoretical noise reduction in diffuse noise environment for methods G2-A to G2-G as function of c, where c = xd/c. The same figure can also serve as the directivity index vs. frequency and provide a basis for performance comparison. The corresponding frequency in the figure is calculated when d = 0.0156 m and c = 344 m/s.

(4) The noise reduction vs. frequency performance in diffuse noise field is depicted in Fig. 4 by using the resulted NRW in Table 2. As can be seen, when c is small, which corresponds to low frequencies and small d, the performances of method G2-A and G2C are very poor, while the NRW can be up to 6 dB for G2-D to G2-G, which are all converge into a superdirective array G2-B. On the other hand, when c increases, the noise signal received by the two microphones gradually become incoherent and the NRW of all methods reduce to 3 dB, as the performance in an incoherent noise field. The weighting coefficients of G2-A to G2-E lead these methods to a unity and flat frequency response in the look direction, whereas in G2-F to G2-H a low pass equalizer should be applied to compensate the differentiator effect. (5) The directivity index (DI) provides a basis for performance comparison between various types of microphone configurations. It is defined as the difference between microphone sensitivity to sounds arriving from the look-direction compared to sounds arriving from all other directions for a specific frequency in diffuse noise field. It is normally used to evaluate the microphone configurations working in the non-adaptive mode ( American National Standard, 2004) as the adaptation process may result in variable DMS directionality in different acoustic conditions. However, in the simulated diffuse noise condition as we did above, the directionalities of these DMS are invariable for all directions. In other words, the adaptation process will converge to the same results for all directions. In this special case, the directivity index of these methods becomes measurable.

If comparing the definition of DI and the NRW in Eq. (20), one may find that they are actually equivalent in the simulated diffuse noise field. In this case, the NRW is exactly the difference between microphone sensitivity to sounds arriving from the look-direction y(t) compared to sounds arriving from all other directions PNN. Thus, we can use Fig. 5 as the DI of each method in this scenario. After assigning the dual microphone interval d and the sound velocity c, we can readily obtain the DI vs. frequency results which is similar to Fig. 4. Similar to the discussion in point (4) earlier, the DIs of these methods are significantly different. Methods G2-A and G2-C have a very low DI in the low frequency band, whereas Methods G2-B,D-G have a 6 dB DI up to 1 kHz. The DIs of all these methods then fluctuate around 3 dB in higher frequency band when the frequency is greater than 10 kHz. 4. Simulation results In this section, we will study the performance of Group 2 DMS in a simulated office environment. In contrast to the metrics used in Section 3, which are explicit functions of frequency, in this section, we will use the so-called ‘broadband metrics’ as introduced in (Desloge et al., 1997), which are used to showcase the SNR improvement. Broadband metrics are formed using intelligibility weights from articulation theory (Greenberg et al., 1993) and is defined as SNRSII ¼

b X

I i Ai SNRi

ð27Þ

i¼1

where the overall SNRSII is made up of SNRi from b onethird octave bands, Ii is the weight for the importance of the ith one-third octave band for speech intelligibility and

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

1187

Fig. 5. Simulated results of intelligibility-weighted polar diagram for methods in Group 2 using two omnidirectional microphones in anechoic condition. Concentric circles on the polar plots denote 20, 15, 10, 5 and 0 dB respectively. (B) Superdirective (Cox et al., 1986); (C) Griffiths–Jim (broadside) (Griffiths and Jim, 1982); (D) Griffiths–Jim (endfire); (E) omnidirectional–cardioid (Maj et al., 2003); (F) dipole–cardioid; (G) back-to-back cardioid 1; (H) back-to-back cardioid 2 (Elko and Pong, 1995; Luo et al., 2002); for methods G2-C to G2-G, K = 10 —, 5 - -, 2.5 --.

Ai is calculated in accordance with the Speech Intelligibility Index (SII) procedure (American National Standard, 1997). The performance of a noise reduction algorithm is assessed by measuring the gain in intelligibility-weighted SNR between the input x1(t) or x2(t) and the output y(t): GSII ¼ SNRSII;y  SNRSII;x

ð28Þ

By measuring the intelligibility-weighted directionality of these methods under different conditions, one can intuitively observe the noise reduction performance of different DMS as a function of the direction of the source. As shown in recent studies (Ricketts and Henry, 2002), the directivity polar diagram can be strongly linked to the improvement of the speech intelligibility. Simulations were performed to mimic anechoic room and reverberant noise fields for different reverberation times Tr. A band-limited (200–6000 Hz) white Gaussian noise was used as the test signal. The inter-microphone distance for the DMS consisting of two omnidirectional microphones was d = 0.0156 m – corresponding to onesample delay for a sampling rate of 22050 Hz and the speed of sound c  344 m/s. As the performance of G2-A is rather clear, it will not be discussed further. The standard superdirective weighting optimized for diffuse noise field was used for G2-B (Bitzer

et al., 1999). The adaptive null-forming scheme proposed in (Luo et al., 2002) was implemented for G2-H. For G2-C to G2-G, the NLMS algorithm in Eqs. (1)–(4) with l = 0.08 and L = 64. To avoid targeting signal cancelling, the norm-constraint adaptive filter (NCAF) as used in (Hoshuyama et al., 1999) was adopted for adaptive filter constraint. The procedure is as follows: yðtÞ y ðtÞ w0 ¼ wðtÞ þ l T y ðtÞy ðtÞ þ d 2 ( p2 ffiffiffiffiffiffiffiffiffi2ffi K=X  w0 ; X ¼ kw0 k2 wðt þ 1Þ ¼ W0

ð29Þ for X > K otherwise

ð30Þ

where w0 (t) is the temporal filter, X and K are the squarednorm of w0 (t) and a threshold respectively. If X > K, w(t + 1) will be restrained by scaling. During the simulations, the same process was repeated for methods G2-B to G2-G. The polar diagram for each method is obtained by rotating the device 360° at increments step of 9° on a turntable in front of the loudspeaker. The distance between the loudspeaker and the DMS was set to be R = 1.0 m. With each setup, 40 sets of data corresponding to 40 positions were generated. The adaptations were executed subsequently and the steady states after

1188

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

Fig. 6. Simulated results of intelligibility-weighted polar diagram for methods in Group 2 using two omnidirectional microphones under reverberant environment. Concentric circles on the polar plots denote 20, 15, 10, 5 and 0 dB respectively. (B) Superdirective (Cox et al., 1986); (C) Griffiths–Jim (broadside) (Griffiths and Jim, 1982); (D) Griffiths–Jim (endfire); (E) omnidirectional–cardioid (Maj et al., 2003); (F) dipole–cardioid; (G) back-to-back cardioid 1; (H) back-to-back cardioid 2 (Elko and Pong, 1995; Luo et al., 2002). Tr = 0.03 s —, Tr = 0.20 s - -, Tr = 0.30 s --. K = 10.

adaptation convergence were used to draw the polar diagrams with the weights of SII. 4.1. Coherent noise field in anechoic environment In Fig. 5 the intelligibility-weighted polar diagram of G2-B to G2-H are plotted. The resulting polar diagram was normalized to 0 dB at the look direction of 0°. The differentiator effect in G2-F and G2-G is normalized by the same effect in the looking direction 0° and thus all the polar diagrams of G2-D to G2-G are identical. As shown in Fig. 5B, method G2-B has a standard hypercardioid polar diagram as we used the standard superdirective weighting optimized for diffuse noise field (Elko, 2000). Secondly, the polar diagram of method G2-C is very sharp, indicating a good spatial selectivity. However, such a sharp spatial selectivity may lead to target signal cancelling when the look-direction deviates from 0° (Hoshuyama et al., 1999). It also shows an inherent drawback of direction ambiguity – two peaks in both 0° and 180°. Our simulations confirmed the conclusions in Section 3 that indicated the SNR improvement performance of G2-D to G2-G were identical. Aided with the NCAF, their responses are relatively flat within the directions of interest at about 0° ± 10°. As shown in the polar diagram of methods G2-C to G2-G, the norm constraints K in Eqs. (29), (30) can be used as a beam width controller.

Increasing K results in sharper beam width and vice versa. The significant difference between the polar diagrams of method G2-C and methods G2-D to G2-G can be explained as follows. In method G2-C, the directivity pattern of the ‘‘noise reference microphone” is dipole in nature, whereas in methods G2-D to G2-G it is of cardioid type. If we observe the nulls of dipole and cardioid microphones, we may find that the gradient in the null direction of dipole microphone is much greater than that of cardioid microphone. In other words, the dipole-type has a sharper blocking range as compared to the cardiod-type. Thus the dipole-type noise reference has good spatial selectivity while the cardioid-type noise reference tends to be more robust against errors in target signal direction. The results of method G2-H shows a polar pattern with a wide frontal hemisphere (90°–270°) due to the constraint on the adaptive filter W. 4.2. Reverberant environment The imaging method in (Allen and Berkley, 1979) was used to simulate the transfer function between the loudspeaker and each microphone in a small office room of size (7.00  3.50  2.85) m. The distance between the loudspeaker and DMS was set to be R = 1.0 m. The constraint parameter K in the NLMS–NCAF algorithm was fixed to

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

1189

7m Height of the room: 2.85 m Microphones

Loudspeaker 3.5 m

R Computers

Stepper motor

Fig. 7. Configuration of the experiment environment. The room size is 3.5  7.0  2.85 m. Stepper motor is placed in the centre of the room and the two microphones are installed on the screw of the motor.

10. The effects of different reverberation times Tr on the polar diagram are shown in Fig. 6. It can be seen that for most methods the performance degrades significantly as Tr increases. From Fig. 6B we observe that although the mainlobe of method G2-B does not change much, its sidelobe becomes wider. In Fig. 6C, the change in the sidelobe for method G2-C is even more noticeable with the shape of the sidelobe expanding rapidly in reverberant condition and approach a circle in the case of Tr = 0.3 s. As for methods G2-D to G2-G as shown in Fig. 6D–G, with Tr increasing from 0.03 s to 0.20 s and then to 0.30 s, the sidelobe went from being more than 20 dB down with respective to the mainlobe to being 12 dB and 4 dB as reverberation increased. At the same time, the mainlobes also become wider. On the other hand, method G2-H shows a constant mainlobe while the increase in the sidelobe is similar to that in methods G2D to G2-G.

band-limited (200–6000 Hz) white Gaussian noise was used as the test signal. The volume of the test signal was set at 75 dB SPL, using a GENELEC 1029A loudspeaker (nominal size less than 100 mm) placed at the same height as the DMS, at 1.00 m above the floor. The loudspeaker was placed at four different distances, namely 0.30 m, 0.60 m, 1.00 m and 1.50 m away from the microphones. A stepper motor was used to rotate the DMS from 0° to 360° with an increment of 9°. The background noise level of the room was about 40 dB SPL. Finally, the two microphone outputs were preamplified and digitalized simultaneously at 22050 Hz and 16 bits, using the built-in sound card of a DELL Precision P4/2.4G Hz PC. The constraint parameter K in the NLMS–NCAF algorithm equation Eq. (30) was fixed at K = 10. The remaining parameters (if not explicitly specified) are all the same as those used in Section 4.

5. Experimental results and discussions

For the DMS using two omni and/or directional microphones, the configurations shown in Fig. 3c–f were used. First point to notice from the experimental results in Fig. 8 is that the directional performance depends highly on the distance between the sound source and the microphones. As the distance R was increased from 0.30 m to 1.50 m, the sidelobes for all Group 1 DMS configurations, namely G1-a to G1-d, became wider. This was mainly due to the lack of coherence between the two microphones as a result of strong multipath, nonlinear and a time-varying acoustic environment. In comparison, as predicted by the theoretical analysis in Section 3, the results of G1-a to

In this section, experiments to investigate the intelligibility-weighted polar diagrams for different DMS configurations are presented. To make our experiment results more applicable to industry, we partially follow some test conditions regulated in ANSI S3,35-2004 ( American National Standard, 2004). The experiment was carried out in regular office environment (shown in Fig. 7), which is moderately reverberant: Tr  0.20–0.30 s for frequency range 200 Hz– 6000 Hz. The specifications of the omnidirectional, cardioid and dipole microphones used are given in Table 3. A

5.1. Group 1

Table 3 Specifications of the microphones used in our experiments, as obtained from the Panasonic Data Sheets. Microphone

Directional pattern

Sensitivity

Dimensions (U: diameter)

WM034CY WM56A103 WM55D103

Omnidirectional Cardioid Dipole

42 ± 3 dB 50 ± 3 dB 54 ± 4 dB

U 9.7  6.7 (mm) U 9.7  5.0 (mm) U 9.7  5.0 (mm)

1190

J. Chen et al. / Speech Communication 51 (2009) 1180–1193 0 0dB

0 0dB 330

330

30 -5

300

-10

300

60

-10

-15 90

240

150 180

90

270

240

120

120

210

(a)

300

60

-10

60

-15

-15 90

270

240

120

150 180

30 -5

-10

210

(b)

0 0dB 330

30 -5

300

150 180

0 0dB 330

60

-15

270

210

30 -5

(c)

90

270

240

120

210

150 180

(d)

Fig. 8. Experimental results of intelligibility-weighted polar diagram for configurations in Group 1 using two microphones with at least one directional microphone involved. Concentric circles on the polar plots denote 20, 15, 10, 5 and 0 dB respectively. (a) Omnidirectional–cardioid, G1-a; (b) backto-back cardioid, G1-b; (c) dipole–cardioid, G1-c; (d) cardioid–dipole microphones, G1-d. R = 0.30 m —, 0.60 m - -, 1.00 m --, 1.50 m ——.

G1-c as shown in Fig. 8a–c, are very close. Hence, the configurations of G1-a to G1-c are equivalent to their respective counterpart in types G2-E to G2-G, which were shown to have the same theoretical performance in any noise field, see Table 2. Since the cardioid–dipole configuration in G1d cannot be constructed using two omnidirectional microphones, it was only studied experimentally. As shown in Fig. 8d, the mainlobe was much narrower than other configurations. As explained earlier, this was likely due to its dipole-type noise reference. Therefore, a higher noise reduction can be expected from this configuration, benefiting from the four independent ports for sound entry. 5.2. Group 2 For the experiments using a DMS consisting of two omnidirectional microphones, the inter-microphone distance was set to be d = 0.0156 m. Except for G2-B and G2-H, the same parameters that was used in Group 1 were also used in Group 2. The experimentally determined polar plots are shown in Fig. 9. For Fig. 9D–G, the noise cancellation performances are exactly the same. The beamwidth of G2-D to G2-G was slightly narrower than that of G2H as the constraint method used in G2-D to G2-G was different from that in G2-H. Again, as the distance between

the loudspeaker and the microphones increased from 0.30 m to 1.50 m, the value of the sidelobe relative to the mainlobe went from being more than 15 dB to 6 dB. 6. Discussion In this section, we will discuss a few practical issues which are not covered in previous sections. 6.1. Microphone mismatch For optimal performance of the DMS in Group 2, the sensitivity of the two microphones across frequencies has to be well matched. A mismatch between the two microphones can severely reduce the directivity of the system (Csermak, 2000). Broadly speaking, there are two different sources of mismatch, each affecting directivity differently. (1) Microphone mismatch due to aging (‘‘drift”) results in a parallel shift of the frequency responses between the two microphones. This type of mismatch can be compensated for by a frequency-independent adjustment of the gain for one microphone. Nevertheless, if microphone pairs are used from the same production batch, it is likely that their characteristics will

J. Chen et al. / Speech Communication 51 (2009) 1180–1193 0 0dB 330

0 0dB

30

330

-5 300

60

-10

-15 90

240

150 180

90

270

240

120

120

210

(B) 330

-10

300

60

-15

60

-15

90

270

240

120

150 180

30 -5

-10

210

(C)

0 0dB

30 -5

300

150 180

0 0dB 330

60

-15

270

210

30 -5

-10

300

1191

(D)-(G)

90

270

240

120

210

150 180

(H)

Fig. 9. Experimental results of intelligibility-weighted polar diagram for methods in Group 2 using two omnidirectional microphones. Concentric circles on the polar plots denote 20, 15, 10, 5 and 0 dB respectively. (B) Superdirective (Cox et al., 1986); (C) Griffiths–Jim (broadside) (Griffiths and Jim, 1982); (D) Griffiths–Jim (endfire); (E) omnidirectional–cardioid (Maj et al., 2003); (F) dipole–cardioid; (G) back-to-back cardioid 1; (H) back-to-back cardioid 2 (Elko and Pong, 1995; Luo et al., 2002). R = 0.3 m —, 0.6 m - -, 1.0 m --, 1.50 m ——.

drift together and not apart. As pointed out by Csermak (2000), Regardless, no evidence to date has been produced which suggests that any drift that might occur is significant enough to have a noticeable effect. Therefore, this type of mismatch is relatively easy to overcome and the effect is insignificant. (2) Other types of mismatch, for example, due to random manufacturing error, dirt or moisture in the acoustical pathway, cause a frequency dependent modification of the frequency response. This second type of mismatch can only be compensated for by a frequency-dependent adjustment of the gain. It is these types of mismatch that lead to significant performance degradation in real applications. There are a few approaches to this problem. First of all, the manufacturers need to ensure that they trim out any sensitivity differences in the microphones in the process of assembling each individual dual microphone systems. Secondly, some commercially available software, such as the FRONTWAVE system (Csermak, 2000), has been designed to automatically trim out this type of mismatches. Thirdly, researchers also proposed to use a customized filter to compensate the mismatch errors (Luo et al., 2002). These methods can effectively control the mismatches typically to within 0.02 dB.

6.2. Microphone deployment As indicated in Section 3, the extent of noise reduction depends critically on the types of noise in the environment. Consequently, the characteristics of the noise field of the application environment should be investigated carefully before deciding on the appropriate configuration for a DMS. In the following, we give a few examples to demonstrate how we normally make the decision. (1) For example, if the noise is predominantly overhead, as is in the case of air conditioning vents, using the dipole-type microphone as the primary signal with its main direction parallel to the floor will do a better job than others in attenuating the noise from the vent. This is due to the presence of a null in the vertical axis of the dipole microphone (refer to G1-c and G2-F). (2) If the environment contains largely diffuse noise, the super directive array, i.e., G2-B, is more appropriate. (3) If the main purpose is to reduce the noise coming from the back of the device, the efficient methods in (Luo et al., 2002) is recommended. (4) If a combination of noise conditions is present, a switching mechanism can be adopted, either manually or adaptively (Tompson, 1999).

1192

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

6.3. Directional selectivity and robustness If a good directional selectivity, e.g., a sharp mainlobe, is desired, type G2-C is recommended. However, its poor performance in reverberant environments as shown in Fig. 6C means that this configuration will be less ineffective in such situation. In this scenario, type G2-F or G1-d may be a good compromise, as indicated by Fig. 8d. Nevertheless, as mentioned earlier, the good directivity for this configuration benefits mainly from its four independent sound ports. If the desired target source is mobile, in order to obtain a robust DMS, methods in Group 2 using endfire orientation (i.e., the reference microphone is cardioid type) are preferred. Compared to configurations using broadside orientation, the main disadvantages of endfire arrays are the required calibration and loss of performance when wide steering range is required. 6.4. Signal self cancellation issue Signal self cancellation is a limiting factor that affects the real application of Adaptive Noise Cancellation systems (Widrow et al., 1975). Taking G2-(C) as an example, its sharp mainlobe indicates that once the targeting sound source direction slightly departs from the desired direction, e.g., 0° in Fig. 5C, the targeting signal will be significantly cancelled, as it is regarded as an interference noise. For this reason, some robust adaptive beamformer are proposed (Hoshuyama et al., 1999), which allows some targeting signal direction errors. For the DMS studied in this paper, we can see from Fig. 5, 6 and 9 that the following configurations are insensitive to the signal direction error: G2-(B), G2-(H) and G2(D-G), while G2-(C) is prone to signal self cancellation. If we observe Fig. 9, the experimental results, it is clear that for G2-(B), the allowable signal direction error is rather wide, while for G2-(D-G) and G2-(H), it is slightly narrower. But for G2-(C), it is highly sensitive to the signal direction. As for Group 1, G1-(a), G1-(b) and G1-(c) show insensitive to the signal direction error, while G1-(d) is similar to G2-(C), as shown in Fig. 8. Hence, if the signal direction error is unavoidable in real application, G1-(d) and G2-(C) are not recommended. 6.5. Other practical issues The internal noise of the microphone is another important aspect when designing audio-capturing device. Although the noise reduction abilities of method G2-D to G2-G are theoretically the same, they could be slightly different when the internal noise is considered. If G2-F and G2-G are equalized by a low pass equalizer to produce a flat frequency response, the low frequency internal noise will be amplified as well, resulting in a poorer SNR in low frequency band compared to G2-D and G2-E. In addition, the distance d between two omnidirectional

microphones in Group 2 may also vary the performance when the internal noise is taken into account. For a detailed discussion on the effect of internal noise effect on DMS, one can refer the opinion proposed in (Tompson, 1999). Another practical problem which may affect the performance of DMS is the wind and vibration. As pointed out in (Csermak, 2000), a pressure-gradient microphone (various directional microphones) is invariably more sensitive to wind noise and vibration than a pressure one (omnidirectional microphone). In the DMS discussed above, all of them are working with gradient microphones. Therefore, the designers are strongly recommended to adopt at least one of the well-proven industrial approaches to relief these effects in real application. Among the methods (American National Standard, 2004) are using appropriate low pass filter, employing windshields, applying layers of mesh to protect the capsule, etc. At last, we should emphasize that due to imperfect voice activity detection in real application, the performances of those adaptive DMS methods may degrade. Moreover, due to the limited degree of freedom (only two microphones are involved), when there are more than one dominant noise sources present, the adaptive-based method may degrade down to the non-adaptive-based methods, such as method G2-B, the superdirective array. 7. Conclusion In this paper, we have characterized the performance of a number of configurations for adaptive DMS based on a uniform framework and experimental system. Typical adaptive DMS configurations by using two omnidirectional microphones (Group 2) are expressed using a coherent Generalized Sidelobe Canceller (GSC)-like structure. Theoretical noise reduction performance analysis has been presented to reveal the potential performance, limitations and similarities of these DMS in Groups 1 and 2, under three typical noise fields: coherent, incoherent and diffuse. A performance evaluation of DMS in simulated and real environment has been carried out to investigate their noise reduction in terms of intelligibility-weighted polar diagram and DI, respectively. While Group 1 of DMS which uses two microphones that have intrinsically directional properties can be made compact, the DMS using two omnidirectional microphones provides more flexibility. Within Group 2, methods G2-B to G2-H show good noise reduction under coherent noise fields. In diffuse noise field, methods G2-B, G2-D to G2H have better noise reduction than methods G2-A and G2-C (maximally 6 dB) in lower frequency band. Group 1 provides similar performance with their equivalent methods in Group 2. In addition, by using dipole-type reference microphone, the DMS shows a good spatial selectivity; on the other hand, by using a cardioid-type reference microphone, the system becomes more insensitive to the error in the

J. Chen et al. / Speech Communication 51 (2009) 1180–1193

look-direction. In a real environment, the performance of the DMS will depend highly on the noise condition and thus a careful investigation of the noise condition in the real application is critical before determine the most suitable DMS method. References Allen, J.B., Berkley, D.A., 1979. Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Amer. 65 (4), 943–950. American National Standard, 1997. Methods for calculation of the speech intelligibility index. American National Standard, 2004. Method and measurement of performance characteristics of hearing aids under simulated real-ear working conditions, ANSI-S3, 35-2004. Berghe, J.V., 1998. An adaptive noise canceller for hearing aids using two nearby microphones. J. Acoust. Soc. Amer. 103 (6), 3621–3626. Bitzer, J., Simmer, K.U., 1998. Multichannel noise reduction: algorithms and theoretical limits. In: Proc. European Signal Processing Conf. (EUSIPCO), pp. 105–108. Bitzer, J., Kammeyer, K., Simmer, K.U., 1999. An alternative implementation of the superdirective beamformer. In: Proc. IEEE ASSP Workshop on Applications of Signal Processing to Audio, Acoust., Vol. 1, pp. 7–10. Brandstein, M., Ward, D., 2001. Microphone Arrays: Signal Processing Techniques and Applications. Springer Verlag (Chapter 2). Compernolle, D.V., 1990. Switching adaptive filters for enhancing noisy and reverberant speech from microphone array recording. In: Proc. IEEE Internat. Conf. on Acoust., Speech, Signal Processing, Vol. 1, pp. 833–836. Cox, H., Zeskind, R.M., Kooij, T., 1986. Practical supergain. IEEE Trans. Acoust. Speech Signal Process. 34, 393–398. Csermak, B., 2000. A primer on a dual microphone directional system. Hearing Rev. 7 (1), 56–60. Desloge, J.G., Rabinowitz, W.M., Zurek, P.M., 1997. Microphone-array hearing aids with binaural output-part I: fixed-processing systems. IEEE Trans. Speech Audio Process. 5, 529–542. Elko, G.W., 2000. Superdirectional microphone arrays. In: Gay, S.L., Benesty, J. (Eds.), Acoustic Signal Processing for Telecommunication. Kluwer Academic Publishers, pp. 181–235 (Chapter 10).

1193

Elko, G.W., Pong, A.-T.N., 1995. A simple adaptive first-order differential microphone. In: Proc. IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 169–172. Greenberg, J.E., Peterson, P.M., Zurek, P.M., 1993. Intelligibility weighted measures of speech-to-interference ratios and speech system performance. J. Acoust. Soc. Amer. 94 (11), 3009–3010. Griffiths, L.J., Jim, C.W., 1982. An alternative approach to linearly constrained adaptive beamforming. IEEE Trans. Antennas Propagat. 30 (1), 27–34. Haykin, S., 1996. Adaptive Filter Theory. Prentice Hall. Hoshuyama, O., Sugiyama, A., Hirano, A., 1999. A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Trans. Signal Process. 47 (10), 2677– 2684. Luo, F.-L., Yang, J., Pavlovic, C., Nehorai, A., 2002. Adaptive nullforming scheme in digital hearing aids. IEEE Trans. Signal Process. 50 (7), 1583–1590. Maj, J.-B., 2004. Adaptive noise reduction algorithms for speech intelligibility improvement in dual microphone hearing aids, Ph.D. Thesis, Katholieke Universiteit Leuven (June). Maj, J.-B., Royackers, L., Moonen, M., Wouters, J., 2003. Comparison of adaptive noise reduction algorithms in dual microphone hearing aids. In: Proc. Internat. Workshop on Acoust. Echo and Noise Control, Vol. 1, pp. 171–174. Phua, K., Chen, J., Shue, L., Sun, H., 2005. Development of a compact 2sensor adaptive directional microphone. Signal Process. 85 (4), 809– 820. Ricketts, T., Henry, P., 2002. Evaluation of an adaptive, directional microphone hearing aid. Internat. J. Audiol. 41, 100–112. Sasaki, T., Gyotoku, K., 1995. Microphone apparatus. Tompson, S.C., 1999. Dual microphones or directional-plus-omni: which is the best? Hearing Rev. 3, 31–35. Welker, D.P., Greenberg, J.E., Desloge, J.G., Zurek, P.M., 1997. Microphone-array hearing aids with binaural output-part II: a twomicrophone adaptive system. IEEE Trans. Speech Audio Process. 5, 543–551. Widrow Jr., B., Glover, J.R., McCool, J.M., Kaunitz, J., Williams, C.S., Hearn, R.H., Zeidler Jr., J.R., Dong, E., Goodlin, R.C., 1975. Adaptive noise canceling: principles and applications. Proc. IEEE 63 (12), 1692–1716.