Applied Acoustics 115 (2017) 109–120
Contents lists available at ScienceDirect
Applied Acoustics journal homepage: www.elsevier.com/locate/apacoust
Spatio-temporal filter bank for visualizing audible sound field by Schlieren method N. Chitanont ⇑, K. Yatabe, K. Ishikawa, Y. Oikawa Department of Intermedia Art and Science, Waseda University, Tokyo, Japan
a r t i c l e
i n f o
Article history: Received 18 April 2016 Received in revised form 19 August 2016 Accepted 22 August 2016 Available online 1 September 2016 Keywords: Sound field visualization Optical system Filter bank Spatio-temporal filtering Schlieren imaging
a b s t r a c t Visualization of sound field using optical techniques is a powerful tool for understanding acoustical behaviors. It uses light waves to examine the acoustical quantities without disturbing the sound information of the field under investigation. Schlieren imaging is an optical method that uses a camera to visualize the density of transparent media. As it uses a single shot to capture the information without scanning, it can observe both reproducible and non-reproducible sound field. Conventionally, the Schlieren system is applied to high-pressure ultrasound and shock waves. However, since the density variation of air caused by the audible sound field is very small, this method was not applicable for visualizing these fields. In this paper, a spatio-temporal filter bank is proposed to overcome this problem. As the sound is a very specific signal, the spatio-temporal spectrum (in two-dimensional space and time) of the audible sound is concentrated in a specific region. The spatio-temporal filter bank is designed for extracting the sound field information in the specific region and removing noise. The results indicate that the visibility of the sound fields is enhanced by using the proposed method. Ó 2016 Elsevier Ltd. All rights reserved.
1. Introduction Visualization of sound field is a technique of representing the physical properties of sound as visual information. It can be used to enhance the understanding of acoustical behaviors, such as reflection, diffraction and interference. As it is convenient to use computers for predicting the behaviors of sound field, numerous simulation methods for sound field visualization have been proposed. For example, Yokota et al. have applied the finite difference time domain (FDTD) method to investigate transient sound propagation in the typical shapes of concert hall [1]. Deines et al. have introduced the comparative visualization of acoustic simulation results obtained by the finite element method (FEM)-based approach and the phonon tracing method for investigating the behavior of both methods in overlapping frequency domain [2]. Bilbao et al. have proposed a three-dimensional finite volume time domain (FVTD) simulation for accurate fitting of room boundary conditions [3]. Although these methods are effective, the errors between the mathematical model and the conditions of real world, the model error, cannot be eliminated completely. Therefore, visu-
⇑ Corresponding author. E-mail addresses:
[email protected] (N. Chitanont), k.yatabe@asagi. waseda.jp (K. Yatabe),
[email protected] (K. Ishikawa), yoikawa@waseda. jp (Y. Oikawa). http://dx.doi.org/10.1016/j.apacoust.2016.08.028 0003-682X/Ó 2016 Elsevier Ltd. All rights reserved.
alizing sound by real experiments is an important alternative for investigating sound field behaviors. There are several experimental methods for observing acoustical behaviors. Microphones are normally used for this propose [4,5]. A single microphone can measure sound pressure at a single point. In order to visualize sound field, we need to acquire sound pressure at numerous points over the observational region. Therefore, a large number of microphones is required to be placed at different positions across the field to acquire the sound information. These instruments can contaminate the acoustical behaviors that are being investigated in the region; for example, when sound wave encounters the instruments, it could be reflected by them. Another factor that can contaminate this observation is the presence of a microphone diaphragm. When the sound is measured under the airflow hitting to the diaphragm, the information acquired from a microphone contains both acoustics pressure and mechanical noise caused by airflow. Moreover, eigenfrequency of the diaphragm can affect the measured information. Visualizing sound field by using optical techniques is another alternative without such contamination. The sound field information can be characterized by investigating the changes in refractive index of air caused by the sound. Its non-intrusive feature allows the sound information to be undisturbed under investigation, as the light does not influence acoustical behaviors while traveling through the acoustic field. Laser Doppler Vibrometer (LDV) is
110
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
widely applied to this [6–10]. It can be used to visualize the sound field by measuring its information over the laser paths. However, we have to scan over the region of interest to obtain a number of measurements for visualization. A schematic presentation of sound visualization by scanning LDV is shown in Fig. 1(a). Therefore, repeated measurements of the same sound is needed. This limits the visualization of instantaneous field by LDV to only the reproducible sound field emitted by an electrical transducer such as a loudspeaker. Schlieren imaging is a technique of using an optical system coupled with photography to visualize the density variation in transparent media. It was originally used to visualize fluid flows [11– 13]. In acoustical phenomena, August Toepler is the first person who visualized shock waves from electronic sparking [14]. Nowadays, Schlieren systems are being widely used in the application of high-pressure ultrasound [15–17]. By this technique, acoustical behaviors can be investigated by using a camera to capture the density gradient of air caused by the sound. A schematic presentation of visualizing sound field by the Schlieren system is shown in Fig. 1(b). As it can use a single shot to capture the sound field without scanning, both non-reproducible and reproducible sound fields can be visualized by this system. At present, a high-speed camera can be used to capture motion of sound propagation that can help us to understand the behaviors of sound more clearly. However, a few researchers have proposed visualization methods belonging to the audible frequency range. In 2010, Hargather et al. applied a high-sensitivity Schlieren system to visualize the pure tone sounds at high frequency with very large pressure [18]. However, this method cannot be used to visualize the audible sound field in a low frequency range with low pressure. This is because the density variation of air caused by audible sound field is very small and that makes the audible sound field difficult to visualize from a raw Schlieren video. Although it is possible to amplify such small quantities using a computer, the audible sound field is still difficult to visualize
because video noises are typically much greater than the audible sound field signal. In this paper, a spatio-temporal filter bank designed for audible sound field to remove noise and extract sound information is proposed. As the features of sound field have very special characteristics, its spatio-temporal spectrum (in two-dimensional space and time) is condensed into a specific region [19,20]. In contrast, noises appear everywhere in the spectrum. The filter aims to extract the sound information inside the specific region and remove the unwanted noise presented outside (see Section 2.2 and Fig. 3). The proposed method is an extension of our previous investigation [21] which used only a single bandpass filter by assuming that a sound field consists of one frequency component. On the other hand, this paper proposes a spatio-temporal filter bank that can be applied to any time-varying sound field which is not monochromatic. The proposed method was applied to both simulated sound field videos as well as to measured videos (recorded by a two-lens Schlieren system) for testing its performance. 2. Theoretical background 2.1. Schlieren method The theory of visualizing transparent media using Schlieren system has been established in the context of geometrical optics. In this subsection, it is summarized briefly for a two-lens parallelbeam Schlieren system illustrated in Fig. 2(a). In addition, its relation to sound is explained. Variation in the density of air leads to change in direction of the light beams. Let ex and ey be deflection angles of the light traveling in the z-direction, which are assumed to be very small so that tan e e. According to the standard textbook of the Schlieren method [22], they can be approximately represented as
ex ¼
GL @ q ; n0 @x
ey ¼
GL @ q ; n0 @y
ð1Þ
where G is the Gladstone–Dale constant, L is the distance between the lenses, n0 is the standard refractive index of air, and q is the density of air. The deflected light field is converted to a visualized
Loudspeaker
Loudspeaker Light source sound
Rigid refractor
Deflected light Camera
Sound
Knife-edge Loudspeaker
Deflected beam ∆
Camera Knife-edge Lens 2
Light Source
Lens 1
Fig. 1. Schematic presentation of sound field visualization by optical methods. (a) Sound field visualization by scanning LDV and (b) sound field visualization by twolens Schlieren system.
Undisturbed beam
Knife-edge
Fig. 2. Schematic presentation of Schlieren system and the displacement of the light beam. (a) Two-lens Schlieren system and (b) vertical displacement of light beam.
111
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
image by an object, a so-called knife-edge. As shown in Fig. 2(a), some light rays are blocked by a knife-edge that creates contrast in the intensity of the light. The contrast C (=DE=E) at each pixel of an image is defined by variation of illuminance DE normalized by background illuminance E. Since E appears as gray background in a typical image, the ratio DE=E indicates how black and white pattern arises in the image. In the setting of Fig. 2, it can be written as
C¼
ey f 2 ak
f GL @ q ¼ 2 ; ak n0 @y
q q0 ; q0
ð2Þ
ð3Þ
where K is the bulk modulus and q0 is the standard density of air. Therefore, the contrast can be written as
C¼
f 2 q0 GL @p ; ak n0 K @y
ð4Þ
2.2. Spatio-temporal spectrum of sound field
1 @ 2 pðx; y; tÞ ¼ 0; c2 @t 2
ð5Þ
@ 2 pðx; y; tÞ @ 2 pðx; y; tÞ þ : @x2 @y2
ð6Þ
Hence, the wave equation can be rewritten as
@ 2 pðx; y; tÞ @ 2 pðx; y; tÞ 1 @ 2 pðx; y; tÞ þ 2 ¼ 0: @x2 @y2 c @t2
ð7Þ
By taking the three-dimensional Fourier transform of Eq. (7), we get þ1
Z
1
þ1 1
Z
þ1
f ðx; y; tÞeikx x eiky y eixt dxdydt ¼ 0;
ð8Þ
1
where
f ðx; y; tÞ ¼
@ 2 pðx; y; tÞ @ 2 pðx; y; tÞ 1 @ 2 pðx; y; tÞ þ 2 ; @x2 @y2 c @t 2
ð9Þ
ð10Þ
Therefore, we get
Z
ðikx Þ
þ1
pðx; y; tÞeikx x dx þ
1 2
1 @ c2 @t 2
Z
þ1
Z
@2 @y2
þ1
pðx; y; tÞeikx x dx
1
pðx; y; tÞeikx x dx ¼ 0:
ð11Þ
1
Eq. (11) can be rewritten as 2 ^ðkx ; y; tÞ þ ðikx Þ p
where
^ðkx ; y; tÞ ¼ p
Z
@2 1 @2 ^ðkx ; y; tÞ ^ðkx ; y; tÞ ¼ 0; p p c2 @t 2 @y2
ð12Þ
pðx; y; tÞeikx x dx:
ð13Þ
þ1
1
By applying the similar procedure to the remaining part of Eq. (8) including the Fourier transform on the space variable y and time variable t, we get
^ðkx ; ky ; xÞ þ ky p ^ðkx ; ky ; xÞ kx p 2
x2 c
^ðkx ; ky ; xÞ ¼ 0; p
ð14Þ
^ðkx ; ky ; xÞ is the three-dimensional Fourier transform of where p pðx; y; tÞ. Then, Eq. (14) can be rearranged as 2
2
ðkx þ ky Þ
x2 c
^ðkx ; ky ; xÞ ¼ 0: p
ð15Þ
Finally, we can obtain the relation between the wavenumbers, kx and ky , and the angular frequency x: 2
where c is the speed of sound. r2 is the Laplacian that can be represented by the linear differential operator in the Cartesian coordinate system as:
Z
1
2
ðkx þ ky Þ ¼
As light travels in the z-direction (see Section 2.1 and Fig. 2(a)), the sound field data captured by the high-speed camera is presented in the two-dimensional x-y plane. Although the obtained information is originally three-dimensional one integrated along z-axis, we assume that it approximately obeys the twodimensional wave equation for simplicity (the appropriateness of this assumption will be discussed in Section 3.1.2 and Fig. 7). The wave equation is represented as
r2 pðx; y; tÞ ¼
( ) @ 2 pðx; y; tÞ @ 2 pðx; y; tÞ 1 @ 2 pðx; y; tÞ ikx x e þ 2 dx ¼ 0: @x2 @y2 c @t 2
þ1
2
which shows that contrast of a Schlieren image for sound field is proportional to the spatial derivative of sound pressure. Thus, integrating an obtained image along the direction perpendicular to a knife-edge yields visualization of the sound pressure field. Note that actual value of the image depends on the brightness of the light source and the position of the knife-edge. We emphasize here that the topic of this paper is on visualization and not on quantitative measurement.
r2 pðx; y; tÞ
Z
2
where f 2 is focal length of the second lens and ak is displacement of the light beam caused by the deflection as shown in Fig. 2(b). Therefore, the contrast of a Schlieren image is obtained by the directional derivative of the density perpendicular to the knife-edge. When the sound pressure p is much smaller than the static pressure, it can be represented as
p¼K
kx and ky are the wave number in x and y space, respectively, and x is the angular frequency. Firstly, the Fourier transform of the acoustical wave equation on the space variable x is
x2 c
:
ð16Þ
Eq. (16) indicates that if the angular frequency x is constant, the two-dimensional spatial spectrum of the sound field consisting of propagating waves will be a circle with radius qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 k ¼ kx þ ky ¼ x=c. In contrast, if the angular frequency x is not constant, the radius will vary according to the angular frequency
x. Fig. 3(a)–(c) shows a schematic presentations of the two-
dimensional sound field spectra of 10 kHz, 15 kHz, and 20 kHz, respectively, obtained from Eq. (16), where x ¼ 2pf and f is the frequency of sound. It can be seen that the spectrum is concentrated on a ring and the radius of the ring will be larger if the temporal frequency of sound is higher. Therefore, the spatio-temporal spectrum of the audible sound field is a cone, as shown in Fig. 3(d). Since the spatio-temporal spectrum of the audible sound field is concentrated in this specific region, using a filter to extract the sound information inside the region and remove the noise present outside can effectively improve the visibility of the Schlieren method. 3. Spatio-temporal filter bank for visualizing audible sound field As the density variation of air caused by audible sound field is very small, it is difficult to observe the audible sound field from a raw Schlieren video. Examples of audible sound field images observed by Schlieren system without any processing are shown in Fig. 4. In order to overcome this problem, the spatio-temporal filter bank, which includes the analysis-synthesis filter bank pair [23,24] and the isotropic spatial bandpass filter [25,26], are applied for extracting the sound field information from the raw Schlieren video and for noise removal.
112
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
1000
500 0 −500
−1000 −500
1000
Wavenumber [rad/m]
Wavenumber [rad/m]
Wavenumber [rad/m]
1000
500 0 −500
−1000
0
500
−500
1000
Wavenunber [rad/m]
500 0 −500 −1000
0
500
1000
(a)
−500
0
500
1000
Wavenumber [rad/m]
Wavenumber [rad/m]
(c)
(b)
Wavenuuber [rad/m]
1000
500
0
−500 1000 −1000 0
500 0
5000 10000
−500
15000
Frequency [Hz]
20000
Wavenumber [rad/m]
−1000
(d) Fig. 3. Schematic presentation of sound field spectra obtained from Eq. (16). (a) Spatial spectrum of 10 kHz sound field, (b) spatial spectrum of 15 kHz sound field, (c) spatial spectrum of 20 kHz sound field, and (d) spatio-temporal spectrum.
Fig. 4. Schlieren images of 10 kHz, 15 kHz sinusoidal, and chirp sound field.
3.1. Spatio-temporal filter bank
1 fH0 ðzÞG0 ðzÞ þ H1 ðzÞG1 ðzÞg ¼ zb ; 2
Fig. 5 shows the tree structure of the spatio-temporal filter bank. Its processing is divided into three processes: analysis filter bank, spatial filter, and synthesis filter bank. The description of these processes and the concepts of the proposed method are described in this section.
where H0 ðzÞ and H1 ðzÞ are the analysis lowpass and highpass filter pair, G0 ðzÞ and G1 ðzÞ are the synthesis lowpass and highpass filter pair, and b is a positive integer. Secondly, the function for aliasing component term must be zero:
3.1.1. Analysis-synthesis filter bank In our proposed method, analysis-synthesis filter pair is designed to have a perfect reconstruction property [27–29]. It will be achieved as the following condition. Firstly, the distortion transfer function in the z-domain satisfies
1 fH0 ðzÞG0 ðzÞ þ H1 ðzÞG1 ðzÞg ¼ 0: 2
ð17Þ
ð18Þ
The amplitude distortion and the aliasing problem will be overcome under the condition of Eqs. (17) and (18). FIR filter is selected for designing a prototype lowpass filter H0 ðzÞ of odd order N [29]. The relationship between the analysis and synthesis filter is represented as
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
113
Fig. 5. Tree structure of designed filter bank.
H1 ðzÞ ¼ zN H0 ðz1 Þ; G1 ðzÞ ¼ 2z
N
G0 ðzÞ ¼ 2zN H0 ðz1 Þ;
1
H1 ðz Þ:
ð19Þ
The parameters of the designed filter bank are shown in Table 1. By this process, the Schlieren video signals in one-dimensional time direction corresponding to each pixel ði; jÞ are used as the input signals Iij ðtÞ of the filter bank (see in Fig. 5). The analysis filter bank is applied for decomposing these signals into multiple frequency bands. Fig. 6 shows examples of the magnitude response of the designed analysis filter bank when their filter channels are 2, 4, 8, 16, 32, and 64, respectively. The bands illustrated by the gray dotted lines correspond to removed ones since they contain the frequency components that are higher than 20 kHz (inaudible sound) (see in Section 3.2). These indicate that the filters split temporal frequencies of the video signals of each pixel ði; jÞ equally before going through the next process (spatial filtering). The frequency components is divides into more bands and be narrower if the number of filter channel increases. After they are filtered by the spatial bandpass filter, the synthesis filter bank is used for composing them. 3.1.2. Isotropic spatial bandpass filter As we mentioned in Section 2.2, the two-dimensional spatial spectrum of the sound field of each specific temporal frequency is a ring. The radius of the ring will be larger if the temporal frequency of the sound is higher. In this section, an array of spatial bandpass filters is designed for removing the spatial frequency outside the ring. The design of each spatial filter corresponds to each band of the filter bank. Let n be the band index of the filter bank. The wavenumber response of the spatial filter is given by
(
Hc;n ðkx ; ky Þ ¼ exp
2
2
ðkx þ ky Þ 2D2h;n
)
(
) 2 2 ðkx þ ky Þ 1 exp ; 2D2l;n
ð20Þ
where Dh;n is the upper cut-off wavenumber and Dl;n is the lower cut-off wavenumber. The upper cut-off wavenumber can be represented as
Dh;n ¼
2pf h;n K sc
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi M2x þ M2y ;
ð21Þ
Table 1 Parameters of the designed FIR filter bank. Order of filter Maximum stopband ripple of analysis filters Maximum stopband ripple of synthesis filters
where f h;n is the upper cut-off frequency of the filter bank of the band number n; Mx and M y are the number of the pixel in kx and ky space, respectively, K s is the sampling wavenumber, and c is the speed of sound. The lower cut-off wavenumber is set according to 20 Hz, which is the minimum audible sound frequency:
Dl;n ¼
40p Ksc
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r M 2x þ M2y :
ð22Þ
Fig. 7(a)–(c) shows the two-dimensional spectrum of the designed spatial filters for 10 kHz, 15 kHz, and 20 kHz, respectively. The radius of each spatial filter will be larger if temporal frequency is higher. Note that their responses are thicker than the theoretical spectrum in Fig. 3(a)–(c). If the two-dimensional wave assumption in Section 2.2 does not hold, the theoretical spectrum contains leakage of energy inside the circle. These thicker responses of the spatial filters allow violation of the two-dimensional assumption which might be too strong for some Schlieren data. 3.1.3. The proposed method The idea of the proposed method is to design a spatio-temporal filter bank such that its spatio-temporal spectrum is a cone. First, the temporal frequency components of the audible sound field are divided equally into multiple bands by using the analysis filter bank as shown in Fig. 6. Then, isotropic spatial filters are generated by setting its radius according to the upper cut-off frequency of each band. Therefore, the spectrum of the proposed filter will be the arrays of cylinders whose radius of the base of each cylinder relates to the upper cut-off frequency of each band. By this method, we can develop a filter with a spectrum similar to the cone shaped spectrum of sound. Clearly, it will be closer to the cone if the temporal frequency components are divided into more bands. A schematic presentation of the designed filter bank spectrum whose coloring is same as the analysis filter bank in Fig. 6 is show in Fig. 8. This method is an extension of our previous investigation that used a single narrow bandpass filter coupled with a single spatial filter for extracting sound field information [21]. Considering the spatio-temporal spectrum in Fig. 8, it is evident that the previous filter is the single narrow cylinder on a specific single frequency. Therefore, its spectrum does not cover all the spatial-temporal spectrum of sound, and thus it can only be applied to a pure tone. 3.2. Procedure of the proposed method
1023 66 dB 60 dB
The procedure of the proposed method is described in Fig. 9. It is explained together with the tree structure of the filter bank in Fig. 5 as follows:
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
0
0
−10
−10
−20
−20
Magnitude [dB]
Magnitude [dB]
114
−30 −40 −50 −60 −70
−30 −40 −50 −60 −70
−80 0
−80
24
12
6
0
0
0
−10
−10
−20
−20
−30 −40 −50 −60
−30 −40 −50 −60 −70
−70
−80 0
3
6
9
12
15
18
21
24
0
3
6
9
12
15
18
Frequency[kHz]
Frequency [kHz]
(c)
(d)
0
0
−10
−10
−20
−20
Magnitude [dB]
Magnitude [dB]
24
(b)
Magnitude [dB]
Magnitude [dB]
(a)
−80
18
12
Frequency [kHz]
Frequency [kHz]
−30 −40 −50 −60 −70
21
24
21
24
−30 −40 −50 −60 −70
−80
−80 0
3
6
9
12
15
18
21
24
0
3
6
9
12
15
18
Frequency [kHz]
Frequency [kHz]
(e)
(f)
Fig. 6. The magnitude response of the designed analysis filter bank. (a) 2-channel filter bank, (b) 4-channel filter bank, (c) 8-channel filter bank whose color of each band corresponds to the schematic presentation of spectrum of the proposed filter in Fig. 8, (d) 16-channel filter bank, (e) 32-channel filter bank, and (f) 64-channel filter bank. The gray dotted lines correspond to removed bands.
0
−500
Wavenumber [rad/m]
Wavenumber [rad/m]
Wavenumber [rad/m]
500
−1000
1000
1000
1000
500
0
−500
−1000 −500
0
500
Wavenumber [rad/m]
(a)
1000
−500
0
500
Wavenumber [rad/m]
(b)
1000
500
0
−500
−1000
−500
0
500
1000
Wavenumber [rad/m]
(c)
Fig. 7. Spectra of spatial band pass filter, where black indicates 1 and white indicates 0. (a) For 10 kHz sound field, (b) for 15 kHz sound field, and (c) for 20 kHz sound field. These figures correspond to Fig. 3(a)–(c).
115
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
300
Wavenumber [rad/m]
Wavenumber [rad/m]
1000
500 0
-500 -1000
1000 3
6
9
12
Frequency [kHz]
200 100 0 -100 -200 -300
0 15 18 20
-1000
Wavenumber [rad/m]
-400
-200
0
200
400
Wavenumber [rad/m]
(a)
(b)
Fig. 8. Schematic presentation of the spectrum of the proposed filter. (a) Spatio-temporal spectrum and (b) two-dimensional spatial spectrum.
1. Firstly, since all the intensity values of the raw Schlieren video are positive, there are large zero frequency components making the sound field difficult to visualize. The DC components of Schlieren video are removed by evaluating the mean of each pixel in time direction and subtracting the mean from each pixel. 2. Secondly, the sound field signals are processed in the spatiotemporal filter bank. According to Fig. 5, the process starts with the two-channel filter bank. In the analysis process, the highpass H1 ðzÞ and lowpass filter H0 ðzÞ pair split the sound field signal into 2 bands. 3. Then, the signal of each band is decimated by two. 4. Next, the process is repeated n times until the number of desired filter channels (the number of channels which is desired to be applied on the audible sound field video), 2n , are acquired.
5. After the analysis process, the frequencies above 20 kHz, the inaudible sound, is removed. 6. The program will check if the upper cut-off frequency of the first band is less than or equal to 20 Hz, the components contained in such band, the inaudible sound, will be removed. 7. Then, in sub-band processing, the unwanted spatial frequencies of each band are removed by using the spatial bandpass filters. 8. Finally, in the synthesis process, filtered signals of each band are interpolated and recombined by using the synthesis highpass G1 ðzÞ and lowpass G0 ðzÞ filter pair. 4. Evaluation of the proposed method by simulation In order to evaluate the performance of the proposed method, the two-dimensional Green’s function is used to simulate a sound field video. Since the proposed method is applied to remove noise
Calculating SNR
Raw SchlierenVideo
Spatio-temporal FIR filter-bank
Removing DC component Analysis filter bank No
Yes Removing frequency above 20 kHz 20 Hz No
Adding noise Fig. 10. Procedure of calculation of SNR.
xi
256 pixels 5.3 cm
Yes Removing frequency 20 Hz
Spatial filter Synthesis filter bank
filtered data
0.21mm
2.7 cm
Desired channel?
Simulated data
256 pixels 5.3 cm
Spatio-temporal Filter bank
1.8 cm
Point sound source
(0,0)
Filtered SchlierenVideo Fig. 9. Flow chart of procedure of the proposed method where f h;1 is upper cut-off frequency of the first band of filter bank.
Fig. 11. The schematic presentation of the simulation condition for simulating the sound field video where xi is the position of each point in the simulated field (see Eq. (19)). The image size was 256 256. This is a similar condition as the condition of real experiment in the next section.
116
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
outside the sound spectrum, the signal-to-noise ratio (SNR) is used for calculating its performance. In this section, the spherical pure tone sound field of frequency 10 kHz and the spherical sound field consisting of frequencies 5 kHz, 10 kHz, and 15 kHz are simulated to test the performance of our method. The procedure of this evaluation is shown in Fig. 10. First, the audible sound field was simulated by using the twodimensional Green’s function. The schematic presentation of the simulation conditions is shown Fig. 11. The equation for simulating the pure tone sound field can be represented as
/ðxi ; yÞ ¼
i ð1Þ H ðkjxi yjÞ; 4 0
ð23Þ
ð1Þ
where H0 is the Hankel function of the first kind, k is the wavenumber, and jxi yj is the distance between each point in the field xi and point source y. In addition, the sound field consisting of frequencies 5 kHz, 10 kHz, and 15 kHz can be simulated by
/ðxi ; yÞ ¼
3 X i ð1Þ H0 ðkj jxi yjÞ; 4 j¼1
16
32
ð24Þ
Improvement of SNR [dB]
36 35 34 33 32 31 30 29
2
4
8
64
Channel of filter bank
(a)
(b)
(d)
(c)
(e)
(f)
(g)
(h)
Fig. 12. Improvement of SNR and the images of simulated 10 kHz sinusoidal sound field. (a) Improvement of SNR of simulated sound field after filtering, (b) simulated sound field with noise, (c) sound field after being filtered by 2-channel filter bank, (d) sound field after being filtered by 4-channel filter bank, (e) sound field after being filtered by 8channel filter bank, (f) sound field after being filtered by 16-channel filter bank, (g) sound field after being filtered by 32-channel filter bank, and (h) sound field after being filtered by 64-channel filter bank.
37
Improvement of SNR [dB]
36 35 34 33 32 31 30
2
8
4
16
32
64
Channel of filter bank
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 13. Improvement of SNR and the images of simulated sound field consisting of frequencies 5 kHz, 10 kHz, and 15 kHz. (a) Improvement of SNR of simulated sound field after filtering, (b) simulated sound field with noise, (c) sound field after being filtered by 2-channel filter bank, (d) sound field after being filtered by 4-channel filter bank, (e) sound field after being filtered by 8-channel filter bank, (f) sound field after being filtered by 16-channel filter bank, (g) sound field after being filtered by 32-channel filter bank, and (h) sound field after being filtered by 64-channel filter bank.
117
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
where k1 ; k2 and k3 are the wavenumbers of frequencies 5 kHz, 10 kHz, and 15 kHz, respectively. The time direction of Schlieren video was simulated by multiplying by the time factor ejxt . After that, Gaussian noise was added to the simulated video to make it resemble the real Schlieren video after removing the DC component. In this case, SNR of 10 dB was chosen to adjust the levels of the noise. Then, the simulated sound field video with noise was processed by the proposed method. Both the simulated sound field video before and after filtering was used to calculate the SNR:
P 10log10 P
jSdata j2
; jSdata F data j2
Table 2 Experimental condition. Sound source frequency
10 kHz, chirp sound 4–4.3 kHz
Camera
NAC MEMRECAM HX-3
Loudspeaker
10 kHz Chirp sound
YAMAHA MSP-7 STUDIO GENELEC 8020B
Image size
10 kHz Chirp sound
256 256 pixel 384 384 pixel
Frame rate
10 kHz Chirp sound
96,000 fps 48,000 fps
Diameter of lenses
10 kHz Chirp sound
10 cm 15 cm
L; f 1 ; f 2 (see in Fig. 2(a))
10 kHz Chirp sound
100 cm, 100 cm, 102 cm 150 cm, 150 cm, 150 cm
ð25Þ
where Sdata is the simulated sound field video without noise and F data is the simulated sound field video with noise after being filtered by the proposed method. 4.1. Simulation result The graphs of improvement of SNR and the simulated sound field images of both the 10 kHz sinusoidal sound field and the sound field consisting of frequencies 5 kHz, 10 kHz, and 15 kHz after filtering are shown in Figs. 12 and 13, respectively. From the graphs, the plots of SNRs for both cases are positive. It is clear evident that the visibility of the sound field is enhanced by using the proposed method. The results are supported by observing the images of the sound fields before and after filtering. By comparing the visibility of the simulated sound fields with added noise in Figs. 12 and 13(b) to other pictures (c)–(h), it is evident that the visibility of sound field is enhanced. In addition, if we consider the SNR corresponding to the number of filter channels, the graphs in Figs. 12 and 13(a) show that the 64channel filter bank is the most effective for visualization. In addition, the effectiveness tends to increase with the number of filter channels. Correspondingly for the sound field image after filtering, the visibility increases as the number of channels increases. As we mentioned in Section 3.1.3, if the temporal frequency components of the audible sound field are divided into more bands, the spec-
Loudspeaker Light source
Camera
Knife-edge Sound Level meter Fig. 14. Schematic diagram of experiment setup of Schlieren visualization system.
trum of the filter bank will be closer to the cone. Consequently, the more noise will be removed, leading to higher visualization performance. Therefore, the evaluation of both the 10 kHz sinusoidal sound field and the sound field consisting of frequencies 5 kHz, 10 kHz, and 15 kHz after filtering support the hypothesis of our filter bank design.
5. Experiments 5.1. Experimental set up and recording In our experiment, the two-lens Schlieren system was used for observing the audible sound field. It features two convex lenses, a light source and knife edge. A pure tone sound field of frequency 10 kHz at sound pressure level (SPL) around 100 dB, which was measured at 10 cm from the loudspeaker, and chirp sound field of frequencies ranging from 4 kHz to 4.3 kHz at SPL of 112 dB, which was at 16 cm, were provided to test our method. The spatial wavelength of the pure tone and chirp sound field videos are around 0.2 mm/pixel and 0.4 mm/pixel, respectively. The schematic presentation of the experimental setup is shown in Fig. 14. A GENELEC 8020B and a YAMAHA MSP-7 STUDIO loudspeaker were provided to produce the sound. Since the shape of a tweeter of each loudspeaker is dome, it can be assumed that wavefronts of the generated sound fields were spherical. The experimental condition and the picture of experimental setup are shown in Table 2 and Fig. 15, respectively. The sound field emitted by a loudspeaker passed through the test area of the Schlieren system. Its interaction with air caused refraction of light. The knife edge, placed at the focal point of the second lens, blocked the refracted light ray. The remaining light was finally detected by the high-speed camera that leads to the visualization. The experimental condition and the picture of experimental setup are shown in Table 1 and Fig. 13, respectively.
LightSource
camera Loudspeaker
Loudspeaker
Lens1
Lens2
Lens1
Micropohne
Fig. 15. Experiment setup of Schlieren visualization system.
118
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
Fig. 16. Schlieren images of 10 kHz sound field. (a) Data detected by the high-speed camera, (b) data without DC component, (c) sound field after being filtered by 2-channel spatio-temporal filter bank, (d) sound field after being filtered by 4-channel spatio-temporal filter bank, (e) sound field after being filtered by 8-channel spatio-temporal filter bank, (f) sound field after being filtered by 16-channel spatio-temporal filter bank, (g) sound field after being filtered by 32-channel spatio-temporal filter bank, and (h) sound field after being filtered by 64-channel spatio-temporal filter bank.
6000
5.2. Experimental results
0
5500 −5
5000
4000
−15
Level [dB]
Frequency [Hz]
−10
4500
3500 −20
3000 −25
2500 2000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
−30
5.2.1. Pure tone sound field In this experiment, the 10 kHz sound field was emitted by a loudspeaker that was placed at the right side of the image. It can be seen that the raw Schlieren sound field acquired from the Schlieren system in Fig. 16(a) cannot be visualized which is consistent with the result in [18]. Removing the DC component causes the appearance of the features of the sound field to be seen more clearly (Fig. 16(b)). However, noise in the spatial domain still exists. The design of spatio-temporal filter bank is efficient for removing the noise in two-dimensional space and extracting the sound information in the time dimension. As shown in Fig. 16 (c)–(h), the sound field can be visualized more clearly if the number of the filter channels of the spatio-temporal filter bank increases.
Times [s] Fig. 17. Spectrogram of chirp sound from 4 kHz up to 4.3 kHz. The gray dotted lines represent the boundary of each band of the filter bank.
5.2.2. Non-stationary sound field The linear chirp sound field from 4 kHz to 4.3 kHz was used to test the proposed method. A sinusoidal linear chirp is defined as
Fig. 18. Schlieren images of the chirp sound field. (a) chirp sound field detected before processing, (b) chirp sound field after removing the DC component, and (c) chirp sound field after being filtered by 64-channel spatio-temporal filter bank.
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
20000
0
−5
16000
−15 8000
Level [dB]
Frequency [Hz]
−10 12000
−20 4000
0
−25
0.005
0.01
0.015
0.02
0.025
0.03
0.035
−30
Time [s] Fig. 19. Spectrogram of the Schlieren chirp sound field before filtering.
20000
0
4th Harmonic
6. Conclusion −10
12000
2nd Harmonic
−15
8000 −20 4000
Level [dB]
Frequency [Hz]
3rd Harmonic
−25
0.005
0.01
0.015
0.02
0.025
0.03
0.035
−30
Time [s] Fig. 20. Spectrogram of the Schlieren chirp sound field after filtering.
0
20000
4th Harmonic −5
3rd Harmonic
−10
12000
2nd Harmonic
−15
8000
Level [dB]
16000
Frequency [Hz]
In this case, we selected the 64-channel spatio-temporal filter bank for analyzing the Schlieren data of the sampling rate 48,000 fps. Therefore, the width of each band of filter bank was 375 Hz. Fig. 17 shows the spectrogram of the chirp sound with the gray dotted lines representing the boundary of each band of the filter. It can be seen that the chirp sound crosses the boundary of the band. Fig. 18 shows the image of the chirp sound field before and after filtering. In this experiment, the loudspeaker was placed in the left side of the image. The brighter area in Fig. 18(a) indicates the area of the screen. As can be seen, the visibility of the chirp sound field after filtering is enhanced (Fig. 18(c)). The spectrogram of the chirp sound field before and after filtering are shown in Figs. 19 and 20, respectively. These indicate that the proposed method is effective for extracting the sound field information and for noise removal. Moreover, the harmonic frequencies generated by distortion due to large pressure can be detected. The spectrogram of the sound field captured by a microphone during the experiment (Fig. 21) is used to validate the results. It can be seen that the harmonic frequencies also occur in the spectrogram of the sound field detected by the microphone. Therefore, it is evident that the proposed method can also detect the harmonic frequencies of a sound field, and it is not restricted within a band of the filter channel.
−5
16000
0
119
In this paper, we have addressed the main problems that limit the visibility of the audible sound field observed by a Schlieren system. In order to overcome the problems, a spatio-temporal filter bank is applied. The idea of this method is to design the spectrum of the filter bank similar to the spatio-temporal spectrum of the audible sound field which is a cone. It aims to extract the sound information inside the cone and remove the unwanted noise present outside. In order to evaluate the performance of the proposed method, simulation is performed for calculating SNR. The results show that the visibility of the sound fields is enhanced by using the proposed method and the audible sound field can be visualized more clearly as the number of channels increases. The performance of the proposed method was validated for both the simulated and real data. As we have proved that the spatio-temporal spectrum of the audible sound field is concentrated in a specific region, the proposed method is effective not only for the Schlieren observation but also for other applications of audible sound field feature extraction and noise removal. In future work, the proposed method should be applied to other applications to investigate its effectiveness. Appendix A. Supplementary material
−20 4000
0
Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.apacoust.2016.08. 028.
−25
0.005
0.01
0.015
0.02
0.025
0.03
0.035
−30
References
Time [s] Fig. 21. Spectrogram of the chirp sound captured by microphone under experiment.
q o n t2 SðtÞ ¼ sin /0 þ 2p f 0 t þ 2
ð26Þ
where /0 is the initial phase, f 0 is the initial frequency, q ¼ ðf 1 f 0 Þ=T; f 1 is the final frequency and T is duration.
[1] Yokota T, Sakamoto S, Tachibana H. Visualization of sound propagation and scattering in rooms. Acoust Sci Technol 2002;23(1):40–6. http://dx.doi.org/ 10.1250/ast.23.40. [2] Deines E, Bertram M, Mohring J, Jegorovs J, Michel F, Hagen H, et al. Comparative visualization for wave-based and geometric acoustics. IEEE Trans Vis Comput Graph 2006;12(5):1173–80. http://dx.doi.org/10.1109/ TVCG.2006.125. [3] Bilbao S, Hamilton B, Botts J, Savioja L. Finite volume time domain room acoustics simulation under general impedance boundary conditions. IEEE/ACM Trans Audio Speech Lang Process 2016;24(1):161–73. http://dx.doi.org/ 10.1109/TASLP.2015.2500018.
120
N. Chitanont et al. / Applied Acoustics 115 (2017) 109–120
[4] Hiranaka Y, Nishii O, Genma T, Yamasaki H. Real-time visualization of acoustic wave fronts by using a two-dimensional microphone array. J Acoust Soc Am 1988;84(4):1373–7. http://dx.doi.org/10.1121/1.397230. [5] Yamasaki Y, Itow T. Measurement of spatial information in sound field by closely located four point microphone method. J Acoust Soc Jpn (E) 1989;10:101–10. [6] Zipser L, Franke H. Laser-scanning vibrometry for ultrasonic transducer development. Sens Actuat A 2003;110(1–3):264–8. http://dx.doi.org/ 10.1016/j.sna.2003.10.051. [7] Frank S, Schell J. Sound field simulation and visualization based on laser doppler vibrometer measurement. In: Proc Forum Acust. p. 91–5. [8] Oikawa Y, Goto M, Ikeda Y, Takizawa T, Yamasaki Y. Sound field measurement based on reconstruction from laser projections. Proc ICASSP, vol. 4. p. IV661–4. http://dx.doi.org/10.1109/ICASSP.2005.1416095. [9] Oikawa Y, Hasegawa T, Ouchi Y, Yamasaki Y, Ikeda Y. Visualization of sound field and sound source vibration using laser measurement method. Proc 20th Int Congr Acoust, vol. 2. p. 992–6. [10] Malkin R, Todd T, Robert D. A simple method for quantitative imaging of 2D acoustic fields using refracto-vibrometry. J Sound Vib 2014;333(19):4473–82. http://dx.doi.org/10.1016/j.jsv.2014.04.049. [11] Rudinger G, Somers LM. A simple schlieren system for two simultaneous views of a gas flow. J SMPTE 1957;66(10):622. http://dx.doi.org/10.5594/J17005. [12] Jonassen DR, Settle GS, Tronosky MD. Schlieren ‘‘PIV” for turbulent flows. Opt Laser Eng 2006;44(3–4):190–207. http://dx.doi.org/10.1016/j. optlaseng.2005.04.004. [13] Brownlee C, Pegoraro V, Shankar S, McCormick P, Hansen CD. Physically-based interactive flow visualization based on schlieren and interferometry experimental techniques. IEEE Trans Vis Comput Graph 2011;17 (11):1574–86. http://dx.doi.org/10.1109/TVCG.2010.255. [14] Krehl P, Engemann S. August Toepler—the first who visualized shock waves. Shock Waves 1995;5(1):1–18. http://dx.doi.org/10.1007/BF02425031. [15] Reichel EK, Schneider SC, Zagar BG. Characterization of ultrasonic transducers using the Schlieren-technique. Proc I2MTC, vol. 3. p. 1956–60. http://dx.doi. org/10.1109/IMTC.2005.1604513. [16] Caliano G, Savoia AD, Iula A. An automatic compact Schlieren imaging system for ultrasound transducer testing. IEEE Trans Ultrason Ferroelectr Freq Control 2012;59(9):2102–10. http://dx.doi.org/10.1109/TUFFC.2012.2431.
[17] Mller D, Degen N, Dual J. Schlieren visualization of ultrasonic standing waves in mm-sized chambers for ultrasonic particle manipulation jet flow. J Nanobiotechnol 2013;11(21):1–5. http://dx.doi.org/10.1186/1477-3155-1121. [18] Hargather MJ, Settles GS, Madalis MJ. Schlieren imaging of loud sounds and shock waves in air near the limit of visibility. Shock Waves 2010;20(1):9–17. http://dx.doi.org/10.1007/s00193-009-0226-6. [19] Ajdler T, Sbaiz L, Vetterli M. The plenacoustic function and its sampling. IEEE Trans Signal Process 2006;54(10):3790–804. http://dx.doi.org/10.1109/ TSP.2006.879280. [20] Mignot R, Chardon G, Daudet L. Low frequency interpolation of room impulse responses using compressed sensing. IEEE/ACM Trans Audio Speech Lang Process 2014;22(1):205–16. http://dx.doi.org/10.1109/TASLP.2013.2286922. [21] Chitanont N, Yaginuma K, Yatabe K, Oikawa Y. Visualization of sound field by means of Schlieren method with spatio-temporal filtering. In: Proc ICASSP. p. 509–13. http://dx.doi.org/10.1109/ICASSP.2015.7178021. [22] Sattle GS. Schlieren and shadowgraph technique. Springer; 2001. [23] Vaidyanathan P. Quadrature mirror filter banks, M-band extensions and perfect-reconstruction techniques. IEEE Signal Process Mag 1987;4(3):4–20. http://dx.doi.org/10.1109/MASSP.1987.1165589. [24] Vaidyanathan P. Multirate digital filters filter banks polyphase networks and applications a tutorial. Proc IEEE 1990;78(1):56–93. http://dx.doi.org/10.1109/ 5.52200. [25] Irene YG, Tardi T. Multiresolution feature detection using a family of isotropic bandpass filters. IEEE Trans Syst Man Cybern B 2002;32(4):443–54. http://dx. doi.org/10.1109/TSMCB.2002.1018764. [26] Zadehgol A, Cangellaris AC. Isotropic spatial filters for suppression of spurious noise waves in sub-gridded FDTD simulation. IEEE Trans Antennas Propag 2011;59(9):3272–9. http://dx.doi.org/10.1109/TAP.2011.2161551. [27] Vetterli M. Filter banks allowing perfect reconstruction. Signal Process 1986;10(3):219–44. http://dx.doi.org/10.1016/0165-1684(86)90101-5. [28] Mitra SK. Digital signal processing. a computer-based approach. 2nd ed. McGraw-Hill; 2001. [29] Fliege NJ. Multirate digital signal processing. 2nd ed. John Wiley and Sons Ltd.; 1994.