Audio signal reconstruction based on adaptively selected seed points from laser speckle images

Optics Communications 331 (2014) 6–13 Contents lists available at ScienceDirect Optics Communications journal homepage: www.elsevier.com/locate/optc...

Download PDF

6MB Sizes 0 Downloads 42 Views

Report

Full Text

Optics Communications 331 (2014) 6–13

Contents lists available at ScienceDirect

Optics Communications journal homepage: www.elsevier.com/locate/optcom

Audio signal reconstruction based on adaptively selected seed points from laser speckle images Ziyi Chen a, Cheng Wang a,n, Chaohong Huang a, Hongyan Fu a, Huan Luo a, Hanyun Wang b a Centre of Excellence for Remote Sensing and Spatial Informatics, School of Information Science and Engineering, Xiamen University, Xiamen, Fujian 361005, China b School of Electron Science and Engineering, National University of Defense Technology, Changsha, Hunan 410073, China

art ic l e i nf o

a b s t r a c t

Article history: Received 29 March 2014 Received in revised form 13 May 2014 Accepted 15 May 2014 Available online 28 May 2014

Speckle patterns, present in the laser reﬂection from an object, reﬂect the micro-structure of the object where the laser is illuminated on. When the object surface vibrates, the speckle patterns move accordingly, and this movement can be recovered with a high-speed camera system. Due to the low signal to noise ratio (SNR), it is a challenging task to recover the micro-vibration information and reconstruct the audio signal from the captured speckle image sequences fast and effectively. In this paper, we propose a novel method based on pixels' gray value variations in laser speckle patterns to work out with the challenging task. The major contribution of the proposed method relies on using the intensity variations of the adaptively selected seed points to recover the vibration information and the audio signal with a novel model that effectively fuses the multiple seed points' information together. Experiments show that our method not only recovers the vibration information with high quality but is also robust and runs fast. The SNR of the experimental results reach about 20 dB and 10 dB at the detection distances of 10 m and 50 m, respectively. & 2014 Published by Elsevier B.V.

Keywords: Laser speckle pattern Vibration recovery Audio signal reconstruction High-speed camera

1. Introduction Nowadays, laser speckle pattern has many application scenarios, such as information extraction in medical ﬁeld [1–3], tilt and displacement measurements [4,5], vibration simulation [6], etc. Since sound detection via optical ways is simple to setup, it is becoming increasingly attractive to detect sound via laser speckle patterns. It is of great signiﬁcance to detect sound from a remote position in security domain, military application, and criminal monitoring, etc. Detecting sound with laser speckle patterns is easier to achieve under some situations than using electronic detectaphone, as it does not need to put any detection devices nearby the targets. It is able to retrieve the micro-vibration on the object's surface through speckle interferometry, but the speckle interferometry method suffers several disadvantages. First, the detectable distance is limited. Second, there is a limit about the minimal detectable displacements of the speckles. Third, the interference needs the reference light and the condition is rigorous. At ﬁrst, sounds were normally detected by collecting and detecting the light reﬂected from the window using an optical interferometer [7,8]. These sound detection methods suffer from the following four major disadvantages [9]: ﬁrst, as all sounds are

n

Corresponding author. E-mail address: [email protected] (C. Wang).

http://dx.doi.org/10.1016/j.optcom.2014.05.038 0030-4018/& 2014 Published by Elsevier B.V.

detected together, the methods must apply digital blind source separation post-processing algorithms. Second, the projection laser and the detection interferometer module must be placed in very speciﬁc positions such that, indeed, the reﬂected beam is directed towards the detection module. Third, the detection module is complicated and sensitive to errors as they are all interferometer based conﬁgurations. Fourth, the methods require a window to be positioned near the voice source. To overcome the above disadvantages, Zalevsky et al. [9] proposed vibration information extraction through displacements between adjacent speckle patterns (VIETD). In VIETD, a speckle pattern is ﬁrst generated by projecting a laser beam on the surface of an object, and then the speckle pattern's movements are recorded via a defocused high-speed camera system. After that, the audio signal is reconstructed according to the movements that reﬂect the vibration information of the laser-pointed object. Furthermore, the quality of the restructured sound signal can be improved by denoising methods [10]. However, the VIETD method has several limitations in applications. First, the image matching procedure is time-consuming. Second, the accuracy of image matching depends on the quality of the obtained speckle images. Since the image quality is generally declined with increasing distance between the image sensor and the object, and the requirement of clear structure information of the speckle images leads to the reduction of the received reﬂection power, system's detectable distance may be limited. Third, the displacement

Z. Chen et al. / Optics Communications 331 (2014) 6–13

between the sequential speckle images is usually within only several pixels, thus the estimation accuracy of the vibration may not be accurate enough. Based on similar speckle pattern theory, optical devices [11–13] with single photo-detectors were developed to extract the vibration information of object's surface in real-time. These devices are able to detect the oscillations of either whole speckle patterns [11,12] or single speckles [13]. Veber et al. recently proposed a method that directly obtains sound information from analog signals generated by a masked single photo-detector system [14]. The major limitation of Veber's work is that the pattern of the mask needs to be well designed, fabricated and adjusted according to the shapes and the sizes of the speckle patterns, which increases the difﬁculty for using. In fact, it is difﬁcult to adjust the mask shape to the speckle pattern as the shape of the speckle pattern is inﬂuenced by the following facts: the imaging distance, the micro-structure of the illuminated object surface, the wavelength of the laser beam, and the focal length of the observe lens. In the view of energy, speckle pattern's energy decreases when the speckles pass through the mask or the spatial ﬁlter, leading to reducing the detectable distance. In this paper, we propose a novel micro-vibration detection method based on the laser speckle images recorded by a highspeed camera, which needs neither a mask nor a strict defocus level constraint. There are three steps in our proposed method. First, we capture the speckle image sequences via a high-speed camera. Second, we adaptively select some special pixels in the speckle images as seed points according to some rules that will be analyzed in Section 2.2. Third, we compute the gray-value variation information of the seed points, and ﬁnally fuse the variation information with a novel model for audio signal reconstruction. Due to utilization of gray value variations from the adaptively selected pixels instead of the displacement between adjacent images to reconstruct the vibration, the processing of the proposed method is faster than VIETD since there is no need to do heavy-computation displacement estimation. And the proposed method is more accurate and sensitive as the dynamic gray value variation range of a pixel is up to 100 times larger than the displacements in the high-speed image sequences. Besides, the proposed method does not need a mask, thus the sensors can obtain higher SNR speckle pattern signals. In the experiments, we also show that the proposed method has ﬂexible constraints to the lens defocus level, which are a key factor for obtaining good quality of speckle images under far detection distance. The last key contribution in the paper is the proposed novel fusion model that fuses the multiple seed points' information together to guarantee and improve the performance of the proposed method. Section 2 describes the theoretical model and the computational complexity of the proposed method, Section 3 shows the experimental results and discussions, and Section 4 draws the

7

concluding remarks. Finally, the acknowledgment and the references are given.

2. Theoretical model 2.1. Experimental setup and system framework The schematic diagram of the proposed system's framework is shown in Fig. 1, which includes a laser projector (MSL-III-53250 mW), a lens (a Nikon lens whose focal length can vary ranging from 30 mm to 300 mm), and a high speed camera (Pixelink74L148). The laser projector emits a laser beam onto the surface of the object of interest (OOI), generating the speckle ﬁeld in the space. The high speed camera records the speckle patterns via the lens at a high frame rate, e.g. 2–8K fps, and the vibration information of the OOI surface is recovered from the recorded speckle image sequence (the sequence length is equal to frame rate multiply detection time) via our post processing method. In Fig. 1, L is the distance between speckle pattern plane being observed and the surface of OOI, U is the distance between the plane of speckle patterns and the camera lens, f is the focus length, D is the diameter of laser spot, and α is the tilt angle of OOI caused by micro-vibrations. Fig. 2 shows the ﬂow chart of the proposed method. In the ﬁrst step, we project the laser on the surface of OOI. In the second step, the high-speed camera records the speckle images with high frame rate. After recording the speckle image sequences, we adaptively select some points as seed points (discussed in Sections 2.2 and 2.3), and ﬁnally the vibration information and the audio signal are reconstructed through fusing the gray value variations of multiple seed points (discussed in Section 2.4). 2.2. Principle analysis and seed-selection criteria The rigid movements of OOI's surface include tilt, transverse and axial translations. Generally, the three types of movements are difﬁcult to separate. However, according to Ref. [9], for the far-ﬁeld speckle, the transverse and axial translations of OOI's surface have barely detectable effect on speckle image under the condition of strongly defocusing, thus the tilt movement can be separated from the translation movements. As a result, the imaged speckle pattern shows only a displacement that originates from the tilt movement and can be calculated in terms of the tilt angle and the defocusing level. Ascribe to the movement separation, the vibration of OOI's surface can be reconstruction directly from the displacements of recorded high-speed speckle sequences as done by Zalevsky et al. [9]. One key problem of Zalevsky's method is that the estimation algorithm of speckle displacements, such as a cross-correlation method, is generally of low efﬁciency, especially for sub-pixel

Fig. 1. An overview of the proposed system framework.

8

Z. Chen et al. / Optics Communications 331 (2014) 6–13

Fig. 2. The ﬂow chart of the proposed system.

P1 and P2, P3 and P4, P4 and P5, P6 and P7, the gray value variations are approximately linear, which are satisﬁed with criterion 3. However, the variations between the pairs of P2 and P3, P5 and P6 are not good in linearity. Moreover, the interval between P4 and P5 is better than the interval between P1 and P2 or P3 and P4 or P6 and P7, as both the gray value variations and the distance between P4 and P5 are the largest among the four satisﬁed neighbor minimum and maximum pairs. Form the above analysis, we know that the middle position of P4 and P5 is the seed point we want. Suppose the maximum tilt angle is α, then the maximum speckle lateral shift under the conditions of far-ﬁeld and strongly defocusing optics arrangement can be calculated as follows: Fig. 3. The pixels on a line along the moving direction in the speckle image.

calculation. If reconstructing the vibration only from integer-pixel displacement, the amplitude quantization of recovered vibration will be coarse due to the limited speckle displacements. For highspeed speckle sequences, the low-efﬁcient displacement estimation algorithm results that the on-line or real-time processing is difﬁcult to realize. In this paper, we proposed a different method to reconstruct the vibration from the high-speed speckle image sequences. In fact, for a ﬁxed pixel in the speckle pattern, its gray value varies with the speckle pattern's translation. If we record the gray value variation of a pixel with time, a digitalized vibration wave can be obtained. Generally, the obtained vibration through this way is seriously distorted in comparison with original vibration of OOI's surface. However, some specially selected pixel points will minimize the distortion. The criteria for the pixel selection are as follows: (1) along the displacement direction, the selected pixel should situate at the middle position of neighbor minimum and maximum; (2) the distance between neighbor minimum and maximum is twice larger than the maximum displacement during the movements; (3) the gray value variation between the neighbor minimum and maximum is as linear as possible; (4) the gray difference of neighbor minimum and maximum is as big as possible in order to ensure a large gray variation while translation takes place. In order to make the criteria clear, we take an example for illustration. As shown in Fig. 3, we draw the gray values of the pixels on a straight line along the moving direction. The P1, P3, P5, and P7 are all the local minimum points, on the other hand, P2, P4, and P6 are all the local maximum points. The pairs of P1 and P2, P2 and P3, P3 and P4, P4 and P5, P5 and P6, P6 and P7 are all the neighbor minimum and maximum deﬁned in the criteria. Between

Δd ¼ 2L tan α U

f U

ð1Þ

where L, f and U in the equations of the paper are the same with the parameters deﬁned in Fig. 1. For a complete speckle generated by a Gaussian beam illumination, the average speckle size of imaged speckle pattern can be estimated by ds ¼ 2

λL f U πω U

ð2Þ

where ω is the radius of Gaussian beam on OOI's surface, λ is the wavelength of the laser beam. In order to ensure that the pixel points satisfying criteria (1) and (2) can be always found, the average speckle size should be larger than twice of the maximum displacement, i.e. ds 4 2Δd. Under this condition, the pixel point that satisﬁes criteria (1) and (2) is always existent and ready to be searched because of the statistic characteristic of speckle size. Combine Eqs. (1) and (2), we get

ωo

λ

2π tan α

ð3Þ

The condition imposes a restriction to the beam radius on OOI's surface. According to the experimental experience, the tilt angle of OOI's surface α is general less than 0.05 mrad. Thus a beam diameter (2ω) less than 3.4 mm is preferable if λ is 532 nm. The criterion (3) is set to guarantee a linear intensity variation of selected pixels with translation. The criterion (4) ensures that the selected pixels have as large as possible intensity variation while translation takes place. For 8 bit-sampled digital speckle images, the maximum intensity difference can reach to 256. Note that the priority of the criteria may be different under different speckle images captured with different defocus levels. Fig. 4 shows four speckle images under different defocusing levels. When the defocusing level is slight, the speckles are overlapped into only several pixels, resulting in the speckle pattern's structure information disappearing, as shown in Fig. 4(d). Under the condition, the whole speckle pattern is considered as one speckle and the criterion (4) is considered at the ﬁrst place.

Z. Chen et al. / Optics Communications 331 (2014) 6–13

9

Fig. 4. Laser speckle images under different defocus levels. (a) with a large defocus level and (d) with slight defocus level, while (b) and (c) in between.

2.3. Vibration recovery with multiple seed points To increase the robustness, we utilize multiple seed points instead of single seed point to reconstruct the vibration. Two key points need to be considered. First, the variation phase of each seed point may be different. Second, the points with better quality should have higher weights. Thus, we propose a fusion model as the following: N

x^ t ¼ ∑ wi U x^ it U phasei ; t ¼ 1; 2; 3…1

ð4Þ

i¼1

where x^ t is the estimated vibration offset at frame t, wi is the weight of ith point, and x^ it is the estimated vibration offset of ith seed point at frame t. For each seed point, we have a linear model as the following: x^ it ¼ ai yit þ bi þni ; i ¼ 1; 2; 3…N; t ¼ 1; 2; 3…1; ni ð0; si Þ

ð6Þ

NðN 1Þ 2

functions of Eq. (6) for N seed points. There are Accumulating M frames for Eq. (6), and eliminating the noise item, we get ai ðyi1 þ yi2 þ …þ yiM Þ þ M bi ¼ aj ðyj1 þ yj2 þ …þ yjM Þ þ M bj ; ia j

In order to estimate the parameters ai and bi , we deﬁne K as the equation number of Eq. (7), and we form K equations of Eq. (7) through accumulating K M frames, i.e. K groups of M frames. Then the least square method is utilized for solving {ai } and {bi } K

N

fai ;bi g

N

∑

k ¼ 1i ¼ 1j ¼ 2;j o i

t¼1

( phasei ¼

1 1

T

; i ¼ 1; 2; 3…N

∑ ðxit xi Þ2

t¼1

if Rxs xi 4 0 if Rxs xi o 0

ð11Þ

Here xs is the gray value variations of the standard seed point s, xi is the gray value variations of the seed point i, Rxs xi is the correlation coefﬁcient of the gray value variations between xs and xi . If Rxs xi is positive, we consider that the gray value variation phase of point i is the same with the standard point s. Otherwise, the variation phase of point i is opposite to point s. Finally, we fuse points' estimation results to reconstruct the vibration information as follows: x^ t ¼ ∑ wi U x^ it Uphasei ; t ¼ 1; 2; 3…1 i¼1

x^ it ¼ ai yit þ bi ; i ¼ 1; 2; 3…N 1

wi ¼

si N

∑

1

; i ¼ 1; 2; 3…N

ð12Þ

s

j¼1 j

ðT ki T kj Þ

T kj ¼ ai ðykj1 þykj2 þ …þ ykjM Þ þ M bj ; k ¼ 1; 2…K; j ¼ 2; 3…N

2.4. Computational complexity analysis ð8Þ

From Eq. (8), we obtain the optimized parameters a^ i and b^ i . Then, we input a^ i and b^ i into Eq. (6) again to compute si . Denoting aj yjt þ bj ai yit bi as C ijt , Eq. (6) can be rewritten as ð9Þ

There are NðN2 1Þ functions of Eq. (9) in each data group. To solve the functions, N must meet the following condition: NðN 1Þ 4N 2

T

∑ ðxst xs Þ2 U

2

T ki ¼ ai ðyki1 þyki2 þ …þ ykiM Þ þ M bi ; k ¼ 1; 2…K; i ¼ 1; 2…N

si sj ¼ VarianceðC ijt Þ

Rxs xi ¼

N

ð7Þ

fa^ i ; b^ i g ¼ arg min ∑ ∑

T

∑ ðxst xs Þðxit xi Þ t¼1 sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃ sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

ð5Þ

where yit is the gray value variation of ith seed point at frame t, N is the number of seed points. ni is the noise at the ith seed point and obeys a standard normal distribution with divided difference si . For each frame t, the ﬁnal recovery result of the seed points should be the same. Thus, we have ai yit þ bi þ ni ¼ aj yjt þ bj þ nj ; i4 j; i ¼ 1; 2; 3:::N; j ¼ 2; 3; 4…N

Thus, Eq. (9) has a solution when N is greater than 3 according to Eq. (10). Fortunately, the number of seed points is always greater than three in the proposed method. In order to compute the phase of variations for each seed point, we select one seed point as the standard point and then compute the correlation coefﬁcients of gray value variations between the standard point and the other seed points. If the correlation coefﬁcient of a seed point is positive, the seed point will be labeled with 1. Otherwise, the seed point is labeled with 1. In the proposed method, T frames of the speckle images are taken as statistical data. The detail is expressed as the following:

ð10Þ

Zalevsky et al. [9] utilized the image matching to compute the displacements between adjacent images for audio signal reconstruction, so the image matching is an important part in [9]. In order to guarantee the image matching precision, the correlation coefﬁcient method is one of the best choices. Suppose that, image A has n pixels, then the computational complexity of image 3 matching using correlation coefﬁcient method is about Oðn2 Þ. If we want to boost the image matching speed, corners and features can be utilized. In Ref. [10], Harris corner is used for image matching step. Except the other computational time, the computational complexity of the Harris corner is OðnÞ. Moreover, the image matching using corners will be unstable since the size of speckle

10

Z. Chen et al. / Optics Communications 331 (2014) 6–13

images is generally small. In the proposed method, the seed points are computed only once among hundreds of thousands of images, and the only computing is the gray value difference between adjacent frames on ﬁxed locations of seed points. Generally, the number of seed points ranges from 4 to 10. In other words, the computational complexity of the proposed method is Oð1Þ since the time spend to select the seed points and solve the parameters of the fusion model is negligible. From this comparison, it is obvious that the processing speed of the proposed method is much faster than previous work [9]. In fact, the proposed method is possible to work online when the computer is fast enough.

3. Experimental results and discussions In all the following experiments, we capture the speckle images with the frame rate of 3800 fps on a small 32 32 window. Processing is performed on a computer with Intels Core™ i52400 and 4 GB RAM. All the procedures of the experiments are similar with the ﬂow chart shown in Fig. 1.

3.1. Veriﬁcation of detectability through the proposed method To verify the detectability of the proposed method, we test the proposed method to reconstruct the audio signal that contains some ﬁxed frequencies from the loudspeaker. Fig. 5(a) shows the image when the laser beam is projected on OOI's surface, and Fig. 5(b) is the speckle image obtained. There are eight parts of frequencies in the sample signal, i.e. 400 Hz, 600 Hz, 800 Hz, 1000 Hz, 1200 Hz, 1400 Hz, 1600 Hz, 1800 Hz. Each part of the signal continues for 1 s, as shown in Fig. 5(c). Note that we use the MATLAB to generate the audio sample signal here. In addition, we use the MATLAB to analyze the spectrum of the signals in all the experiments of the paper. The spectrum of the result recovered by the proposed method is shown in Fig. 5(d). From the comparison between Fig. 5(c) and (d), we see that the sample audio signal is reconstructed correctly with some noise (harmonics).

3.2. Random seed point selection versus the proposed seed point selection In this experiment, we compare the SNR values of recovered results by the proposed seed point selection method and the random point selection (RS) method. In this experiment, the sample audio signal, containing a constant frequency of 600 Hz, is played by the loudspeaker. In order to prove that the proposed method is suitable for different shapes of the speckles, we use three groups of speckle images captured under different focal lengths and speckle pattern planes in the comparison experiments, as shown in Table 1. Here L is set as inﬁnite because the imaging plane is located at the focal plane of the lens. For each data source, we compare the results reconstructed by the proposed method and the RS method. Besides, in order to show the inﬂuence of seed point number, we also compare the effects of different seed point number, i.e. 1, 2, 4, 8, 16, and 32. Due to the randomness of seed points in the RS method, the SNR value in the RS method is computed as the average SNR value during 10 times of experiments. The experimental results are shown in Fig. 6. Obviously, the proposed method is superior to the RS method in all the experiments, which is consistent with the analysis in Section 2.2. Meanwhile, the proposed method reaches its best performance at about 4 seed point number, and the performance has no obvious improvements when the seed point number increases. This phenomenon proves that the proposed method does not need many seed points as long as the number of seed points satisﬁes Eq. (10). Another fact is that the recovered results with 1 and 2 seed points are not as good as others due to the sensitivity to noise when the recovered process is without the proposed fusion model. Although the performance of the RS

Table 1 Parameters of focal length and focus position in each data source. Data source label

f (mm)

L (m)

Detection distance (m)

Data source 1 Data source 2 Data source 3

135 70 70

3 Inﬁnite 3

5 5 5

Fig. 5. (a) The laser beam projected on the surface of OOI. (b) The obtained speckle image. (c) The spectrum of the sample sound. (d) The spectrum of the recovery result via the proposed method.

Z. Chen et al. / Optics Communications 331 (2014) 6–13

11

Fig. 6. (a), (b), and (c) SNR value comparisons between the proposed seed point selection method and the RS method in data sources 1, 2, and 3, respectively. The blue column represents the proposed method, and the red column represents the average of the RS method. The green bar and the red dashed line bar represent the lowest SNR value and the highest SNR value of the RS method during the experiments, respectively. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)

Fig. 7. (a) Speckle image captured under rather slight defocus level with 32 32 size. (b) The waveform of the sample heart beat audio signal. (c) The waveform of the recovered signal. (d) The spectrum of the sample audio signal. (e) The spectrum of the reconstruction result via the proposed method.

method is improved when the number of seed points is increased, instability is still the most important problem of the RS method. As shown in Fig. 6, there are great differences among the lowest SNR values, the highest SNR values and the average SNR values in the RS method. Several reasons account for the improvement of the RS method when the number of seed points increases. First, the fusion model increases the weights of seed points that have higher quality. Second, the probability of choosing the seed points that satisfy the criteria increases when the number of the seed points increases. 3.3. Robustness to slight defocusing level of lens The proposed method does not need a clear structure of the speckle pattern so that the proposed method is robust to different

defocusing levels. In this experiment, we prove that the proposed method still works effectively under the condition of an extremely slight defocusing level. Here, we detect the audio signal of heartbeat from the loudspeakers. The detection distance is 10 m, and f is 70 mm. Instead of large defocusing, we focus the camera at about 9.5 m that generates small and bright speckle pattern with only a few pixels, as shown in Fig. 7(a). Fig. 7(b) and (c) are the waveforms of the sample signal and the recovered signal, respectively. It can be seen that the subtle structures of the speckle pattern have disappeared. Under such extreme condition, the speckle pattern only varies some points' gray values instead of moving during the vibration. However, the proposed method still succeeds to reconstruct the sample audio signal ( Fig. 7(d) ) under such a challenging condition, as seen in Fig. 7(e). As the structure

12

Z. Chen et al. / Optics Communications 331 (2014) 6–13

Fig. 8. (a) The original obtained speckle image with low SNR in the experiment. (b) The speckle image with manually tuned contrast for demonstration. (c) The spectrum of the sample audio signal. (d) The spectrum of the reconstruction result via the proposed method.

information of the speckles is missing, we consider the whole speckle pattern as one speckle in this experiment for the proposed method. 3.4. Robustness to low SNR speckle image and weak vibration intensity When the defocusing level is large and the detection distance is far, the energy received by each pixel on the sensor plane will be strongly reduced, resulting to the low SNR in the obtained speckle images. Beneﬁting from the large variation range of pixel's gray value (ranging from 0 to 255), the proposed method is still robust to recover the vibration under such a challenging condition. In this experiment, we prove that the proposed method is robust to low SNR speckle images. Here, we detect the audio signal of counting through a paper cup placed beside the loudspeaker. The distance between the paper cup and the camera system is about 40 m, and f is 300 mm. Because the detection distance is far and the reﬂection of the paper cup is weak, the SNR of the speckle images is very low, as seen in Fig. 8(a). Fig. 8(b) is the speckle image with manually tuned contrast to make the speckles visible. Moreover, the vibration intensity of the paper cup is rather weak as the paper cup is a secondary signal source. However, the proposed method still works under such a challenging condition. Fig. 8(c) and (d) are the spectra of the sample audio signal and the reconstruction result, respectively. We can see that the sample signal is reconstructed via the proposed method from the ﬁgures. Note that the criterion (4) mentioned above must be considered at the ﬁrst place for seed point selection in this experiment. 3.5. Performance under different detection distances In this section, we show the SNR values of the reconstruction results by the proposed method under different detection distances: 5 m, 10 m, and 50 m. During the tests, an audio signal with constant 600 Hz frequency is used as the sample signal. Similarly, the test OOI is the loudspeaker. Fig. 9 shows the SNR values of the

Fig. 9. SNR values of the reconstruction results under different detection distances via the proposed method.

reconstruction results. As shown in ﬁgure, when the detection distance is closer than 10 m, the SNR values of the reconstruction results can reach even higher than 20 dB. Moreover, the reconstruction result can still get about 10 dB of SNR value when the detection distance is 50 m. All the above experiments show that the proposed method is highly efﬁcient in real world tests.

4. Conclusion In this paper, we have proposed and experimentally demonstrated a novel method for micro-vibration recovery and audio signal reconstruction from speckle image sequences captured via a high-speed camera. In the proposed method, we ﬁrst adaptively select some best seed points, and then the gray-value variations of the seed points between adjacent frames are utilized to recover the micro-vibration information sensitively. Finally, we fuse the gray-value variation information of different seed points skillfully and reconstruct the audio signal with high SNR value. The experiments prove that the proposed method is robust to slight defocusing level. In addition, it is robust to low SNR speckle

Z. Chen et al. / Optics Communications 331 (2014) 6–13

images and weak vibration intensity. The SNR analysis quantitatively illustrates that the proposed method attains a high quality of the recovered vibration information. In conclusion, the proposed approach performs fast, sensitively, and gets high SNR in the experimental results. Moreover, our approach not only accelerates the processing to meet the real-time standard but also can be used at a much further detection distance through relaxing the constraints of the defocusing level. Experiments show that the proposed method acquires reconstruction results with high SNR values while free of the limitation of large defocusing level when imaging the laser speckle patterns under far detection distances.

[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Acknowledgment This work was supported in part by the National Natural Science Foundation of China (Grant no. 61371144). References [1] Y. Beiderman, I. Horovitz, M. Teicher, J. Garcia, Z. Zalevsky, V. Mico, N. Burshtein, Remote estimation of blood pulse pressure via temporal tracking of reﬂected secondary speckles pattern, J. Biomed. Opt. 15 (2010) 061707-1.

[13] [14]

13

P. Li, S. Ni, L. Zhang, S. Zeng, Q. Luo, Opt. Lett. 31 (2006) 1824. J.D. Briers, S. Webster, J. Biomed. Opt. 1 (1996) 174. H.J. Tiziani, Opt. Commun. 5 (1972) 271. J. Leendertz, J. Phys. E: Sci. Instrum. 3 (1970) 214. R. Nassif, C.A. Nader, F. Pellen, G. Le Brun, M. Abboud, B. Le Jeune, Appl. Opt. 52 (2013) 7564. P. Yapp, Inf. Secur. Techn. Rep. 5 (2000) 23. L. Hasan, N. Yu, J.A. Paradiso, Proceedings of the Conference on New Interfaces for Musical Expression, National University of Singapore, 2002, pp. 1–6. Z. Zalevsky, Y. Beiderman, I. Margalit, S. Gingold, M. Teicher, V. Mico, J. Garcia, Opt. Express 17 (2009) 21566. Y. Beiderman, Y. Azani, Y. Cohen, C. Nisankoren, M. Teicher, V. Mico, J. Garcia, Z. Zalevsky, Recent Pat. Signal Process. 2 (2010) 6. C.-C. Wang, S. Trivedi, F. Jin, V. Swaminathan, P. Rodriguez, N.S. Prasad, Appl. Phys. Lett. 94 (2009) 051112. A.A. Kamshilin, Y. Iida, S. Ashihara, T. Shimura, K. Kuroda, Linear sensing of speckle-pattern displacements using a photorefractive GaP crystal, Appl. Phys. Lett. 74 (1999) 2575. P. Heinz, E. Garmire, Opt. Lett. 30 (2005) 3027–3029. A. Veber, A. Lyashedko, E. Sholokhov, A. Trikshev, A. Kurkov, Y. Pyrkov, A. Veber, V. Seregin, V. Tsvetkov, Appl. Phys. B: Lasers Opt. 105 (2011) 613.

Audio signal reconstruction based on adaptively selected seed points from laser speckle images

Audio signal reconstruction based on adaptively selected seed points from laser speckle images

Recommend Documents