Simultaneous detection of blink and heart rate using multi-channel ICA from smart phone videos

Simultaneous detection of blink and heart rate using multi-channel ICA from smart phone videos

Biomedical Signal Processing and Control 33 (2017) 189–200 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journa...

4MB Sizes 1 Downloads 42 Views

Biomedical Signal Processing and Control 33 (2017) 189–200

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc

Simultaneous detection of blink and heart rate using multi-channel ICA from smart phone videos Chao Zhang a,b , Xiaopei Wu a,b,∗ , Lei Zhang a,b , Xuan He a,b , Zhao Lv a,b a b

Key Laboratory of Intelligent Computing & Signal Processing, Hefei, 230601, China School of Computer Science and Technology, Anhui University, Hefei, 230601, China

a r t i c l e

i n f o

Article history: Received 23 June 2016 Received in revised form 5 November 2016 Accepted 29 November 2016 Keywords: Blink Heart rate Independent component analysis PhotoPlethysmoGraphy (PPG) SOBI

a b s t r a c t Physiological indexes, such as blink frequency and heart rate, express the physical and mental state of a human. This paper presents an algorithm for simultaneous detection of eye blink and heart rate using multi-channel ICA. Video sequences captured from smart phones are used in which subjects are requested to sit without big motion. Statistic information from the R, G and B components of eyes and their surrounding facial region is explored to discriminate the different sources that mixed in each image. The proposed algorithm extracts both eye blink and cardiac signal as different sources at the same time by 6-channel SOBI without any other complex processing. Meanwhile, a kurtosis based method is proposed to automatically select blink and cardiac signals from the output separations. Different ICA algorithms and channel numbers as well as a series of head moving modes are employed to test the robustness and accuracy of the algorithm. Experiments on twenty subjects show that multi-channel ICA is capable of precisely separating the eye blink and cardiac signals in a less complex way. © 2016 Published by Elsevier Ltd.

1. Introduction Various types of signals or characteristics that generated by normal body functions can be used to identify the physical or mental state of a human being. However, the traditional ways to gain those signals are perhaps very expensive or even impossible without a set of sophisticated hardware which may bring not only complex operations but also discomfort to users. Therefore, researchers are inspired to explore low-cost but accurate and user-friendly solutions for physiological monitoring. Photoplethysmography(PPG), as an optical technique to measure the changes of light reflected through the skin [1,2], has become increasingly popular due to its low cost and noninvasive way of sensing without the need of electrodes. Skin images captured by optical device contain the information of tiny color changing which is caused by heart beating, breathing, eye blinking or other kinds of physiological activities that can generate pulsewave signals [3]. To capture these signals that are hardly seen by naked eyes by PPG, researchers have developed approaches for automatic computation of important physiological parameters from videos of the skin surface [4,5]. Firstly revealed that heart

∗ Corresponding author at: School of Computer Science and Technology, Anhui University, Hefei, 230601, China. E-mail address: [email protected] (X. Wu). http://dx.doi.org/10.1016/j.bspc.2016.11.022 1746-8094/© 2016 Published by Elsevier Ltd.

rate (HR) signal can be extracted from a sequence of video images, and in [6] Pelegris et al. rectified that result by comparing it with those required using a pulse oximeter. Their work validated that, blood volume pulse (BVP) detection, which can be understood as extracting the BVP signal from raw observation data for HR (or other information) estimation, can be done by visual information analyzing. Verkruysse et al. [7] found that although all the R, G, and B components contain PPG information in RGB color space, the G channel, however, contains the strongest signal. In [8], both spatial and temporal processing of the video sequence were used to amplify heartbeat pulse in the facial videos and make them seen by the naked eye. Besides HR, respiratory rate (RR) [9] and oxygen saturation [10] were also reported that can be determined by PPG. But current studies focus mostly on HR detection. In [11] high speed industrial camera was used for image acquisition and Kalman filter was employed to improve the robustness of heart rate estimation. Poh et al. [12] employed Independent Component Analysis (ICA) [13] to extract the BVP signal from PPG. In his method, 3channel ICA was used to treat the R, G, and B components of each image as observed mixing signals coming from different sources and these sources could be well estimated from the observed mixture signals. Fig. 1 gives the process of Poh’s 3-channel ICA based HR estimation algorithm. This is a constructive and promising method although they did not explore deeper to see whether more sources can be reconstructed. As the latest development of the ICA based

190

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

Fig. 1. 3-Channel ICA based algorithm to reconstruct the BVP waveform. (a) Video stream. (b) 3-channel observation matrix building from R,G and B components of the selected facial region. (c) ICA that used to separate the BVP signal from the observation. (d) Three outputs including the separated BVP signal.

methods, Daniel et al. [14] used 3-channel ICA which was a modified version of Poh’s method to realize remote (at a distance of 3 m) BVP extraction. In some recent researches, for example [15,16], they debated that ICA has a high computational complexity and the independent sources reconstructed using ICA were just similar to, or even not as good as using the raw signal of the G component. In this paper, we will rectify the effectiveness of ICA and extend ICA based approach to precisely measure multiple physiological parameters rather than HR only. ICA’s potential of extracting important physiological parameters from the optical observations will be further explored by investigating whether the eye blink as well as HR can be reconstructed from the change of the image stream recorded by smart phone. The rest of this paper is organized as follows. In Section 2, the base theory of ICA is presented. Details of the proposed algorithm are described in Section 3. In Section 4 we conduct experiments and discuss the results. Finally, we conclude the paper in Section 5. 2. Theory Independent component analysis, as a multi-dimension statistic analysis method, has been widely used in biomedical signal processing [17,18] and image processing [19,20]. Under the assumption that the sources are statistically independent, ICA tries to recover independent signals from a set of observations that are composed of linear or nonlinear mixtures of the underlying sources [21]. In a standard ICA based HR detection task, the underlying source signal of interest is the cardiovascular blood volume pulse signal that propagates throughout the tissue with pulse-wave information. During certain physical process like cardiac cycle, the changing of blood pressure on micro vascular ultimately leads to changing of pixel value on skin surface. By recording a video of the skin using an optical sensor, a mixture of different sources representing different physical process will be picked up. Let s = [s1 (t), s2 (t),. . .,sM (t)]T be a matrix of statistically independent zero mean sources and here M equals 3 in a standard task. The instantaneous ICA model of the observed signals from the image sensors is x = As

(1)

where x = [x1 (t), x2 (t), x3 (t)] represents the time series of the recorded signals on different channels and A is a 3 × 3 scalar mixing matrix. Every element in A, aij for example, denotes the transmit coefficients from the jth source to the ith sensor. The aim of ICA is to estimate the sources by finding a separation matrix W which is an approximation of A−1 and hence de-mix the mixture signal, i.e., y = Wx

(2)

The output y is an estimate of the vector s containing the underlying source signals. To estimate the separation matrix, a given cost

function is generally needed to measure the non-Gaussianity of y in each iteration until a proper W is found. 3. Description of the study 3.1. Eye blink detection Unlike conventional PPG, new PPG approaches, named imaging PPG(iPPG) [22] or camera-based PPG(cbPPG) [11], use natural lighting and offers the possibility to capture spatio-temporal information related to subjects [11]. From the view of image processing, lighting change will directly lead to the fluctuations of pixel values in an image stream captured by an optical device. If we put aside the physical background, PPG method is fundamentally a technique to extract potential regularities from fluctuations of the pixel values. Thus, although BVP has gained much attention, it should not be the only factor that could cause lighting (image) changing. It is reasonable to imagine that some other physical processes, like eye blink, could also bring image changes that can be detected. Blink is an ordinary body activity which can be seen as an important body signal due to its association with peoples’ mental states such as fatigue, lapses of attention, and stress [23,24]. Blink frequency, for example, is believed to express a lot of insights about not only peoples’ mental state but also their behavior [25]. There are two mainly ways to capture blink as a typical eye motion. One is made possible by placing electrodes around the eyes and measuring the bioelectricity when eye blink occurs [26]. This method is always known as ElectroOculoGraphy (EOG). The other one is made possible by measuring eye blinks from video streams with computer vision techniques been used. The source of the eye video may come from a camera of eye tracker [27] or a common smart phone [28]. For those EOG based method, nodes mounted on the user’s head are seen as too obtrusive and uncomfortable. And it is impossible to wear those devices in everyday life. For those detecting the eye blink using visual based method, although a sophisticated device is not required, a specific algorithm is needed for blink detection. Here in this paper, we propose a method which detects both HR and blinks simultaneously with low complexity in both device and algorithm. Eye-detection and localization which is indispensable for traditional eye state analysis is not necessary in our algorithm. Tiny motions can be tolerated in our algorithm and we find that a facial area containing only single eye is enough for blink and heart rate detection. To simplify the apparatus and keep the complexity of the system as minimal as possible, like in [11] and [16], we use a smart phone (with front camera) placed in front of the user to capture video stream. The distance between the camera and the subject is maintained around 15 cm to 30 cm to obtain a relatively accurate area of the facial region. The demonstration of video recording is given in Fig. 2. A smaller area containing subject’s eyes is selected manually from the video. That smaller region, not the whole region of the face, is the area we really concerned. In our work, the

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

191

observation matrix x is obtained:

Fig. 2. Demonstration of the video recording.

⎧ 1  ⎪ ⎪ xi,j (t) ⎪ ⎪ 2MN ⎪ ⎪ x ∈ R1 ⎪ ⎪ 1  ⎪ ⎪ xi,j (t) ⎪ ⎪ 2MN ⎪ ⎪ x ∈ R2 ⎪ ⎪ 1  ⎪ ⎪ xi,j (t) ⎪ ⎨ 2MN x ∈ G1 x=  1 ⎪ xi,j (t) ⎪ ⎪ 2MN ⎪ ⎪ x ∈ G2 ⎪ ⎪ ⎪ 1  ⎪ ⎪ xi,j (t) ⎪ ⎪ 2MN ⎪ ⎪ x ∈ B1 ⎪ ⎪ 1  ⎪ ⎪ xi,j (t) ⎪ ⎩ 2MN ⎡

subject is allowed to lightly move during the video recording and the angle that the subject’s face towards the camera can be randomly selected under the basic principle that the subject’s eyes must exist in the video.

3.2. Observation vector generation The major task of our work is to detect blinks as well as BVP from the facial videos. So the region of interest (ROI) must contain subjects’ eyes. Depending on whether the subject’s state is strictly motionless or not, a MeanShift [29] tracker is employed to get the updated ROI in the following frames after ROI being firstly fixed if the subject moves randomly in the video. In fact, the subjects do not need to keep strictly motionless, they could just sit and act as usual with tiny motions. The usage of a MeanShift tracker enables the proposed algorithm to tolerate subjects’ tiny motions hence brought much more comfortable experiences and relaxations to the subjects. A convention from 2-D ROI to multi-channel observation is needed to build the incoming data of ICA. It is known to all that each pixel in a color image obtained from the camera consists of a mixture of red, green and blue components. Thus, it seems natural and reasonable to treat the BVP recovery as a 3-channel ICA task as Poh did in [12]. However, if we assume more potential sources, the channel number would not be enough. Fig. 3 presents the waveforms of observation signals containing both BVP and blinks. Peaks corresponding to BVP and blinks can be easily found as the parts which are circled out in Fig. 3. In fact, the number of components is definitely much more than the number of channels. That means some other sources, such as respiratory signal, oxygen saturation signal or even artifact signals, contribute to the observations. If a small channel number is selected in a multi-source separation task, the separated components would rather be another mixtures of sources, than a separated independent source [30]. If more channels are available, we are able not only to extract the BVP, but also dig more information such as blink from the observations. In this paper, we applied 6-channel ICA to the observation data. In 6channel ICA, each ROI in each frame is divided into two parts from middle of the width. Each part yields a set of Red, Green and Blue points by spatially averaging all pixels in the part, so that an 6 × T

⎫ ⎪ ⎪ i = 1, ...M; j = 1, ..., N/2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ i = 1, ...M; j = N/2, ..., N ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ i = 1, ...M; j = 1, ..., N/2 ⎪

⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ i = 1, ...M; j = 1, ..., N/2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ i = 1, ...M; j = N/2, ..., N ⎪ ⎭ i = 1, ...M; j = N/2, ..., N

x ∈ B2

xR1 (1)

⎢ x (1) ⎢ R2 ⎢ ⎢ xG1 (1) ⎢ = ⎢ ⎢ xG2 (1) ⎢ ⎢ ⎣ xB1 (1) xB2 (1)



xR1 (2)

···

xR1 (T )

xR2 (1)

···

xR2 (T ) ⎥

xB2 (2)

···

xB2 (T )



xR1 (t)





x1 (t)



⎥ ⎢ x (t) ⎥ ⎢ x (t) ⎥ ⎢ R2 ⎥ ⎢ 2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ xG1 (2) · · · xG1 (T ) ⎥ ⎥ ⎢ xG1 (t) ⎥ ⎢ x3 (t) ⎥ ⎥=⎢ ⎥=⎢ ⎥ ⎢ ⎥ ⎢ ⎥ xG2 (2) · · · xG2 (T ) ⎥ ⎥ ⎢ xG2 (t) ⎥ ⎢ x4 (t) ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ xB1 (2) · · · xB1 (T ) ⎦ ⎣ xB1 (t) ⎦ ⎣ x5 (t) ⎦ xB2 (t)

x6 (t) (3)

where M and N respectively indicate the height and the width of the frame while T stands for the number of the frames. Applying (3) to the raw videos, a set of red, blue, and green point yield from each frame constitutes a 6 × T observation matrix x. After being converted to 6-channel observation matrix x, the raw videos were filtered with a high-pass FIR filter (0.8 Hz). Fig. 4 provides an overview of the steps involved in our approach to recover the multiple physiological parameters from a smart phone video. An explanation of the ROI partition method has been shown in Fig. 4(b) and (c). 3.3. SOBI based ICA As a common used pre-processed step, it is necessary to normalize the raw data as follows: zi (t) =

xi (t) − i i

(4)

for each i = 1, 2, . . .,6, i and  i are the mean and standard deviation of xi (t) respectively. Six independent source signals are then estimated using second-order blind identification (SOBI) based ICA [31]. Based on a joint diagonalization of a set of covariance matrices, SOBI separates a set of sources only relied on stationary secondorder statistics. It minimizes the relatedness among the separated sources that constitute the observed mixtures by utilizing detailed temporal structures presented in the continuous video frames. ICA is able to extract potential regularities from the irregular observation signals. Being treated as two independent sources, BVP as well as eye blinks can be extracted from the pixel value fluctuations that they brought into the image sequence. After being separated by SOBI, the output components became independent to each other, and they are regarded as the estimation of the original sources. 3.4. Blink and BVP signal selection by kurtosis Since ICA has its fundamental limitation of permutation ambiguity that the order of ICA’s outputs is random. Separated blink and BVP signals may exist in different output channels if the separation process is consisted of several sub-parts. When you separate

192

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

Fig. 3. Potential sources in observations.

a short-term mixture by ICA, you may select the target signal by visual inspection. But if you are going to separate a long-term mixture which has to be divided into several blocks, you must find out the blink and BVP signals in the outputs from every block. Here we

proposed a signal selection method to automatically select blink and BVP signals by measuring the kurtosis of the outputs as: Kurt(z) =

E(z 4 ) (E(z 2 ))

2

−3

(5)

Fig. 4. Overview of 6-Channel ICA based algorithm to reconstruct the BVP and blink waveforms. (a) Video frames. (b) ROI that to be concerned and ROI partition to obtain a 6-row observation matrix. (c) 6-channel ICA is applied to reconstruct the potential sources. In this figure, the BVP is visible in the fourth output channel and the blink is in the third output channel.

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

193

Table 1 Parameters of SOBI. Number of time delayed correlation matrices

200

Number of sources Number of sensors Threshold for convergence error Zero mean the data Pre-whiten the data

6 6 2.5 × 10−4 needed needed

where z is a zero mean random variable which contains the time or frequency domain points from a single output channel. Kurtosis is widely used to measure the gaussianity of a signal. BVP has a nature peak in its power spectrum indicating the heart rate, hence has a big spectral kurtosis. As to blink, successfully separated blink signal is consisted of a set of standard pulses, which can be easily discriminated by time-domain kurtosis. 4. Experiments and discussion 4.1. Experimental setup We took videos in different time from 20 subjects, including 12 males and 8 females. Each of them was taken at least 3 videos. The videos were taken indoors either in the daytime with relatively consistent and ambient sunlight entering the lab through windows as the source of illumination or at night with artificial illumination source. An iphone 6 and a LG G3 were used to collect the facial videos. During the experiment, the subjects were seated at a table in front of the smart phone at a distance of approximately 0.15-0.3 m between the phone camera and their face. In the video recording, subjects were asked to blink in every 5 s or just blink freely, meanwhile they were not required to keep strictly motionless. Any spontaneous motions, such as eye blink and normal tiny body motion were allowed. But subject have to guarantee that his/her face area which includes no less than two eyes was captured in the video. All videos were recorded at 30 frames per second (fps) with pixel resolution of 1280 × 720, and saved in .MOV format on the smart phone and then converted to .AVI/MP4 format on PC. All blinks happened in the videos were manually checked as the ground truth and a pulse oximeter (PHILIPS DB12) was clipped on

Fig. 5. HR and blink detection results of 3 subjects. (a1) − (c1) Subject’s image. (a2), (a3) − (c2), (c3) Blink and BVP waveforms detected by ICA. (a4)-(c4) Power spectrums of separated BVPs.

subject’s finger to obtain the true heart rates. We manually wrote down the readings of the pulse oximeter in every minute. By analyzing the videos off-line, we estimated the HR in every single minute and compared the results to the readings of the pulse oximeter. Parameters of SOBI are given in Table 1. All the videos were analyzed offline on MATLAB platform. 4.2. Experiment 1. Simultaneous detection of HR and blink Ten subjects were involved in this experiment. Their ages range from 6 to 38. The videos which last about 1 min were recorded with fixed blink frequency for 3 different subjects. All further analysis was performed as described in Section III. The separation results of 6-channel SOBI are shown in Fig. 5. Red bounding boxes in Fig. 5(a1)–(c1) represent the selected ROIs. The second column of

Fig. 6. Calculation of blink duration and blink frequency.

194

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

Fig. 7. Comparison of HRs obtained from proposed algorithm, a wristband and a pulse oximeter.

Fig. 5 is the separation results corresponding to blink and BVP. Also seen in the last column is the power spectrum of the separated BVP signals which are shown in the second row of Fig. 5(a3)–(c3). Distinguishable peaks in the power spectrums demonstrate the successful estimation of heart rates. The average pixel value of eye region increased greatly when a blink happened. So a serial of peaks can be observed in the separated blink signal. The peaks, which depict the blinking process distinguishably, bring convenience to blink detection. Meanwhile, the separated BVP waveforms were

also relatively standard in their shape, which made the heart rate calculation more accurate. From the separated blink waveforms like Fig. 5(a3)–(c3), more information about blink such as blink frequency and duration can be explored. Fig. 6 gives the waveform of a single blink and the corresponding original images. Two features [32] are considered in Fig. 6: duration at 50% (D50) which is defined as the duration calculated from the half rise amplitude to the half fall amplitude and the blink frequency which is the number of blinks during a defined time period (one minute). A threshold, which is an empirical value or calculated as 80% of the averaged amplitude of M blinks before current one, is used to convert the blink pulses to square wave for blink counting. Smart wristband is now becoming a widely accepted way to monitor people’s health stats such as heart rate and so on. It provides users with compromise of wearing comfort and acceptable measurement performance. The further validation of HR estimation of the proposed algorithm was performed with the use of a wristband. Videos from 7 subjects were used and each of them lasted for 5 min. The subjects had a pulse oximeter clipped on his/her finger and wore a wristband (lifesense mambo) during the video recording. Readings of the pulse oximoter and the wristband were recorded synchronously with the videos. Results of the comparison are presented in Fig. 7. It is clear that there are almost no difference between the results of the proposed algorithm and the pulse

Fig. 8. Results comparison between 3-channel SOBI and 6-channel SOBI.

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

195

Fig. 9. Detection results comparison of 4 different ICA algorithms. (a1) Power spectrum calculated from the detected BVP by SOBI. (a2) Power spectrum calculated from the detected BVP by FastICA. (a3) Power spectrum calculated from the detected BVP by Infomax. (a4) Power spectrum calculated from the detected BVP by JADE. (b1) Blink and BVP waveforms detected by SOBI. (b2) Blink and BVP waveforms detected by FastICA. (b3) Blink and BVP waveforms detected by Infomax. (b4) Blink and BVP waveforms detected by JADE.

oximoter. But a relatively big difference can be observed in the results of the wristband especially in Subject 3 and Subject 6. By comparing our results with those produced by a pulse oximeter and a wristband, the capabilities of the proposed algorithm to estimate HR was verified. Meanwhile, we found that comparing to a pulse oximeter, the accuracy of the wristband still needs to be improved.

4.3. Experiment 2. Comparison of different channel numbers HR and blink detection that individually used 3-channel and 6-channel SOBI for sources separation were compared. Detection results of the comparison are shown in Fig. 8.The first column of Fig. 8 is the results corresponding to 3-channel ICA and the second

Fig. 10. Results comparison of 6-channel ICA and PCA. (a) Extracted blink and BVP signals of ICA and PCA. (b) Power spectrum of the extracted BVP signal of PCA. (b) Power spectrum of the extracted BVP signal of ICA. (d) Other outputs of ICA. (e) Other outputs of PCA.

196

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

column of Fig. 8 is the results corresponding to 6-channel ICA. The first and second rows shown in blue and red color are blink and BVP signals respectively. Although both 3-channel SOBI and 6-channel SOBI separate blink signals well, 6-channel SOBI presents a more smooth and regular waveform of BVP which is critical for HR calculation. The last row of Fig. 8 shows power spectrums calculated from the output BVPs. It can be seen that result from 6-channel SOBI has less sub-peaks surrounding the main peak corresponding to HR, 6-channel SOBI obtains a more accurate HR estimation.

4.4. Experiment 3. Comparison of different ICA algorithms It is clear that the ICA algorithm adopted in our method will determine the accuracy of blink detection and HR estimation. Many ICA algorithms have been developed and some of them may suit better than others in our task. So the purpose of this experiment is to verify the performance of SOBI based ICA. Four different popular ICA algorithms, including FastICA [33], Infomax [34], JADE [35] and SOBI have been used to separate the 6-channel observed mixture and the performance was compared. Video from the same subject in Experiment 2 was used. Fig. 9(a1)–(a4) show the power spectrums calculated from the detected BVPs and Fig. 9(b1)–(b4) present blink and BVP detected by different ICA algorithms. Obviously, blink and BVP waveforms recovered by SOBI are of the best quality. Although blink waveforms separated by the other 3 methods are also acceptable, the BVP signal they obtained has significant difference between that of SOBI. Furthermore, an inaccurate BVP waveform contains artifact component in the signal, which in turn results in inaccuracy of HR calculation. As a result, SOBI outperforms the other 3 algorithms in our task of HR estimation.

4.5. Experiment 4. Comparison of ICA and PCA As a widely used orthogonal linear transformation, PCA transfers the data to a new space of lower dimensionality by linear projection to reveal the latent information and reduce the number of parameters. Since PCA and ICA share some similarities that both of them try to find out the potential statistical patterns in the data, researches are interested in the performance comparison of these two methods in the same task [20,36]. In this experiment, our results are compared to PCA under completely the same working conditions. Results of this experiment can be seen in Fig. 10. Fig. 10(a) shows the extracted blink and BVP signals by ICA and PCA. From the blue waveforms we noticed that the blink and BVP signals are not well separated, suggesting that it may possibly lead to an incorrect estimation of heart rate frequency which can be observed in Fig. 10(b). In contrast to PCA, SOBI that employed in our algorithm captures second order statistics and projects the mixture onto the output components that are as statistically independent as possible. The measurement of statistical independence in ICA guarantees the quality of the reconstructed sources. So in this experiment we found that ICA outperforms PCA in the task of multi-source estimation using PPG signals.

4.6. Experiment 5. Detection of dynamic heart rate changing A dynamic heart rate changing can also be detected by our approach. Two subjects were asked to run upstairs from 1st floor to 5th floor before a 7-min video was recorded for each of them. Then the videos were divided into 7 pieces, each piece named t1, t2,. . ., t7. Heart rate was calculated in each piece using the proposed algorithm to get a dynamic heart rate changing curve which is shown in Fig. 11. It is obvious that subjects’ heart rates decreased from more than 100 to less than 90 in the 7 min after strenuous exercise. This

Fig. 11. Detection results of dynamic HR changing. (a) Demonstration of subjects’ physical exercise before video recording; (b) Two subjects’ HR changing curves.

phenomenon that heart rate gradually slows down after physical exercise is reasonable for any healthy people. 4.7. Experiment 6. Motion-tolerant performance of the proposed algorithm Under most of the PPG application conditions, it was difficult for the subjects to keep strictly motionless although they were always required to do so. Meanwhile, due to the data driven characteristic of the PPG method, the changing of the data in ROI may cause drastic performance deterioration. Therefore, it is necessary to test the proposed algorithm to verify the effect of subjects’ motion of heads pos changing. The head rotation limits about yaw and roll axis were examined and the results obtained from 4 subjects are shown in Fig. 12. Fig. 12(a12) and (a22) gives the separation results under an in-plane head rotation angle (about yaw axis) of 30◦ . Although a decline in HR calculation accuracy was observed in Fig. 12(a13) and (a23), the separated waveforms, which were successfully reconstructed without great influence of head moving, indicates that the proposed algorithm still worked under this rotation mode. However, with the increasing of angle on yaw axis, the separated waveforms became worse, as shown in (b12), (b22), (c12), (c22), (b13) and (b23). (c13) and (c23) give the power spectrums calculated from BVPs under subjects’ in-plane head rotation angle of 45 ◦ and 75 ◦ respectively. In the modes shown in Fig. 12(b) and (c), calculated HRs are incorrect compared with the results in Fig. 12(a13) and (a23). The increase in rotation angle leads to greater discrimination between calculated HR and true HR. Fig. 12(d12) and (d22) give the separation results under subjects’ 30 ◦ vertical head rotation (about roll axis) which often appeared in daily videos. With

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

197

Fig. 12. Detection results under subjects’ different motion modes. (a11) − (a23) Results under subjects’ in-plane head rotation angle of 30◦ . (b11)-(b23) Results under subjects’ in-plane head rotation angle of 45◦ . (c11)-(c23) results under subjects’ in-plane head rotation angle of 75◦ . (d11)-(d23) Results under subjects’ vertical head rotation angle of 30◦ .

198

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

Fig. 13. Results comparison using 3 different ROIs. (a) 3 different ROIs selected. (b) Power spectrums calculated from the detected BVPs. (c) and (d) Blink and BVP extracted from different ROIs.

the help of a meanshift traker, both blink and BVP signals were well reconstructed under this mode. However, in this mode, a stable tracking model is crucial to guarantee the ROI could be properly co vered. Within the rotation angle of 30 ◦ about yaw or roll axis, the method works as approximately well as for motionless subjects. The reduction in HR evaluation accuracy is mainly due to the drastic changing of ROI. A head rotation angle within 30 ◦ about yaw or roll axis can be tolerated by our algorithm. 4.8. Experiment 7. Comparison of ROIs with different size While running the experiments, we noticed that a bigger or smaller ROI selected in the same video may not bring big different into final detection results. Moreover, a smaller ROI is less time consuming when the observation matrix is being built. Thus, whether a smaller ROI which contains only single eye can be used instead of a two-eye symmetric ROI to improve the time complexity of our algorithm has been taken into consideration. 3 different ROIs used in this experiment are expressed as 3 rectangles with different colors in Fig. 13(a), and the detection results from different ROIs are

shown in corresponding colors in Fig. 13(b), (c) and (d). As is shown in Fig. 12(b) and (d), when a single-eye ROI was selected, the blink led to great fluctuation of gray value which makes the peak in the separated blink waveform more prominent and distinguishable. However, a smaller ROI covers a smaller facial region and hence brings less cardiovascular information into ICA. So the BVP waveform separated from a smaller ROI was not so smooth as that from a bigger ROI (See Fig. 13(c12), (c22) and (c32)). But the HRs calculated from different ROIs shown in Fig. 13(b) demonstrated that even a smaller ROI which contains only one eye is capable of simultaneously detecting the BVP and blink. The comparison of average time consuming per frame was performed among the 3 different ROIs mentioned above, and the results from 4 subjects were given in Fig. 14. It is obvious that a smaller ROI needs less time consuming as we expected. A quantitative evaluation was made as a summary of all the results. We employed absolute distance (AD)

AD =

abs(Vref − D HR) Vref

(6)

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

199

As HR always gains more concerns of people, a more precisely evaluation was given to measure the stability of HR estimation in the above experiments:

  N 1 STD =  [D HR(n) − R HR] N

(8)

n=1

Fig. 14. Comparison of average time consuming per frame using 3 different ROIs.

to measure the accuracy of HR estimation. Vref is the reference value given by the pulse oximeter and D HR is the estimated HR. Moreover, we employed Detection Rate (DR), which calculated as: DR =

TP TP + FN

(7)

to measure the positive rate of blink detection. In (7), TP is the number of blinks that are correctly detected and FN is the number of blinks that not been detected but really existed. Fig. 15(a) shows the quantified results of all the experiments. Totally 750 true blinks were observed from 70 videos of 20 subjects, and in most cases the proposed algorithm achieved a DR of more than 90% in detecting these blinks. Meanwhile, the proposed algorithm has shown a similar performance in HR estimation that less distortion in the separated BVP waveforms was observed and more accurate HRs were estimated. The statistics indicate that the proposed algorithm performs well even in some kind of head moving condition. Whereas, it is note that if the angle of subject’s head rotation keeps increasing to more than 30◦ , the blink detection and HR estimation would be no longer accurate, which indicates that BVP and blink signals are affected by motion artifacts.

where R HR is the reference HR calculated from the first 30 s of each video stream. The video (without the first 30s) is then divided into N pieces in which HR is estimated as D HR(n), n = 1,. . ., N. Fig. 15(b) shows the STD calculated from all the HR estimation experiments. It is clear that 6-channel SOBI presented a most stable HR estimation comparing to other algorithms. Meanwhile, the proposed method is less sensitive to subject’s small movements, a moderate-intensity motion like 30 ◦ head rotation is tolerable to our method. 5. Conclusion simultaneous blink and HR detection algorithm based on 6channel SOBI is presented in this paper. We have demonstrated the feasibility of the proposed method to detect blink and physiological parameter such as HR at the same time. The proposed algorithm is simple but quite effective, the quantified experimental results have shown high DR and low AD, STD in most of the cases. However, some factors should be carefully considered for the successful application of the proposed method. First, the channel number of independent sources in ICA is important, since it fundamentally affects the reconstruction of output waveforms. It is clear that a 6-channel ICA outperforms 3-channel ICA in our experiments. Second, the proposed algorithm is motion-tolerant to normal head posture changing, but if the head rotates more than 30◦ or even 45◦ about vertical or horizontal axis, a reduction in detection accuracy would happen.

Fig. 15. Quantitative evaluation of overall results. (a) DR and AD in all the results; (b) STD of overall HR estimation.

200

C. Zhang et al. / Biomedical Signal Processing and Control 33 (2017) 189–200

Acknowledgments The research work described in this paper is supported by National Nature Science Foundation (No. 61271352, No. 61401002), Anhui Province Natural Science Foundation (No. 1408085QF125), Anhui provincial natural science research project of colleges and Universities (No. KJ2014A011) and Anhui University Academic and Technical Leaders Introduce Engineering Foundation (No. 02303203) and the open project of Key lab of Optc-electronic Information Acquisition and Manipulation Ministry of Education, Anhui University (OEIAM201401). References [1] A.B. Hertzman, The blood supply of various skin areasas estimated by the photoelectric plethysmography, AM. J. Physiol. 124 (1938) 329–340. [2] J. Allen, Photoplethysmography and its application in clinical physiological measurement, Physiol. Meas. 28 (2007) 1–39. [3] J. Lee, K. Matsumura, K. Yamakoshi, et al., Comparison between red, green and blue light reflection photoplethysmography for heart rate monitoring during motion, Eng. Med. Biol. Soc. (EMBC) IEEE (2013) 1724–1727. [4] V. Ntziachristos, J. Ripoll, V.L. Wang, R. Weissleder, Looking and listening to light: the evolution of whole-body photonic imaging, Nat. Biotechnol. 23 (3) (2005) 313–320. [5] J. Zheng, S. Hu, The Preliminary Investigation Ofimaging Photoplethysmographic System, IoP Publishing, 2007, pp.012031, ISBN:1742-6588. [6] P. Pelegris, K. Banitsas, T. Orbach, et al., A novel method to detect heart beat rate using a mobile phone, Eng. Med. Biol. Soc. (EMBC) (2010) 5488–5491. [7] W. Verkruysse, L.O. Svaasand, J.S. Nelson, Remote plethysmographic imaging using ambient light, Opt. Express 16 (26) (2008) 21434–21445. [8] H.Y. Wu, R. Michael, S. Eugene, G. John, D. Fr´ıedo, F. William, Eulerian video magnification for revealing subtle changes in the world, ACM Trans. Graphics 31 (4) (2012) 65. [9] C. Takano, Y. Ohta, Heart rate measurement based on a time-lapse image, Med. Eng. Phys. 29 (8) (2007) 853–857. [10] L. Kong, Y. Zhao, L. Dong, et al., Non-contact detection of oxygen saturation based on visible light imaging device using ambient light, Opt. Express 21 (15) (2013) 17464–17471. [11] F. Andreotti, A. Trumpp, H. Malberg, et al., Improved heart rate detection for camera-based photoplethysmography by means of Kalman filtering, Electron. Nanotechnol. (ELNANO) IEEE (2015) 428–433. [12] M.Z. Poh, D.J. McDuff, R.W. Picard, Advancements in noncontact, multiparameter physiological measurements usinga webcam, IEEE Trans. Biomed. Eng. 58 (1) (2011) 7–11. [13] A. Hyvärinen, E. Oja, Independent component analysis: algorithms and applications, Neural Netw. 13 (2000) 411–430. [14] D. McDuff, S. Gontarek, R.W. Picard, Remote detection of photoplethysmographic systolic and diastolic peaks using a digital camera, biomedical engineering, IEEE Trans. 61 (12) (2014) 2948–2954. [15] D.H. Gerard, J. Vincent, Robust pulse rate from chrominance-based rPPG, IEEE Trans. Biomed. Eng. 60 (10) (2013) 2878–2886.

[16] W.J. Jiang, S.C. Gao, P. Wittek, et al., Real-time quantifying heart beat rate from facial video recording on a smart phone using Kalman filters, e-Health Networking, Applications and Services (Healthcom), IEEE (2014) 393–396. [17] S.H. Sardouie, L. Albera, M.B. Shamsollahi, et al., An efficient jacobi-Like deflationary ICA algorithm: application to EEG denoising, IEEE Signal Process. Lett. 22 (8) (2015) 1198–1202. [18] M.R.H. Samadi, N. Cooke, VOG-enhanced ICA for SSVEP response detection from consumer-grade EEG, Signal Process. Conf. (EUSIPCO), IEEE (2014) 2025–2029. [19] A. Villa, J.A. Benediktsson, J. Chanussot, et al., Hyperspectral image classification with independent component discriminant analysis, IEEE Trans. Geosci. Remote Sens. 49 (12) (2012) 4865–4876. [20] J. Wu, K.G. Brigham, M.A. Simon, et al., An implementation of independent component analysis for 3D statistical shape analysis, Biomed. Signal Process. Control 13 (1) (2014) 345–356. [21] T.W. Lee, Independent Component Analysis, Springer, US, 1998, pp. 27–66. [22] S. Sun, S. Hu, V. Azorin-Peris, et al., Motion-compensated noncontact imaging photoplethysmography to monitor cardiorespiratory status during exercise, J. Biomed. Opt. 16 (7) (2011) 077010–077010-9. [23] S.P. Marshall, Identifying cognitive state from eye metrics, Aviat. Space Environ. Med. 78 (5) (2007) 165–175. [24] K. Ryu, R. Myung, Evaluation of mental workload with acombined measure based on physiological indices during a dual task of tracking and mental arithmetic, Int. J. Ind. Ergon. 35 (2005) 991–1009. [25] S. Benedetto, M. Pedrotti, L. Minin, T. Baccino, A. Re, R. Montanari, Driver workload and eye blink duration, Transp. Res. Part F 14 (2011) 199–208. [26] A. Banerjee, A. Konar, R. Janarthana, et al., Electro-oculogram based classification of eye movement direction, in: Advances in Computing and Information Technology, Springer, Berlin Heidelberg, 2013, pp. 151–159. [27] X. Jiang, G. Tien, D. Huang, et al., Capturing and evaluating blinks from video-based eyetrackers, Behav. Res. Methods 45 (3) (2013) 656–663. [28] M. Chau, M. Betke, Real time eye tracking and blink detection with usb cameras, Boston Univ. Comput. Sci. 2215 (2005) 1–10. [29] D. Comaniciu, V. Ramesh, P. Meer, Real-time tracking of non-rigid objects using mean shift, Comput. Vision and Pattern Recognit. IEEE 2 (2000) 142–149. [30] I. Rejer, P. Gorski, Benefits of ICA in the case of a few channel EEG, Eng. Med. Biol. Soc. (EMBC), IEEE (2015) 7434–7437. [31] A. Belouchrani, K. Abed-Meraim, J.F. Cardoso, et al., A blind source separation technique using second-order statistics, IEEE Trans. Signal Process. 45 (2) (1997) 434–444. [32] A. Picot, S. Charbonnier, A. Caplier, On-line detection of drowsiness using brain and visual information, IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 42 (99) (2012) 1–12. [33] A. Hyvärinen, Fast ICA for noisy data using Gaussian moments, Circuits Syst. IEEE 5 (1999) 57–61. [34] T.W. Lee, M. Girolami, T.J. Sejnowski, Independent component analysis using an extended infomax algorithm for mixed subgaussian and supergaussian sources, Neural Comput. 11 (2) (1999) 417–441. [35] J.F. Cardoso, A. Souloumiac, Jacobi angles for simultaneous diagonalization, SIAM J. Matrix Anal. Appl. 17 (1) (1996) 161–164. [36] X. Zhang, X. Ren, Two dimensional principal component analysis based independent component analysis for face recognition, International Conference on Multimedia Technology. IEEE (2011) 934–936.