Biomedical Signal Processing and Control 15 (2015) 49–59
Contents lists available at ScienceDirect
Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc
Coarse-to-fine dot array marker detection with accurate edge localization for stereo visual tracking Junchen Wang ∗ , Etsuko Kobayashi, Ichiro Sakuma Graduate School of Engineering, The University of Tokyo, Tokyo, Japan
a r t i c l e
i n f o
Article history: Received 30 July 2014 Received in revised form 5 September 2014 Accepted 23 September 2014 Keywords: Marker detection Dot feature Sub-pixel edge localization Image filtering Stereo tracking
a b s t r a c t We present a coarse-to-fine dot array marker detection algorithm which can extract dot features with high accuracy and low uncertainty. The contribution of this paper is twofold: one is a configurable dot array marker detection framework which enables real-time multi-marker tracking with compact marker size (coarse detection); the other is a closed-form sub-pixel edge localization method including the formulation and the implementation (fine localization). The marker pattern together with the dot contours is detected in a fast but coarse way for efficiency consideration, using simple thresholding and hierarchical contour analysis. If the marker pattern matches with one of predefined marker descriptors, sub-pixel edge point localization of the dot contour is performed within the detected marker region by searching the zero-crossing in the convolution of the marker image with a Laplacian-of-Gaussian (LoG) kernel. A closed-form solution is proposed to localize the “true” edge point in a 3 × 3 neighborhood of a candidate pixel by solving a quartic equation. The dot center is finally extracted by ellipse fitting and re-ordered according to an orientation indicator. The algorithm was evaluated against both synthetic and real image data, and also in real applications where stereo visual trackers were implemented using the proposed marker detection algorithm. Experimental results show that (1) the marker detection algorithm yielded a feature detection error of less than 0.1 pixel with real-time performance; (2) the uncertainties in both localizing static 2-D dot features and 3-D pose tracking were obviously reduced by performing the subpixel localization; and (3) the feasibility of the marker tracking under stereo laparoscopic views was confirmed in an in vivo animal experiment. © 2014 Elsevier Ltd. All rights reserved.
1. Introduction 1.1. Background Stereo visual tracking employs a calibrated stereo camera to localize a well-defined visual pattern (i.e., a visual marker) in real time with six degrees of freedom (DOF). The tracking algorithms follow a general framework: (1) detecting features of the visual marker in the stereo images and matching them over the image pair, (2) retrieving features’ 3-D information by triangulation, and (3) fitting the marker’s geometry to obtain the rotation and translation. The salient features encoded in a visual marker are usually corners and edges. Therefore, a visual marker commonly consists of square or/and dot patterns. Compared with a square which can provide four corners, a dot contains less information (only its center). However, the dot center is retrieved by fitting (many) edge
∗ Corresponding author. Tel.: +81 0358417480. E-mail address:
[email protected] (J. Wang). http://dx.doi.org/10.1016/j.bspc.2014.09.008 1746-8094/© 2014 Elsevier Ltd. All rights reserved.
points on dot’s contour (ellipse), and this can reduce the localization uncertainty incurred by image sensor noise. Of the visual markers, planar markers are most frequently used due to the ease of manufacture (printout), wide viewing angle, and minimum space occupation. By virtual of the current printing technology, a printed marker has enough geometric accuracy for most applications. Other than stereo visual tracking, a visual marker can also be used for camera calibration [1,2], (single camera) pose estimation [3] and some high level application such as augmented reality (AR) [4] and visual servoing [5]. Stereo visual tracking can be considered as a special case of pose estimation by introducing more constraints due to a second camera, which should be more accurate than single camera pose estimation under the same configuration. The essence of a visual marker in these computer vision tasks is to provide 3-D/2-D correspondences for solving a linear or nonlinear minimization problem. The first step relating to the visual marker is marker detection, i.e., extracting the encoded features and making them understandable. The accuracy and uncertainty of feature extraction will largely influence the final output according to error propagation. Take the stereo visual tracking as an example, 3-D
50
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
triangulation error of a feature in the depth direction is sensitive to the feature disparity especially when a small baseline or the marker is far from the camera. The uncertainty of localizing 2-D features will be amplified through the triangulation procedure, resulting in large uncertainty in determining the depth. Assume we are using the stereo tracking results to visualize a tracking object in a virtual scene (e.g., in navigation systems), large tracking uncertainty leads to obvious random fluctuation of the displayed tracking object even if it actually remains static. In the context of marker-based surgical instrument tracking under endoscopic view in minimally invasive surgery (MIS), a visual marker is attached to the surgical instrument (e.g., forceps manipulator [6] or endoscopic ultrasound probe [7,8]) whose pose with respect to the endoscope inside the body is estimated by either PnP techniques (for a monocular endoscope) or stereo tracking (for a stereo endoscope). Limited by the small operative field inside the body, the visual marker is restricted to a small size, which will deteriorate the accuracy and increase the uncertainty at the target point (e.g., the instrument tip) far from the marker features [9,10]. Additionally, the baseline of a stereo endoscope is typically small (several millimeters), which makes the triangulation procedure subject to image noise. Eventually, a small uncertainty in localizing feature points may cause relatively large uncertainty in the depth localization, which will further result in large angular uncertainties around the marker plane. One improvement of visual marker detection is to increase the accuracy and reduce the uncertainty of feature localization in the presence of noise. For the tracking task, the detection time is also an important issue which has to be considered. 1.2. Related work There are several open source visual marker systems available. ARToolkit is an open source c/c++ library developed many years ago for building augmented reality (AR) applications by tracking a planar AR marker using pose estimation techniques [11]. The used AR marker is a planar pattern enclosed by a black rectangle frame on a white background. The corners of the rectangle are extracted by edge line fitting and used for pose estimation. The estimated pose is further used to overlay a virtual object on the video stream to create an AR scene. Inspired by ARToolkit, ARTag marker system was proposed in 2005 for AR applications, with improvement on the marker pattern to improve the false detection rate and inter-marker confusion rate [12]. ARToolkit and ARTag are the same in the way of pose estimation using extracted four corners but different in the way of carrying information for marker recognition. Zhang evaluated the performance of several AR marker systems and reported that the standard deviation of localizing a static feature point varied from 0.26 to 0.57 pixel [13]. Assume the error follows a normal distribution, the variation of locating the same feature point in a static state is more than 1 pixel due to the image sensor noise. The uncertainty of 1 pixel is quite large for both pose estimation and stereo
tracking, especially when the marker is small and/or far from the camera. The computer vision open source library OpenCV [14] utilizes a black/white chessboard pattern for camera calibration. The corners of the chessboard pattern (X corner) are detected at pixel level using quadrilateral detection, and then iteratively optimized toward the saddle point at sub-pixel level [15]. Other chessboard corner detection methods can also be found elsewhere [16–19]. We noticed that more attentions have been paid to chessboard pattern detection. The reason may be the ease of detection and sub-pixel localization of an X corner. However, the uncertainty of the above sub-pixel corner localization is larger than that of dot center extraction using ellipse fitting. Because we can first localize the edge point on the dot’s contour at sub-pixel level, then use an ellipse to fit these edge points and take the ellipse center as the dot center. The uncertainty of localizing the dot center is further reduced by ellipse fitting. OpenCV also supports dot grid feature detection. However, it detects the dot contour at pixel level which results in a relatively large uncertainty in localizing the dot center. Therefore, a compact marker and a corresponding detection algorithm with high accuracy and low uncertainty of feature localization are needed. This paper presents a coarse-to-fine dot array marker detection algorithm with accurate edge point localization. The marker is detected in a cascade way for efficiency consideration. The contour of the dot is detected at pixel level in a fast way, and then the sub-pixel edge point localization is performed by searching the zero-crossing in the convolution of the image with a Laplacian-of-Gaussian (LoG) kernel [20]. The dot center is finally extracted by ellipse fitting and re-ordered according to an orientation indicator. The contribution of this paper is twofold: a compact configurable dot array marker detection framework which enables multi-marker tracking (coarse detection); and a closed-form subpixel edge localization method including the formulation and the implementation (fine localization). The sub-pixel edge localization method could also be used in relevant vision tasks [21–23]. 2. Marker detection 2.1. Dot array marker Our dot array marker is inspired by the camera calibration pattern used by the commercial machine vision software MVTec Halcon [24]. The dot array marker is a m × n dot matrix enclosed by a rectangle frame as shown in Fig. 1. The origin of the marker is chosen to be located at the center of the rectangle frame. A solid triangle at a corner serves as an orientation indicator distinguishing the x and y directions. A dot array marker hence is characterized by (m, n, d, c), where d is the spacing; c is a binary code with m × n bits whose value indicates the presence (value 1) or absence (value 0) of the dot at the corresponding position. Let N denote the real number (the number of values 1 in the binary code) of the dots, which should be less than or equal to mn. The width of the border is 1/4d,
Fig. 1. Dot array markers and their descriptors.
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
51
Fig. 2. Marker detection: (a) original intensity image, (b) binary image after thresholding, (c) detected contours, and (d) detected markers.
the diameter of the dot is 1/2d, and the total size of the marker is (n + 1)d × (m + 1)d (width × height). We call d p = (m, n, d, c) the marker descriptor. Given d p , the center coordinates of the dots in row-major order are determined, denoted by {xi }N . In addition, the 1 coordinates of four inner corners starting from the orientation indicator in counterclockwise order can also be determined, denoted by {P i }41 as shown in Fig. 1. Therefore, a marker descriptor can discriminatively represents the geometry of a marker on the marker plane . Note that {P i }41 are only used to facilitate marker detection which will be described later. They do not serve as feature points, so they do not need sub-pixel localization. 2.2. Detection algorithm Given a marker descriptor d p and an intensity image I, the marker detection is to detect the presence of the marker in I, and extract the corresponding (ordered) feature points {x i }N 1 on the image plane . The skeleton of the detection algorithm is described as follows. First of all, the image I is converted to a binary image Ib
using a simple threshold. All the contours (black/white borders) M {Ck }M 1 of Ib together with the topological structure {H k }1 (M is the number of the detected contours) are extracted using the algorithm proposed by Suzuki [25]. A contour Ck is a collection of connected border pixels. Its corresponding topological structure H k is a quadruple where Hk [0], Hk [1], Hk [2], Hk [3] represent the indices of Ck ’s next and previous neighboring contours (contours in the same hierarchy level), the first child contour (inner contours in the next hierarchy level) and the parent contour (the outer contour in the previous hierarchy level), respectively (−1 indicates nonexistence). With the topological structure, we can easily traverse every contour and access its parent, neighbors and children. Next, a cascade scheme is used to find from {Ck }M 1 the candidate marker pattern. The candidate pattern is a five-sided polygon having N ellipse contours. We denote by {P a , P b , P 2 , P 3 , P 4 , {Ei }N 1} the candidate pattern. { P a , P b , P 2 , P 3 , P 4 } are five counterclockwise vertices (homogeneous coordinates) of the polygon with P a P b being the shortest side. {Ei }N 1 are N ellipse contours (without order) inside the polygon. Then, P 1 is calculated as the intersection
Fig. 3. 2-D plot and 1-D profile of ∇ 2 G(x, y) with = 1.
52
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
of P 2 P b and P 4 P a : P 1 = ( P 2 × P b ) × ( P 4 × P a ). A homography H between the marker plane and the image plane is estimated using {P i }41 and {P i }41 . After that, the centers of {Ei }N 1 are transformed to the marker plane using H and permuted in a row-major order. Finally, the binary code in terms of the ordered {Ei }N 1 is calculated and compared with the descriptor d p . If they are matched, sub-pixel localization and ellipse fitting are performed on N {Ei }N 1 to obtain {x i }1 , otherwise, 0 is returned indicating no marker detected. The pseudo code of the algorithm is given in Algorithm 1. Fig. 2 shows the intermediate results of the marker detection.
3.1. Laplacian-of-Gaussian filter
Algorithm 1.
∇ 2 G(x, y, ) =
Marker detection algorithm
There are many studies regarding sub-pixel edge feature detection, which fall into four categories: gradient operators [26], second derivative operators [20], moment operators [27–29] and energy minimization [30]. Considering that the localization should be performed in real time, we propose a closed-form sub-pixel localization method by exploring the image convolved with a Laplacian-of-Gaussian (LoG) filter. The two dimensional LoG filter has a form of: 1 4
x 2 + y2 −1 2 2
e−((x
2 +y2 )/2 2 )
(1)
where the parameter is the scale of Gaussian. ∇ 2 G(x, y, ) is also a radial function:
∇ 2 G(x, y, ) = ∇ 2 G(r, ) =
1 4
r2 −1 2 2
e−(r
2 /2 2 )
(2)
where r2 = x2 + y2 . Fig. 3 shows the 2-D plot and 1-D profile of ∇ 2 G(x, y) with = 1, where the width of central excitatory region (i.e., the 2 distance √ between the two zero-crossings of ∇ G(r, ), denoted by w) is 2 2, and the support (denoted by s) is 8. The zero-crossings of the convolution ∇ 2 G(x, y, ) * I(x, y) response to the feature “edge” in the image I(x, y). The scale parameter determines the resolution of edge detection. Small (i.e., small w) makes the response sensitive to noise. Regarding the accuracy of the LoG filter, readers are referred to [20,31] for more details. 3.2. Closed-form sub-pixel localization
3. Edge point localization In the proposed marker detection algorithm, sub-pixel edge point localization is performed if the marker has been recognized. Because the marker recognition is a coarse procedure for timesaving consideration, we need to finely localize every edge point on the contour Ei for all i. In addition, disturbed by image sensor noise and motion artifacts, the contours extracted from a thresholded image are not true edges, however, they are supposed to be close to the true edges. Therefore, it is necessary to exploit the original intensity image to localize the “true” edges.
In our marker detection case, the 1-D profile of the intensity on the dot contour along the gradient direction can be modeled by a ramp. Fig. 4(a) shows the detected marker region (the region of the polygon P 1 P 2 P 3 P 4 ) in the original image. Fig. 4(b) shows one of the dots, where the green points are the coarse edge points detected in the thresholded image, and the arrows indicate the gradient directions. Fig. 4(c) shows the profiles of intensity, gradient magnitude, and Gaussian–Laplacian at one coarse edge point along the gradient direction. Obviously, the coarse edge point is not located on the true edge where the Gaussian–Laplacian is supposed to be zero. The following things seem to be straightforward: firstly convolve the original marker image I(x, y) with Gaussian first derivative filters to get the normalized gradient image ∇ Ix (x, y) and ∇ Iy (x, y) so that cos (x, y) = ∇ Ix (x, y) and sin (x, y) = ∇ Iy (x, y) where (x, y) is the orientation of the gradient vector at (x, y); and then convolve I(x, y) with the LoG filter with an appropriate to get the Gaussian–Laplacian image ∇ 2 I(x, y) = ∇ 2 G(x, y, ) * I(x, y); lastly from every coarse edge point search along its gradient direction to find the zero-crossing of ∇ 2 I(x, y) as the refined edge point. For the selection of , we choose so that the support of ∇ 2 G(x, y, ) is approximate to the dot diameter on the image. Assume that we start at a coarse edge point (x0 , y0 ), its orientation (x0 , y0 ) is binned into one of the eight discrete directions as shown in Fig. 5(a), and the corresponding neighbor (x1 , y1 ) is chosen in the 3 × 3 neighborhood of (x0 , y0 ), denoted by (x0 , y0 ). we look into whether ∇ 2 I(x0 , y0 ) · ∇ 2 I(x1 , y1 ) ≤ 0, and repeat the above process starting at (x1 , y1 ) until we find a (x1 , y1 ) so that ∇ 2 I(x0 , y0 ) · ∇ 2 I(x1 , y1 ) ≤ 0. Now we are sure there are zero-crossings in (x0 , y0 ). To further localize the zero-crossing precisely, a bivariate polynomial is used to approximate ∇ 2 I(x, y) in (x0 , y0 ), which has a form def
∇ 2 Ix0 y0 (x, y) = ∇ 2 I(x0 + x, y0 + y) ≈ c0 x2 y2 + c1 x2 y + c2 xy2 + c3 x2 + c4 y2 + c5 xy + c6 x + c7 y + c8
(3)
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
53
Fig. 4. (a) Detected marker region. (b) Detected coarse edge points in the thresholded image. (c) Profiles of intensity, gradient magnitude and Gaussian–Laplacian along the gradient direction.
where x, y ∈ [−1, 1]. There are nine coefficients {ci }80 , and each grid point in (x0 , y0 ) imposes a linear constraint, therefore, {ci }80 can be easily obtained by solving a 9 × 9 linear system. Fig. 5(b) shows the 3-D plot of the determined polynomial around (x0 , y0 ), and Fig. 5(c) shows its contour lines with different values. As can be seen, the zero-crossing curve with the function value 0 has a form: c0 x2 y2 + c1 x2 y + c2 xy2 + c3 x2 + c4 y2 + c5 xy + c6 x + c7 y + c8 = 0 (4) (4) represents the “true” sub-pixel edge curve in (x0 , y0 ). It is not necessary to obtain all the points on the curve, we just pick up one point on the curve that has the same orientation as (x0 , y0 ). Therefore, the pointswith the orientation 0 = (x0 , y0 ) in (x0 , y0 ) is parameterized by t cos 0 , t sin 0 where t is the parameter.
3.3. Separable Gaussian Until now, we have solved the problem theoretically. This section deals with the implementation issues on computational efficiency. 2-D convolution is time consuming. A s × s kernel requires s2 multiplies and adds per pixel. Recall s = 8, so typically 64 2 operations are required per pixel, and the total computation is 64MN 2 for filtering an M × N image. For real-time computing, we reduce the computation firstly by separating 2-D convolution into two times 1-D convolution and then by making the computation independent of using recursive filtering. 1-D Gaussian together with its first and second derivatives is as follows: g(x) = e−(x
2 /2 2 )
(8)
Substituting (x, y) for t cos 0 , t sin 0 in (4), we obtain
x 2 2 g(x) = − 2 e−(x /2 )
c0 cos2 0 sin2 0 t 4 + (c1 cos2 0 sin 0 + c2 cos 0 sin2 0 )t 3
g (x) =
2
2
+ (c3 cos 0 + c4 sin 0 + c5 sin 0 cos 0 )t + (c6 cos 0 + c7 sin 0 )t + c8 = 0
2
(5)
1 x2 − 2 4
(9)
e−(x
2 /2 2 )
(10)
2-D Gaussian can be written as G(x, y) = g(x)g(y), and its first derivative is
which is a quartic equation about t. (5) can be solved using an algebraic technique (closed-form), and the real solution tˆ with the smallest absolute value is used to calculate the sub-pixel edge point:
Gx (x, y) = g (x)g(y)
(11)
Gy (x, y) = g (y)g(x)
(12)
xˆ = x0 + tˆ cos 0
(6)
∇ Ix (x, y) =
yˆ = y0 + tˆ sin 0
(7)
Therefore
∂ (G(x, y) ∗ I(x, y)) = Gx (x, y) ∗ I(x, y) ∂x
= g (x)g(y) ∗ I(x, y)
(13)
Fig. 5. (a) (x0 , y0 ): 3 × 3 neighborhood of (x0 , y0 ). (b) ∇ 2 Ix0 y0 (x, y): bivariate polynomial approximation. (c) Contour lines of ∇ 2 Ix0 y0 (x, y) with different function values.
54
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
∂ (G(x, y) ∗ I(x, y)) = Gy (x, y) ∗ I(x, y) ∂y
∇ Iy (x, y) =
= g (y)g(x) ∗ I(x, y)
(14)
(13) and (14) imply that ∇ Ix (x, y) (∇ Iy (x, y)) can be obtained by convolving I(x, y) using a 1-D Gaussian column (row) kernel followed by its first derivative row (column) kernel. For the LoG filter, it can also be written, according to its definition, as (up to scale):
∇ 2 G(x, y) = g (x)g(y) + g (y)g(x)
(15)
Therefore
∇ 2 I(x, y) = ∇ 2 G(x, y) ∗ I(x, y) = g (x)g(y) ∗ I(x, y) + g (y)g(x) ∗ I(x, y)
(16)
(16) implies that ∇ 2 I(x, y) is the sum of two convolutions which are obtained by convolving I(x, y) using a 1-D Gaussian column (row) kernel followed by its second derivative row (column) kernel. 3.4. Recursive filtering Now we have separated the original 2-D convolution problems into a series of 1-D Gaussian derivative convolutions, and the operations per pixel are reduced from s2 to 2s. However, the computational complexity is directly proportional to the kernel size (s = 8). Recursive filtering was proposed to drastically reduce this computational effort[32–34]. The idea of recursive filtering is to approximate a N-order causal finite impulse response (FIR) filter: H(z) =
N
h(k)z
m b(k)z −k k=0 n −k k=1
(18)
a(k)z
m
n
b(k)x(i − k) −
k=0
a(k)y(i − k)
(19)
k=1
where x(i) is the input sequence and y(i) is the output. Therefore, the operations per element are reduced from N + 1 to m + n + 1 (usually m, n ≤ 4), which is independent of N. For large N, recursive filtering is very computationally efficient. In our implementation, the causal parts of Gaussian and its derivatives are approximated by a fourorder IIR filter (k ≥ 0): h+ a (k) =
a0 cos +
c0 (
w 0
k + a1 sin
w 0
k
e−(b0 /k)
w1 w1 k) + c1 sin( k) e−(b1 /k)
(20)
whose z transform takes the form: h+ a (z) =
+ n+ z −1 + n+ z −2 + n+ z −3 n+ 0 1 2 3
1 + d1+ z −1 + d2+ z −2 + d3+ z −3 + d4+ z −4 4
(21)
4
} , {di+ }1 are determined by a0 , a1 , b0 , b1 , c0 , c1 , where {n+ i 0 w0 , w1 , . The z transform of the anti-causal part h− a (z) takes the form h− a (z) =
4
4
y+ (i) = n+ x(i) + n+ x(i − 1) + n+ x(i − 2) + n+ x(i − 3) − d1+ y(i − 1) 0 1 2 3 + d2+ y(i − 2) + d3+ y(i − 3) + d4+ y(i − 4) then from right to left using
h− a (z)
to get
(23)
y− (i)
y− (i) = n− x(i + 1) + n− x(i + 2) + n− x(i + 3) + n− x(i + 4) 1 2 3 4 (24)
The final recursive filtering output is y(i) = y+ (i) + y− (i). The coefficients in (20) for Gaussian and its derivatives can be found in [32].
(18) yields a n-order recursive system: y(i) =
4
(17)
using a n-order infinite impulse response (IIR) filter:
1+
4
where {n− } , {di− }1 are deduced from {n+ } , {di+ }1 according to the i 1 i 0 symmetry or antisymmetry of the kernel. To apply recursive Gaussian filtering, (19) is performed from left to right on the sequence + x(i) using h+ a (z) to get y (i)
− d1− y(i + 1) + d2− y(i + 2) + d3− y(i + 3) + d4− y(i + 4)
−k
k=0
Ha (z) =
Fig. 6. Principle of 3-D triangulation (x–z plane view).
z + n− z 2 + n− z 3 + n− z4 n− 1 2 3 4
1 + d1− z + d2− z 2 + d3− z 3 + d4− z 4
(22)
4. Marker pose tracking If the marker is successfully detected in a stereo image pair (left and right images), 3-D triangulation is performed to reconstruct 3-D coordinates of each dot center. Assume the stereo camera system has been calibrated and stereo-rectified so that it has a configuration of two parallel looking cameras with identical intrinsic parameter matrices:
⎛
⎞
f
0
cx
K = ⎝0
f
cy ⎠
0
0
1
⎜
⎟
(25)
where f is the focal length and (cx , cy ) is the principal point of the camera. The 3-D coordinates (x, y, z) of a 2-D correspondence (xl , yl ) and (xr , yr ) are calculated using simple geometric relations shown in Fig. 6: x=
yl + yr − 2cy b x + xr − 2cx b , · l · , y= 2 xl − xr 2 xl − xr
z =
fb xl − xr
(26)
where the subscription “l” and “r” represent left and right images, respectively, and b is the baseline of the stereo camera (i.e., the distance between the centers of the two cameras). In this way, the 3-D coordinates of all marker dot points in the camera frame are calculated, denoted by xi , i = 1, 2, . . ., M. Then they are registered to the coordinates y i , i = 1, 2, . . ., M in the marker frame (determined
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
55
Fig. 7. Synthesized marker images and coarse marker detection under different noise levels.
by the marker descriptor) using a point-based registration method [35] to obtain the rotation R and the translation t by minimizing: min
M
||y i − Rxi − t||2
(27)
respectively, and then show the real applications based on our proposed method. 5.1. Synthetic data evaluation
i=1
5. Experiments and results We have implemented the dot array marker detection algorithm and the stereo visual tracker using c++ with the help of OpenCV. Except for the sub-pixel localization, all the operations in Algorithm 1 can be implemented by OpenCV functions. In this section, we evaluate the proposed algorithm on synthetic data and real data,
The advantage of evaluating the algorithm on synthetic data is that we can know the ground truth. A 3 × 3 dot array is synthesized on a 640 × 480 image with known coordinates of dot centers, as shown in Fig. 7. The radius of the dot is 10 pixels and the dot spacing is 40 pixels. White noise with the standard deviation n = 0.05k, k = 0, 2, . . ., 16 is added to the synthesized image (the intensity range of the image is [0, 1]), as shown in the first row of Fig. 7. For each n , we repeat adding independent identically distributed (iid)
Fig. 8. (a) Detected marker region and sub-pixel localization. (b) One dot in (a) showing coarse edge points and found fine edge points. (c) Feature detection errors under different noise levels. (d) Motion blurred marker image. (e) One dot in (d) showing coarse edge points and found fine edge points. (f) Feature detection errors under motion blur artifacts with different orientations. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
56
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
Fig. 9. Synthesized marker images and the ground truth (green circles) undergoing randomly perspective distortion. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
noise instances for 200 times and run the marker detection algorithm to obtain error statistics. We denote the error of localizing the dot center i, i = 1, . . ., 9 by ei , which is the distance root mean square between the detected dot center and the ground truth in the 200 times repetitions, and take en =
1 9
9
e2 i=1 i
as the feature
detection error (pixel) under noise level n . The algorithm succeeded in detecting the marker in all the cases. The second row of Fig. 7 shows the coarse detection results of the marker under different noise levels. Fig. 8(a) and (b) shows the sub-pixel localization results under noise level n = 0.2, where the red and green points are coarse and fine edge points, respectively. The red ellipses are fitted from green points, whose centers are extracted as the feature points. Blue arrows show the gradient directions along which the sub-pixel edge points are localized. The contour of the coarse edge points is unsmooth due to the threshold effect on the discrete image. After refining the edge points, the contour becomes smooth and close to an ellipse. Furthermore, the ellipse fitting imposes the constraint on the shape of the contour to further reduce the localization uncertainty. Fig. 8(c) shows the feature detection error en in pixel with respect to the noise level n , where the blue/red curve is with/without sub-pixel localization. We can see that the feature detection error increases as the noise level goes up, and the error with sub-pixel localization is smaller than that without sub-pixel localization. To evaluate the algorithm against motion blur artifacts, the synthesized marker image is convolved with a motion blur filter whose orientation m varies from 0◦ to 162◦ , with step 18◦ . As a result, we get ten motion-blurred images. For each motion-blurred image, white noise with n = 0.1 is added, and we repeat adding iid noise instances for 200 times to obtain the error statistics. Fig. 8(d) shows one of the motion blurred images with white noise. Fig. 8(e) shows the coarse edge points and the found sub-pixel edge points. Fig. 8(f) plots the feature detection errors in pixel under the motion blur artifacts, where the blue/red curve is with/without sub-pixel
localization. As can be seen, under anistropic noise or artifacts the error with sub-pixel localization is obviously smaller than that without sub-pixel localization. Because the marker image in the real case suffers from perspective distortion, it is necessary to investigate the accuracy under various perspective distortions. The undistorted synthesized image is warped using a randomly generated 3 × 3 projective matrix. The resulting images are further disturbed by adding white noise with n = 0.1. Fig. 9 shows the synthesized marker images undergoing randomly perspective distortion with the ground truth dot centers. Note that the ground truth of the distorted dot centers are still known by applying the projective matrix to the original ground truth. 2000 distorted marker images were generated and the feature points of the 9 dot centers were extracted. The root mean square error of localizing each dot center was calculated from the 2000 samples and is shown in Fig. 10. The feature extraction error with sub-pixel localization is lower than 0.1 pixel.
5.2. Real data evaluation A USB3.0 CMOS camera (UI-3240CP, IDS Imaging Development Systems GmbH) with the resolution of 1280 × 1024 was used to capture a 3 × 3 dot array marker at a distance of 700 mm. The total size of the marker is 32 mm × 32 mm. 200 images were acquired while keeping the marker static to investigate the uncertainty in localizing the dot center. Let (xi , yi ) and (x i , y i ) denote the extracted centers of the dot i, i = 1, . . ., 9 with and without sub-pixel localization, respectively. The statistics (standard deviation , and range of variation which is the absolute difference between the maximum and the minimum) of (xi , yi ) and (x i , y i ), i = 1, . . ., 9 are summarized in Table 1. From the table we can see that the uncertainty (both and ) of feature extraction without sub-pixel localization is much larger (one order of magnitude) than that with sub-pixel localization.
Fig. 10. Mean square error of dot feature extraction under perspective distortion. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
57
Table 1 Feature extraction uncertainty. Dot index
1 2 3 4 5 6 7 8 9
xi (pixel)
x i (pixel)
yi (pixel)
˙
˙
˙
˙
0.009 0.008 0.007 0.008 0.006 0.008 0.006 0.007 0.006
0.053 0.044 0.038 0.046 0.037 0.045 0.027 0.041 0.038
0.009 0.011 0.008 0.008 0.007 0.008 0.010 0.008 0.009
0.046 0.063 0.042 0.053 0.046 0.049 0.051 0.044 0.046
0.065 0.032 0.028 0.033 0.052 0.050 0.017 0.023 0.039
0.317 0.168 0.166 0.187 0.205 0.304 0.141 0.112 0.188
0.029 0.038 0.015 0.020 0.030 0.013 0.039 0.019 0.033
0.117 0.169 0.071 0.074 0.187 0.105 0.153 0.082 0.216
Table 2 Uncertainty evaluation with sub-pixel localization.
˙
y i (pixel)
Table 3 Uncertainty evaluation without sub-pixel localization.
x (mm)
y (mm)
z (mm)
˛ (◦ )
ˇ (◦ )
(◦ )
0.001 0.005
0.002 0.015
0.006 0.040
0.047 0.309
0.046 0.311
0.003 0.020
Using the proposed marker detection algorithm, we succeeded in suppressing the feature point variation below 0.1 pixel in localizing static features, which greatly improved the 3-D triangulation precision in our implemented stereo visual tracker. 5.3. Performance in real applications Based on the proposed marker detection algorithm, a stereo visual tracker consisting of two USB3.0 cameras (UI-3240CP, IDS Imaging Development Systems GmbH) was developed, as shown in Fig. 11(a). The camera has a frame rate of 60 frames per second (fps) with the image resolution of 1280 × 1024. The stereo camera was calibrated and stereo rectified using a standard technique. The position and orientation of a dot array marker given its descriptor can be tracked in real time according to the algorithm described in Section 4. It took approximately 8 ms to detect and extract the marker features with sub-pixel localization from a 1280 × 1024 image on an Intel Core i7-4820K (3.7 GHz) CPU computer. Fig. 11(b) shows the simultaneous tracking of six different markers. To investigate the uncertainty propagation from the feature extraction to the pose estimation, the Euler poses of a static 3 × 3 dot array marker, denoted by (x, y, z, ˛, ˇ, ), were measured using the stereo visual tracker at a distance of about 700 mm. 3000 samples were acquired to obtain the standard deviation ˙ and the variation range of (x, y, z, ˛, ˇ, ). For comparison, the statistical results with and without sub-pixel localization are summarized in Tables 2 and 3. As we can see from the tables, the uncertainty of localizing feature points largely influences the 3-D tracking precision in the depth direction (z) and the rotation around the axes on the marker plane (˛, ˇ). The uncertainties in determining z, ˛, ˇ are obviously reduced with the sub-pixel localization. The stereo visual
˙
x (mm)
y (mm)
z (mm)
˛ (◦ )
ˇ (◦ )
(◦ )
0.003 0.018
0.002 0.016
0.024 0.160
0.224 1.856
0.184 1.186
0.014 0.103
tracker shown in Fig. 11(a) has already been tested and used in an image-guided surgical navigation system for oral and maxillofacial surgery developed by our group [36,37], yielding a 3-D tracking error of less than 0.3 mm. We also implemented a stereo visual tracker on a stereo laparoscope for endoscopic ultrasound probe tracking. In laparoscopic surgery, the laparoscope and surgical instruments are inserted into the body via two or three small incisions made on the abdomen. Surgeons perform surgery under the guidance of the laparoscopic video. Endoscopic ultrasound is usually used to examine the interior tissue which cannot be seen by the laparoscope, for example vessels embedded in the tissue. It is useful to overlay the ultrasound image on the view of the laparoscope to visualize complicated structures, which requires endoscopic ultrasound probe tracking under the laparoscope. Fig. 12(a) and (b) shows the stereo laparoscope (LS101D, Shinko Optical) and the endoscopic ultrasound probe (UST-5550, ALOKA) used in our implementation. A 2 × 11 dot array marker with dot spacing 3 mm was adhered to a plastic mount which was further attached to the ultrasound probe. The feasibility of the proposed marker detection algorithm for surgical instrument tracking inside body was confirmed in an in vivo animal experiment. Fig. 12(c) and (d) shows the marker detection results under stereo laparoscopic views inside the abdomen of a porcine. Fig. 12(e) shows the detected coarse and fine edge points (red and green contours). Fig. 12(f) shows the amplified zone of one dot indicated by the purple rectangle, from which we can see that the coarse contour (red points) has serious threshold effect, while improved by performing the sub-pixel localization as the green points show. Because we do not know the true feature locations, we compared the residuals of the homography estimation (minimizing geometric error) between the marker plane and the image plane using
Fig. 11. (a) Physical setup of the stereo visual tracker. (b) Simultaneous tracking of six markers.
58
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
Fig. 12. (a) Stereo laparoscope. (b) Endoscopic ultrasound probe. (c) and (d) In vivo marker detection on a stereo pair. (e) Coarse (red) and fine (green) edge points. (For interpretation of the references to color in this sentence, the reader is referred to the web version of the article.)
the extracted 21 dot centers with/without the sub-pixel localization. The former and latter residuals were 0.39 pixel and 0.42 pixel, respectively. Both of them could be considered highly accurate, although the residual with sub-pixel localization is slightly smaller than that without sub-pixel localization. 6. Discussion and conclusion In this paper, a coarse-to-fine marker detection algorithm with sub-pixel edge localization is presented. The dot array pattern is detected and matched to a predefined descriptor in a fast way using a simple threshold and hierarchical contour analysis. The resulting dot contours are coarse edge points which are supposed to be close to the “true” edges. If the marker has successfully matched with one of the predefined descriptors, the marker region (a bounding box containing only the marker pattern) is marked as a region of interest (ROI), within which the refinement of the coarse edge points is performed by firstly finding nearby a “zero-crossing pixel” (whose 3 × 3 neighborhood contains zero-crossings of the Gaussian–Laplacian of the image) and then solving a quartic equation to obtain the exact sub-pixel edge points. An ellipse is finally fitted from the sub-pixel edge points, whose center is regarded as the dot center. The proposed algorithm was evaluated against both synthetic and real image data, and also in real applications where stereo trackers were implemented based on the proposed marker detection algorithm. The synthetic data evaluation aimed to test the accuracy of the algorithm against different noise levels, motion blurs, and various perspective distortions. The algorithm yielded a feature detection error of less than 0.1 pixel (up to noise level
n ≤ 0.35) with real-time performance. In the evaluation on real image data, the precision (uncertainty) in localizing the static dot feature was measured. The results show that the uncertainty was obviously reduced by performing the sub-pixel localization. To further demonstrate how the uncertainty in localizing 2-D dot features would propagate to the 3-D pose estimation of the marker, a stereo visual tracker composed of two CMOS cameras was implemented and its precision in tracking a static marker was evaluated. The results show that the proposed sub-pixel localization method significantly suppressed the 3-D pose tracking uncertainty in the depth direction (decrease by 75% in z) and the rotations (decrease by 79%, 75%, 79% in ˛, ˇ, , respectively). For the application in surgical instrument tracking, we confirmed the feasibility of the marker tracking under stereo laparoscopic views in an in vivo animal experiment. The residual of estimating the homography between the image plane and the marker plane yielded 0.39 pixel, which suggested highly accurate model fitting. The reason why the residual errors with and without sub-pixel localization did not show much difference can be explained as follows. The residual error reflects the accuracy of the model fitting, rather than the absolute accuracy of feature detection (an offset to one feature set will not change the residual error). In addition, the 2-D homography is a planeto-plane transformation, whose estimation is less sensitive to 2-D feature localization error compared with 3-D triangulation in stereo tracking. Furthermore, the number of points for homography estimation was large (21 points), which further reduced the residual error and made the difference small between with and without sub-pixel localization. Regarding the influence caused by illumination variation or specular reflection, sometimes a simple threshold cannot coarsely
J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59
segment out the dot marker. In such cases, the threshold operation in Algorithm 1 can be easily replaced with a Sobel operator to extract a binary edge image at pixel level, and the following is the same. Sobel operator is a 3 × 3 separable kernel which is a fast way (but coarse) to detect intensity changes. In this way, we are looking at the relative intensity change instead of the absolute intensity to coarsely segment out the marker pattern for robustness consideration, at the cost of slightly increased computation. On the recursive filtering, the operations per pixel is 16 times because we use four-order IIR filters to approximate the original Gaussian and its derivatives. The computation is the same as that of a 16-length Gaussian kernel with the scale = 2. That means the recursive filtering reduces the computation for only those > 2. For those ≤ 2, Gaussian kernels are directly convolved with the image. Conflict of interest None. References [1] Z. Zhang, A flexible new technique for camera calibration, IEEE Trans. Pattern Anal. Mach. Intell. 22 (11) (2000) 1330–1334, http://dx.doi.org/ 10.1109/34.888718. [2] J. Helferty, C. Zhang, G. McLennan, W. Higgins, Videoendoscopic distortion correction and its application to virtual guidance of endoscopy, IEEE Trans. Med. Imaging 20 (7) (2001) 605–617, http://dx.doi.org/10.1109/42.932745. [3] V. Lepetit, F. Moreno-Noguer, P. Fua, EPnP: an accurate O(n) solution to the PNP problem, Int. J. Comput. Vis. 81 (2) (2009) 155–166, http://dx.doi.org/10.1007/ s11263-008-0152-6. [4] F. Zhou, H.B.-L. Duh, M. Billinghurst, Trends in augmented reality tracking, interaction and display: a review of ten years of Ismar, in: Proceedings of the Seventh IEEE/ACM International Symposium on Mixed and Augmented Reality, ISMAR ’08, IEEE Computer Society, Washington, DC, USA, 2008, pp. 193–202, http://dx.doi.org/10.1109/ISMAR.2008.4637362. [5] B. Espiau, F. Chaumette, P. Rives, A new approach to visual servoing in robotics, IEEE Trans. Robot. Autom. 8 (3) (1992) 313–326, http://dx.doi.org/ 10.1109/70.143350. [6] F. Nageotte, P. Zanne, C. Doignon, M. de Mathelin, Visual servoing-based endoscopic path following for robot-assisted laparoscopic surgery, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006, pp. 2364–2369, http://dx.doi.org/10.1109/IROS.2006.282647. [7] U. Jayarathne, A. McLeod, T. Peters, E. Chen, Robust intraoperative US probe tracking using a monocular endoscopic camera, in: K. Mori, I. Sakuma, Y. Sato, C. Barillot, N. Navab (Eds.), Medical Image Computing and ComputerAssisted Intervention – MICCAI 2013, Vol. 8151 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2013, pp. 363–370, http://dx.doi.org/ 10.1007/978-3-642-40760-4 46. [8] P. Pratt, A. Marco, C. Payne, A. Darzi, G.-Z. Yang, Intraoperative ultrasound guidance for transanal endoscopic microsurgery, in: N. Ayache, H. Delingette, P. Golland, K. Mori (Eds.), Medical Image Computing and ComputerAssisted Intervention – MICCAI 2012, Vol. 7510 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2012, pp. 463–470, http://dx.doi.org/ 10.1007/978-3-642-33415-3 57. [9] J. West, C.R. Maurer Jr., Designing optically tracked instruments for image-guided surgery, IEEE Trans. Med. Imaging 23 (5) (2004) 533–545, http://dx.doi.org/10.1109/TMI.2004.825614. [10] J. Snchez-Margallo, F. Snchez-Margallo, J. Pagador, I. Oropesa, M. Lucas, E. Gmez, J. Moreno, Technical evaluation of a third generation optical pose tracker for motion analysis and image-guided surgery, in: K. Drechsler, M. Erdt, M. Linguraru, C. Oyarzun Laura, K. Sharma, R. Shekhar, S. Wesarg (Eds.), Clinical Image-Based Procedures. From Planning to Intervention, Vol. 7761 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2013, pp. 75–82, http://dx.doi.org/10.1007/978-3-642-38079-2 10. [11] H. Kato, M. Billinghurst, I. Poupyrev, K. Imamoto, K. Tachibana, Virtual object manipulation on a table-top AR environment, in: IEEE and ACM International Symposium on Augmented Reality, 2000, pp. 111–119, http://dx.doi.org/ 10.1109/ISAR.2000.880934. [12] M. Fiala, Artag, a fiducial marker system using digital techniques, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2005, pp. 590–596, http://dx.doi.org/10.1109/CVPR.2005.74.
59
[13] X. Zhang, S. Fronz, N. Navab, Visual marker detection and decoding in AR systems: a comparative study, in: Proceedings of International Symposium on Mixed and Augmented Reality, 2002, pp. 97–106, http://dx.doi.org/ 10.1109/ISMAR.2002.1115078. [14] Opencv reference: http://opencv.org/ [15] L. Lucchese, S. Mitra, Using saddle points for subpixel feature detection in camera calibration targets, in: Asia-Pacific Conference on Circuits and Systems, Vol. 2, 2002, pp. 191–195, http://dx.doi.org/10.1109/APCCAS.2002.1115151. [16] S. Bennett, J. Lasenby, Chess – quick and robust detection of chess-board features, Comput. Vis. Image Understand. 118 (2014) 197–210, http://dx.doi.org/ 10.1016/j.cviu.2013.10.008. [17] A. De la Escalera, J.M. Armingol, Automatic chessboard detection for intrinsic and extrinsic camera parameter calibration, Sensors 10 (3) (2010) 2027–2044, http://dx.doi.org/10.3390/s100302027. [18] W. Sun, X. Yang, S. Xiao, W. Hu, Robust checkerboard recognition for efficient nonplanar geometry registration in projector-camera systems, in: Proceedings of the Fifth ACM/IEEE International Workshop on Projector Camera Systems, PROCAMS’08, ACM, New York, USA, 2008, pp. 2:1–2:7, http://dx.doi.org/ 10.1145/1394622.1394625. [19] J. Xia, J. Long Xiong, X. Xu, H. Qin, A multiscale sub-pixel detector for corners in camera calibration targets, in: International Conference on Intelligent Computation Technology and Automation (ICICTA), Vol. 1, 2010, pp. 196–199, http://dx.doi.org/10.1109/ICICTA.2010.429. [20] A. Huertas, G. Medioni, Detection of intensity changes with subpixel accuracy using Laplacian–Gaussian masks, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-8 (5) (1986) 651–664, http://dx.doi.org/10.1109/TPAMI.1986.4767838. [21] H. Zhou, A.M. Wallace, P.R. Green, Efficient tracking and ego-motion recovery using gait analysis, Signal Process. 89 (12) (2009) 2367–2384, http://dx.doi.org/10.1016/j.sigpro.2009.04.010, special Section: Visual Information Analysis for Security. [22] H. Zhou, X. Li, A. Sadka, Nonrigid structure-from-motion from 2-D images using Markov chain Monte Carlo, IEEE Trans. Multimedia 14 (1) (2012) 168–177, http://dx.doi.org/10.1109/TMM.2011.2170406. [23] W. Zhou, M. Fei, H. Zhou, K. Li, A sparse representation based fast detection method for surface defect detection of bottle caps, Neurocomputing 123 (2014) 406–414, http://dx.doi.org/10.1016/j.neucom.2013.07.038. [24] Halcon, http://www.halcon.com/ [25] S. Suzuki, K. Be, Topological structural analysis of digitized binary images by border following, Comput. Vis. Graph. Image Process. 30 (1) (1985) 32–46, http://dx.doi.org/10.1016/0734-189X(85)90016-7. [26] Y. Nomura, M. Sagara, H. Naruse, A. Ide, Edge location to subpixel precision and analysis, Syst. Comput. Japan 22 (9) (1991) 70–81, http://dx.doi.org/ 10.1002/scj.4690220908. [27] E. Lyvers, O. Mitchell, M. Akey, A. Reeves, Subpixel measurements using a moment-based edge operator, IEEE Trans. Pattern Anal. Mach. Intell. 11 (12) (1989) 1293–1309, http://dx.doi.org/10.1109/34.41367. [28] Y.-D. Qu, C.-S. Cui, S.-B. Chen, J.-Q. Li, A fast subpixel edge detection method using Sobel–Zernike moments operator, Image Vis. Comput. 23 (1) (2005) 11–17, http://dx.doi.org/10.1016/j.imavis.2004.07.003. [29] F. Da, H. Zhang, Sub-pixel edge detection based on an improved moment, Image Vis. Comput. 28 (12) (2010) 1645–1658, http://dx.doi.org/10.1016/ j.imavis.2010.05.003. [30] F. Bouchara, S. Ramdani, Subpixel edge refinement using deformable models, J. Opt. Soc. Am. A: Opt. Image Sci. 26 (4) (2009) 820–832, http://dx.doi.org/ 10.1364/JOSAA.26.000820. [31] V. Berzins, Accuracy of Laplacian edge detectors, Comput. Vis. Graph. Image Process. 27 (2) (1984) 195–210, http://dx.doi.org/10.1016/S0734189X(84)80043-2. [32] R. Deriche, Recursively Implementating the Gaussian and its Derivatives, Rapport de recherche RR-1893, INRIA, 1993. [33] L. van Vliet, I. Young, P. Verbeek, Recursive Gaussian derivative filters, in: Proceedings of Fourteenth International Conference on Pattern Recognition, Vol. 1, 1998, pp. 509–514, http://dx.doi.org/10.1109/ICPR.1998.711192. [34] S. Tan, J.L. Dale, A. Johnston, Performance of three recursive algorithms for fast space-variant Gaussian filtering, Real-Time Imaging 9 (3) (2003) 215–228, http://dx.doi.org/10.1016/S1077-2014(03)00040-8. [35] B.K.P. Horn, Closed-form solution of absolute orientation using unit quaternions, J. Opt. Soc. Am. A: Opt. Image Sci. 4 (4) (1987) 629–642, http://dx.doi.org/10.1364/JOSAA.4.000629. [36] J. Wang, H. Suenaga, L. Yang, H. Liao, E. Kobayashi, T. Takato, I. Sakuma, Real-time marker-free patient registration and image-based navigation using stereovision for dental surgery, in: H. Liao, C. Linte, K. Masamune, T. Peters, G. Zheng (Eds.), Augmented Reality Environments for Medical Imaging and Computer-Assisted Interventions, Vol. 8090 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2013, pp. 9–18, http://dx.doi.org/10.1007/978-3-642-40843-4 2. [37] J. Wang, H. Suenaga, K. Hoshi, L. Yang, E. Kobayashi, I. Sakuma, H. Liao, Augmented reality navigation with automatic marker-free image registration using 3-D image overlay for dental surgery, IEEE Trans. Biomed. Eng. 61 (4) (2014) 1295–1304, http://dx.doi.org/10.1109/TBME.2014.2301191.