Coarse-to-fine dot array marker detection with accurate edge localization for stereo visual tracking

Biomedical Signal Processing and Control 15 (2015) 49–59 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journal ...

Download PDF

4MB Sizes 0 Downloads 19 Views

Report

Full Text

Biomedical Signal Processing and Control 15 (2015) 49–59

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc

Coarse-to-ﬁne dot array marker detection with accurate edge localization for stereo visual tracking Junchen Wang ∗ , Etsuko Kobayashi, Ichiro Sakuma Graduate School of Engineering, The University of Tokyo, Tokyo, Japan

a r t i c l e

i n f o

Article history: Received 30 July 2014 Received in revised form 5 September 2014 Accepted 23 September 2014 Keywords: Marker detection Dot feature Sub-pixel edge localization Image ﬁltering Stereo tracking

a b s t r a c t We present a coarse-to-ﬁne dot array marker detection algorithm which can extract dot features with high accuracy and low uncertainty. The contribution of this paper is twofold: one is a conﬁgurable dot array marker detection framework which enables real-time multi-marker tracking with compact marker size (coarse detection); the other is a closed-form sub-pixel edge localization method including the formulation and the implementation (ﬁne localization). The marker pattern together with the dot contours is detected in a fast but coarse way for efﬁciency consideration, using simple thresholding and hierarchical contour analysis. If the marker pattern matches with one of predeﬁned marker descriptors, sub-pixel edge point localization of the dot contour is performed within the detected marker region by searching the zero-crossing in the convolution of the marker image with a Laplacian-of-Gaussian (LoG) kernel. A closed-form solution is proposed to localize the “true” edge point in a 3 × 3 neighborhood of a candidate pixel by solving a quartic equation. The dot center is ﬁnally extracted by ellipse ﬁtting and re-ordered according to an orientation indicator. The algorithm was evaluated against both synthetic and real image data, and also in real applications where stereo visual trackers were implemented using the proposed marker detection algorithm. Experimental results show that (1) the marker detection algorithm yielded a feature detection error of less than 0.1 pixel with real-time performance; (2) the uncertainties in both localizing static 2-D dot features and 3-D pose tracking were obviously reduced by performing the subpixel localization; and (3) the feasibility of the marker tracking under stereo laparoscopic views was conﬁrmed in an in vivo animal experiment. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction 1.1. Background Stereo visual tracking employs a calibrated stereo camera to localize a well-deﬁned visual pattern (i.e., a visual marker) in real time with six degrees of freedom (DOF). The tracking algorithms follow a general framework: (1) detecting features of the visual marker in the stereo images and matching them over the image pair, (2) retrieving features’ 3-D information by triangulation, and (3) ﬁtting the marker’s geometry to obtain the rotation and translation. The salient features encoded in a visual marker are usually corners and edges. Therefore, a visual marker commonly consists of square or/and dot patterns. Compared with a square which can provide four corners, a dot contains less information (only its center). However, the dot center is retrieved by ﬁtting (many) edge

∗ Corresponding author. Tel.: +81 0358417480. E-mail address: [email protected] (J. Wang). http://dx.doi.org/10.1016/j.bspc.2014.09.008 1746-8094/© 2014 Elsevier Ltd. All rights reserved.

points on dot’s contour (ellipse), and this can reduce the localization uncertainty incurred by image sensor noise. Of the visual markers, planar markers are most frequently used due to the ease of manufacture (printout), wide viewing angle, and minimum space occupation. By virtual of the current printing technology, a printed marker has enough geometric accuracy for most applications. Other than stereo visual tracking, a visual marker can also be used for camera calibration [1,2], (single camera) pose estimation [3] and some high level application such as augmented reality (AR) [4] and visual servoing [5]. Stereo visual tracking can be considered as a special case of pose estimation by introducing more constraints due to a second camera, which should be more accurate than single camera pose estimation under the same conﬁguration. The essence of a visual marker in these computer vision tasks is to provide 3-D/2-D correspondences for solving a linear or nonlinear minimization problem. The ﬁrst step relating to the visual marker is marker detection, i.e., extracting the encoded features and making them understandable. The accuracy and uncertainty of feature extraction will largely inﬂuence the ﬁnal output according to error propagation. Take the stereo visual tracking as an example, 3-D

50

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

triangulation error of a feature in the depth direction is sensitive to the feature disparity especially when a small baseline or the marker is far from the camera. The uncertainty of localizing 2-D features will be ampliﬁed through the triangulation procedure, resulting in large uncertainty in determining the depth. Assume we are using the stereo tracking results to visualize a tracking object in a virtual scene (e.g., in navigation systems), large tracking uncertainty leads to obvious random ﬂuctuation of the displayed tracking object even if it actually remains static. In the context of marker-based surgical instrument tracking under endoscopic view in minimally invasive surgery (MIS), a visual marker is attached to the surgical instrument (e.g., forceps manipulator [6] or endoscopic ultrasound probe [7,8]) whose pose with respect to the endoscope inside the body is estimated by either PnP techniques (for a monocular endoscope) or stereo tracking (for a stereo endoscope). Limited by the small operative ﬁeld inside the body, the visual marker is restricted to a small size, which will deteriorate the accuracy and increase the uncertainty at the target point (e.g., the instrument tip) far from the marker features [9,10]. Additionally, the baseline of a stereo endoscope is typically small (several millimeters), which makes the triangulation procedure subject to image noise. Eventually, a small uncertainty in localizing feature points may cause relatively large uncertainty in the depth localization, which will further result in large angular uncertainties around the marker plane. One improvement of visual marker detection is to increase the accuracy and reduce the uncertainty of feature localization in the presence of noise. For the tracking task, the detection time is also an important issue which has to be considered. 1.2. Related work There are several open source visual marker systems available. ARToolkit is an open source c/c++ library developed many years ago for building augmented reality (AR) applications by tracking a planar AR marker using pose estimation techniques [11]. The used AR marker is a planar pattern enclosed by a black rectangle frame on a white background. The corners of the rectangle are extracted by edge line ﬁtting and used for pose estimation. The estimated pose is further used to overlay a virtual object on the video stream to create an AR scene. Inspired by ARToolkit, ARTag marker system was proposed in 2005 for AR applications, with improvement on the marker pattern to improve the false detection rate and inter-marker confusion rate [12]. ARToolkit and ARTag are the same in the way of pose estimation using extracted four corners but different in the way of carrying information for marker recognition. Zhang evaluated the performance of several AR marker systems and reported that the standard deviation of localizing a static feature point varied from 0.26 to 0.57 pixel [13]. Assume the error follows a normal distribution, the variation of locating the same feature point in a static state is more than 1 pixel due to the image sensor noise. The uncertainty of 1 pixel is quite large for both pose estimation and stereo

tracking, especially when the marker is small and/or far from the camera. The computer vision open source library OpenCV [14] utilizes a black/white chessboard pattern for camera calibration. The corners of the chessboard pattern (X corner) are detected at pixel level using quadrilateral detection, and then iteratively optimized toward the saddle point at sub-pixel level [15]. Other chessboard corner detection methods can also be found elsewhere [16–19]. We noticed that more attentions have been paid to chessboard pattern detection. The reason may be the ease of detection and sub-pixel localization of an X corner. However, the uncertainty of the above sub-pixel corner localization is larger than that of dot center extraction using ellipse ﬁtting. Because we can ﬁrst localize the edge point on the dot’s contour at sub-pixel level, then use an ellipse to ﬁt these edge points and take the ellipse center as the dot center. The uncertainty of localizing the dot center is further reduced by ellipse ﬁtting. OpenCV also supports dot grid feature detection. However, it detects the dot contour at pixel level which results in a relatively large uncertainty in localizing the dot center. Therefore, a compact marker and a corresponding detection algorithm with high accuracy and low uncertainty of feature localization are needed. This paper presents a coarse-to-ﬁne dot array marker detection algorithm with accurate edge point localization. The marker is detected in a cascade way for efﬁciency consideration. The contour of the dot is detected at pixel level in a fast way, and then the sub-pixel edge point localization is performed by searching the zero-crossing in the convolution of the image with a Laplacian-of-Gaussian (LoG) kernel [20]. The dot center is ﬁnally extracted by ellipse ﬁtting and re-ordered according to an orientation indicator. The contribution of this paper is twofold: a compact conﬁgurable dot array marker detection framework which enables multi-marker tracking (coarse detection); and a closed-form subpixel edge localization method including the formulation and the implementation (ﬁne localization). The sub-pixel edge localization method could also be used in relevant vision tasks [21–23]. 2. Marker detection 2.1. Dot array marker Our dot array marker is inspired by the camera calibration pattern used by the commercial machine vision software MVTec Halcon [24]. The dot array marker is a m × n dot matrix enclosed by a rectangle frame as shown in Fig. 1. The origin of the marker is chosen to be located at the center of the rectangle frame. A solid triangle at a corner serves as an orientation indicator distinguishing the x and y directions. A dot array marker hence is characterized by (m, n, d, c), where d is the spacing; c is a binary code with m × n bits whose value indicates the presence (value 1) or absence (value 0) of the dot at the corresponding position. Let N denote the real number (the number of values 1 in the binary code) of the dots, which should be less than or equal to mn. The width of the border is 1/4d,

Fig. 1. Dot array markers and their descriptors.

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

51

Fig. 2. Marker detection: (a) original intensity image, (b) binary image after thresholding, (c) detected contours, and (d) detected markers.

the diameter of the dot is 1/2d, and the total size of the marker is (n + 1)d × (m + 1)d (width × height). We call d p = (m, n, d, c) the marker descriptor. Given d p , the center coordinates of the dots in row-major order are determined, denoted by {xi }N . In addition, the 1 coordinates of four inner corners starting from the orientation indicator in counterclockwise order can also be determined, denoted by {P i }41 as shown in Fig. 1. Therefore, a marker descriptor can discriminatively represents the geometry of a marker on the marker plane . Note that {P i }41 are only used to facilitate marker detection which will be described later. They do not serve as feature points, so they do not need sub-pixel localization. 2.2. Detection algorithm Given a marker descriptor d p and an intensity image I, the marker detection is to detect the presence of the marker in I, and extract the corresponding (ordered) feature points {x i }N 1 on the image plane . The skeleton of the detection algorithm is described as follows. First of all, the image I is converted to a binary image Ib

using a simple threshold. All the contours (black/white borders) M {Ck }M 1 of Ib together with the topological structure {H k }1 (M is the number of the detected contours) are extracted using the algorithm proposed by Suzuki [25]. A contour Ck is a collection of connected border pixels. Its corresponding topological structure H k is a quadruple where Hk [0], Hk [1], Hk [2], Hk [3] represent the indices of Ck ’s next and previous neighboring contours (contours in the same hierarchy level), the ﬁrst child contour (inner contours in the next hierarchy level) and the parent contour (the outer contour in the previous hierarchy level), respectively (−1 indicates nonexistence). With the topological structure, we can easily traverse every contour and access its parent, neighbors and children. Next, a cascade scheme is used to ﬁnd from {Ck }M 1 the candidate marker pattern. The candidate pattern is a ﬁve-sided polygon having N ellipse contours. We denote by {P a , P b , P 2 , P 3 , P 4 , {Ei }N 1} the candidate pattern. { P a , P b , P 2 , P 3 , P 4 } are ﬁve counterclockwise vertices (homogeneous coordinates) of the polygon with P a P b being the shortest side. {Ei }N 1 are N ellipse contours (without order) inside the polygon. Then, P 1 is calculated as the intersection

Fig. 3. 2-D plot and 1-D proﬁle of ∇ 2 G(x, y) with = 1.

52

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

of P 2 P b and P 4 P a : P 1 = ( P 2 × P b ) × ( P 4 × P a ). A homography H between the marker plane and the image plane is estimated using {P i }41 and {P i }41 . After that, the centers of {Ei }N 1 are transformed to the marker plane using H and permuted in a row-major order. Finally, the binary code in terms of the ordered {Ei }N 1 is calculated and compared with the descriptor d p . If they are matched, sub-pixel localization and ellipse ﬁtting are performed on N {Ei }N 1 to obtain {x i }1 , otherwise, 0 is returned indicating no marker detected. The pseudo code of the algorithm is given in Algorithm 1. Fig. 2 shows the intermediate results of the marker detection.

3.1. Laplacian-of-Gaussian ﬁlter

Algorithm 1.

∇ 2 G(x, y, ) =

Marker detection algorithm

There are many studies regarding sub-pixel edge feature detection, which fall into four categories: gradient operators [26], second derivative operators [20], moment operators [27–29] and energy minimization [30]. Considering that the localization should be performed in real time, we propose a closed-form sub-pixel localization method by exploring the image convolved with a Laplacian-of-Gaussian (LoG) ﬁlter. The two dimensional LoG ﬁlter has a form of: 1 4

x 2 + y2 −1 2 2

e−((x

2 +y2 )/2 2 )

(1)

where the parameter is the scale of Gaussian. ∇ 2 G(x, y, ) is also a radial function:

∇ 2 G(x, y, ) = ∇ 2 G(r, ) =

1 4

r2 −1 2 2

e−(r

2 /2 2 )

(2)

where r2 = x2 + y2 . Fig. 3 shows the 2-D plot and 1-D proﬁle of ∇ 2 G(x, y) with = 1, where the width of central excitatory region (i.e., the 2 distance √ between the two zero-crossings of ∇ G(r, ), denoted by w) is 2 2, and the support (denoted by s) is 8. The zero-crossings of the convolution ∇ 2 G(x, y, ) * I(x, y) response to the feature “edge” in the image I(x, y). The scale parameter determines the resolution of edge detection. Small (i.e., small w) makes the response sensitive to noise. Regarding the accuracy of the LoG ﬁlter, readers are referred to [20,31] for more details. 3.2. Closed-form sub-pixel localization

3. Edge point localization In the proposed marker detection algorithm, sub-pixel edge point localization is performed if the marker has been recognized. Because the marker recognition is a coarse procedure for timesaving consideration, we need to ﬁnely localize every edge point on the contour Ei for all i. In addition, disturbed by image sensor noise and motion artifacts, the contours extracted from a thresholded image are not true edges, however, they are supposed to be close to the true edges. Therefore, it is necessary to exploit the original intensity image to localize the “true” edges.

In our marker detection case, the 1-D proﬁle of the intensity on the dot contour along the gradient direction can be modeled by a ramp. Fig. 4(a) shows the detected marker region (the region of the polygon P 1 P 2 P 3 P 4 ) in the original image. Fig. 4(b) shows one of the dots, where the green points are the coarse edge points detected in the thresholded image, and the arrows indicate the gradient directions. Fig. 4(c) shows the proﬁles of intensity, gradient magnitude, and Gaussian–Laplacian at one coarse edge point along the gradient direction. Obviously, the coarse edge point is not located on the true edge where the Gaussian–Laplacian is supposed to be zero. The following things seem to be straightforward: ﬁrstly convolve the original marker image I(x, y) with Gaussian ﬁrst derivative ﬁlters to get the normalized gradient image ∇ Ix (x, y) and ∇ Iy (x, y) so that cos (x, y) = ∇ Ix (x, y) and sin (x, y) = ∇ Iy (x, y) where (x, y) is the orientation of the gradient vector at (x, y); and then convolve I(x, y) with the LoG ﬁlter with an appropriate to get the Gaussian–Laplacian image ∇ 2 I(x, y) = ∇ 2 G(x, y, ) * I(x, y); lastly from every coarse edge point search along its gradient direction to ﬁnd the zero-crossing of ∇ 2 I(x, y) as the reﬁned edge point. For the selection of , we choose so that the support of ∇ 2 G(x, y, ) is approximate to the dot diameter on the image. Assume that we start at a coarse edge point (x0 , y0 ), its orientation (x0 , y0 ) is binned into one of the eight discrete directions as shown in Fig. 5(a), and the corresponding neighbor (x1 , y1 ) is chosen in the 3 × 3 neighborhood of (x0 , y0 ), denoted by (x0 , y0 ). we look into whether ∇ 2 I(x0 , y0 ) · ∇ 2 I(x1 , y1 ) ≤ 0, and repeat the above process starting at (x1 , y1 ) until we ﬁnd a (x1 , y1 ) so that ∇ 2 I(x0 , y0 ) · ∇ 2 I(x1 , y1 ) ≤ 0. Now we are sure there are zero-crossings in (x0 , y0 ). To further localize the zero-crossing precisely, a bivariate polynomial is used to approximate ∇ 2 I(x, y) in (x0 , y0 ), which has a form def

∇ 2 Ix0 y0 (x, y) = ∇ 2 I(x0 + x, y0 + y) ≈ c0 x2 y2 + c1 x2 y + c2 xy2 + c3 x2 + c4 y2 + c5 xy + c6 x + c7 y + c8

(3)

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

53

Fig. 4. (a) Detected marker region. (b) Detected coarse edge points in the thresholded image. (c) Proﬁles of intensity, gradient magnitude and Gaussian–Laplacian along the gradient direction.

where x, y ∈ [−1, 1]. There are nine coefﬁcients {ci }80 , and each grid point in (x0 , y0 ) imposes a linear constraint, therefore, {ci }80 can be easily obtained by solving a 9 × 9 linear system. Fig. 5(b) shows the 3-D plot of the determined polynomial around (x0 , y0 ), and Fig. 5(c) shows its contour lines with different values. As can be seen, the zero-crossing curve with the function value 0 has a form: c0 x2 y2 + c1 x2 y + c2 xy2 + c3 x2 + c4 y2 + c5 xy + c6 x + c7 y + c8 = 0 (4) (4) represents the “true” sub-pixel edge curve in (x0 , y0 ). It is not necessary to obtain all the points on the curve, we just pick up one point on the curve that has the same orientation as (x0 , y0 ). Therefore, the pointswith the orientation 0 = (x0 , y0 ) in (x0 , y0 ) is parameterized by t cos 0 , t sin 0 where t is the parameter.

3.3. Separable Gaussian Until now, we have solved the problem theoretically. This section deals with the implementation issues on computational efﬁciency. 2-D convolution is time consuming. A s × s kernel requires s2 multiplies and adds per pixel. Recall s = 8, so typically 64 2 operations are required per pixel, and the total computation is 64MN 2 for ﬁltering an M × N image. For real-time computing, we reduce the computation ﬁrstly by separating 2-D convolution into two times 1-D convolution and then by making the computation independent of using recursive ﬁltering. 1-D Gaussian together with its ﬁrst and second derivatives is as follows: g(x) = e−(x

2 /2 2 )

(8)

Substituting (x, y) for t cos 0 , t sin 0 in (4), we obtain

x 2 2 g(x) = − 2 e−(x /2 )

c0 cos2 0 sin2 0 t 4 + (c1 cos2 0 sin 0 + c2 cos 0 sin2 0 )t 3

g (x) =

2

2

+ (c3 cos 0 + c4 sin 0 + c5 sin 0 cos 0 )t + (c6 cos 0 + c7 sin 0 )t + c8 = 0

2

(5)

1 x2 − 2 4

(9)

e−(x

2 /2 2 )

(10)

2-D Gaussian can be written as G(x, y) = g(x)g(y), and its ﬁrst derivative is

which is a quartic equation about t. (5) can be solved using an algebraic technique (closed-form), and the real solution tˆ with the smallest absolute value is used to calculate the sub-pixel edge point:

Gx (x, y) = g (x)g(y)

(11)

Gy (x, y) = g (y)g(x)

(12)

xˆ = x0 + tˆ cos 0

(6)

∇ Ix (x, y) =

yˆ = y0 + tˆ sin 0

(7)

Therefore

∂ (G(x, y) ∗ I(x, y)) = Gx (x, y) ∗ I(x, y) ∂x

= g (x)g(y) ∗ I(x, y)

(13)

Fig. 5. (a) (x0 , y0 ): 3 × 3 neighborhood of (x0 , y0 ). (b) ∇ 2 Ix0 y0 (x, y): bivariate polynomial approximation. (c) Contour lines of ∇ 2 Ix0 y0 (x, y) with different function values.

54

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

∂ (G(x, y) ∗ I(x, y)) = Gy (x, y) ∗ I(x, y) ∂y

∇ Iy (x, y) =

= g (y)g(x) ∗ I(x, y)

(14)

(13) and (14) imply that ∇ Ix (x, y) (∇ Iy (x, y)) can be obtained by convolving I(x, y) using a 1-D Gaussian column (row) kernel followed by its ﬁrst derivative row (column) kernel. For the LoG ﬁlter, it can also be written, according to its deﬁnition, as (up to scale):

∇ 2 G(x, y) = g (x)g(y) + g (y)g(x)

(15)

Therefore

∇ 2 I(x, y) = ∇ 2 G(x, y) ∗ I(x, y) = g (x)g(y) ∗ I(x, y) + g (y)g(x) ∗ I(x, y)

(16)

(16) implies that ∇ 2 I(x, y) is the sum of two convolutions which are obtained by convolving I(x, y) using a 1-D Gaussian column (row) kernel followed by its second derivative row (column) kernel. 3.4. Recursive ﬁltering Now we have separated the original 2-D convolution problems into a series of 1-D Gaussian derivative convolutions, and the operations per pixel are reduced from s2 to 2s. However, the computational complexity is directly proportional to the kernel size (s = 8). Recursive ﬁltering was proposed to drastically reduce this computational effort[32–34]. The idea of recursive ﬁltering is to approximate a N-order causal ﬁnite impulse response (FIR) ﬁlter: H(z) =

N

h(k)z

m b(k)z −k k=0 n −k k=1

(18)

a(k)z

m

n

b(k)x(i − k) −

k=0

a(k)y(i − k)

(19)

k=1

where x(i) is the input sequence and y(i) is the output. Therefore, the operations per element are reduced from N + 1 to m + n + 1 (usually m, n ≤ 4), which is independent of N. For large N, recursive ﬁltering is very computationally efﬁcient. In our implementation, the causal parts of Gaussian and its derivatives are approximated by a fourorder IIR ﬁlter (k ≥ 0): h+ a (k) =

a0 cos +

c0 (

w 0

k + a1 sin

w 0

k

e−(b0 /k)

w1 w1 k) + c1 sin( k) e−(b1 /k)

(20)

whose z transform takes the form: h+ a (z) =

+ n+ z −1 + n+ z −2 + n+ z −3 n+ 0 1 2 3

1 + d1+ z −1 + d2+ z −2 + d3+ z −3 + d4+ z −4 4

(21)

4

} , {di+ }1 are determined by a0 , a1 , b0 , b1 , c0 , c1 , where {n+ i 0 w0 , w1 , . The z transform of the anti-causal part h− a (z) takes the form h− a (z) =

4

4

y+ (i) = n+ x(i) + n+ x(i − 1) + n+ x(i − 2) + n+ x(i − 3) − d1+ y(i − 1) 0 1 2 3 + d2+ y(i − 2) + d3+ y(i − 3) + d4+ y(i − 4) then from right to left using

h− a (z)

to get

(23)

y− (i)

y− (i) = n− x(i + 1) + n− x(i + 2) + n− x(i + 3) + n− x(i + 4) 1 2 3 4 (24)

The ﬁnal recursive ﬁltering output is y(i) = y+ (i) + y− (i). The coefﬁcients in (20) for Gaussian and its derivatives can be found in [32].

(18) yields a n-order recursive system: y(i) =

4

(17)

using a n-order inﬁnite impulse response (IIR) ﬁlter:

1+

4

where {n− } , {di− }1 are deduced from {n+ } , {di+ }1 according to the i 1 i 0 symmetry or antisymmetry of the kernel. To apply recursive Gaussian ﬁltering, (19) is performed from left to right on the sequence + x(i) using h+ a (z) to get y (i)

− d1− y(i + 1) + d2− y(i + 2) + d3− y(i + 3) + d4− y(i + 4)

−k

k=0

Ha (z) =

Fig. 6. Principle of 3-D triangulation (x–z plane view).

z + n− z 2 + n− z 3 + n− z4 n− 1 2 3 4

1 + d1− z + d2− z 2 + d3− z 3 + d4− z 4

(22)

4. Marker pose tracking If the marker is successfully detected in a stereo image pair (left and right images), 3-D triangulation is performed to reconstruct 3-D coordinates of each dot center. Assume the stereo camera system has been calibrated and stereo-rectiﬁed so that it has a conﬁguration of two parallel looking cameras with identical intrinsic parameter matrices:

⎛

⎞

f

0

cx

K = ⎝0

f

cy ⎠

0

0

1

⎜

⎟

(25)

where f is the focal length and (cx , cy ) is the principal point of the camera. The 3-D coordinates (x, y, z) of a 2-D correspondence (xl , yl ) and (xr , yr ) are calculated using simple geometric relations shown in Fig. 6: x=

yl + yr − 2cy b x + xr − 2cx b , · l · , y= 2 xl − xr 2 xl − xr

z =

fb xl − xr

(26)

where the subscription “l” and “r” represent left and right images, respectively, and b is the baseline of the stereo camera (i.e., the distance between the centers of the two cameras). In this way, the 3-D coordinates of all marker dot points in the camera frame are calculated, denoted by xi , i = 1, 2, . . ., M. Then they are registered to the coordinates y i , i = 1, 2, . . ., M in the marker frame (determined

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

55

Fig. 7. Synthesized marker images and coarse marker detection under different noise levels.

by the marker descriptor) using a point-based registration method [35] to obtain the rotation R and the translation t by minimizing: min

M

||y i − Rxi − t||2

(27)

respectively, and then show the real applications based on our proposed method. 5.1. Synthetic data evaluation

i=1

5. Experiments and results We have implemented the dot array marker detection algorithm and the stereo visual tracker using c++ with the help of OpenCV. Except for the sub-pixel localization, all the operations in Algorithm 1 can be implemented by OpenCV functions. In this section, we evaluate the proposed algorithm on synthetic data and real data,

The advantage of evaluating the algorithm on synthetic data is that we can know the ground truth. A 3 × 3 dot array is synthesized on a 640 × 480 image with known coordinates of dot centers, as shown in Fig. 7. The radius of the dot is 10 pixels and the dot spacing is 40 pixels. White noise with the standard deviation n = 0.05k, k = 0, 2, . . ., 16 is added to the synthesized image (the intensity range of the image is [0, 1]), as shown in the ﬁrst row of Fig. 7. For each n , we repeat adding independent identically distributed (iid)

Fig. 8. (a) Detected marker region and sub-pixel localization. (b) One dot in (a) showing coarse edge points and found ﬁne edge points. (c) Feature detection errors under different noise levels. (d) Motion blurred marker image. (e) One dot in (d) showing coarse edge points and found ﬁne edge points. (f) Feature detection errors under motion blur artifacts with different orientations. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of the article.)

56

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

Fig. 9. Synthesized marker images and the ground truth (green circles) undergoing randomly perspective distortion. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of the article.)

noise instances for 200 times and run the marker detection algorithm to obtain error statistics. We denote the error of localizing the dot center i, i = 1, . . ., 9 by ei , which is the distance root mean square between the detected dot center and the ground truth in the 200 times repetitions, and take en =

1 9

9

e2 i=1 i

as the feature

detection error (pixel) under noise level n . The algorithm succeeded in detecting the marker in all the cases. The second row of Fig. 7 shows the coarse detection results of the marker under different noise levels. Fig. 8(a) and (b) shows the sub-pixel localization results under noise level n = 0.2, where the red and green points are coarse and ﬁne edge points, respectively. The red ellipses are ﬁtted from green points, whose centers are extracted as the feature points. Blue arrows show the gradient directions along which the sub-pixel edge points are localized. The contour of the coarse edge points is unsmooth due to the threshold effect on the discrete image. After reﬁning the edge points, the contour becomes smooth and close to an ellipse. Furthermore, the ellipse ﬁtting imposes the constraint on the shape of the contour to further reduce the localization uncertainty. Fig. 8(c) shows the feature detection error en in pixel with respect to the noise level n , where the blue/red curve is with/without sub-pixel localization. We can see that the feature detection error increases as the noise level goes up, and the error with sub-pixel localization is smaller than that without sub-pixel localization. To evaluate the algorithm against motion blur artifacts, the synthesized marker image is convolved with a motion blur ﬁlter whose orientation m varies from 0◦ to 162◦ , with step 18◦ . As a result, we get ten motion-blurred images. For each motion-blurred image, white noise with n = 0.1 is added, and we repeat adding iid noise instances for 200 times to obtain the error statistics. Fig. 8(d) shows one of the motion blurred images with white noise. Fig. 8(e) shows the coarse edge points and the found sub-pixel edge points. Fig. 8(f) plots the feature detection errors in pixel under the motion blur artifacts, where the blue/red curve is with/without sub-pixel

localization. As can be seen, under anistropic noise or artifacts the error with sub-pixel localization is obviously smaller than that without sub-pixel localization. Because the marker image in the real case suffers from perspective distortion, it is necessary to investigate the accuracy under various perspective distortions. The undistorted synthesized image is warped using a randomly generated 3 × 3 projective matrix. The resulting images are further disturbed by adding white noise with n = 0.1. Fig. 9 shows the synthesized marker images undergoing randomly perspective distortion with the ground truth dot centers. Note that the ground truth of the distorted dot centers are still known by applying the projective matrix to the original ground truth. 2000 distorted marker images were generated and the feature points of the 9 dot centers were extracted. The root mean square error of localizing each dot center was calculated from the 2000 samples and is shown in Fig. 10. The feature extraction error with sub-pixel localization is lower than 0.1 pixel.

5.2. Real data evaluation A USB3.0 CMOS camera (UI-3240CP, IDS Imaging Development Systems GmbH) with the resolution of 1280 × 1024 was used to capture a 3 × 3 dot array marker at a distance of 700 mm. The total size of the marker is 32 mm × 32 mm. 200 images were acquired while keeping the marker static to investigate the uncertainty in localizing the dot center. Let (xi , yi ) and (x i , y i ) denote the extracted centers of the dot i, i = 1, . . ., 9 with and without sub-pixel localization, respectively. The statistics (standard deviation , and range of variation which is the absolute difference between the maximum and the minimum) of (xi , yi ) and (x i , y i ), i = 1, . . ., 9 are summarized in Table 1. From the table we can see that the uncertainty (both and ) of feature extraction without sub-pixel localization is much larger (one order of magnitude) than that with sub-pixel localization.

Fig. 10. Mean square error of dot feature extraction under perspective distortion. (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of the article.)

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

57

Table 1 Feature extraction uncertainty. Dot index

1 2 3 4 5 6 7 8 9

xi (pixel)

x i (pixel)

yi (pixel)

˙

˙

˙

˙

0.009 0.008 0.007 0.008 0.006 0.008 0.006 0.007 0.006

0.053 0.044 0.038 0.046 0.037 0.045 0.027 0.041 0.038

0.009 0.011 0.008 0.008 0.007 0.008 0.010 0.008 0.009

0.046 0.063 0.042 0.053 0.046 0.049 0.051 0.044 0.046

0.065 0.032 0.028 0.033 0.052 0.050 0.017 0.023 0.039

0.317 0.168 0.166 0.187 0.205 0.304 0.141 0.112 0.188

0.029 0.038 0.015 0.020 0.030 0.013 0.039 0.019 0.033

0.117 0.169 0.071 0.074 0.187 0.105 0.153 0.082 0.216

Table 2 Uncertainty evaluation with sub-pixel localization.

˙

y i (pixel)

Table 3 Uncertainty evaluation without sub-pixel localization.

x (mm)

y (mm)

z (mm)

˛ (◦ )

ˇ (◦ )

(◦ )

0.001 0.005

0.002 0.015

0.006 0.040

0.047 0.309

0.046 0.311

0.003 0.020

Using the proposed marker detection algorithm, we succeeded in suppressing the feature point variation below 0.1 pixel in localizing static features, which greatly improved the 3-D triangulation precision in our implemented stereo visual tracker. 5.3. Performance in real applications Based on the proposed marker detection algorithm, a stereo visual tracker consisting of two USB3.0 cameras (UI-3240CP, IDS Imaging Development Systems GmbH) was developed, as shown in Fig. 11(a). The camera has a frame rate of 60 frames per second (fps) with the image resolution of 1280 × 1024. The stereo camera was calibrated and stereo rectiﬁed using a standard technique. The position and orientation of a dot array marker given its descriptor can be tracked in real time according to the algorithm described in Section 4. It took approximately 8 ms to detect and extract the marker features with sub-pixel localization from a 1280 × 1024 image on an Intel Core i7-4820K (3.7 GHz) CPU computer. Fig. 11(b) shows the simultaneous tracking of six different markers. To investigate the uncertainty propagation from the feature extraction to the pose estimation, the Euler poses of a static 3 × 3 dot array marker, denoted by (x, y, z, ˛, ˇ, ), were measured using the stereo visual tracker at a distance of about 700 mm. 3000 samples were acquired to obtain the standard deviation ˙ and the variation range of (x, y, z, ˛, ˇ, ). For comparison, the statistical results with and without sub-pixel localization are summarized in Tables 2 and 3. As we can see from the tables, the uncertainty of localizing feature points largely inﬂuences the 3-D tracking precision in the depth direction (z) and the rotation around the axes on the marker plane (˛, ˇ). The uncertainties in determining z, ˛, ˇ are obviously reduced with the sub-pixel localization. The stereo visual

˙

x (mm)

y (mm)

z (mm)

˛ (◦ )

ˇ (◦ )

(◦ )

0.003 0.018

0.002 0.016

0.024 0.160

0.224 1.856

0.184 1.186

0.014 0.103

tracker shown in Fig. 11(a) has already been tested and used in an image-guided surgical navigation system for oral and maxillofacial surgery developed by our group [36,37], yielding a 3-D tracking error of less than 0.3 mm. We also implemented a stereo visual tracker on a stereo laparoscope for endoscopic ultrasound probe tracking. In laparoscopic surgery, the laparoscope and surgical instruments are inserted into the body via two or three small incisions made on the abdomen. Surgeons perform surgery under the guidance of the laparoscopic video. Endoscopic ultrasound is usually used to examine the interior tissue which cannot be seen by the laparoscope, for example vessels embedded in the tissue. It is useful to overlay the ultrasound image on the view of the laparoscope to visualize complicated structures, which requires endoscopic ultrasound probe tracking under the laparoscope. Fig. 12(a) and (b) shows the stereo laparoscope (LS101D, Shinko Optical) and the endoscopic ultrasound probe (UST-5550, ALOKA) used in our implementation. A 2 × 11 dot array marker with dot spacing 3 mm was adhered to a plastic mount which was further attached to the ultrasound probe. The feasibility of the proposed marker detection algorithm for surgical instrument tracking inside body was conﬁrmed in an in vivo animal experiment. Fig. 12(c) and (d) shows the marker detection results under stereo laparoscopic views inside the abdomen of a porcine. Fig. 12(e) shows the detected coarse and ﬁne edge points (red and green contours). Fig. 12(f) shows the ampliﬁed zone of one dot indicated by the purple rectangle, from which we can see that the coarse contour (red points) has serious threshold effect, while improved by performing the sub-pixel localization as the green points show. Because we do not know the true feature locations, we compared the residuals of the homography estimation (minimizing geometric error) between the marker plane and the image plane using

Fig. 11. (a) Physical setup of the stereo visual tracker. (b) Simultaneous tracking of six markers.

58

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

Fig. 12. (a) Stereo laparoscope. (b) Endoscopic ultrasound probe. (c) and (d) In vivo marker detection on a stereo pair. (e) Coarse (red) and ﬁne (green) edge points. (For interpretation of the references to color in this sentence, the reader is referred to the web version of the article.)

the extracted 21 dot centers with/without the sub-pixel localization. The former and latter residuals were 0.39 pixel and 0.42 pixel, respectively. Both of them could be considered highly accurate, although the residual with sub-pixel localization is slightly smaller than that without sub-pixel localization. 6. Discussion and conclusion In this paper, a coarse-to-ﬁne marker detection algorithm with sub-pixel edge localization is presented. The dot array pattern is detected and matched to a predeﬁned descriptor in a fast way using a simple threshold and hierarchical contour analysis. The resulting dot contours are coarse edge points which are supposed to be close to the “true” edges. If the marker has successfully matched with one of the predeﬁned descriptors, the marker region (a bounding box containing only the marker pattern) is marked as a region of interest (ROI), within which the reﬁnement of the coarse edge points is performed by ﬁrstly ﬁnding nearby a “zero-crossing pixel” (whose 3 × 3 neighborhood contains zero-crossings of the Gaussian–Laplacian of the image) and then solving a quartic equation to obtain the exact sub-pixel edge points. An ellipse is ﬁnally ﬁtted from the sub-pixel edge points, whose center is regarded as the dot center. The proposed algorithm was evaluated against both synthetic and real image data, and also in real applications where stereo trackers were implemented based on the proposed marker detection algorithm. The synthetic data evaluation aimed to test the accuracy of the algorithm against different noise levels, motion blurs, and various perspective distortions. The algorithm yielded a feature detection error of less than 0.1 pixel (up to noise level

n ≤ 0.35) with real-time performance. In the evaluation on real image data, the precision (uncertainty) in localizing the static dot feature was measured. The results show that the uncertainty was obviously reduced by performing the sub-pixel localization. To further demonstrate how the uncertainty in localizing 2-D dot features would propagate to the 3-D pose estimation of the marker, a stereo visual tracker composed of two CMOS cameras was implemented and its precision in tracking a static marker was evaluated. The results show that the proposed sub-pixel localization method signiﬁcantly suppressed the 3-D pose tracking uncertainty in the depth direction (decrease by 75% in z) and the rotations (decrease by 79%, 75%, 79% in ˛, ˇ, , respectively). For the application in surgical instrument tracking, we conﬁrmed the feasibility of the marker tracking under stereo laparoscopic views in an in vivo animal experiment. The residual of estimating the homography between the image plane and the marker plane yielded 0.39 pixel, which suggested highly accurate model ﬁtting. The reason why the residual errors with and without sub-pixel localization did not show much difference can be explained as follows. The residual error reﬂects the accuracy of the model ﬁtting, rather than the absolute accuracy of feature detection (an offset to one feature set will not change the residual error). In addition, the 2-D homography is a planeto-plane transformation, whose estimation is less sensitive to 2-D feature localization error compared with 3-D triangulation in stereo tracking. Furthermore, the number of points for homography estimation was large (21 points), which further reduced the residual error and made the difference small between with and without sub-pixel localization. Regarding the inﬂuence caused by illumination variation or specular reﬂection, sometimes a simple threshold cannot coarsely

J. Wang et al. / Biomedical Signal Processing and Control 15 (2015) 49–59

segment out the dot marker. In such cases, the threshold operation in Algorithm 1 can be easily replaced with a Sobel operator to extract a binary edge image at pixel level, and the following is the same. Sobel operator is a 3 × 3 separable kernel which is a fast way (but coarse) to detect intensity changes. In this way, we are looking at the relative intensity change instead of the absolute intensity to coarsely segment out the marker pattern for robustness consideration, at the cost of slightly increased computation. On the recursive ﬁltering, the operations per pixel is 16 times because we use four-order IIR ﬁlters to approximate the original Gaussian and its derivatives. The computation is the same as that of a 16-length Gaussian kernel with the scale = 2. That means the recursive ﬁltering reduces the computation for only those > 2. For those ≤ 2, Gaussian kernels are directly convolved with the image. Conﬂict of interest None. References [1] Z. Zhang, A ﬂexible new technique for camera calibration, IEEE Trans. Pattern Anal. Mach. Intell. 22 (11) (2000) 1330–1334, http://dx.doi.org/ 10.1109/34.888718. [2] J. Helferty, C. Zhang, G. McLennan, W. Higgins, Videoendoscopic distortion correction and its application to virtual guidance of endoscopy, IEEE Trans. Med. Imaging 20 (7) (2001) 605–617, http://dx.doi.org/10.1109/42.932745. [3] V. Lepetit, F. Moreno-Noguer, P. Fua, EPnP: an accurate O(n) solution to the PNP problem, Int. J. Comput. Vis. 81 (2) (2009) 155–166, http://dx.doi.org/10.1007/ s11263-008-0152-6. [4] F. Zhou, H.B.-L. Duh, M. Billinghurst, Trends in augmented reality tracking, interaction and display: a review of ten years of Ismar, in: Proceedings of the Seventh IEEE/ACM International Symposium on Mixed and Augmented Reality, ISMAR ’08, IEEE Computer Society, Washington, DC, USA, 2008, pp. 193–202, http://dx.doi.org/10.1109/ISMAR.2008.4637362. [5] B. Espiau, F. Chaumette, P. Rives, A new approach to visual servoing in robotics, IEEE Trans. Robot. Autom. 8 (3) (1992) 313–326, http://dx.doi.org/ 10.1109/70.143350. [6] F. Nageotte, P. Zanne, C. Doignon, M. de Mathelin, Visual servoing-based endoscopic path following for robot-assisted laparoscopic surgery, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006, pp. 2364–2369, http://dx.doi.org/10.1109/IROS.2006.282647. [7] U. Jayarathne, A. McLeod, T. Peters, E. Chen, Robust intraoperative US probe tracking using a monocular endoscopic camera, in: K. Mori, I. Sakuma, Y. Sato, C. Barillot, N. Navab (Eds.), Medical Image Computing and ComputerAssisted Intervention – MICCAI 2013, Vol. 8151 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2013, pp. 363–370, http://dx.doi.org/ 10.1007/978-3-642-40760-4 46. [8] P. Pratt, A. Marco, C. Payne, A. Darzi, G.-Z. Yang, Intraoperative ultrasound guidance for transanal endoscopic microsurgery, in: N. Ayache, H. Delingette, P. Golland, K. Mori (Eds.), Medical Image Computing and ComputerAssisted Intervention – MICCAI 2012, Vol. 7510 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2012, pp. 463–470, http://dx.doi.org/ 10.1007/978-3-642-33415-3 57. [9] J. West, C.R. Maurer Jr., Designing optically tracked instruments for image-guided surgery, IEEE Trans. Med. Imaging 23 (5) (2004) 533–545, http://dx.doi.org/10.1109/TMI.2004.825614. [10] J. Snchez-Margallo, F. Snchez-Margallo, J. Pagador, I. Oropesa, M. Lucas, E. Gmez, J. Moreno, Technical evaluation of a third generation optical pose tracker for motion analysis and image-guided surgery, in: K. Drechsler, M. Erdt, M. Linguraru, C. Oyarzun Laura, K. Sharma, R. Shekhar, S. Wesarg (Eds.), Clinical Image-Based Procedures. From Planning to Intervention, Vol. 7761 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2013, pp. 75–82, http://dx.doi.org/10.1007/978-3-642-38079-2 10. [11] H. Kato, M. Billinghurst, I. Poupyrev, K. Imamoto, K. Tachibana, Virtual object manipulation on a table-top AR environment, in: IEEE and ACM International Symposium on Augmented Reality, 2000, pp. 111–119, http://dx.doi.org/ 10.1109/ISAR.2000.880934. [12] M. Fiala, Artag, a ﬁducial marker system using digital techniques, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2005, pp. 590–596, http://dx.doi.org/10.1109/CVPR.2005.74.

59

[13] X. Zhang, S. Fronz, N. Navab, Visual marker detection and decoding in AR systems: a comparative study, in: Proceedings of International Symposium on Mixed and Augmented Reality, 2002, pp. 97–106, http://dx.doi.org/ 10.1109/ISMAR.2002.1115078. [14] Opencv reference: http://opencv.org/ [15] L. Lucchese, S. Mitra, Using saddle points for subpixel feature detection in camera calibration targets, in: Asia-Paciﬁc Conference on Circuits and Systems, Vol. 2, 2002, pp. 191–195, http://dx.doi.org/10.1109/APCCAS.2002.1115151. [16] S. Bennett, J. Lasenby, Chess – quick and robust detection of chess-board features, Comput. Vis. Image Understand. 118 (2014) 197–210, http://dx.doi.org/ 10.1016/j.cviu.2013.10.008. [17] A. De la Escalera, J.M. Armingol, Automatic chessboard detection for intrinsic and extrinsic camera parameter calibration, Sensors 10 (3) (2010) 2027–2044, http://dx.doi.org/10.3390/s100302027. [18] W. Sun, X. Yang, S. Xiao, W. Hu, Robust checkerboard recognition for efﬁcient nonplanar geometry registration in projector-camera systems, in: Proceedings of the Fifth ACM/IEEE International Workshop on Projector Camera Systems, PROCAMS’08, ACM, New York, USA, 2008, pp. 2:1–2:7, http://dx.doi.org/ 10.1145/1394622.1394625. [19] J. Xia, J. Long Xiong, X. Xu, H. Qin, A multiscale sub-pixel detector for corners in camera calibration targets, in: International Conference on Intelligent Computation Technology and Automation (ICICTA), Vol. 1, 2010, pp. 196–199, http://dx.doi.org/10.1109/ICICTA.2010.429. [20] A. Huertas, G. Medioni, Detection of intensity changes with subpixel accuracy using Laplacian–Gaussian masks, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-8 (5) (1986) 651–664, http://dx.doi.org/10.1109/TPAMI.1986.4767838. [21] H. Zhou, A.M. Wallace, P.R. Green, Efﬁcient tracking and ego-motion recovery using gait analysis, Signal Process. 89 (12) (2009) 2367–2384, http://dx.doi.org/10.1016/j.sigpro.2009.04.010, special Section: Visual Information Analysis for Security. [22] H. Zhou, X. Li, A. Sadka, Nonrigid structure-from-motion from 2-D images using Markov chain Monte Carlo, IEEE Trans. Multimedia 14 (1) (2012) 168–177, http://dx.doi.org/10.1109/TMM.2011.2170406. [23] W. Zhou, M. Fei, H. Zhou, K. Li, A sparse representation based fast detection method for surface defect detection of bottle caps, Neurocomputing 123 (2014) 406–414, http://dx.doi.org/10.1016/j.neucom.2013.07.038. [24] Halcon, http://www.halcon.com/ [25] S. Suzuki, K. Be, Topological structural analysis of digitized binary images by border following, Comput. Vis. Graph. Image Process. 30 (1) (1985) 32–46, http://dx.doi.org/10.1016/0734-189X(85)90016-7. [26] Y. Nomura, M. Sagara, H. Naruse, A. Ide, Edge location to subpixel precision and analysis, Syst. Comput. Japan 22 (9) (1991) 70–81, http://dx.doi.org/ 10.1002/scj.4690220908. [27] E. Lyvers, O. Mitchell, M. Akey, A. Reeves, Subpixel measurements using a moment-based edge operator, IEEE Trans. Pattern Anal. Mach. Intell. 11 (12) (1989) 1293–1309, http://dx.doi.org/10.1109/34.41367. [28] Y.-D. Qu, C.-S. Cui, S.-B. Chen, J.-Q. Li, A fast subpixel edge detection method using Sobel–Zernike moments operator, Image Vis. Comput. 23 (1) (2005) 11–17, http://dx.doi.org/10.1016/j.imavis.2004.07.003. [29] F. Da, H. Zhang, Sub-pixel edge detection based on an improved moment, Image Vis. Comput. 28 (12) (2010) 1645–1658, http://dx.doi.org/10.1016/ j.imavis.2010.05.003. [30] F. Bouchara, S. Ramdani, Subpixel edge reﬁnement using deformable models, J. Opt. Soc. Am. A: Opt. Image Sci. 26 (4) (2009) 820–832, http://dx.doi.org/ 10.1364/JOSAA.26.000820. [31] V. Berzins, Accuracy of Laplacian edge detectors, Comput. Vis. Graph. Image Process. 27 (2) (1984) 195–210, http://dx.doi.org/10.1016/S0734189X(84)80043-2. [32] R. Deriche, Recursively Implementating the Gaussian and its Derivatives, Rapport de recherche RR-1893, INRIA, 1993. [33] L. van Vliet, I. Young, P. Verbeek, Recursive Gaussian derivative ﬁlters, in: Proceedings of Fourteenth International Conference on Pattern Recognition, Vol. 1, 1998, pp. 509–514, http://dx.doi.org/10.1109/ICPR.1998.711192. [34] S. Tan, J.L. Dale, A. Johnston, Performance of three recursive algorithms for fast space-variant Gaussian ﬁltering, Real-Time Imaging 9 (3) (2003) 215–228, http://dx.doi.org/10.1016/S1077-2014(03)00040-8. [35] B.K.P. Horn, Closed-form solution of absolute orientation using unit quaternions, J. Opt. Soc. Am. A: Opt. Image Sci. 4 (4) (1987) 629–642, http://dx.doi.org/10.1364/JOSAA.4.000629. [36] J. Wang, H. Suenaga, L. Yang, H. Liao, E. Kobayashi, T. Takato, I. Sakuma, Real-time marker-free patient registration and image-based navigation using stereovision for dental surgery, in: H. Liao, C. Linte, K. Masamune, T. Peters, G. Zheng (Eds.), Augmented Reality Environments for Medical Imaging and Computer-Assisted Interventions, Vol. 8090 of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2013, pp. 9–18, http://dx.doi.org/10.1007/978-3-642-40843-4 2. [37] J. Wang, H. Suenaga, K. Hoshi, L. Yang, E. Kobayashi, I. Sakuma, H. Liao, Augmented reality navigation with automatic marker-free image registration using 3-D image overlay for dental surgery, IEEE Trans. Biomed. Eng. 61 (4) (2014) 1295–1304, http://dx.doi.org/10.1109/TBME.2014.2301191.

Coarse-to-fine dot array marker detection with accurate edge localization for stereo visual tracking

Coarse-to-fine dot array marker detection with accurate edge localization for stereo visual tracking

Recommend Documents