Shape from silhouette outlines using an adaptive dandelion model

Shape from silhouette outlines using an adaptive dandelion model

Computer Vision and Image Understanding 105 (2007) 121–130 www.elsevier.com/locate/cviu Shape from silhouette outlines using an adaptive dandelion mo...

1MB Sizes 0 Downloads 48 Views

Computer Vision and Image Understanding 105 (2007) 121–130 www.elsevier.com/locate/cviu

Shape from silhouette outlines using an adaptive dandelion model

q

Xin Liu *, Hongxun Yao, Wen Gao School of Computer Science and Technology, Harbin Institute of Technology, Harbin, PR China Received 31 May 2005; accepted 12 September 2006 Available online 3 November 2006

Abstract In this paper we study the problem of shape from silhouette outlines (SFO) using a novel adaptive dandelion model. SFO reconstructs a 3D model using only the outmost silhouette contours and produces a solid called the ‘‘bounding hull’’. The dandelion model is composed of a pencil of organized line segments emitted from a common point which samples the bounding hull surface in regularly distributed orientations with the ending points serving as sampling points. The orientations and the topology of the line segments are derived from a geodesic sphere. The lengths of the line segments are computed by cutting rays in 2D with silhouette outlines. Based on the subdivision structure of the geodesic sphere, the sampling resolution can be refined adaptively until the desired precision is achieved. Finally a manifold mesh is extracted from the dandelion model.  2006 Elsevier Inc. All rights reserved. Keywords: Shape from silhouette outlines; Bounding hull; Dandelion model

1. Introduction In computer vision, shape from silhouettes (SFS) is a well known approach that reconstructs a 3D model from a set of calibrated images using only silhouette information. The result of SFS is usually a visual hull [1], which is the intersection of all viewing cones. Compared with other 3D reconstruction algorithms such as shape from stereo, shading, shadows, etc. SFS has its unique advantages: first, SFS is not fastidious about the object to be modeled, because it does not depend on evident geometric features, distinguishable textures or any reflection models. Second, it does not require strict imaging conditions and therefore it is very easy to set up a 3D shape capture system based on SFS. Third, SFS is essentially robust, which means good results can be obtained with imperfectly extracted silhou-

q

This paper is an expanded and improved version of a paper presented in 3DIM 2005. The work is supported by Program for New Century Excellent Talents in University, NCET-05-03 34 and NSFC under Contract Nos. 60472043 and 60332010/F01. * Corresponding author. E-mail addresses: [email protected] (X. Liu), [email protected] (H. Yao), [email protected] (W. Gao). 1077-3142/$ - see front matter  2006 Elsevier Inc. All rights reserved. doi:10.1016/j.cviu.2006.09.003

ettes. Despite the limited precision of the visual hull, SFS has extensive applications in 3D object modeling [2–4]. In practice we found many objects to be modeled by SFS are (approximately) spherical-terrain-like (STL), the surfaces of which can be described by single valued functions in a spherical coordinate system. Human heads are good examples. Such STL objects, also referred to as ‘‘star-shaped’’ objects although nuances may exist, have been studied in the context of surface reconstruction from volumetric data [5–7], but to our knowledge no algorithm that is specially optimized for reconstructing the visual hulls of STL objects from silhouettes has appeared. Instead applications reconstructing STL objects often use common SFS algorithms [4]. In this paper we study the problem of shape from silhouette outlines (SFO), which is a variant of SFS. SFO reconstructs a 3D model using only the outmost silhouette contours, i.e., the silhouette outlines. It produces a ‘‘bounding hull’’, which is the visual hull of the generated STL solid of the object. For (approximately) STL objects, SFO can be used instead of SFS to produce (approximately) the same result and enjoy all advantages of our specially designed algorithm. For other objects, the bounding hull produced by SFO serves as a STL superset of the visual

122

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

hull. If such a superset is enough for specific applications, our algorithm can also be used. Collision detection and avoidance in virtual environments constructed by image based rendering (IBR) is an example of such applications. The problem of SFO is solved by a novel adaptive dandelion model, which is composed of a pencil of organized line segments emitted from a common point, O. The line segments sample the bounding hull surface in regularized orientations with the ending points standing for sampling points. The orientations and the topology of the line segments are derived from a geodesic sphere. The lengths of the line segments are computed by cutting rays in 2D with silhouette outlines. Because concurrent rays in 3D space project into concurrent rays in any 2D images, all possible intersections between projected rays and silhouette outlines can be pre-computed efficiently by scanning the silhouette images in one pass, which makes 2D ray cuts very fast. Based on the subdivision structure of the geodesic sphere, the sampling resolution can be easily adapted to the local complexity of bounding hull surface. We propose an algorithm that can directly estimate the representing precision from outlines. The SFO algorithm runs recursively until the desired precision is achieved. Finally, a manifold mesh model of the bounding hull is extracted from the dandelion model. The SFO algorithm has an asymptotic time complexity of O (j Æ k), where j is the number of input images and k is the number of sampling points. To sum up, our SFO algorithm based on the adaptive dandelion model has the combined advantage of • • • •

speediness, adaptive resolution, controllable precision, and producing manifold meshes.

2. Previous work In this section, we briefly review previous work on SFS that is related to our algorithm. SFS was first proposed by Baumgart in 1974, and has experienced more than 30 years’ development. Existing algorithms can be classified according to their rationales. Surface based approaches, pioneered by Baumgart [8], calculate a polygonal surface model of the visual hull by intersecting viewing cones. The basic algorithm first approximates silhouette contours by polygons. Then a set of viewing cones is created by extruding these polygons for each view. Finally, the viewing cones are intersected in 3D space. Observing that the viewing cone has a fixed scaled cross-section, Matusik et al. proposed an efficient algorithm [9] that reduces 3D intersections to simpler intersections in 2D image space. Because some viewing cone boundaries are not well defined, surface based approaches often produce incomplete or corrupted surface models. To solve this problem, recent researchers analyzed the topology of a polygonal visual hull, and engaged in reconstructing

exact visual hulls [10,11]. Franco and Boyer [11] first reconstructed an incomplete visual hull by viewing edges, and then used local orientation and connectivity rules to search missing surface points and connections. Finally the contour of each face is identified and the exact visual hull is obtained. Volumetric approaches, first proposed by Martin and Aggarwal [12], compute a volumetric model of the visual hull by volume carving. The basic algorithm first discretizes the initial space containing the object by small cubic cells, called voxels. Each voxel is projected into all images. If it falls out of the silhouette region in any image, the voxel is carved away. After all voxels have been tested for occupancy, the remaining voxels constitute the volumetric visual hull. The basic algorithm can be accelerated significantly by an octree structure [13,14] and performing voxel carving in a hierarchically coarse-to-fine fashion [2,15]. A polygonization algorithm, such as marching cubes [16,17], can be used to extract a mesh surface from the volumetric model. Volumetric approaches are very robust and easy to implement, but they are computationally expensive and suffer from quantization artifacts. Most recently, Erol et al. proposed an implicit surface visual hull (ISVH) [18] aiming at eliminating the quantization artifacts. Boyer et al. proposed a hybrid approach that integrates both volumetric and surface based techniques [19]. This approach first computes sampling points on the visual hull surface. Then Delaunay triangulation is applied on the sampling points to obtain a convex hull represented by tetrahedral cells. These cells are carved according to silhouette information. The final surface model is extracted from the remaining tetrahedral cells. Some recent SFS algorithms [20–22] make use of the duality property between points and planes. Kang et al. [21] partitioned 3D scene-space into an array of cubes and estimated a 3D quadric surface patch in the 4 dimensional homogeneous dual space consisting of tangent planes. Then a single higher degree algebraic surface is fitted to all or many of those quadric patches to obtain an estimation of the complex object surface. Brand et al. [22] estimated depth information on contours by solving for locally consistent estimates of curvature at points that are found to be nearby on the dual manifold. The visual hull mesh is constructed by intersecting the tangent planes grazing the visual hull. Image based visual hull (IBVH) [23,24] represents a 3D solid by line segments along the viewing rays passing through pixels of the desired image. The visual hull is used largely as an imposter surface onto which textures are mapped. The algorithm can compute high quality textured renderings in real-time, but it does not produce a view-independent geometric model. Matusik further augmented the IBVH with view-dependent opacity, and constructed an opacity hull [25]. The dandelion model also uses line segments to represent 3D solids as IBVH does, but the model is object-centered so that sampling points distribute evenly on the

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

object surface and line segments are intersecting with (rather than tangent to) the object surface, making ray cuts better posed. The dandelion model is view-independent.

SFO reconstructs the bounding hull of an object from silhouette outlines. For an in depth study of the problem, we first give some formalized definitions. In the following text, we use o, p, and p 0 to denote points in 2D space, and use O, P, P 0 , P0, and P1 to denote points in 3D space. Definition 1. The silhouette S of an object V in an image is the 2D region S = Prj(V) the object V projects into. Definition 2. A 2D region P is spherical-terrain-like (STL) about o if "p 2 P ) op ˝ P. Definition 3. The generated STL region about o of a 2D region P is gen(P) = {p: $p 0 2 P 3 p 2 op 0 }. In Definitions 2 and 3, the phrase about o is often omitted without ambiguity for the simplicity of description. Fig. 1 illustrates the silhouette and the generated STL silhouette of a teapot. From Definitions 2 and 3, it can be easily known that P ¼ genðPÞ () P is STL:

ð1Þ

Definition 4. The outline of a silhouette S about o is the contour of gen(S) about o. Definition 5. A 3D solid X is spherical-terrain-like (STL) about O if "P 2 X ) OP ˝ X. If a solid is STL about any point in a large continuous 3D region, we say the solid is well STL. Definition 6. The generated STL solid about O of a 3D solid X is Gen(X) = {P: $P 0 2 X 3 P 2 OP 0 }. In Definitions 5 and 6, the phrase about O is often omitted without ambiguity for the simplicity of description. From Definitions 5 and 6, it can be easily known that X  GenðXÞ;

X ¼ GenðXÞ () X is STL:

In the above definition, Prj(Æ) is unique for each input image. Definition 9. The standard bounding hull of an object V is the bounding hull about the geometric center of V.

3. Basic problems of SFO

P  genðPÞ;

123

ð2Þ

Definition 7. The visual hull of an object with respect to n images {Ii} is the maximal solid consistent with all silhouettes in {Ii}. In this paper n is always supposed to be a finite number. Definition 8. The bounding hull of an object about O with respect to n images {Ii} is the maximal solid consistent with all generated STL silhouettes about o = Prj(O) in {Ii}.

Fig. 1. The silhouette (a) and the generated STL silhouette (b) of a teapot. The cross in the silhouette region denotes the center point o.

From Definition 9 we know the standard bounding hull is unique. In practice, the standard bounding hull of an object is the usual reconstruction destination. Above definitions are formal and strict, but one may be confused by them. What on earth is the bounding hull of an object? We answer this question by the following theorem. Lemma 1. The silhouette of the generated STL solid of an object about O is identical with the generated STL silhouette about o = Prj(O) in any view. Proof. Let V stand for the object, S for the silhouette in any view, and S 0 for the silhouette of Gen(V) in the same view. (i) "p 2 S 0 , $P 2 Gen(V)3p = Prj(P). $P0 2 V3P 2 OP0. p0 = Prj(P0) 2 S ) op0 ˝ gen(S). p 2 op0 ) p 2 gen(S). Therefore S 0 ˝ gen(S). (ii) "p 2 gen(S), $p0 2 S3p 2 op0. $P0 2 V3p0 = Prj(P0). OP0 ˝ Gen(V) ) op0 ˝ S 0 ) p 2 S 0 . Therefore gen (S) ˝ S 0 . From (i) and (ii) gen(S) = S 0 . h Theorem 1. The bounding hull of an object V about O is the visual hull of Gen(V) about O. Proof. From Lemma 1, in any view the silhouette of Gen(V) is equal to the generated STL silhouette of V. Therefore, both the bounding hull and the visual hull of Gen(V) are the maximal solid consistent with the same silhouettes. So, they are the same. h With Theorem 1, the concept of bounding hull is clarified. According to Eq. (2), if an object V is STL about O, Gen(V) = V, and therefore the bounding hull equals the visual hull. So SFO can be used to replace SFS without losing precision. Generally speaking, if V is approximately STL about O, i.e., Gen(V)  V, then the visual hull of Gen(V) approximates the visual hull of V and SFO can also produce satisfactory results. We continue to give an important property of the bounding hull. Theorem 2. A bounding hull about O is STL about O. Proof. Let BH stand for the bounding hull, S for the silhouette of the object in any view. If BH is not STL about O, $P 2 BH3OP 62 BH. Let BH 0 = BH [ OP. We have BH  BH 0 . In any view, p = Prj(P) 2 S ) op = Prj (OP) ˝ gen(S) ) OP consists with all generated STL silhouettes ) BH 0 consists with all generated STL silhouettes. But according to Definition 8, BH is the maximal solid consistent with all generated STL silhouettes. A contradiction. h

124

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

The visual hull has been well studied and successfully applied in many fields. So it is important to know the relationship between the visual hull and the bounding hull of an object. Lemma 2. Let X0 and X1 stand for two 3D solids, VH0 and VH1 for the visual hulls of them. X0 ˝ X1 ) VH0 ˝ VH1. Proof. If VH0 6 VH1, $P:P 2 VH0 but P 62 VH1. Let S0 stand for the silhouette of X0 and S1 for the silhouette of X1 in any view. X0 ˝ X1 ) S0 ˝ S1. In any view p = Prj(P) 2 S0, but at least in one view I, p = Prj(P) 62 S1. Therefore, S0 6 S1 in I. This is a contradiction. h Theorem 3. If VH is the visual hull of an object V and BH is the bounding hull of V about O, VH ˝ BH and VH = BH () all silhouettes are STL about o = Prj(O). Proof. Let S stand for the silhouette of V in any view. (i) According to Theorem 1, BH is the visual hull of Gen(V). V ˝ Gen(V). According to Lemma 2, VH ˝ BH. (ii) If VH = BH, but in a view the silhouette S is not STL, $p 2 S and p 0 2 op 3 p 0 62 S. $P 2 VH 3 p = Prj(P). VH = BH, from Theorem 2, VH is STL. Therefore, OP ˝ VH ) op ˝ S ) p 0 2 S. This is a contradiction. So, VH = BH) all silhouettes are STL. (iii) If all silhouettes are STL, S = gen(S) in any view. So, VH = BH. h Theorem 3 gives the lower bound of a bounding hull and the necessary and sufficient condition under which the bounding hull equals the visual hull, which is specially interested since if it is satisfied, the SFO can be used to replace SFS without losing precision. We further give the upper bound of a bounding hull in terms of convex hull as defined below. Definition 10. A solid X is convex if "P0 2 X and "P1 2 X ) P0P1 ˝ X. Definition 11. The convex hull of an object V is the minimal convex solid containing V. Theorem 4. If CH is the convex hull of an object V, BH is the bounding hull of V about O:O 2 CH and VCH is the visual hull of CH, then BH ˝ VCH. Proof. "P 2 Gen(V), $P 0 2 V 3 P 2 OP 0 . O 2 CH and P 0 2 CH ) P 2 CH ) Gen(V) ˝ CH. According to Lemma 2, BH ˝ VCH. h

bounding hull can be described by a single valued function R ¼ qðh; /Þ in the spherical coordinate system centered at O, where h and / are the azimuthal and elevation angles respectively. The dandelion model samples the function R ¼ qðh; /Þ in regularly distributed orientations. It is composed of a pencil of organized line segments emitted from O. The orientations of the line segments are from O to the vertices of a geodesic sphere centered at O. The topology of the line segments is represented by triangles of the geodesic sphere. The lengths of the line segments are the values of R ¼ qðh; /Þ. The ending points of the line segments stand for the sampling points on the bounding hull surface. To reconstruct the bounding hull from silhouette outlines, O is designated in the world coordinate system. Generally speaking O should be posited inside the convex hull of the object. In a calibrated 3D capture system, it is of course not difficult to estimate the coordinate of a point. To reconstruct the standard bounding hull, O can be efficiently estimated by the initial volume estimation techniques in volumetric SFS literatures [26,3]. Thanks to the adaptive ability of our SFO algorithm, for well (approximately) STL objects, the result is not sensitive to the selection of O, if only it does not damage the STL property of silhouettes too much. The lengths of line segments are initiated as the radius, BR, of a big enough bounding sphere containing the bounding hull, and then cut by silhouette outlines. Because the running time is irrelevant to BR, it can be set to a bigger value than what is necessary. The initial dandelion model with all line segments set to the same length, as shown in Fig. 2, looks vividly like a dandelion. The geodesic sphere is a polygonal approximation of a sphere. It can be constructed recursively by subdividing facets of a regular icosahedron. For each subdivision, each side of the triangle breaks into two equal segments. Three new vertices are generated and lifted up to the spherical surface. And four sub-triangles are generated to substitute for the original one. We denote a geodesic sphere each triangle of which undergoes n times of subdivision by n-GeoSphere. Fig. 3 shows the n-GeoSpheres for n = 0, 1, and 2. An n-GeoSphere consists of approximately equal sized triangles across the whole surface, so the orientations from the center to the vertices distribute approximately evenly. The dandelion model based on an n-GeoSphere is called an even dandelion model. An even dandelion model samples

We know the geometric center must be inside the convex hull, so the visual hull of the convex hull of an object is the upper bound of the standard bounding hull. 4. The dandelion model According to Theorem 2, the bounding hull is STL about O. From Definition 5 it can be known that the

Fig. 2. The initial dandelion model.

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

Fig. 3. (a) A 0-GeoSphere; (b) a 1-GeoSphere; (c) a 2-GeoSphere.

the bounding hull surface with approximately evenly distributed orientations. Because the distance from the center point and the complexity of the local bounding hull surface may vary greatly, sampling the bounding hull surface with evenly distributed orientations may not be reasonable. So we introduce the adaptive dandelion model, which is derived from a geodesic sphere the facets of which are subdivided for different times to satisfy the preciseness requirement of surface representation. The data structure used to represent an adaptive dandelion model includes the center point and increasable lists of line segments, edges and triangles. The line segment is represented by a unit vector denoting its orientation and a length. The edges are organized as binary trees. The 2 children of an edge, if exist, are the edges generated from it. The triangle is represented by the vertices and their opposite edges. A recursive SFO algorithm based on the adaptive dandelion model is shown below. SFO(float precision, int maxDepth){ for each image Ii Ri = ComputeSilhouetteOutline(Ii); construct an n-GeoSphere; line_segment L; for all vertices Vi of the n-GeoSphere { L.orientation = Vi; L.length = BR; RayCut(L); put L into the line segment list; } put all edges of the n-GeoSphere into the edge list; for all triangles nABCi of the n-GeoSphere subdivide (nABCi, precision, maxDepth - n); MeshExtraction(); } subdivide (triangle nABC, float precision, int maxDepth) { if ((maxDepth == 0) jj IsPrecise(nABC, precision)) { put nABC into the triangle list; return; } line_segment L; for each edge ei of nABC { if (ei has no child) { compute mid point mi; normalize(mi); L.orientation = mi; L.Length = BR; RayCut(L); put L into the line segment list; create child edges sei1, sei2, and put them into the edge list; }

125

else mi = the common vertex of the child edges; } create edges of the central triangle and put them into the edge list; for each of the 4 sub-triangles ni subdivide(ni, precision, maxDepth - 1); } This algorithm includes three serial steps: image processing (computing outlines), model building, and mesh extraction. Suppose the number of input images is j and the image resolution is constant. The time complexity of image processing is O (j). The number of triangles visited (checked for precision) and the number of triangles in the dandelion model are both linear to the number of sampling points. Let k be the number of sampling points. The model building algorithm has an asymptotic running time of O (i Æ k), where i is the time complexity of IsPrecise( ) and RayCut( ). With pre-computed R-Functions the 2 routines can both be finished in O (j). Thus, the time complexity of model building is O (j Æ k). Mesh extraction can be finished in O (k). So the asymptotic time complexity of SFO is O (j Æ k). 5. Ray cut The ray cut algorithm computes the length of a line segment in the dandelion model. Because the line segment has a fixed starting point and a variable ending point, we refer to it as a ray. The algorithm works as follows: First a ray is projected into each 2D image space. Second, the projected ray is intersected by the outline in each image. Finally, the intersected length in each image is lifted back to 3D, and the shortest length computed from all images is recorded as the length of the line segment. 5.1. Ray cut in 2D From projective geometry [27], we know that concurrent rays emitted from O in 3D space project into concurrent rays emitted from o = Prj(O) in a 2D image. The 2D concurrent rays can be indexed by a single parameter, the polar angle u, if we set up a polar coordinate system centered at o. In this coordinate system the silhouette outline is described by a single valued function r = R(u). We call it the R-Function. Fig. 4 shows the silhouette and the R-Function of a partial human head model, which was derived from a cylindrical range map scanned by a laser range finder. In practice, the R-Function is sampled at

Fig. 4. The silhouette (a) and the R-Function (b) of a partial human head model. The cross in the silhouette region denotes the center point o.

126

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

discrete polar angles and represented by a 1D vector. At each polar angle u, R(u) is set to the maximal radius of the points in the silhouette. With the pre-computed R-Function the intersection between a projected ray and the outline can be easily obtained by looking up the R-Function vector with u. Supposing the resolution of each silhouette image is m · m, an R-Function can be efficiently computed from a silhouette image with a time complexity of O (m2). 5.2. Lifting back The length of a 3D line segment can be computed from the length of its 2D projection by trigonometry. As shown in Fig. 5, the 3D line segment OA projects into the 2D line segment oa, where a is the intersection between the ray ! oa and the outline. From camera parameters, we can know iOEi, \OEA and \OAE. According to the law of sines, the length of the 3D line segment OA deduced from oa is kOAk ¼ kOEk  sinð\OEAÞ= sinð\OAEÞ:

ð3Þ

The effectivity of ray cut is determined by the angle be! ! tween the 3D rays OA and OE . Fig. 6a shows the section ! ! plane determined by OA and OE . The big circle denotes a bounding sphere of the bounding hull, the radius of which is BR, and the bold short line segment in the image I denotes the generated STL silhouette. If \AOE falls out of the range of the effective angle, denoted by the gray regions, the length of the 3D line segment computed from the outline must exceed BR, and be useless. \EOB can be computed by the law of sines:  \EOB ¼ p  \OEB  \OBE ð4Þ sinð\OBEÞ ¼ sin \OEB  kOEk=BR: In Eq. (4) \OEB is determined by o = Prj(O), the inter! section between PrjðOA Þ and the outline and camera

O

o

E a I

A

Fig. 5. Lifting back. E denotes the view point.

a

b

parameters. The range of the ineffective angle can be confined conservatively by a single value H as [0,H) [ (p  H,p], supposing IR = min R(u) > 0 (IR is different for each image). H is called the angle of limitation. It can be computed for each image I: As shown in Fig. 6b, a ray emitted from E and moving along the inscribed circle centered at o and with the radius IR in the image plane generates a cone; o = Prj(O); Ep ^ I; ! po intersects the inscribed circle at b (if o and p coincide with each other, b can be any point on the circle). It can be known that \oEb is the minimal angle between the ray and oE. Thus H can be computed by substituting \oEb for \OEB in Eq. (4). The ray cut algorithm repeats for all input images. So, with pre-computed R-Functions the time complexity of the ray cut algorithm is O (j), where j is the number of input images. 6. The computation of precision How can we know if a triangular facet is a good approximation of the corresponding region of the bounding hull surface? Here we propose an algorithm that can directly compute the precision from silhouette outlines. As shown in Fig. 7a, every 3 adjacent line segments with the ending points A, B, and C and the lengths iOAi > 0, iOBi > 0, and iOCi > 0 in the dandelion model defines a triangular pyramid, which is denoted by PrmABC. The solid cut from the bounding hull by the 3 lateral planes of the pyramid is called a cone and denoted by ConeABC. The region of the bounding hull surface cut by the 3 lateral planes of the pyramid is called a cone cap and denoted by CcABC. If A, B, and C project into a, b, and c in an image with the polar angles \a P \b P \c and \a  \c < p, as shown in Fig. 7b, the wedge obtained by taking a portion of the image with the polar angle in [\c, \a] is called the projective wedge of PrmABC. The portion of the generated STL silhouette inside the projective wedge is called the relevant silhouette of PrmABC and denoted by RsABC. The projection of PrmABC is called the silhouette of PrmABC and denoted by S PrmABC . Definition 12. A cone cap CcABC is non-concave if ConeABC ˚ PrmABC, non-convex if ConeABC ˝ PrmABC. Theorem 5. A cone cap CcABC is non-convex if the 3 vertices of nABC project into a line and RsABC  S PrmABC in a view.

a

O

b a

A

b o B

Fig. 6. Estimating the effectivity of ray cut.

c

C

Fig. 7. (a) The pyramid, cone and cone cap; (b) the projective wedge.

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

Proof. For the view in which A, B and C project into a line and RsABC  S PrmABC , suppose a, b, and c are the projections of A, B, and C. If CcABC is not non-convex, $P:P 2 ConeABC, but P 62 PrmABC. Let p = Prj(P). p 2 RsABC ) p 2 S PrmABC . Supposing OP intersect nABC at a point P 0 and p 0 = Prj(P 0 ), we have iOP 0 i < iOPi. From the hypothesis, \c  \a < p, it can be known that OP is not perpendicular to the image plane, and therefore iop 0 i < iopi. p0 2 ca ) p 62 S PrmABC . This is a contradiction. h Theorem 6. A cone cap CcABC S PrmABC  RsABC in every view.

is

non-concave

if

Proof. If CcABC is not non-concave, $P:P 2 PrmABC but P 62 ConeABC. Let BH denote the bounding hull, BH 0 = BH [ PrmABC. We have BH  BH 0 . S PrmABC  RsABC in every view )PrmABC consists with all generated STL silhouettes )BH 0 consists with all generated STL silhouettes. According to Definition 8, BH is the maximal solid consistent with all generated STL silhouettes. A contradiction. h Theorems 5 and 6 give the sufficient conditions for the precise representation of a cone cap CcABC by a triangular facet nABC. Suppose a, b, and c are the projections of A, B, and C in a view with the polar angles \a P \b P \c, \a  \c < p, as shown in Fig. 8. Let q (q, l) denote the signed distance between a point q and a directed line l, which is negative on the left and positive on the right if we walk forward along l. Let p denote a point on the outline,  ca if qðb; caÞ 6 0; L¼ cb; ba if qðb; caÞ > 0; 8 > < qðp; caÞ if qðb; caÞ 6 0; \c 6 \p 6 \a; qðp; LÞ ¼ qðp; cbÞ if qðb; cbÞ > 0; \c 6 \p 6 \b; > : qðp; baÞ if qðb; caÞ > 0; \b < \p 6 \a: If jq(b,ca)j < PRECISION and max\c6\p6\a ðqðp; caÞÞ < PRECISION in one view, and min\c6\p6\a ðqðp; LÞÞ > PRECISION in all views, we think nABC is a good approximation of CcABC, and need not be subdivided further more. In practice, the periodicity of the polar angle should be carefully tackled. \a  \c P p occurs in the circumstances that OE falls in the pyramid PymABC, where E is the viewpoint. iOAi = 0, iOBi = 0 or iOCi = 0 occurs in the circumstances that O is on the bounding hull surface.

In these cases, the precision can be analyzed similarly. If in image I, a, b, and c all project in the shrunk inscribed circle of the generated STL silhouette, which is centered at o and with the radius IR-PRECISION, then RsABC  S PrmABC must be true, and image I can be ignored for precision computation. Based on this principle, most computations with a big \coa can be avoided. The precision computation algorithm scans the R-Function in the angle \coa contained by the projective wedge in each of the j images. If the width of R-Function is w, O (j Æ w) point-to-line distance computations and O (j) projections are involved. Note that the mean value of \coa is small. If w is thought of as a constant, the time complexity of precision computation is O (j). 7. Mesh extraction A mesh model can be easily extracted from the dandelion model by taking the ending points of line segments and the triangles of the dandelion model. The triangles whose edges have children, occurring on the boundaries of triangles with different resolutions (see Fig. 9a), should be processed to avoid cracks in the constructed mesh model. For a triangle nABC with child edges, all child edges are found from the edge trees and the vertices on them are listed in the same order as A fi B fi C fi A. Then a new vertex is generated by cutting the ray from O to the centroid of nABC. Finally a fan of triangles from the new vertex to the vertices on the child edges is generated to replace nABC. Fig. 9b shows the constructed triangular mesh. A manifold mesh is a polygonal mesh in which each vertex is surrounded by a fan of triangles. A strict definition can be found in [28]. The manifold mesh is preferred in compute graphics because it is convenient for mesh post-processes such as simplification, compression, smoothing, etc. If O is not on the bounding hull surface, the mesh generated by the approach presented above is naturally a closed manifold mesh. If O is on the bounding hull surface, O may be a singular vertex [28], which is surrounded by more than one triangle fans. To generate a manifold mesh, O is multiplied, with each triangle fan sharing a copy of it. 8. Experiments We tested our algorithm by experiments on both synthetic images and real photos and made a simple

a

Fig. 8. The projection of a pyramid.

127

b

Fig. 9. (a) Adjacent triangles with different resolutions; (b) the constructed triangular mesh.

128

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

comparison against the exact polyhedral visual hull (EPVH) algorithm and the voxel-carving algorithm. Franco’s EPVH library (version 1.0.0) [11] was used to test the performance of EPVH. The voxel-carving algorithm was implemented based on the techniques presented in [2], which use octree and half-distance transform to accelerate the reconstruction process. A marching cube algorithm was used to extract a mesh surface model from the voxel based model. All experiments used 36 images captured around the object by either a virtual camera or a real camera. Each image is 600 · 398 pixels. For SFO, the R-Functions were sampled every 0.5 degree and the center points were designated by a human operator. We set PRECISION = 1.6 pixels, maxDepth = 7, which means 327,680 sampling points at most. For voxel-carving, the initial volumes were designated by a human operator. We set the maximal resolution to 2563 voxels to obtain good enough results. We report the number of triangles of each reconstructed model and the running time of reconstruction (from binary silhouette images to a triangular mesh model in main memory). The programs ran on a desktop PC with an Intel Pentium IV 3.2 GHz CPU and 1 GB main memory. 8.1. Experiments on synthetic images We first test the performance of SFO on synthetic images. The objects to be reconstructed were a cone, a pear and a cartoon fish, which stand for regular artificial objects, smooth natural objects and non-STL objects, respectively. The results are shown in Table 1 and Fig. 10. From the results we can see that our algorithm is very efficient. This is because the computation of a dandelion model involves only some simple projection, back projection, and some simple 2D operations. Intersections between projected rays and outlines can be easily obtained from R-Functions. The reconstructed models are precise and smooth (see the 3rd row of Fig. 10), because each sampling point is allowed to slide in the fixed orientation and located exactly on the bounding hull surface. It produces high quality visual hulls if the object is STL. In the case of non-STL object, it produces a bounding hull, which is a STL superset of the visual hull, as shown by the right image in the 3rd row of Fig. 10. Thanks to the adaptive strategy and precision computation algorithms, the numbers of triangles are well controlled and the distribution of sampling points is quite reasonable, with more points in complex regions and less points in planar regions.

Fig. 10. Experiments on synthetic images. The 1st row shows the original models; the 2nd and 3rd rows show the reconstructions of SFO; the 4th and 5th rows show the reconstructions of EPVH; the last row shows the reconstructions of voxel-carving.

Table 1 Numeric results of reconstruction from synthetic images Cone

SFO EPVH Voxel-carving Time is in milliseconds.

Pear

Fish

Triangles

Time

Triangles

Time

Triangles

Time

4896 900 357,160

375 147 25,906

1928 3812 338,260

281 587 25,584

30,250 10,280 285,888

1562 2337 20,248

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

8.2. Experiments on real photos To experiment with real photos, we used an acquisition system composed of a fixed Nikon D70 camera with a Nikkor 50 mm f/1.8 lens and a turntable. Both camera and turntable were pre-calibrated. Objects to be modeled were placed on the turntable and 36 photos were taken every 10 degrees for each object. The results of reconstructing 2 head statues by SFO are shown in Fig. 11. In these experiments, the bottom parts of silhouettes were eliminated to make them approximately STL. The reconstructed models contained 23,854 and 10,606 triangles; the running time was 1297 and 765 ms, respectively. The results are very satisfying in both efficiency and quality. 8.3. Comparison against EPVH and voxel-carving algorithms We present experimental results of EPVH and voxelcarving algorithms in Table 1 and Fig. 10. These experiments are based on the same input data as described in Section 8.1. SFO, EPVH, and voxel-carving work in very different ways. Each algorithm has its advantages and disadvantages as suggested by the experimental results. The asymptotic time complexity of SFO and EPVH are O (j Æ k) and O (j2 Æ q2), respectively, where j stands for the number of images, k for the number of sampling points and q for the number of silhouette contour vertices, so SFO has a lower time complexity compared with EPVH. In most cases, SFO runs faster than EPVH, as suggested by Table 1. The output mesh of EPVH is usually composed of many lathy and irregular triangles (see the 4th row of Fig. 10) which is disadvantageous for most applications. In comparison, the output mesh of SFO is often well composed, as shown in the 2nd row of Fig. 10. Although not too often, EPVH may fail and produce cracked models, because it depends on an error-prone searching scheme to recover missing vertices and edges. This problem makes it hard to be used in critical applications. SFO never fails,

129

because the dandelion model has a pre-arranged topology and computing a sampling point on the bounding hull is actually a one-unknown-parameter problem. Compared with voxel-carving, SFO is much more efficient because it needs only to compute sampling points on the bounding hull surface. Each sampling point is precisely located on the bounding hull surface and the resulting mesh model is a linear interpolation of the sampling points, so a good result can be obtained with a relatively low resolution for smooth surfaces. In contrast, voxel-carving has to check the occupancy of all voxels. Because it never knows which point in a voxel is precisely on the visual hull, the result suffers from quantization artifacts. A satisfying result can only be yielded at high enough resolution, which implies a great amount of computation. So, it has to consider a trade-off between speed and quality. In our experiments, each reconstruction process cost more than 20 s with a high resolution of 2563 voxels, but the results still show apparent artifacts (see the 6th row of Fig. 10). 9. Conclusion In this paper we have studied the problem of shape from silhouette outlines (SFO). The result of SFO, called a bounding hull, is the visual hull of the generated spherical-terrain-like (STL) solid of an object. The bounding hull about a point O equals the visual hull if and only if all silhouettes are STL about the projections of O. A novel adaptive dandelion model was proposed to represent the bounding hull and solve the SFO problem. The model is composed of a pencil of organized line segments emitted from a center point. The line segments sample the bounding hull surface in regularly distributed orientations, from the center O to the vertices of a geodesic sphere. The lengths of the line segments are computed by ray cuts with silhouette outlines. The SFO algorithm based on the dandelion model has the combined advantage of speediness, adaptive resolution, controllable precision, and producing manifold meshes. SFO provides a good choice for

Fig. 11. Reconstruction from real photos by SFO. The 1st row shows one of 36 input images for each experiment; the 2nd row shows the reconstructions of SFO.

130

X. Liu et al. / Computer Vision and Image Understanding 105 (2007) 121–130

reconstructing (approximately) STL objects. For other objects, SFO can also be used if the bounding hull is enough for specific applications such as collision detection and avoidance in a virtual environment constructed by image based rendering (IBR). We believe with the advantages, the SFO algorithm as well as the dandelion model will find their extensive applications in many fields. References [1] A. Laurentini, The visual hull concept for silhouette-based image understanding, IEEE Transactions on Pattern Analysis and Machine Intelligence 16 (2) (1994) 150–162. [2] R. Szeliski, Real-Time Octree Generation from Rotating Objects, Technical Report 90/12, Digital Equipment Corporation, Cambridge Research Lab, 1990. [3] A.Y. Mu¨layim, Y. Ulas, V. Atalay, Silhouette-based 3D model reconstruction from multiple images, IEEE Transactions on System, Man and Cybernetics Part B 33 (4) (2003) 582–591. [4] G. Zeng, S. Paris, M. Lhuillier, L. Quan, Study of volumetric methods for face reconstruction, in: Proceedings of IEEE Intelligent Automation Conference, 2003. URL . [5] C. Unsalan, A. Ercil, A new robust and fast implicit polynomial fitting technique, in: Proceedings of M2VIP’99, 1999. [6] P. Matula, D. Svoboda, Spherical object reconstruction using starshaped simplex meshes, Energy Minimization Methods in Computer Vision and Pattern Recognition, Lecture Notes on Computer Science, vol. 2134, Springer Verlag, 2001, pp. 608–620. [7] J. Li, Three Dimensional Shape Modeling: Segmentation, Reconstruction and Registration, Ph.D. thesis, The University of Michigan, 2002. [8] B. Baumgart, Geometric Modeling for Computer Vision, Ph.D. thesis, Stanford University, 1974. [9] W. Matusik, C. Buehler, L. McMillan, Polyhedral visual hulls for real-time rendering, in: Proceedings of the 12th Eurographics Workshop on Rendering, 2001, pp. 115–125. [10] S. Lazebnik, E. Boyer, J. Ponce, On computing exact visual hulls of solids bounded by smooth surfaces, in: IEEE Computer Society International Conference on Computer Vision and Pattern Recognition, vol. 1, 2001, pp. I-156–I-161. [11] J.-S. Franco, E. Boyer, Exact polyhedral visual hulls, in: Proceedings of the Fourteenth British Machine Vision Conference, 2003, pp. 329– 338. URL . [12] W.N. Martin, J.K. Aggarwal, Volumetric description of objects from multiple views, IEEE Transactions on Pattern Analysis and Machine Intelligence 5 (2) (1983) 150–158.

[13] C.L. Jackins, S.L. Tanimoto, Oct-trees and their use in representing three-dimensional objects, Computer Graphics and Image Processing 14 (3) (1980) 249–270. [14] H.H. Chen, T.S. Huang, A survey of construction and manipulation of octrees, Computer Vision, Graphics, and Image Processing 43 (1988) 409–431. [15] G. Slabaugh, B. Culbertson, T. Malzbender, A survey of methods for volumetric scene reconstruction from photographs, in: Proceedings of International Workshop on Volume Graphics, 2001, pp. 81–100. [16] W.E. Lorensen, H.E. Cline, Marching cubes: A high resolution 3D surface construction algorithm, Computer Graphics 21 (4) (1987) 163–169. [17] C. Montani, R. Scateni, R. Scopigno, Discretized marching cubes, in: Visualization ’94 Proceedings, IEEE Computer Society Press, 1994, pp. 281–287. [18] A. Erol, G. Bebis, R.D. Boyle, M. Nicolescu, Visual hull construction using adaptive sampling, in: Proceedings of the IEEE Workshop on Application of Computer Vision, January, 2005, pp. 234–241. [19] E. Boyer, J.S. Franco, A hybrid approach for computing visual hulls of complex objects, in: IEEE Computer Society International Conference on Computer Vision and Pattern Recognition, vol. 1, 2003, pp. I-695–I-701. [20] G. Gross, A. Zisserman, Quadric reconstruction from dual-space geometry, in: Proceedings of International Conference on Computer Vision, 1998, pp. 25–31. [21] K. Kang, J.-P. Tarel, R. Fishman, B.D. Cooper, A linear dual-space approach to 3D surface reconstruction from occluding contours using algebraic surface, in: International Conference on Computer Vision, vol. 1, 2001, pp. 198–204. [22] M. Brand, K. Kang, D. Cooper, Algebraic solution for the visual hull, in: IEEE Computer Society International Conference on Computer Vision and Pattern Recognition, vol. 1, 2004, pp. 30–35. [23] W. Matusik, C. Buehler, R. Raskar, Image-based visual hulls, in: SIGGRAPH’00 Proceedings, New Orleans, July, 2000, pp. 369– 374. [24] W. Matusik, C. Buehler, L. McMillan, Efficient View-Dependent Sampling of Visual Hull, MIT Technical Memo, MA02141. MIT Laboratory for Computer Science, Cambridge, 2002. [25] W. Matusik, H. Pfister, A. Ngan, P. Beardsley, R. Ziegler, L. McMillan, Image-based 3D photography using opacity hulls, in: Proceedings of SIGGRAPH, 2002, pp. 427–437. [26] A.Y. Mu¨layim, O. Ozun, V. Atalay, F. Schmitt, On the silhouette based 3D reconstruction and initial bounding cube estimation, Vision Modeling and Visualisation, 2000, pp. 11–18. [27] J. Semple, G. Kneebone, Algebraic Projective Geometry, Oxford University Press, 1952. [28] A. Gueziec, G. Taubin, F. Lazarus, W. Horn, Converting sets of polygons to manifold surfaces by cutting and stitching, in: Proceedings of IEEE Visualization’98, 1998, pp. 383–390.