ARTICLE IN PRESS Computers & Graphics 32 (2008) 624–631
Contents lists available at ScienceDirect
Computers & Graphics journal homepage: www.elsevier.com/locate/cag
Geometry-optimized virtual human head and its applications Yong-Jin Liu a,, Matthew Ming-Fai Yuen b a b
Department of Computer Science and Technology, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, PR China Department of Mechanical Engineering, The Hong Kong University of Science and Technology, Hong Kong, PR China
a r t i c l e in f o
Keywords: Deformable model Head model Parameterization
a b s t r a c t The creation of realistic virtual human heads is a challenging problem in computer graphics, due to the diversity and individuality of the human being. In this paper, an easy-to-use geometry-optimized virtual human head is presented. By using a uniform two-step refinement operator, the virtual model has a natural multiresolution structure. In the coarsest level, the model is represented by a control mesh M0 which serves as the anatomical structure of the head. To achieve fine geometric details over M0, a multi-level detail set is assigned to the corresponding control vertices in M0 , with the aid of a uniquely defined local frame field. We show in this paper that with a carefully designed control mesh, the presented virtual model not only captures the physical characteristics of the head, but also is optimal in the geometric sense. The contributions of this paper also include that diverse applications of the presented virtual model are presented, showing that (1) the model deformation is smooth and natural, and (2) the presented virtual model is easy to use. Crown Copyright & 2008 Published by Elsevier Ltd. All rights reserved.
1. Introduction
2. Related work
With the advance of virtual reality techniques, virtual community of human habitation has drawn a lot of researchers’ attention. In a large-scale inhabited virtual environment, thousands of different head geometries are usually required to assemble a vast universe of populations. While end users in these virtual worlds have invested a large amount of efforts, time and money in building large social networks and creating many virtual stuffs and other digital assets, they found it not easy to reuse these assets across heterogeneous virtual worlds. To get access to various head geometries, one way is to use scanning technology. But the scanning process is very expensive and the scanned data frequently exhibits artifacts that need to be manually repaired. A more efficient and low-cost way is to deform a canonical head model based on some personal data. The personal data can be anthropometric data, range images or photographs, etc. Several candidate canonical models have been proposed in previous work which are summarized in Section 2. The quality of the resulting head geometries strongly depend on the representations of the canonical models. In this paper we present an easy-to-use canonical head model with diverse applications, showing that distinct, plausible and detailed head geometries can be easily and automatically captured.
Surface representation. Free form shape modeling techniques mainly use three classes of representations: polygonal meshes, parametric and implicit surfaces. In the context of head modeling, very few parametric models are proposed: a typical work is presented in [1] which use a B-spline surface as shape representation. To the best of the authors’ knowledge, there is no implicit head models. Compared to parametric surfaces, polygonal meshes can offer more flexibility to model very fine details on head. However, the applications of polygonal models are confronted with difficulties in attaining local and smooth deformation. Deformation techniques. A number of deformation techniques have been proposed for polygonal head model editing. Usually a small set of personalized feature vertices are first specified and then different techniques are used to determine the positions of the remaining mesh vertices. Kurihara and Arai [2] projected the vertices into a cylindrical surface and interpolated the positions of non-feature vertices with the Delaunay triangulation of the feature vertices. Ip and Yin [3] looked for the n nearest feature vertices around each non-feature vertex. Akimoto and Suenaga [4] and Pighin et al. [5] used a scattered data interpolation function built over the feature vertices. Lee and Magnenat-Thalmann [6] used a Dirichlet free-form deformation. Lee et al. [7] and Kahler et al. [8] used a underlying muscle level to control the mesh deformation. A more recent work which utilizes the anatomical structure of the head is presented in [9]. The multi-layer editing of
Corresponding author. Tel.: +86 10 62780807; fax: +86 10 62771138.
E-mail addresses:
[email protected] (Y.-J. Liu),
[email protected] (M.M.-F. Yuen).
0097-8493/$ - see front matter Crown Copyright & 2008 Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.cag.2008.09.003
ARTICLE IN PRESS Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
head model with physical structure can also be compliant to MPEG-4 facial definition parameters [10]. Multiresolution models. To take advantages of both parametric surfaces and polygonal meshes (i.e., the canonical model possesses very fine geometric details and can be deformed very smoothly), different to the work [9] that uses radial-basis functions, we propose a subdivision-surface-based canonical head model. For a detailed survey on the entire field of subdivision surfaces, the reader is referred to [11] and the references therein. Subdivision surfaces have been successfully applied in character design and animation [12]. In this paper, a further development of subdivision surfaces with multi-level scalar geometric details is proposed as a new novel canonical model of virtual human head. In the context of mathematical representation, our model is closely related to displaced subdivision surfaces [13] and normal meshes [14]. However, compared to our model, in displaced subdivision surface, there is only one level of scalar geometric details; in normal meshes, the inherent local frame for each scalar detail is not as smooth as our subdivision-rules-based local frames. To be clear later in Sections 3 and 4, our proposed model encodes geometric details in multi-levels and deforms very smoothly. These desired properties make our canonical model particularly suitable for virtual head modeling. Diverse examples are presented in this paper, showing the effectiveness of the proposed canonical model. The presented work is an extension of our previous work [15–17], with the following important improvements:
The feature based approach in [15,16] can only provide
physically optimal control meshes which may be geometrically very bad. We study this phenomena carefully by providing examples, which demonstrate that the presented canonical model not only captures the physical characteristics of the head, but also is optimal in the geometric sense. With the sketch-based technique in [17], we extend the basic approach in [16] to non-photorealistic modeling and broaden the applications of the presented techniques.
3. A multi-layer canonical head model The proposed canonical model is based on the following observations. First, all the human heads share a common anatomical structure. Secondly, facial features, such as mouth, eyes, nose, and ears, are very similar between individuals when locally examined. Thirdly, the final head model can be regarded as the anatomical structure attached with fine facial features. Thus, to model the head efficiently, with the help of anthropometric
Fig. 1. The control mesh, representing the anatomical structure of human head, consists of 59 vertices and 56 triangles.
625
studies [18,19], one can deform the anatomical structure and the facial features are moved accompanyingly. Based on these observations, we realize the anatomical structure of head in a base control mesh M0 as shown in Fig. 1. This control mesh consists of 59 vertices and 56 triangles. We use a laser scanned point cloud P to capture the full geometric details of head. Moving least square method [20] is used with the kernel 2 2 yðrÞ ¼ er =h to reconstruct a C 1 continuous surface SðPÞ that interpolates P. The positions of 59 vertices are manually specified in P. The specification task can be further reduces to locate 37 vertices in 28 triangles by making the control mesh symmetric. Note that although control mesh is symmetric, the deformed head models need not to be symmetric. See Fig. 7 for an example. To attach facial features to the control mesh M0 , starting with M0 consisting of the control vertices v01 ; . . . ; v059 , the refined mesh Mi with vertices vi1 ; . . . ; vi#ðiÞ is computed on i ¼ 1; 2; 3; 4 refinement levels, where #ðiÞ denotes the number of vertices in Mi . The refinement rules from Mi to Miþ1 involve two steps as follows (refer to Fig. 2). Step 1. To quickly converge the shape into a smooth surface, we use the butterfly interpolatory subdivision [11] to determine iþ1 ~ iþ1 intermediate positions of new vertices v~ iþ1 . #ðiÞþ1 ; . . . ; v#ðiþ1Þ in M iþ1 The positions of old vertices viþ1 1 ; . . . ; v#ðiÞ are not changed. The
modified butterfly rules tend to generate a C 1 smooth surface with arbitrary meshes. Step 2. By using the butterfly average rules, each new vertex is sent to its limit position in the smooth surface and a local frenet frame ðt 1j ; t 2j ; nj Þ can be uniquely determined for each new vertex v~ iþ1 j . To capture the full details of head shape SðPÞ, each new generated vertex is moved in its local frenet frame along the normal direction. The resulting new position of each vertex is viþ1 ¼ v~ iþ1 þ dnj , where viþ1 2 SðPÞ and dj is a scalar displacej j j ment in the normal direction. The process of canonical mesh refinement is illustrated in Fig. 3. It is clear that the refined canonical meshes soon converge into a smooth surface capturing the full head details. From Fig. 3 it is also clear that the major difference between a head model and a facial model is the existence of the complex ear shape. The coarse-to-fine hierarchy ðM0 ; M1 ; M2 ; M3 ; M4 Þ of the canonical mesh possesses a natural multiresolution structure ðM0 ; D0 ; D1 ; D2 ; D3 Þ, where Di is the ith level scalar detail set. The cardinality of Di is #ðiÞ #ði 1Þ. The multiresolution functionality can be written as Miþ1 ¼ Mi Di , where the reconstruction operation is specified by two-step refinement rules as shown in Fig. 2. Control-mesh-driven smooth deformation. To this end, the proposed canonical head model can be formally defined by an anatomical structure (also called control mesh) M0 attached with a multi-level geometric details ðD0 ; D1 ; D2 ; D3 Þ (cf. Figs. 3 and 4).
Fig. 2. The two-step refinement rules to capture head details with the anatomical structure in Fig. 1.
ARTICLE IN PRESS 626
Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
Fig. 3. The refined canonical mesh that associates head details to the control vertices in M0 .
positions will be presented in the next section. Given a deformed c 0 , the deformation process can be described by control mesh M 1
0
2
1
3
2
4
3
c ¼M c D0 M c D1 c ¼M M c D2 c ¼M M c D3 c ¼M M
Fig. 4. Canonical model deformation.
Given a set of anthropometric head data of individuals, the control mesh M0 is deformed to drive the canonical model. This equals to specify a few positions of vertices in M0 and thus can be achieved very efficiently. Several ways to automatically specify vertex
The control-mesh-driven deformation is smooth, since C 1 butterfly subdivision scheme is embedded in each level refinement. An example of canonical model deformation is presented in Fig. 4. Mesh parameterization. Parameterization between detailed surface meshes and simple domains have diverse applications in computer graphics and geometric processing [21]. The multiresolution structure inherent in the canonical head model offers a natural parameterization of the detailed meshes Mi ; iX0 over the control mesh M0 , i.e., the two-step refinement rules shown in Fig. 2 create a piecewise linear mapping between Mi and M0 . Fig. 5 shows several examples of canonical model parameterization over the control mesh. In the next section, parameterization is used as an effective tool for mapping textures onto surfaces. Physical and geometric aspects of the canonical model. In the canonical model deformation process, a prescribed uniform refinement operator is used and thus leads to a specific connectivity (so-called semi-regular or subdivision connectivity) inherent in the hierarchical meshes Mi . Uniform refinement means the distribution of the control vertices in the anatomical structure must be uniform such that the refined canonical meshes have a good triangle aspect ratio: this property is usually desired in many computer graphics applications. Accordingly, a desired canonical head model M0 ¼ ðK0 ; V0 Þ should not only take care of the physically topological structure K0 (that encodes the mesh connectivity information), but also need to take the geometric aspect V0 (i.e., the positions of all vertices) into consideration. Fig. 6 shows a physically feasible canonical model, but which is
ARTICLE IN PRESS Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
627
Fig. 5. Parameterization of several heads with the canonical structure.
Fig. 6. A physically feasible canonical model which is not geometrically good; compared to Fig. 3.
not geometrically good. This reveals one novelty of our carefully designed canonical model as shown in Fig. 3. Note that the canonical model only need to be generated once. In all downstream applications as presented in the next section, all need to do is to manipulate the 59 control vertices in M0 .
4. Applications By collecting individual head information from any of (1) anthropometric data [18,19], (2) range images [9] or (3) 2D
photographs [6,22], it is convenient to drive the canonical head model due to its small size of control vertices in M0 . Below we show several applications of this simple strategy that demonstrates the presented canonical model is a useful tool in computer graphics. Feature head space generation: A metamorphosis or morphing is a process of continuously transforming one shape into another. One difficulty in morphing techniques is to find the correspondence between two shapes. By using the presented canonical head model, all the customized models have the same anatomical structure, i.e., the same topology of the control mesh. Since there
ARTICLE IN PRESS 628
Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
v0j ðtÞ ¼ ð1 tÞv0j ð0Þ þ tv0j ð1Þ
idea can be easily extended to a n-dimensional vector space in which a number of sample models ðe1 ; e2 ; . . . ; en Þ serve as the shape bases. In this featured vector space, any customized model with coordinates ðx1 ; x2 ; . . . ; xn Þ has the shape driven by the control mesh
v0j ð0Þ 2 M0 ð0Þ; v0j ð1Þ 2 M0 ð1Þ
M0 ðx1 ; x2 ; . . . ; xn Þ ¼ x1 e1 þ x2 e2 þ þ xn en
is a one-to-one correspondence between vertices of two customized control meshes M0 ð0Þ and M0 ð1Þ, a simple linear interpolation:
0ptp1;
1pjp59
can be used to obtain a continuous transformation M0 ðtÞ. Accordingly, associated with the multi-level details ðD0 ; D1 ; D2 ; D3 Þ, the continuous transformation M4 ðtÞ offer the desired morphing between two customized head models. This
Fig. 7 shows a 4D featured vector space of canonical heads. Implicit modeling. Since the M4 -level model presents a dense mesh with the full details of the head, the adaptive method [23] can be used to scan-convert the parametric-based canonical head into a signed distance field (SDF). The SDF is an implicit model with which the operations such as offset, blending and boolean
Fig. 7. A four-dimensional featured vector space of canonical head models.
ARTICLE IN PRESS Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
operations can be easily achieved. Fig. 8 shows an example of offset operation. To do this, let the SDF of model M4 be denoted by D. A level surface (also called an isocontour or isosurface) can be traversed by a function F defined as FðxÞ ¼ DðxÞ d, d 2 R. The surface F1 ð0Þ is then the desired offset surface of D1 ð0Þ in the metric induced by signed distance value. Photorealistic head modeling. Three-dimensional graphics representation of virtual human head not only has the geometric information, but also can have other attributes on the model such as colors. Rather than using the expensive laser scanner to capture the full details of head, vivid head models can also be obtained from a small set of family-made photographs. Obviously this technique can broaden the applications of virtual human among a large population of non-expert users. Below we present such an application. Given a pair of photographs, one is front view F and the other is side view S, we first identify a set of salient points ff i g 2 F and fsi g 2 S. Each point in ff i g or fsi g has a corresponding vertex in M0 and at least four pairs ðf i ; sj Þ correspond to four distinct vertices in M0 , denoted by ðf ik ; sjk Þ3vk ; k ¼ 1; 2; 3; 4. Then from the knowledge of stereo vision with epipolar geometry [22], the 3D positions of vk can be determined. If for a control vertex vk in M0 , there is only one salient corresponding point in ff i g or fsi g, a ray through that salient point is built in space and the position of vk is moved to the nearest point in that ray. Note that although we do not implement, the salient point identification in the photographs can be done fully automatically, by using the structured snake algorithm in [6]. After determining the geometry of the control mesh M0 , a view independent texture map is created from the given photographs in three steps. First, the side-view photograph is flipped to give another side-view photograph. Then based on the recovered camera parameters, a quasi-dense set of points in space near the control mesh is colored by projecting each point into three planes and blending in a weighted manner the pixel colors at three projection points. Finally by splitting the control mesh at the back of head, the 3D mesh M0 is flatten into a 2D canonical mesh M02D . By applying the same multiresolution reconstruction operator on both M0 and M02D , a photorealistic head model is generated. Fig. 9 shows such an example, demonstrating that our method is easy to use and the results are satisfied by common users. Non-photorealistic head modeling. In recent years, non-photorealistic modeling has attracted many attentions in 3D cartoon design and game development. In our study, we are particularly
Fig. 8. Offset a canonical head using the signed distance field [23].
629
Fig. 9. Photorealistic head modeling by texture mapping family-made photographs.
interested in generating 3D non-photorealistic model from ancient portrait. Since the portrait is sketch-based, it is a typical case of non-photorealistic modeling. Much of the work of photorealistic modeling can be reused in this case. Refer to Fig. 10. Given a sketched portrait, we flip it to obtain another view. Then the salient edges on portraits are detected by applying the standard Canny operator. Consequently, the 3D control mesh and the 2D texture map are built by searching along the salient edges and matching similar points. This process can be speed up with the user intervention: the features needed to be specified are described in [17]. Afterwards the remaining procedure is the same as the photorealistic modeling.
5. Conclusions The human head is a significant part of the human body. Many efforts have been made to computer-aided modeling of virtual human head. However, realistic head modeling is still a challenging task. In this paper, a geometry-optimized canonical model of virtual human head is presented. The presented canonical model has a natural multiresolution structure. The anatomical structure of the head is presented in the coarsest level M0 . By decomposing the full detailed head model into M0 and a multi-level detail set, distinct customized head model can be generated by controlling the vertex positions of M0 . Since M0 has very few vertices, this deformation is very efficient. By encoding the scalar detail in a uniquely defined local frame, the deformation is also smooth. Diverse examples are presented, showing that the presented canonical model is easy to use and can achieve great effectiveness in several important applications including both photorealistic and non-photorealistic head modeling. There are several directions for the future work. Currently the canonical model can only describe the static mouth. It is interesting to investigate how to add more time-varying control points around the mouth: this would lead to a model of talking head [24]. Another direction is to model the realistic hair style. With the aid of texture mapping, the presented canonical model can describe some simple styles of male’s short hair. However, for the style of long hair, it is better to treat it as a separated part [25].
ARTICLE IN PRESS 630
Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
Fig. 10. Non-photorealistic head modeling from ancient portraits.
Acknowledgment The work was supported by the Project funded by Tsinghua Basic Research Foundation, the National Natural Science Foundation of China (Project nos. 60603085 and 60736019), the National High Technology Research and Development Program of China (Project nos. 2006AA01Z304 and 2007AA01Z336). References [1] DeCarlo D, Mataxas D, Stone M. An anthropometric face model using variational techniques. In: Proceedings of SIGGRAPH’98, 1998. p. 67–74. [2] Kurihara T, Arai K. A transformation method for modeling and animation of the human face from photographs. In: Proceedings of Computer Animation’91, 1991. p. 45–58. [3] Ip H, Yin L. Constructing a 3D individualized head model from two orthogonal views. Visual Computer 1996;12(5):254–566. [4] Akimoto T, Suenaga Y. Automatic creation of 3D facial models. IEEE Computer Graphics and Application 1993;13(5):16–22. [5] Pighin F, Hecker J, Lischinski D, Szeliski R, Salesin D. Synthesizing realistic facial expressions from photographs. In: Proceedings of SIGGRAPH’98, 1998. p. 75–84. [6] Lee W, Magnenat-Thalmann N. Fast head modeling for animation. Image and Vision Computing 2000;18(4):355–64. [7] Lee Y, Terzopoulos D, Waters K. Constructing physics-based facial models of individuals. In: Proceedings of Graphics Interface’93, 1993. p. 1–8.
[8] Kahler K, Haber J, Yamauchi H, Seidel H. Head shop: generating animated head models with anatomical structure. In: Proceedings of ACM SIGGRAPH Symposium on Computer Animation, 2002. p. 55–64. [9] Zhang Y, Sim T, Tan CL. From range data to animated anatomy-based faces: a model adaption method. In: Fifth international conference on 3-D digital imaging and modeling (3DIM05), 2005. p. 343–50. [10] Chen C, Prakash E. Personalized cyber face: a novel facial modeling approach using multi-level radial basis function. In: International conference on cyberworlds, 2005. p. 23–30. [11] Zorin D, Schroder P. Subdivision for modeling and animation. In: SIGGRAPH course notes, 2000. [12] DeRose, T. Kass M, Truong T. Subdivision surfaces in character animation. In: Proceedings of SIGGRAPH’98, 1998. p. 85–94. [13] Lee A, Moreton H, Hoppe H. Displaced subdivision surfaces. In: Proceedings of SIGGRAPH’00, 2000. p. 85–94. [14] Guskov I, Vidimce K, Sweldens W, Schroder P. Normal meshes. In: Proceedings of SIGGRAPH’00, 2000. p. 95–102. [15] Liu YJ, Yuen M, Xiong S. A feature-based approach for individualized human head modeling. Visual Computer 2002;18(5–6):368–81. [16] Liu YJ. Complex shape modeling with point-sampled geometry. PhD dissertation, Hong Kong University of Science and Technology, 2003. [17] Liu YJ, Tang K, Joneja A. Sketch-based free-form deformation with a fast and stable numerical engine. Computers & Graphics 2005;29(5):778–93. [18] Enciso R, Shaw A, Neumann U, Mah J. 3D head anthropometric analysis. In: Proceedings of the SPIE Medical Imaging’05, 2005. p. 5029:590–7. [19] Farkas L. Anthropometry of the head and face. New York: Raven Press; 1994. [20] D. Levin, Mesh-independent surface interpolation. In: Brunnett G, Hamann B, Muller H, Linsen L, editors. Geometric modeling for scientific visualization. Berlin: Springer. p. 37–49.
ARTICLE IN PRESS Y.-J. Liu, M.M.-F. Yuen / Computers & Graphics 32 (2008) 624–631
[21] Sheffer A, Praun E, Rose K. Mesh parameterization methods and their applications. Foundations and Trends in Computer Graphics and Vision 2006;2(2):105–71. [22] Hartley R, Zisserman A. Multiple view geometry in computer vision. 2nd ed. Cambridge University Press; 2004. [23] Frisken S, Perry R, Rockwood A, Jones T. Adaptive sampled distance fields: a general representation of shape for computer graphics. In: Proceedings of SIGGRAPH’00, 2000. p. 249–54.
631
[24] Beskow J. Talking heads: models and applications for multimodal speech synthesis. Doctoral thesis, Department of Speech, Music and Hearing, the Royal Institute of Technology (KTH), 2003. [25] Ward K, Bertails F, Kim T, Marschner S, Cani M, Lin M. A survey on hair modeling: styling, simulation, and rendering. IEEE Transactions on Visualization Computer Graphics 2007;13(2):213–34.