COMPUTER VISION AND IMAGE UNDERSTANDING
Vol. 72, No. 3, December, pp. 328–339, 1998 ARTICLE NO. IV970668
Direct Visible Surface Interpolation Baining Guo Intel Corporation, Microcomputer Graphics Laboratory, RN6-35, 2200 Mission College Blvd., Santa Clara, California 95052 E-mail:
[email protected]
and Joseph Liu Department of Computer Science, York University, 4700 Keele Street, North York, Ontario, Canada, M3J 1P3 Received October 2, 1996; accepted October 10, 1997
The computation of visible surfaces is usually formulated in a regularization framework based on thin-plate and membrane splines. When discretized, this formulation leads to large sparse linear systems. Most surface interpolation methods solve these sparse systems with iterative methods. Here we explore the use of direct methods. Through a careful analysis of the regularization operator, we derive direct methods that efficiently make use of all zeros in the sparse discretization of the operator. Experimental results show that, compared with iterative interpolation methods, the direct methods we present are competitive in general, and they provide significant speed-ups for problems involving discontinuities. In addition to their use in visible-surface interpolation, the presented methods also support very efficient time integration for deformable c 1998 Academic Press surfaces. ° Key Words: regularization; surface interpolation; variational principles; sparse matrix analysis; discontinuities; deformable models; finite elements.
1. INTRODUCTION The goal of visible-surface interpolation is to produce a complete description of a surface that is only partially constrained by available data. This computation plays an important role in early vision processes such as shape-from-shading, structurefrom-motion, and stereopsis [13, 14, 24, 22, 27]. The physical models for visible-surface interpolation also form the basis of deformable surfaces, which are widely used for recovering shapes and nonrigid motion [26]. In recent years, deformable surfaces with high-order finite elements have become popular in the computer-aided design (CAD) community [4, 20]. Visible-surface interpolation is usually formulated in a regularization framework, and the surfaces we seek are characterized by the Euler–Lagrange equation [24]. When discretized with finite-element techniques, this equation gives rise to a large sparse linear system. Most existing interpolation methods solve the sparse system using iterative methods [13, 24, 22, 27, 18, 15]. A problem with iterative methods is that they converge very slowly, because the sparse system for visible-surface inter-
polation is poorly conditioned. In many cases, pre-conditioners can introduce significant speed-ups [22, 27, 15], but general techniques for designing such preconditioners remain poorly understood. For example, there is no effective preconditioners for interpolation problems involving discontinuities [15]. The difficulties with iterative methods led us to consider direct methods as an alternative approach to visible-surface interpolation. Clearly, a naive direct method is likely to be expensive. Even after taking into account the banded structures of sparse matrices for interpolation [15], the time complexity is still O(N 2 ) on a regular grid of N node variables. The methods we derive make use of all zeros in the sparse matrices and find visible-surface representations in O(N 1.5 ) time with O(N log N ) storage. Our derivations are based on a careful analysis of the regularization operator for thin-plate and membrane splines, and the analysis is applicable to general controlled-continuity stablizers over regular grids [24]. To make use of all zeros in the sparse matrices, we employ our knowledge about the regularization operator to order the rows and columns of the matrices in such a way that the number of nonzeros introduced in the matrix factorization process is significantly reduced. Two ordering strategies are studied: the minimum degree ordering [11] and multisection ordering [2]. Experimental results show that, compared with the interpolation methods in the literature, direct methods based on these ordering strategies are competitive in general, and they provide significant speed-ups when the input data contain prescribed discontinuities. In addition to their use in visible-surface interpolation, the presented methods also support very efficient time integration for deformable surfaces. Implicit methods in conjunction with matrix factorization are known to provide stable time integration [25, 26], but matrix factorization without exploiting all zeros in sparse matrices quickly becomes too expensive for finely discretized surfaces. The direct methods presented here can generate compact matrix factorization that allows fast and stable updates of deformable surfaces. Moreover, in practice the material properties of the thin plates changes rarely if at all [26], which implies the same matrix factorization may be used for many time
328 1077-3142/98 $25.00 c 1998 by Academic Press Copyright ° All rights of reproduction in any form reserved.
329
VISIBLE SURFACE INTERPOLATION
steps. With these factors taken into consideration, the methods we present are among the most attractive ones for implementing deformable surfaces. For example, one of the presented methods only needs about 0.3 s to update a thin-plate spline surface over a regular grid of 120 × 120 on a Sun Sparc 10. In previous work on surface interpolation, much effort has been devoted to improving the convergence rates of iterative methods. In some of his early work, Terzopolous used multigrid methods to speed up the convergence by orders of magnitude over single grid approaches [24]. Szeliski further accelerated the computation by preconditioning the matrices with hierarchical basis functions [22]. Recently, various wavelet-based preconditioning methods have become popular [18, 27, 15]. In their latest work on wavelet preconditioning [15], Lai and Vemuri adapted preconditioners to the specific problems being solved and achieved significant speed-ups over previous iterative methods. The remainder of this paper is organized as follows: Section 2 briefly discusses the physical models for regularization and some of the main difficulties in preconditioner design. Section 3 describes a direct interpolation method based on minimum degree ordering. Section 4 further explores an ordering strategy for regular grids. Section 5 presents experimental results on direct interpolation methods and compares them with the interpolation methods found in the literature. This section also explains how the presented methods allow fast and stable time integration for deformable surfaces. Finally, Section 6 concludes with discussions about future work. 2. REGULARIZATION AND PRECONDITIONING In this section, we first describe visible-surface interpolation in a regularization framework. Then, we examine some of the main difficulties in designing preconditioners for iterative methods. 2.1. The Controlled-Continuity Stablizer In the regularization framework, the visible surface we seek minimizes some energy functional. The controlled-continuity stablizer uses the functional E(v) = S(v) + D(v) [24, 3] to determine a visible surface v(x, y), where the smoothness energy S(v) over a 2D domain Ä is ZZ © ¡ ¢ 1 ρ(x, y) τ (x, y) vx2x + 2vx2y + v 2yy 2 Ä ¢ª ¡ + [1 − τ (x, y)] vx2 + v 2y d x d y, and D(v) is a penalty functional measuring the discrepancy between the visible surface v(x, y) and the data constraints. The continuity control functions ρ(x, y) and τ (x, y) at a point (x, y) determine the local smoothness of v(x, y) at that point. Specifically, the surface v(x, y) is locally a membrane spline when τ (x, y) = 0, a thin-plate spline when τ (x, y) = 1, and discontinuous when ρ(x, y) = 0. As for the functional D(v), its form depends on the data constraints. For example, D(v) =
1 6 α (v(xi , yi ) − ci )2 2 i i
is a positively weighted square sum for positional constraints of the form v(xi , yi ) ≈ ci [24]. The variational principles imply that the surface v(x, y) minimizing E(v) satisfies the Euler–Lagrange equation ρ(x, y){τ (x, y)12 v(x, y) + [1 − τ (x, y)]1v(x, y)} + α(x, y)v(x, y) = α(x, y)c(x, y),
(1)
where 1 is the Laplace operator and 12 is the bi-harmonic operator. Note that we have modeled the positional constraints with the functions α(x, y) and c(x, y); modeling orientation constraints is also straightforward. Using finite-element techniques, we can discretize the above regularization model. For the sake of simplicity, let us use an k × k regular grid as in [24]. With this discretization, the energy functional E(v) becomes a positive-definite quadratic form E(v) = 12 vt Kv − bt v + c, with the vector v of size N = k 2 representing the visible surface. The discretized Euler–Lagrange equation is the linear system Kv = b, where the stiffness matrix K is a sparse N × N matrix. The solution of Kv = b is the discrete visible surface. As mentioned earlier, the system Kv = b is poorly conditioned, and current iterative methods rely on preconditioning to achieve reasonable convergence rates. 2.2. Difficulties with Preconditioner Design Several researchers have noted that the regularization process may be regarded as applying a low-pass filter to the data (e.g., see [23, 15]). For example, assuming the functions ρ, τ , and α in the Euler–Lagrange equation are constants, we can apply the Fourier transform to both sides of Eq. (1) and obtain the corresponding equation in the frequency domain ω = (ω1 , . . . , ω N ) as vˆ (ω) =
α cˆ (ω). ρτ kωk4 + ρ(1 − τ )kωk2 + α
(2)
Because ρ, τ , and α are constants, the regularization filter is linear and shift-invariant. Lai and Vemuri proposed effective preconditioner design for this sort of filter [15]. In general, the functions ρ(x, y), τ (x, y), and α(x, y) are not constants, as is the case for surface reconstruction with prescribed discontinuities. In that case, the continuity control functions ρ(x, y) and τ (x, y) vary from point to point according to the locations of positional and orientational discontinuities. For such ρ(x, y) and τ (x, y), the regularization filter is nonlinear and spatial varying, which makes it difficult to design and analyze preconditioners. 3. DIRECT SURFACE INTERPOLATION In this section, we describe the main stages of an interpolation method adapted from direct methods for general sparse systems. The emphasis is on different adaptation choices and their impact. Through the discussion, we identify the key issue of ordering, which will be explored further in the next section.
330
GUO AND LIU
3.1. Overview Let us first provide an overview of the various phases involved in direct visible-surface interpolation. The first step of the interpolation is to discretize the Euler–Lagrange equation into a linear system Kv = b as described above. Note that the stiffness matrix K depends on the material properties of the surface through the continuity control functions. The linear system Kv = b is sparse, symmetric, and positivedefinite. Direct methods solve such a system by computing the Cholesky factorization K = LLt , where the Cholesky factor L is a lower triangular matrix with positive diagonal entries. The solution vector x can be obtained by forward and backward substitutions in the triangular systems: Ly = b and Lt x = y. Readers are referred to some sparse matrix books for background information: Duff et al. [6] and George and Liu [10]. When K is sparse, some entries that are initially zeros in the matrix K may become nonzero in the factor L due to the factorization. Such entries of L are referred to as fills. For obvious concern of storage and time efficiency, it is desirable to have a small number of fills and to exploit all zeros in the factor matrix L. Since the matrix K is positive definite, it is well known that row or column interchanges are not required to maintain numerical stability. However, permutations by row/column interchanges can have a drastic impact on the amount of fills [10]. Since numerical stability is not a concern, for any given permutation matrix P, we can solve the equivalent system: (PKPt )(Px) = Pb.
(3)
The permutation P is often referred to as an ordering of the sparse matrix K. Therefore, we want to choose an ordering P such that the Cholesky factor of PKPt has less fill than that of K. As shown in Fig. 1, there are four distinct phases in the overall direct solution of Kx = b. (a) Ordering: Find a fill-reducing ordering P for K. (b) Symbolic factorization: Determine the structure of the Cholesky factor L of PKPt and set up the data structure to store only nonzeros in L.
FIG. 2. The structure of the stiffness matrix for a 5 × 5 grid.
(c) Numerical factorization: Place the numerical values of K into the data structure of L and factor K to obtain the numerical nonzero values of L. (d) Numerical substitution: Solve Ly = Pb and Lt z = y and then set x = Pt z. It is appropriate here to point out that early usage of direct methods in the literature did not exploit all possible zeros in the factor matrix. Instead, only zeros outside certain regions in the matrix are ignored. Band methods store and operate on zeros and nonzeros within the band of the matrix. Since most sparse matrices have variable bands, the exploitation of zeros outside the variable bands leads to the envelope or profile methods. The sparse matrices in visible-surface computation are far from fully banded: each row of K has at most 13 nonzeros, yet the bandwidth of K is 3k for a k × k grid. Figure 2 illustrates the structure of the stiffness matrix K for a 6 × 6 grid. Because of the structure of K, general sparse methods that take advantage of all zeros are better suited for interpolation than band and/or envelope methods. Indeed, it can be shown that band/envelope methods using the best band/envelope ordering require O(k 4 ) arithmetic operations and suffer O(k 3 ) number of fills [10]. On the other hand, general sparse methods with the best fill-reducing ordering use only O(k 3 ) floating point operations to produce Cholesky factors with O(k 2 log2 k) nonzeros. 3.2. Elimination Graph
FIG. 1. Stages for direct visible-surface interpolation.
The ordering and symbolic factorization steps do not require the numerical values of the given matrix K. Indeed, only the structure of the matrix is required. The matrix structure can be conveniently studied using graph theory. The graph of an
VISIBLE SURFACE INTERPOLATION
N × N symmetric matrix K, usually represented by G(K), is a labeled undirected graph of N nodes with an edge between two nodes i and j if and only if the corresponding value K i j is nonzero. The numerical factorization of a sparse matrix K can be modeled as a sequence of structural modifications to its associated graph G(K). The elimination of a variable i in numerical Gaussian elimination corresponds to the deletion of the node i and its incident edges from the graph. Furthermore, the fills incurred from this step in the matrix are associated with adding new edges to the remaining graph. Indeed, edges are added to the graph so that the neighbors of the eliminated node i become pairwise-connected. This transformed graph is referred to as an elimination graph. In other words, the entire numerical factorization can be modeled as a sequence of such elimination graphs. 3.3. Symbolic Phase
331
separator are numbered last. In terms of the reordered matrix, this gives rise to a bordered block diagonal matrix. This ordering is relevant because the zero off-diagonal block will be preserved after factorization. The dissection idea can be applied recursively to the two subgraphs with a successive set of separators such that each step preserves a block of zero entries. This produces a nested sequence of dissection of the graph G(K), and hence the name “nested dissection.” The quality of the ordering by nested dissection depends crucially on the separator size and to a lesser extent on the difference in sizes of the two halves. This is a subject that we shall return to later. After a fill-reducing ordering is obtained in the ordering step, the structure of the Cholesky factor L of PKPt is determined in the symbolic factorization phase. This could be done by carrying out the Cholesky factorization symbolically. However, the time complexity would be the same as that of numeric factorization. Efficient symbolic factorization schemes have been devised with much lower complexity. We use the one described in [10] that has a time complexity of O(nnz(L)), where nnz(L) is the number of nonzeros in L. It should be noted that the symbolic factorization step usually requires the least amount of computations compared to the other three steps of the direct method.
In the first symbolic step, a permutation matrix for K or an ordering of the graph G(K) is determined. Since we consider only general sparse methods, for a given graph G(K), we want to find an ordering P that will minimize the amount of fills in its Cholesky factor. Unfortunately, this is a very hard combinatorial 3.4. Numeric Phase problem (an NP-complete problem [10]). The most fundamental unit of numerical computation in the In practice, we have to rely on some heuristic ordering schemes. Perhaps the most successful and widely used ordering method Cholesky factorization of a symmetric positive-definite matrix for reducing fills is the minimum degree ordering. It is a greedy K is scheme that at each step numbers a node of minimum degree in (4) K i j = K i j − (K is K s j )/K ss . the current elimination graph. The scheme can be described in its simplest form as: The overall factorization scheme can be described as a tripleMINIMUM-DEGREE ORDERING nested loop around the above statement, with the loop indices G = G(K) i, j, and s. Depending on which loop index is at the outer loop, while there are nodes left in G do we have three forms of Cholesky factorization: find a node x of minimum degree Left-Looking Method: with j in the outer loop. The columns eliminate x from the graph and make its neighbors of L are computed one by one where modification to the curpairwise-connected rent column is performed by a matrix-vector product from the G = resulting elimination graph previous columns. end while Bordering Method: with i in the outer loop. The rows of L are The scheme is popular due to its simplicity and its remarkcomputed one by one. able ability to produce reasonably good orderings for a wide Right-Looking Method: with s in the outer loop. The columns spectrum of problem classes. More importantly, a number of enof L are computed one by one and the modification from the hancements to the basic scheme have been made over the years current column is applied to the remaining partially factored to produce an efficient implementation. Readers are referred to submatrix. the survey paper by George and Liu [11] on the development The three forms of factorization are illustrated in Fig. 3. of this minimum degree ordering algorithm. The experimental The left-looking column algorithm is the most commonly results reported in this paper are obtained by using the multiple used method in commercially available sparse matrix software minimum degree code in [16]. Another effective approach for finding fill-reducing orderings [8, 12]. In this algorithm, column j of K remains unchanged is the nested dissection scheme [9]. It is based on a divide-and- until the outer-loop index becomes j. Then the algorithm modconquer paradigm, where the graph G(K) is decomposed into ifies column j by applying updates from every column s < j of two roughly equal halves by removing a subset of nodes (called L for which L js 6= 0. Note that columns that are used to update the separator). The graph is to be reordered so that the nodes column j come from the left of the current column, hence the in each half are numbered contiguously and the nodes in the name “left-looking.”
332
GUO AND LIU
FIG. 3. Three forms of Cholesky factorization.
A competitive version of the right-looking approach is the multifrontal method by Duff and Reid [7]. In essence, the multifrontal method factors a large sparse matrix by performing partial factorizations on a sequence of smaller dense matrices, called “frontal matrices.” Although the motivation of the pioneering work of Duff and Reid is for solving sparse indefinite systems, the approach is equally effective for sparse positivedefinite systems. Since the bulk of the work is performed on dense frontal matrices, efficient kernels from dense solvers can be used. Moreover, the method has good locality properties so that it can make effective use of cache and has also proven to be efficient on machines with virtual memory and paging. Recent advances in the numerical factorization step are related to the notion of supernodes, first discussed in [1]. Intuitively, a supernode is a set of contiguous columns in the Cholesky factor L that have essentially the same column sparsity structures. Supernodes have been used to organize numerical factorization with matrix–vector (BLAS-2) and matrix–matrix (BLAS-3) multiplication kernels. Its usage has helped reducing memory traffic by making efficient reuse of data in cache. A state-of-theart implementation of the numerical scheme has been described by Ng and Peyton [17]. Their scheme can be viewed as a leftlooking block algorithm. The numerical experiments reported in Section 5 were obtained based on an adapted version of Ng– Peyton’s block software code. The last step in the overall solution method is numerical substitution. The complexity for this step is O(nnz(L)), and for most practical problems, it is significantly less than that of the sparse numerical factorization. The forward and backward substitutions usually take a small fraction of the total overall time to solve a sparse system, and the time for sparse factorization can be amortized over the sequence of triangular solves. This makes direct methods attractive for time-varying data, particularly deformable surfaces made of thin-plate splines [26]. We shall return to deformable surfaces later.
4. ORDERING FOR REGULAR GRID Finding a fill-reducing ordering is crucial to the success of direct methods. As mentioned in the past section, there are two broad ordering approaches: the minimum degree ordering and the nested dissection ordering. In this section, we describe the ordering strategy we recommend and use for direct visible-surface interpolation on regular grids. The ordering is based on the recent multisection ordering approach by Ashcraft and Liu [2]. It can be viewed as a hybrid of the minimum degree and nested dissection orderings. 4.1. Nested Dissection Ordering For the grid matrix arising from our study of the surface reconstruction problem, it is a 13-point operator where each node is connected to its eight neighbors in the NW, N, NE, E, SE, S, SW, W directions and the four nodes distance 2 away in the N, E, S, W directions [24]; see Fig. 4. For the use of nested dissection on such square grids, it is easy to find good separators. Since each interior node is connected to some nodes distance 2 away, we need width-2 dissectors. One such choice is the use
FIG. 4. A node (dark shade) with its neighbors (light shade): 5-point operator (left) for membrane splines and 13-point operator (right) for thin-plate splines.
VISIBLE SURFACE INTERPOLATION
333
FIG. 5. Width-2 nested dissection with “+”-separators.
of width-2 “+”-separators consisting of vertical and horizontal dissectors [9]. The grid examples in Fig. 5 show two and four recursive levels of “+”-separators of a 25 × 25 grid. It has been recognized that for the 5-point grid graphs, the conventional nested dissection ordering with “+”-separators can be significantly improved by the use of diagonal dissectors [9, 5]. In the regularization framework based on the controlledcontinuity stablizer, the 5-point operator corresponds to a membrane whereas the 13-point operator corresponds to thin-plate
splines [24]. Like the 5-point operator, it is desirable to use diagonal separators, but this time with separators of width-2. The top level separator consists of a set of parallel diagonal grid points from the north-west corner to the south-east corner. The next level separator is a width-2 diagonal dissectors from the northeast to south-west corner. We shall refer to such separators as width-2 “x”-separators, since two levels of diagonal dissectors corresponds to the symbol “x.” In Fig. 6, two and four levels of “x”-separators are illustrated in the two 25 × 25 grids.
FIG. 6. Width-2 nested dissection with “x”-separators.
334
GUO AND LIU
• the choice of fill-reducing ordering for domains, • the choice of fill-reducing ordering for multisector.
TABLE 1 Performance of Different Dissectors Ordering method ND
with “+”-dissector
MMD ND
with “x”-dissector
FactorNonz
FactorOpns
1,208,573 952,005 901,899
100.061M 66.592M 64.299M
Experimental results in Table 1 on a 120-by-120 13-point grid clearly show the significant advantages of the use of “x”dissector over “+”-dissector. For comparison, we have also included the results from the minimum degree ordering MMD. The top level separator of the “x”-dissector has roughly the same size as that of the “+”-dissector, and they both subdivide the grid into two almost equal portions. The same can be said on the next level of separator. The advantages of the “x”-dissectors can be readily seen from the next few levels. Indeed, after two levels of dissection, the triangular form of the subgraphs from “x”-dissectors has smaller dissectors than the rectangular form from “+”-dissectors. This intuitively explains the superior performance of “x”-dissectors. It is also relevant to mention that the use of such dissectors can be easily generalized to rectangular grids. The way the “x”-dissectors are generated deserves some elaboration. For a k-by-k grid, a direct determination of a nested dissection with width-2 “x”-dissectors is not as simple as it seems. We have instead used an elegant strategy due to Stan Eisenstat (communicated to us by Cleve Ashcraft). A nested dissection ordering for a (2k − 1)-by-(2k − 1) grid with width2 ‘+’-dissectors is first generated. The original k-by-k grid is then rotated 45◦ and embedded into this larger grid. The desired dissection ordering is then inherited from the larger grid. 4.2. Multisection Ordering In [2], a new approach is proposed that combines the advantages of minimum degree and nested dissection orderings. It relies on the notion of a multisector, which is a generalization of a bisector. A multisector 8 is a subset of vertices whose removal disconnects the graph into two or more components. Each connected component Ä after the removal of the multisector is called a domain. For example, Fig. 6 (right) shows a multisector with 12 domains. The ordering strategy is a two-stage ordering. The first stage orders the nodes in each domain separately using some fillreducing ordering. The second stage considers the elimination graph after the elimination of the domain nodes; in other words, this elimination graph consists of nodes only from the multisector. This step then numbers the multisector nodes in the elimination graph using some fill-reducing ordering. It is clear that such multisection ordering depends on three factors: • the choice of multisector 8,
We shall use the notation MS (domain-ordering, multisectorordering, 8) to represent the corresponding multisection ordering. The way we obtain our multisector is by combining a number of levels of “x”-dissectors. Experimental results show an optimal choice is between six to seven levels of dissectors. In all experiments to be reported in the next section, we always assume the use of a multisector consisting of seven levels of “x”-dissectors. In terms of domain orderings, we use the nested dissection ordering ND with width-2 “x”-dissectors. This choice is reasonable since each interior domain is square (tilted 45◦ ) and the best ordering is based on nested dissection. For domains on the boundary of the square grid, we could have used minimum degree on them; but for simplicity on implementation, we use nested dissection as the same ordering strategy for all domains. The impact of a better minimum degree ordering on boundary domains is rather insignificant. On the other hand, we use minimum degree ordering MMD on the elimination graph associated with the multisector. So the overall multisection ordering used can be expressed in our notation as MS(ND, MMD, 8), where the multisector 8 consists of seven levels of “x”-dissectors. Henceforth, we shall simply use MS to represent this multisection ordering. In Table 2 we provide some statistics on the performance of MS for the 120-by-120 13-point grid problem. Note that there is a 25% reduction in factorization operation when compared with MMD. 5. EXPERIMENTAL RESULTS AND DISCUSSIONS In this section, we report the performance of the presented methods and compare them with the interpolation methods found in the literature. We also explain how the presented methods allow fast and stable numerical implementations of deformable surfaces. 5.1. Direct Methods Figure 7 (top) shows a series of synthetic range measurements input to a visible surface computation over a regular grid of 60 × 60. These input data include sparse data, dense data, and sparse data with discontinuities. In the sparse data set (sparse.60), there are 20 positional constraints, whereas in the TABLE 2 Performance of Multisection Ordering Ordering method MMD ND MS
with “x”-dissector
FactorNonz
FactorOpns
952,005 901,899 828,820
66.592M 64.299M 49.726M
335
VISIBLE SURFACE INTERPOLATION
FIG. 7. Some input data (top) and the visible surfaces computed from them (bottom). In the top right figure, the dark line segment indicates a prescribed positional discontinuity.
dense data set (dense.60) there are 400 such constraints. For the sparse data set with discontinuities (discon.60), the prescribed edge of positional discontinuity is illustrated in Fig. 7, top right. Figure 7 (bottom) shows the resulting visible surfaces. For the data with positional discontinuity, ρ(x, y) = 0.0 along the line of discontinuity and ρ(x, y) = 1.0 elsewhere. For the sparse and dense data without discontinuity, we set ρ(x, y) = 1.0 everywhere. Another continuity control function τ (x, y) is set to 1.0. As mentioned, direct methods can efficiently handle arbitrary continuity control functions ρ(x, y) and τ (x, y). Tables 3 and 4 summarize timings (in seconds) taken on a Sun Sparc 10, with 53.2 SPECint 92 and 67.8 SPECfp 92. Table 3 records experiments with the minimum-degree ordering. The table includes the above data sets, as well as additional data
sets. These additional data sets are similar to those described above but are of larger sizes. In the table, a FactorOpn of 6.695M means that 6.695 million operations are performed to factorize the corresponding matrix. The symbolic time is the total time for ordering and symbolic factorization. As expected, the solve times are short: direct methods can solve a sparse system very efficiently once the matrix has been factorized. We shall explain later how to take advantage of this fact when working with deformable models [26]. From Table 3, we also see the acceleration induced by discontinuities. To further analyze the effects of discontinuity on the computation time, we varied the length of the discontinuity edge in the data set discon.60 while keeping all other data static. As the edge length increases in the sequence of 0, 15, 30, and
TABLE 3 Direct Method Using Minimum Degree Ordering (Times in s)
TABLE 4 Direct Method Using Multisection Ordering (Times in s)
Data
# Nodes
FactorNonz
FactorOpn
Symb. time
Fac. time
Sol. time
Data
# Nodes
FactorNonz
FactorOpn
Symb. time
Fac. time
Sol. time
sparse.60 dense.60 discon.60 sparse.90 dense.90 discon.90 sparse.120 dense.120 discon.120
3600 3600 3600 8100 8100 8100 14400 14400 14400
172657 172657 149663 468837 468837 402086 952005 952005 827342
6.695M 6.695M 4.424M 25.202M 25.202M 15.670M 66.592M 66.592M 41.687M
0.2 0.2 0.15 0.48 0.47 0.5 0.88 0.92 0.87
0.58 0.57 0.43 2.05 2.03 1.38 5.03 4.98 3.41
0.05 0.06 0.05 0.15 0.16 0.14 0.32 0.30 0.27
sparse.60 dense.60 discon.60 sparse.90 dense.90 discon.90 sparse.120 dense.120 discon.120
3600 3600 3600 8100 8100 8100 14400 14400 14400
159654 159654 148640 430029 430029 406378 843220 843220 795226
5.530M 5.530M 4.499M 20.750M 20.750M 17.387M 49.311M 49.311M 40.529M
0.11 0.11 0.11 0.25 0.26 0.26 0.52 0.53 0.53
0.48 0.50 0.42 1.79 1.81 1.53 3.89 3.87 3.34
0.05 0.05 0.04 0.13 0.14 0.13 0.29 0.28 0.29
336
GUO AND LIU
FIG. 8. Top left, a range image of a statuette. Top right, positional constraints from a sparse (10%) sample of the range image. Bottom, the reconstructed surface (wireframe and shaded).
45, we observe that the computation time drops in the sequence of 0.83, 0.71, 0.63, and 0.55. Table 4 records experiments with multisection ordering with “x”-dissectors. Notice the reduced time for ordering, as well as the significant additional speed-ups for all cases but those with discontinuities. Figure 8 illustrates an experiment with real range data. Figure 8 (top left) shows a 128 × 128 range image of a statuette, taken from the NRCC 3D Image Database [21]. In this image, we randomly sample 10% of the data points and use the sample as positional constraints for visible surface computation. In this example, we did not prescribe any discontinuity; doing so can improve the quality of the visible surface. Figure 9 shows another experiment with a mask, also taken from the NRCC database. 5.2. Comparisons with Iterative Methods The iterative interpolation methods found in the literature [24, 22, 18, 27, 15] are mostly measured by numbers of iterations. Comparisons with direct methods require actual timings. To obtain timing estimations from the numbers of iterations, we have implemented a version of the conjugate-gradient method for visible-surface interpolation. A special feature of our implementation is that it makes the best effort to reduce 1t, the average time per iteration. For an algorithm in the literature, we take its running time to be the reported number of iterations multiplied
by 1t as estimated by our implementation. This running time is an underestimate since our implementation aims to minimize 1t, not the number of iterations. To reduce the time of each iteration, we use the fact that the dominating cost of a conjugate-gradient iteration is that of the multiplication Kzk , where K is the N × N stiffness matrix K and zk is some intermediate N -vector of iteration k. In addition to ensuring that Kzk is calculated only once per iteration, we also try to minimize the cost of the calculation. Sparse matrices are usually placed in application-dependent storage structures to save space and facilitate a restricted set of operations. Knowing that we only need the matrix K for the multiplication Kzk , we choose the storage structure of facilitate that multiplication. More specifically, we use a storage structure similar to that of the Yale Sparse Matrix Package [8]. The relevant routines for building such storage structures and forming the product Kzk may be found in [19]. When comparing with iterative interpolation methods, we focus on two most recent sources [27, 15]. Yaou and Chang give detailed performance evaluations of various iterative methods— multigrid [24], hierarchical basis [22], and bi-orthogonal wavelet basis transfer [27]. On a 32 ×32 grid with sparse positional constraints the fastest method they studied, a wavelet basis transfer method, needs more than 50 iterations for a visible surface to get close to its final shape. With similar data sets, our iterative implementation indicates 0.015 s per iteration is necessary, and thus a conservative estimation is a total time of 0.75 = 50 × 0.015 s
337
VISIBLE SURFACE INTERPOLATION
FIG. 9. Top left, positional constraints from a sparse (10%) sample of the range image shown in the top right figure. Bottom, the reconstructed surface (wireframe and shaded).
on the Sun Sparc 10. In comparison, the direct method based on multisection ordering takes 0.12 s. Yaou and Chang do not consider discontinuities. This missing aspect is explored in [15], where Lai and Vemuri also provide a very effective preconditioner design for constant continuity control functions ρ(x, y) and τ (x, y). Let us first consider the cases in which discontinuities are absent. On a 64 × 64 grid with sparse positional constraints, Lai and Vemuri’s method needs 10 iterations for a visible surface to get close to its final shape [15]. With similar data sets, our iterative implementation indicates 0.07 s per iteration is necessary, and thus a conservative estimation is a total time of 0.7 = 10 × 0.07 on the Sun Sparc 10. In comparison, the direct method based on multisection ordering takes 0.65 s. As for data sets with discontinuities, Lai and Vemuri’s preconditioner is not designed to handle them and is thus less effective. For a 64 × 64 grid with both sparse positional constraints and prescribed discontinuities, their method needs more than 60 iterations [15]. With our iterative implementation, we give a conservative estimation of 4.2 = 60 × 0.07 s total time on the Sun Sparc 10. In comparison, the direct method based on multisection ordering takes 0.6 s. In summary, the presented methods for direct visible-surface interpolation are competitive in general, and they provide significant speed-ups when the input data contain prescribed discontinuities. 5.3. Deformable Surfaces A deformable surface v(x, y, t) is defined by v = [v1 (x, y, t), v2 (x, y, t), v3 (x, y, t)] satisfying continuum mechanical equa-
tion [26] µ
∂ 2v ∂v δE(v) + = f (v, t), +γ 2 ∂t ∂t δv
(5)
where µ is the mass density of the surface, γ is the damping density of the ambient medium, and f (v, t) is the external forces. The third term on the left-hand side is the variational derivative of the elastic energy functional ZZ E(v) =
Ä
(w20 kvx x k2 + w11 kvx y k2 + w02 kv yy k2
+ w10 kvx k2 + w01 kv y k2 ) d x d y.
(6)
The weighting functions w20 , w11 , and w02 control the rigidities of the surface, whereas w01 and w10 control its tension. In the following we restrict our discussion to deformable surfaces with elastic energy functional based on thin-plate and membrane splines. These kind of deformable surfaces are among the most frequently used [26]. The explicit Euler method may be used for time integration of deformable surfaces. Unfortunately, as pointed out in [26], explicit methods quickly become unstable when internal and external forces are increased, making it necessary to use tiny time steps. One solution is to use the alternating direction implicit (ADI) method [19], as was done in [26]. However, there are two difficulties with ADI in the context of deformable surfaces. First, the cross term w11 kvx y k2 in E(v) cannot be handled by ADI and must be treated explicitly (this was pointed out
338
GUO AND LIU
to us by Demetri Terzopoulos). Second, ADI is not applicable to deformable surfaces on nonregular grids such as triangular grids. Another solution for stable time integration of deformable surfaces is to use an implicit method in conjunction with matrix factorization [19]. Denote by vt the discrete time approximation of v(x, y, t) [19]. For an implicit method, the main computation at a time step t is the solution of a sparse system of the form [19, 25] Avt+1t = gt ,
(7)
where A is the stiffness matrix K plus a diagonal matrix accounting for the mass density and damping of the medium. Because A has the same structure as K for visible-surface interpolation, the direct methods we present can be used to obtain vt+1t efficiently. The experimental results above indicate that the main cost for direct methods is that of matrix factorization. For deformable surfaces made of thin-plate and membrane splines, matrix A changes only when the material properties (e.g., rigidity) of the surfaces change. Since such changes happen rarely if at all [26], the same matrix factorization can be used for many time steps. In these steps, the vector vt+1t can be obtained very quickly for each new vector gt , as we have seen earlier in the experiments. This advantage of implicit methods has been pointed out by others before (e.g., [25]), but early direct methods do not exploit all possible zeros in factorizing matrices (only zeros outside certain regions or bands are ignored) and thus become very expensive for finely discretized deformable surfaces.
6. CONCLUSIONS We have described methods for directly performing visiblesurface interpolation within the regularization framework based on thin-plate and membrane splines. Through a careful analysis of the regularization operator, we derive direct methods that efficiently make use of all zeros in the sparse discretization of the operator. Experimental results show that, compared with iterative interpolation methods, the direct methods we present are competitive in general, and they provide significant speedups for problems involving prescribed discontinuities. The problems with prescribed discontinuities are very difficult for iterative methods, as the discontinuities require spatially variant continuity-control functions, making the usually slow convergence hard to accelerate. On the contrary, Tables 3 and 4 show that the prescribed discontinuities actually lead to slightly faster computation for the direct methods presented. In addition to their use in visible-surface interpolation, the presented methods also support very efficient time integration for deformable surfaces, which are widely used for recovering shapes and nonrigid motion. An interesting topic for future research is to explore the use of direct methods in other early vision problems, such as shapefrom-shading and optical flow. These problems are usually for-
mulated in terms of regularization and solved with iterative techniques (e.g., see [23]).
ACKNOWLEDGMENTS We thank Professor Demetri Terzopolous from the University of Toronto for very helpful discussions. Many thanks to Tim McInerney from the University of Toronto for helping us obtain some NRCC test data. Shang-Hong Lai from the University of Florida kindly answered some questions at early stage of this research. We also thank Drs. Esmond Ng and Barry Peyton of Oak Ridge National Laboratory for allowing us to use their sparse numerical solution software. Special thanks go to Dr. Cleve Ashcraft of Boeing Information and Support Services for numerous insightful discussions on the role and the generation of grid “x”-dissectors. Finally, we are grateful to the anonymous referees for their constructive critique and detailed editorial comments. This work is partly supported by Canadian National Science and Engineering Research Council (NSERC).
REFERENCES 1. C. Ashcraft, R. Grimes, J. Lewis, B. Peyton, and H. Simon, Progress in sparse matrix methods for large sparse linear systems on vector supercomputers. Int. J. Supercomputer Appl. 1, 1987, 10–30. 2. C. Ashcraft and J. W. H. Liu, Robust Ordering of Sparse Matrices Using Multisection, Technical Report CS-96-01, Department of Computer Science, York University, North York, Ontario, Canada, 1996. 3. R. M. Bolle and B. C. Vemuri, On three-dimensional surface reconstruction methods, IEEE Trans. Pattern Anal. Mach. Intelligence 13(1), 1991, 1–13. 4. G. Celniker and D. Gossard, Deformable curve and surface finite elements for free-form shape design, Computer Graphics 25, 1991, 257–266. 5. I. S. Duff, A. M. Erisman, and J. K. Reid, On George’s nested dissection method, SIAM J. Numerical Anal. 13, 1976, 686–695. 6. I. S. Duff, A. M. Erisman, and J. K. Reid, Direct Methods for Sparse Matrices. Oxford University Press, London, 1987. 7. I. S. Duff and J. K. Reid, The multifrontal solution of indefinite sparse symmetric linear equations, ACM Trans. Math. Software 9, 1983, 302–325. 8. S. C. Eisenstat, M. C. Gursky, M. H. Schultz, and A. H. Sherman, The Yale Sparse Matrix Package, I. The symmetric code, Int. J. Numer. Meth. Engrg. 18, 1982, 1145–1151. 9. J. A. George, Nested dissection of a regular finite element mesh, SIAM J. Numerical Anal. 10, 1973, 345–363. 10. J. A. George and J. W. H. Liu, Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, Englewood Cliffs, NJ, 1981. 11. J. A. George and J. W. H. Liu, The evolution of the minimum degree ordering algorithm, SIAM Rev. 31, 1989, 1–19. 12. J. A. George, J. W. H. Liu, and E. Ng, User Guide for SPARSPAK, Waterloo Sparse Linear Equations Package. Technical Report Res. CS-78-30 (revised 1980), DCS, University of Waterloo, Waterloo, Ontario, Canada, 1980. 13. W. E. L. Grimson, Computational experiments with a feature based stereo algorithm, IEEE Trans. Pattern Anal. Mach. Intelligence 7(1), 1985, 17–34. 14. W. Hoff and N. Ahuja, Surfaces from stereo: Integrating feature matching, disparity estimation, and contour detection, IEEE Trans. Pattern Anal. Mach. Intelligence 11(2), 1989, 121–136. 15. S. H. Lai and B. C. Vemuri, Physically based and adaptive preconditioners for early vision, in Physics Based Modeling Workshop in Computer Vision, Cambridge, MA, June 1995. 16. J. W. H. Liu, Modification of the minimum degree algorithm by multiple elimination, ACM Trans. Math. Software 11, 1985, 141–153.
VISIBLE SURFACE INTERPOLATION 17. E. G. Ng and B. W. Peyton, Block sparse Cholesky algorithms on advanced uniprocessor computers, SIAM J. Scientific Statistical Comput. 14, 1993, 1034–1056. 18. A. P. Pentland, Interpolation using wavelet bases, IEEE Trans. Pattern Anal. Mach. Intelligence 16(4), 1994, 410–414. 19. W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes: The Art of Scientific Computing. Cambridge Univ. Press, Cambridge, UK, 1986. 20. H. Qin and D. Terzopoulos, D-nurbs: A physics-based geometric design framework. IEEE Trans. Visualization Computer Graphics 2(1), 1996, 85–96. 21. M. Rioux and L. Cournoyer, The NRCC Three-Dimensional Image Data Files. Technical Report CNRC No 29077, National Research Council of Canada, 1988.
339
22. R. Szeliski, Fast surface interpolation using hierarchical basis functions, IEEE Trans. Pattern Anal. Mach. Intelligence 12(6), 1990, 513–528. 23. D. Terzopoulos, Image analysis using multigrid relaxation methods, IEEE Trans. Pattern Anal. Mach. Intelligence 8(2), 1986, 129–139. 24. D. Terzopoulos, The computation of visible-surface representations, IEEE Trans. Pattern Anal. Mach. Intelligence 10(4), 1988, 417–438. 25. D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer, Elastically deformable models, Computer Graphics 21, 1987, 205–213. 26. D. Terzopoulos, A. P. Witkin, and M. Kass, Constraints on deformable models: Recovering 3D shape and nonrigid motion, Artificial Intelligence 36(1), 1988, 91–123. 27. M. H. Yaou and W. T. Chang, Fast surface interpolation using multiresolution wavelet transform, IEEE Trans. Pattern Anal. Mach. Intelligence 16(7), 1994, 673–688.