Signal Processing 81 (2001) 2249–2252
www.elsevier.com/locate/sigpro
Fast communication
E!ective nonlinear approach for optical $ow estimation Jong-dae Kima; ∗ , Jongwon Kimb a Division
of Information and Communication Engineering, Hallym University, 1 Okchon-dong, Chunchon, Kangwon-do 200-702, South Korea b Biomedlab. Co., 1-49 Dongsung-dong, Jongno-gu, Seoul 110-510, South Korea Received 20 April 2001
Abstract An e!ective nonlinear iterative scheme is proposed for the optical $ow estimation. The proposed scheme has the simple structure similar to the prevalent linear solutions di!erently from the existing nonlinear approach. Both the nonlinear approaches are evaluated theoretically and experimentally verifying that the proposed method has better error-performance than the existing one. ? 2001 Published by Elsevier Science B.V. Keywords: Optical $ow estimation; Nonlinear motion constraint equation; Single-step Newton method
1. Introduction Optical $ow or image $ow, which is usually represented by an array of pixel velocities, has been a dominant tool for dynamic scene analysis. Feature-based and gradient-based methods are the major tools for optical $ow estimation. The latter methods assume that the local variation of image intensity function is preserved during the sampling interval of image frames. The assumption constrains the pixel-wise motion as follows: f(x; y; t) − f(x − u; y − v; t − 1) = 0; ∗
(1)
Corresponding author. Division of Information & Communication Engineering, Hallym University, 1 Okchon-dong, Chunchon, Kangwon-do 200-702, South Korea. E-mail address:
[email protected] (J.-d. Kim).
where u(x; y) and v(x; y) are the horizontal and vertical displacement at position (x; y), respectively. While Eq. (1) is the nonlinear form of the motion constraint (nonlinear MCE), the Erst-order approximation by Taylor expansion yields the linear motion constraint equation (linear MCE): fx u + fy v + ft 0;
(2)
where fx and fy are the spatial gradients of f(x−u; y−v; t −1) and ft = f(x; y; t)−f(x; y; t −1). ft is commonly referred to as the frame di!erence or the temporal gradient. As the optical $ow estimation problem is usually ill-posed only with any of MCEs, it is regularized with another constraint. The smoothness constraint is commonly adopted resulting in the following
0165-1684/01/$ - see front matter ? 2001 Published by Elsevier Science B.V. PII: S 0 1 6 5 - 1 6 8 4 ( 0 1 ) 0 0 1 0 3 - 7
2250
J.-d. Kim, J. Kim / Signal Processing 81 (2001) 2249–2252
optimization problem: min d(u; v)2 + 2 |∇u|2 + |∇v|2 d x dy; u;v
(3) where d(u; v) is the linear MCE (Eq. (2)) or the nonlinear one (Eq. (1)). Applying the variational method we can arrive at the following system of partial di!erential equations (PDEs) [1,2]: ∇2 u = 12 fx d(u; v); ∇2 v = 12 fy d(u; v):
(4)
While employing the linear form for MCE (d(u; v)) results in the linear system in the discrete space, we have a nonlinear system with the nonlinear MCE requiring a nonlinear approach. For the former linear approach, there have been proposed many iterative solutions such as Gauss–Seidel relaxation (GSR), successive over relaxation (SOR) and local relaxation (LR) [2,4,5]. There are, however, only a few solutions for the latter. One possible nonlinear approach is the linearization of the nonlinear system of Eqs. (4) with an approximation [1]. After the previous frame f(x; y; t − 1) is registered with the approximate optical $ow (u0 ; v0 ), the residue (u ; v ) can be framed in a linear form as follows: ∇2 u
1 2
fx {d(u0 ; v0 ) + fx u + fy v } − ∇2 u0 ;
∇2 v
1 2 fy
{d(u0 ; v0 ) + fx u + fy v } − ∇2 v0 ;
(5) where d(u0 ; v0 ) is usually referred to as the displaced frame di!erence or registered temporal gradient. Enkelmann further simpliEes the system of Eqs. (5) by assuming that the residues are spatially $at, that is, ∇2 u = u and ∇2 v = v . As the system of Eqs. (5) is decoupled pixel-wise with this assumption, it is straightforward to have the solution of the residues. The calculated residual (u ; v ) is added to the approximation (u0 ; v0 ) to obtain a more accurate approximation. The reEned approximation is again applied for the registration of the previous frame yielding another linear system for the further residues resulting a kind of successive registration scheme(SRS). The method proposed by
Enkelmamm can be regarded as SRS except the oriented smoothness constraint [1]. Note that SRS is composed of the residual calculation and the registration of the previous frame.
2. Nonlinear relaxation While SRS is one of the useful ways to solve the nonlinear problem, a simple-structured iteration method can be induced by applying the single-step Newton method (SNM) directly to the nonlinear system of Eqs. (4). Using 5 point formula for the Laplacian operator in the system of Eqs. (4), it is approximated in the discrete space as follows [3]: ui − ui = 12 fxi di (ui ; vi ); vi − vi = 12 fyi di (ui ; vi );
(6)
where (ui ; vi ) denotes the average displacement of the 4-neighborhood pixels and i is the pixel index in the lexicographical or the chequer-board order. By introducing the vector notation for the displacement and the spatial gradients, the pair of nonlinear equations of Eqs. (6) can be represented as follows: li (zi ) = zi − zi +
1 gi · di (zi ) = 0; 2
(7)
where li = (lui ; lvi )t , zi = (ui ; vi )t , gi = (fxi ; fyi ) and zi = (ui ; vi )t . From Eq. (7) the problem of Eq. (3) can be stated as follows: L(Z) = (l0 ; : : : ; li ; : : : ; lN )t = 0;
(8)
where N is the number of pixels. As the Jacobian of the system L is a large sparse matrix with a form similar to the system matrix of the linear version of Eqs. (6), we can apply the following SNM to solve the system of Eq. (8) [6]: Zn+1 = Zn − !W−1 L(Zn )
(9)
where W consists of the 2 × 2 block-diagonal part of the Jacobian of the system L and ! is the relaxation parameter as those of SOR or LR. The above equation can be rewritten for each displacement
J.-d. Kim, J. Kim / Signal Processing 81 (2001) 2249–2252
component: uin+1
= (1 −
!)uin
fx (fx un + fyi vin + (d(uin ; vin ) − fxi uin − fyi vin )) + ! uin − i i i 2 + fx2i + fy2i
vin+1 = (1 − !)vin + ! vin −
fyi (fxi uin
+
+ (d(uin ; vin ) − 2 + fx2i + fy2i
fyi vin
It is interesting to note that the above equations have the simple form similar to those in the linear case except the frame di!erence term. By replacing the term of d(uin ; vin ) − fxi uin − fyi vin with the frame di!erence ft , we can have SOR for the linear case [4]. As it is also possible to employ LR to accelerate the convergence, this paper selected LR to determine the local relaxation parameter as in [4]. 3. Results and discussions As the iteration property greatly depends on the smoothness constant , it will be ambiguous if the methods are compared with a Exed constant. Therefore, their steady state errors should be investigated for various values of the smoothness constant to evaluate the methods. All three methods are tested for artiEcial and real frame pairs generated with artiEcial motion Eelds. One of the frame pairs is shown in Fig. 1. The frame shown in Fig. 1(a) is translated by (1,0) pixels,
fxi uin
−
fyi vin ))
◦
2251
; (10)
:
rotated by 6 , and zoomed by 1.1 times producing the second frame of Fig. 1(b). In Fig. 2, the vertical axis denotes the mean square of the steady state error of the estimated optical $ow in pixels=frame and the horizontal axis represents the smoothness constant. The result denoted by ‘linear’ is for the linear approach where the linear form is utilized for MCE and LR in [4] is applied. The Fig. 2 shows that the minimum errors of both nonlinear approaches are much smaller than that of the linear approach. The existing nonlinear approach SRS, however, became worse than the linear approach for some values of smoothness constant. Even though there are methods to determine the optimum smoothness constant such as cross validation and L-curve method, it is not possible to select the constant resulting in the minimum steady state error [4]. Therefore, SRS is not preferable in terms of the stability against the smoothness
Fig. 1. The frame pair generated by an artiEcial motion.
2252
J.-d. Kim, J. Kim / Signal Processing 81 (2001) 2249–2252
Fig. 2. The comparison of the mean square error of the estimated optical $ow according to the smoothness constant.
constant. On the contrary, the proposed one always has better error-performance than the linear approach regardless of the smoothness constant value. As the computational cost depends greatly on the number of iterations and the interpolation for the calculation of the displaced frame di!erence, the nonlinear approaches must be more expensive than the linear ones. The proposed method, however, should be more cost-e!ective than SRS, in fact that for a given estimated error, the proposed method needs fewer number of iterations. The proposed method must show the better qualitative behaviors such as the frame di!erence and the registered frame as in [4]. Similarly its performance against the real image sequences will be the best if the L-curve method is selected for the comparison tool [4]. The proposed scheme can be easily combined with the techniques such as the hierarchical or temporal iterative methods because most of them are
based on the linear approach. Simply changing their temporal gradient term with that of the proposed method will make them have better performance. References [1] W. Enkelmann, Investigations of multigrid algorithms for the estimation of optical $ow Eelds in image sequences, Computer Vision, Graphics, and Image Processing 43 (1988) 150 –177. [2] B.K.P. Horn, B.G. Schunck, Determining optical $ow, ArtiEcial Intelligence 17 (1981) 185 –203. [3] W. Hackbusch, Iterative Solution of Large Sparse Systems of Equations, Springer-Verlag, New York, NY, 1995. [4] J.D. Kim, S.K. Mitra, A local relaxation method for optical $ow estimation, Signal Processing: Image Commun. 11 (1) (1997) 21–38. [5] C.-C.J. Kuo, B.C. Levy, B.R. Musicus, A local relaxation method for solving elliptic on mesh-connected arrays, SIAM J. Sci. Statist. Comput. 8 (1987) 550 –573. [6] U.M. Theodor Meis, Numerical Solution of Partial Di!erential Equations, Springer-Verlag, New York, 1981.