Analysis of multiframe super-resolution reconstruction for image anti-aliasing and deblurring

Analysis of multiframe super-resolution reconstruction for image anti-aliasing and deblurring

Image and Vision Computing 23 (2005) 393–404 www.elsevier.com/locate/imavis Analysis of multiframe super-resolution reconstruction for image anti-ali...

367KB Sizes 0 Downloads 31 Views

Image and Vision Computing 23 (2005) 393–404 www.elsevier.com/locate/imavis

Analysis of multiframe super-resolution reconstruction for image anti-aliasing and deblurring Zhaozhong Wang*, Feihu Qi Department of Computer Science and Engineering, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China Received 25 June 2003; received in revised form 18 October 2004; accepted 8 November 2004

Abstract Novel theoretical results for the super-resolution reconstruction (SRR) are presented under the case of arbitrary image warping. The SRR model is reasonably separated into two parts of anti-aliasing and deblurring. The anti-aliasing part is proved to be well-posed. The ill-posedness of the entire SRR process is shown to be mainly caused by the deblurring part. The motion estimation error results in a multiplicative perturbation to the warping matrix, and the perturbation bound is derived. The common regularization algorithms used in SRR are analyzed through the discrete Picard condition, which provides a theoretical measure for limits on SRR. Experiments and examples are supplied to validate the presented theories. q 2004 Elsevier B.V. All rights reserved. Keywords: Super-resolution; Ill-posedness; Perturbation; Discrete Picard condition; Regularization; Anti-aliasing; Deblurring

1. Introduction The multiframe super-resolution reconstruction (SRR) problem was originally proposed by Tsai and Huang [28]. They recovered a high-resolution (HR) unaliased image from a sequence of low-resolution (LR) images. These LR images are down-sampled and noise-free versions of the HR one, degraded by the spatial aliasing effect. Additional work by other authors generalized this idea to blurred and noisy images. The corresponding algorithms try to perform antialiasing and deblurring simultaneously. Most algorithms solving the classical image restoration (CIR) problem were borrowed and then modified to solve the SRR problem. These algorithms include, for example, the iterative methods [6,17], the Bayesian methods using the MAP criterion [4,11,23] and the POCS-based methods [21,25]. See [19] for a recent survey. In addition to the development of the algorithmic aspects mentioned above, theoretical aspects of SRR have been developed. The studied problems mainly focus on the ill-posedness of SRR, including the uniqueness and * Corresponding author. Tel.: C86 21 3226 1135; fax: C86 21 6293 2089. E-mail address: [email protected] (Z. Wang). 0262-8856/$ - see front matter q 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2004.11.001

continuity of the solution. Based upon the generalized sampling theorem of Papoulis [18], Ur and Gross [29] proved that for translational motion among LR images, the anti-aliasing problem via SRR has a unique solution and is well-posed under certain conditions. Kim et al. [12] also presented some results for the existence of the unique solution in similar cases. The theories in [12,29] were derived using knowledge of the frequency domain characteristics of the translational motion model. General theories for arbitrary motion are absent, but needed in practice. The continuity of the solution is studied through several related issues, such as the limits on SRR and the model perturbations. Baker and Kanade [2] stated that the performance of SRR degrades rapidly as the magnification factor increases. Lin and Shum [14] calculated a bound on the magnification factor caused by model perturbations and data errors. Zhao and Sawhney [33] analyzed the effect of motion estimation error to SRR. Other theoretical studies also exist in the literature. Shekarforoush [24] studied SRR from the multichannel point of view. Ng and Bose [16] proposed analyses regarding the treatment of image boundary, the convergence of iterative algorithms, etc. This paper studies the theoretical aspects of SRR, focusing on the discrete ill-posedness and the performance limits on SRR. The work is mostly relevant to Baker and

394

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

Kanade’s [2] and Lin and Shum’s [14]. But we study the general case of arbitrary image motion, rather than global or local translations [14]. The technical thread is also different from those in [2,14] and the results are novel. Specifically, we use the ‘complex to simple, then back to complex’ thread to analyze the SRR problem, such a treatment coincides with the basic idea in the development of mathematics and natural science. The singular value decomposition (SVD) is used as a tool in the analysis. The contributions are summarized as follows. First, we prove that the anti-aliasing part of SRR is essentially a discrete well-posed problem under arbitrary motion. Second, the motion estimation error results in a multiplicative perturbation to the warping matrix, and the problem is still well-posed if the perturbation is rank preserved and the estimation error is small. Third, if the matrix in the aliasing part is of full rank, the ill-posedness of the entire SRR problem is mainly caused by the deblurring part; but if such a matrix is rank deficient, its properties may heavily affect the performance of SRR. Finally, in addition to the limits on SRR presented in [2,14], the SRR performance is mathematically limited by the discrete Picard condition [10]. The paper is organized as follows. In Section 2, we first discuss a ‘separable’ SRR model, which reduces the complex model to a simpler one, called the kernel part of SRR. Then the kernel model is discussed, including the rank, the norm and the condition number of the underlying matrix. In Section 3, the kernel model is released from strict assumptions and then returned back to the complex model. The perturbation caused by the motion estimation error and the blurring effect is discussed successively. The effect of the discrete Picard condition to SRR is pointed out in Section 4. Section 5 provides experimental demonstrations and section 6 the conclusions.

2. Model analysis: from complex to simple We need to select an appropriate model for analyzing. In SRR modeling, a commonly accepted assumption is that there exist three kinds of imaging operations: warping, blurring and decimating, although some models do not consider the warping (e.g. [22]). But there are at least two different kinds of assumptions for the order of these operations. One kind of assumption is that the imaging process involves warping followed by blurring and decimating to generate LR images from the HR one [2,6,11,17,30,33]. Denote f as the HR image to be reconstructed, Fi the warping operator, B the blurring operator, and D the downsampling operator, this model can be written as [6] si Z DBFi f C ni ; i 2Z; where si denotes the LR image, and ni the noise.

(1)

The second kind of assumption is that the HR image is blurred first, then warped and decimated to yield LR images [13,20,29] si Z DGi Bf C ni ; i 2Z;

(2)

where Gi , which is generally different from Fi [31], denotes the warping operator for the blurred image Bf . There are algorithms, adopting a fusion-deblur [5] or an interpolationrestoration [13] scheme for SRR; these algorithms are also based on the blurring–warping imaging process in (2). Both of the two models (1) and (2) are widely used in the SRR community. 2.1. The separable SRR model In search for insight into the analysis of physically meaningful models, consideration will be given to the blurring–warping model (2). There are two important reasons for such a choice: First model (2) describes the real imaging process for the case when the dominant blur is either the atmospheric turbulence or the motion blur as in satellite imaging or aerial photography [13]. Second Wang and Qi [31] have shown that, even if the real imaging process is the warping–blurring (1) (in the case when the camera blur is dominant), model (2) is still preferable when the motion has to be estimated from the LR images. Moreover, while analyzing model (1) is difficult, model (2) is convenient for theoretical analysis, since it can be separated into two sub-models, one of which is well-posed (as we show below). For convenience we first establish the discrete form of model (2). The operators B, Gi and D are generally assumed to be linear [6], so model (2) can be discretized as a system of linear equations with matrix– vector form. Assume that each frame in the LR sequence contains m!n pixels, and the HR image is with M!N pixels. Denote r 2ZC as the magnification factor (or decimation ratio). We can represent the discrete image as a vector by stacking the columns of the image array. Let the MN!1 vector f be the vectorized form of the HR image f, and let the mn!1 vectors si, iZ0,1,.,lK1 be the vectorized LR images, the blurring–warping model (2) can then be written as si Z DGi Bf C ni ; i Z 0; 1; .; l K 1;

(3)

where D 2Rmn!MN , Gi 2RMN!MN and B 2RMN!MN are the matrix representations of the operators D, Gi and B, respectively, ni is the discretized noise, and l the number of LR frames. The imaging process in Eq. (3) can be decomposed into si Z DGi g C ni ; i Z 0; 1; .; l K 1; and g Z Bf:

(4)

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

395

We call model (3) the separable imaging model for SRR. The sub-model gZBf is a CIR problem, and Eq. (4), which reflects the term ‘multiframe super-resolution’, is the kernel model of SRR. Applying multiframe information is the major distinction between SRR and CIR. We shall first analyze the kernel model (4), focusing on its well-posedness under arbitrary image motion. Then in Section 3 we shall return back to the entire complex model (3). The properties of the kernel model are still valid and play an important role in the analysis of (3). Fig. 1. Down-sampling structure and the kernel model illustration. Small dots are the pixels of the non-decimated image. Big dots are the sampled pixels, which form the LR image.

2.2. The kernel model of SRR For quantitative study, we need to explicitly describe the matrix representations of the operators. Generally, the warping operator Gi in Eq. (2) can be defined as [27] gi ðxÞ Z ðGi gÞðxÞ dgðx K di ðxÞÞ;

(5)

where x 2R2 is the continuous coordinate of image point, and di(x) the 2D displacement field, which may be estimated from the LR images. Let p 2Z be the indices of the discretized and vectorized image points, then we may write the displacement as di(x)Z(ui,p, vi,p). The vector (ui,p, vi,p) may vary with i and p, so it represents arbitrary motion. For simplicity, here we assume that the components of (ui,p, vi,p) are integer numbers, i.e. the motion between non-decimated images have integer-pixel accuracy. Analysis for noninteger motion will be given in Section 3. Based on Eq. (5), the warping matrix Gi in Eq. (4) can be expressed using its entries Gi(p, q) as ( 1; q Z p K Mui;p K vi;p ; Gi ðp; qÞ Z (6) 0; otherwise; for all pZ0,1,.,MNK1 and iZ0,1,.,lK1. That is, there is only one non-zero entry in each row p of the matrix Gi. We must keep the non-zero entries q within the available image range. There are several ways to do this. A mathematically convenient one is to set the motion vector circularly, e.g. set ui,pZui,pKN if bp/McKui,p!0, and ui,pZui,pCN if bp/McKui,pONK1. To describe the decimation matrix D, we first define a displacement (x0, y0), which is the offset of the first sampling point from the origin of the non-decimated image plane, see Fig. 1. We restrict x0!r and y0!r, otherwise some samples may be located out of range. Let cdy0C Mx0. If cZ0, the first sample is the first point of the non-decimated image. Define xp dp K b p=m c m Z modðp; mÞ;

jp d b p=m c :

for all pZ0,1,.,mnK1. There is also only one non-zero entry in each row of D. Now we combine the l equations in (4). Define 3 2 2 3 DG0 s0 7 6 6 s 7 6 DG1 7 6 1 7 7; 6 (9) s d6 7; H d 6 7 4 « 5 4 « 5 slK1 DGlK1 where s is the lmn!1 vector and the matrix H 2Rlmn!MN may be termed as the aliasing matrix. In this way the l equations in (4) can be written as s Z Hg C n;

where n is the stacked noise vector. The matrix H can be derived from Eqs. (6) and (8) as Hðimn C p; qÞ ( 1; q Z c C xp r C jp Mr K Mui;p K vi;p ; Z 0; otherwise;

(11)

for any iZ0,1,.,lK1, and pZ0,1,.,mnK1. Here p is the location of the pth point in the ith LR image. The single nonzero entry in each row of H is determined by its row index imnCp, the down-sampling structure (c, r) and the motion vector (ui,p, vi,p). 2.3. Well-posedness of the kernel model We derive properties of the matrix H in this section. Denote ej 2RMN , jZ0,1,.,lmnK1, as the row vectors of H, where j is the row index of H (i.e. the index of a measured pixel ranged from all measured LR images). For convenience in the following derivations, define

(7)

The xp and jp are respectively the row and column indices of the point p in the LR image. We then have ( 1; q Z c C xp r C jp Mr; Dðp; qÞ Z (8) 0; otherwise;

(10)

qj d

j X

ek :

(12)

kZ0

We call qj the partial sum of the row vectors of H from index 0 to j. One row vector in the aliasing matrix H corresponds to one pixel in a LR image. If a row vector does

396

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

not belong to the subspace spanned by others, the pixel related to this row vector may be termed as an innovation. 2.3.1. The rank of the matrix H Here we judge whether a pixel carries new information or not, and give considerations for the rank of H. Theorem 1. For the aliasing matrix H in Eq. (11), (1) the row vector e j;span{e 0 ,e 1,.,e jK1}, jZ1,., lmnK1, if and only if the partial sum qjK1 ðc C xp r C jp Mr K Mui;p K vi;p Þ Z 0;

(13)

where pZmod(j, mn), and iZbj/mnc. (2) Let lmnRMN. H is of full rank if and only if qlmnK1 ðqÞ s0; c q 2f0; 1; .; MN K 1g; where qlmnK1 is the sum of all the row vectors of H. The proof of this theorem is given in Appendix A. This theorem gives a necessary and sufficient condition for a pixel to become an innovation. It is well known that for the existence of a unique solution in general least square problems, the rank of H should be full. Although one may obtain good estimations even if the rank is deficient [2,9], we shall show later that the rank of H provides a limit on the performance of SRR. Theorem 1 also gives a new method to compute the rank of H. This theorem implies that the number of non-zero entries in the sum vector qlmnK1 is equal to the rank of the matrix H, namely rankðHÞ Z cardfq : qlmnK1 ðqÞ s0g; where ‘card’ denotes the cardinal number of set. Thus the rank of H can be easily calculated. Now we give a corollary of Theorem 1 with respect to global translational motion. It can be seen how the existing results in [12,29] can be combined in our framework as special cases. For global translations, all pixels in an image have the same displacement, thus the motion vector (ui,p, vi,p) reduces to (ui, vi), the latter is only the function of the frame number i. Corollary 1. Let the motion between LR images be translations, and let (M, N)Z(rm, rn), 0%x0, y0!r. The aliasing matrix H is of full rank if rK1 ðui ; vi Þ 2fx0 K kgrK1 kZ0 !fy0 K kgkZ0 ;

(14)

for iZ0,1,.,r2K1, where (x0, y0) is the offset of the sampling structure, and ! denotes the Cartesian product of the two sets. The proof is given in Appendix B. Note that the condition in Eq. (14) is sufficient only. The necessity will hold if we restrict the motion vectors to ensure that the indices cannot be located out of range. Corollary 1 implies that the

minimum number of observed images required for H to be full rank is r2, see also [14]. 2.3.2. The condition number of H We consider a further problem: does the solution of the kernel model (4) depend continuously on the data? In other words, we need to know whether the aliasing matrix H is well conditioned or not. We may study the 2-norm k$k2 (also known as the spectral norm [9]) of the matrix H and the corresponding condition number. Theorem 2. For the aliasing matrix H 2Rlmn!MN pffiffiffi (lmnRMN), kHk2 Z m, where m is the maximal element in the sum of all row vectors of H, i.e. mdmaxi qlmnK1(i). If H is of full rank, then kHCk2Z1, where HC is the pseudoinverse of H. The proof can be found in Appendix C. We can obtain immediately from Theorem 2 that, if the matrix H is of full rank, the 2-norm condition number k2(H) [9] is given as pffiffiffi k2 ðHÞ :Z kHk2 kHCk2 Z m: (15) It is known that k2(H) is a measure for the continuity of the pffiffiffi solution [9]. Clearly in the SRR problem m remains as a small number for most cases, so we can conclude that the matrix H is well conditioned, and in turn the SRR using Eq. (10) is well-posed. That is, small perturbations (e.g. the noise n) in the measurement s cannot give rise to unacceptably large changes in the solution. Consequently, the model (10) can be easily regularized and solved. At this point, we may point out that for insightful analysis, the matrix singular value decomposition (SVD) [9] is a very useful tool. For an arbitrary matrix A 2Rm!n with mRn, the SVD of A has the form of A Z USV T Z

n X

ui si vTi ;

(16)

iZ1

where UZ[u1,.,un] and VZ[v1,.,vn] are matrices with orthonormal columns, UTUZVTVZIn (In is the n!n identity matrix), and SZdiag(s1,.,s n) with s1R/RsnR0. The si are the singular values of A. The basic properties of the aliasing matrix H can be easily expressed using its SVD. The rank of H, for example, is equal to the number of non-zero singular values. Denoting smax and smin as the largest and the smallest non-zero singular values of H, respectively, it is known that [9] kHk2 Z smax ;

kHCk2 Z 1=smin ;

(17) pffiffiffi and k2(H)Zsmax/smin. Thus smax Z m, and sminZ1 for the full-rank H. See Fig. 2 for the plot of the singular values of H. The benefit of SVD will be shown more clearly in the following sections. The above claims about the rank and the condition number of H are simple since H has the nice row-stochastic structure. But these results are fundamental and still valid in the complex models discussed below.

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

397

G^ i Z Gi DGi

(19)

where Gi is the real warping operator defined in Eq. (5). Eq. (19) indicates that the motion estimation error actually results in a multiplicative perturbation to the warping, but the error is frequently treated as the additive Gaussian noise in the literature [7,33]. Based on Eq. (18), the SRR kernel model (10) can be generalized to the following perturbed form ^ C n; s Z Hg

(20)

where the perturbed aliasing matrix H^ is given by 2 3 DG0 DG0 6 7 « H^ d 4 5;

(21)

DGlK1 DGlK1 Fig. 2. Plots of singular values of aliasing matrices with integer or noninteger warping. H^ 1 and H^ 2 are two perturbed version of H. All these matrices have 5!122 rows and 242 columns.

3. Model synthesis: from simple to complex In this section, we shall release the assumptions on model (10) and consider more complex cases. We first discuss the perturbation to the kernel model, and then combine the blurring effect. 3.1. The perturbed model and the non-integer warping Theorems 1 and 2 are both derived based on the assumption of integer warping. In this section, the noninteger warping will be handled; this case can be included in a general framework for analyzing the perturbed model with errors in warping matrices, as we shall show below. That is, we allow in this section that (1) there are motion estimation errors in warping matrices, and (2) the image displacements are non-integer, and the warping matrices are formed by higher order interpolations than the nearest-neighbor one in Eq. (6). 3.1.1. Multiplicative perturbation of warping First, we consider perturbations in model (10) caused by motion estimation errors. We can generalize Eq. (5) to ðG^ i gÞðxÞ dgðx K di ðxÞ K Ddi ðxÞÞ;

(18)

here Gi, iZ0,.,lK1, are the warping matrices defined in Eq. (6), and DGi are the matrix representations of the operators DGi , and may be constructed by certain highorder (e.g. bilinear) interpolation. The non-integer motion ðui;p ; vi;p Þ 2R2 can be equivalently viewed as the perturbation to the integer motion, that is, we set in Eq. (18) that di ðxÞ Z ð½ui;p ; ½vi;p Þ;

Ddi ðxÞ Z ðui;p K ½ui;p ; vi;p K ½vi;p Þ; (22)

where [ui,p] and [vi,p] denote the nearest integer numbers to ui,p and vi,p, respectively. Thus the analysis for the noninteger warping reduces to a special case of that for general perturbed warping. 3.1.2. Perturbation bound How are the properties of the perturbed matrix H^ changed from those of H? This problem is crucial in analyzing the perturbed model (21). Specifically, we have Theorem 3. Theorem 3. For the matrices H, H^ 2Rlmn!MN given respectively in Eqs. (9) and (21), pffiffiffiffiffiffiffiffi kH^ K Hk2 % a MN kHk2 ; (23) ^ and if rankðHÞZ rankðHÞ, then pffiffiffi C C kH^ K H k2 1 C 5 pffiffiffiffiffiffiffiffi % a MN k2 ðHÞ; C 2 kH^ k2

(24)

where G^ i denotes the perturbed warping operator caused by the inaccurate motion di(x)CDdi(x), di(x) is the true motion, and Ddi(x) the motion estimation error. Define a perturbation warping operator by the error Ddi(x) as

where

ðDGi gÞðyÞ dgðy K Ddi ðxÞÞ;

The proof can be found in Appendix D. The norm kHk2 and the condition number k2(H) are given in Theorem 2 and Eq. (15), respectively. Theorem 3 reveals the deviation between the perturbed matrix H^ (resp. its pseudoC inverse H^ ) and the true matrix H (resp. HC) caused by

where yZxKdi(x). Assume that there is no occlusion, it is not difficult to prove that the perturbed warping G^ i can be expressed as

a :Z max kDGi K IkF 0%i%lK1

(25)

with k$kF denoting the matrix Frobenius norm.

398

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

^ the motion errors DGi. Note that the condition rankðHÞZ C C ^ rankðHÞ should be met, otherwise kH K H k2 may grow unboundedly even if a is small [26]. The H^ may be termed as the rank-preserved perturbation of H if ^ rankðHÞZ rankðHÞ. Under this condition, we can conclude that if the motion error is small (e.g. caused by the fractional part of the motion, as in Eq. (22)), the properties of H^ are close to those of H. We can see Theorem 3 more clearly by using SVD. From the theorem, Eq. (17) and the inequality ^ 2 K kHk2 j% kH^ K Hk2 , it is easy to obtain the jkHk following perturbation bound on singular values pffiffiffiffiffiffiffiffi js^ max K smax j% a MN smax ; pffiffiffi 1 þ 5 pffiffiffiffiffiffiffiffi a MN smax ; js^ min K smin j% 2 where s^ max and s^ min are the largest and the smallest non^ respectively. As an illustration, zero singular values of H, Fig. 2 shows the plots of singular values for aliasing matrices with integer or perturbed non-integer warping, where the matrix H is formed by the translational motion {[0, 0] [K1, 0], [0, K1], [K1, K1], [1, K1]}, H^ 1 by {[0, 0], [K0.92, 0.13], [0.18, K0.96], [K0.89, K0.91], [1.04, K0.82]}, and H^ 2 by {[0, 0], [K1.4, 0.4], [0.3, K0.8], [K0.9, K1.2], [1.6, K1.1]}. For H^ 1 , the a value defined in Eq. (25) is equal to 6.47, and for H^ 2 , aZ19.76. All the three matrices have full rank 576 (Z242), and the above inequalities can be easily validated. In addition, we have k2(H)Z1.414, k2 ðH^ 1 ÞZ 1:899, and k2 ðH^ 2 ÞZ 5:641. The above theory and experiment indicate that, the integer motion reflects the ‘principal part’ of warping, and plays the central role in SRR. The fractional part of noninteger motion has minor effect in SRR. This is one reason why we first treat the integer-warping matrix in the previous section. The above computed condition numbers validate that the perturbed aliasing matrices are still well conditioned if they are rank preserved and the perturbations are small. Theorem 3 can be used in the perturbation analysis to the uncertain model (20), as well as to an extended model

combining the blurring effect (will be shown later). Lin and Shum [14] presented perturbation analyses for local translational motion. Our contribution here is that we derive the perturbation bound for arbitrary motion. 3.2. Combining the blurring effect In this section we release additional assumptions and allow the existence of imaging blur. Thus we embed the equation gZBf in model (4) and return back to the general model (3), which can be rewritten as s Z Wf C n Z HBf C n;

(26)

where B 2RMN!MN is the blurring matrix, and H is defined in Eqs. (9) or (21), i.e. it denotes either the true or the perturbed matrix in terms of different context below. 3.2.1. Ill-posedness of the entire model If the imaging blur is formed by a linear shift-invariant point-spread function (PSF) h(x), then the blurring matrix B can be constructed from h(x) as a convolution matrix, see, for example [1,16]. It is known that B is generally ill conditioned, and its condition number has been well studied [8,15]. The upper bound on the condition number k2(B) is given by the ratio of the maximum to the minimum values of the amplitude spectrum of the PSF h [15], i.e. k2 ðBÞ% maxjFðhÞj=minjFðhÞj;

(27)

where F(h) denotes the Fourier transform of h. In addition, the growth of k2(B) is generally fast as the HR image size increases [15]. Fig. 3(a) gives illustrations for singular values of some blurring matrices, and the corresponding condition numbers are also computed in the context. Now, by combining the matrices H and B, we can obtain properties of the matrix W Z HB 2Rlmn!MN in model (26). It is easy to verify that kWk2 % kHk2 kBk2 ; and if B is non-singular, kW Ck2 Z kBK1 HCk2 % kHCk2 kBK1 k2 :

Fig. 3. Logarithmic-scaled plots of singular values of various blurring and warping matrices. (a) Singular values of the blurring matrices generated by various filtering kernels. (b) The matrix H is just the H^ 2 in Fig. 2, and the B is generated by the 3!3 Gaussian kernel. (c) The H is generated from image rotation with 28 per frame, and the B is generated by the 3!3 box kernel.

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

Thus by using (15) and (27), we obtain that, if H is of full rank and unperturbed and B is non-singular, pffiffiffi maxjFðhÞj : (28) k2 ðWÞ% k2 ðHÞk2 ðBÞ% m minjFðhÞj pffiffiffi The condition number m of H is known to be small in general, so the ill-posedness of the entire SRR problem is mainly caused by the ill-conditionedness of the blurring matrix B. But if H is rank deficient, then its properties (the distribution of singular values, for example) may heavily affect the ill-conditionedness of W. We shall give illustrations for these claims in Section 5 and Fig. 3. Due to the discrete ill-posedness, SRR needs welldesigned regularization algorithms. A variety of direct and iterative numerical regularization methods have been proposed [16,19]. Most of these methods seek to either compute or approximate a certain regularized solution, namely the solution fl of the Tikhonov regularization f l Z argminfkWf K sk22 C lkLfk22 g;

(29)

where l is the regularization parameter, and L is typically either the identity matrix or a well-conditioned discrete approximation to a certain differential operator. There are other selections for L, see e.g. [16]. The solution of (29) is known as f l Z ðW T W C lLT LÞK1 W T s: Ng and Bose [16] have presented good discussions on the convergence rate and other computational aspects of such regularization algorithms.

4. The discrete Picard condition (DPC) An important theoretical problem might, however, be neglected in the SRR literature. In the mathematical community, it is known that a necessary condition to solve a discrete ill-posed problem is the discrete Picard condition (DPC) [10]. But the DPC cannot be met trivially in SRR problems, thus it may provide a good measure for limits on SRR, as we shall show below. To interpret the DPC, we need first review the concept of generalized singular value decomposition (GSVD) [10], which is useful in analyzing the regularized solutions of Eq. (29). Without loss of generality, assume that L 2Rp!MN and lmnRMNRp (note that W 2Rlmn!MN ). The GSVD of the matrix pair (W, L) is a decomposition of W and L in the form of " # S 0 W ZU XK1 ; L Z V½F 0 XK1 ; 0 InKp where U Z ½u1 ; .; uMN 2Rlmn!MN and V Z ½v1 ; .; vp 2 Rp!p with UT UZI MN and VT VZI p , X 2RMN!MN is non-singular, SZdiag(s1,.,sp) and FZdiag(f1,.,fp).

399

Moreover 0% s1 %/% sp % 1;

1R f1 R/R fp R 0;

and they are so normalized that s2i C f2i Z 1, iZ1,.,p. The generalized singular values gi of (W, L) are defined as the ratios gi Z si =fi ; i Z 1; .; p: When L is the identity matrix IMN, the generalized singular values of (W, IMN) are identical to the singular values of W. Now we can introduce Theorem 4. Theorem 4 (The discrete Picard condition [10])The unperturbed observation data s in (29) with the regularization matrix L satisfies the discrete Picard condition iff the Fourier coefficients juTi sj, iZ1,.,MN, on the average decay to zero faster than the generalized singular values gi. The detailed discussions on this theorem are presented by Hansen [10]. Here we give an intuitive explanation. For simplicity, consider the least square problem minkWfKsk2 based on Eq. (26), and assume that W has no exact zero singular values. Using the SVD defined in Eq. (16), it is easy to show that the solution is given by [9] f^ Z

n X uTi s iZ1

si

vi :

(30)

If the Fourier coefficients juTi sj do not decay as fast as the singular values si, the solution f^ is then dominated by the terms in the sum corresponding to the smaller si. Such terms become more and more oscillatory as si decrease [10], and more and more sensitive to noise. As a result, the solution f^ may appear completely random and be meaningless. The purpose of a regularization method as mentioned in Theorem 4 is to damp or filter out the contributions to the solution from the small generalized singular values. If the DPC fails in the underlying problem, it is generally impossible to compute a satisfactory solution by means of the Tikhonov regularization or any related methods [10]. For other kinds of regularization, there also exist conditions analogous to the DPC. Given a regularization scheme (the matrix L is fixed) in SRR, the system matrix W can vary with the different settings to the magnification factor; this may cause the DPC failure. In addition, a rank-deficient aliasing matrix H, caused by inappropriate motion types or large motion estimation errors, may result in the failure of the DPC as well. Hence we need to check the DPC before performing the SRR. In practical applications, one may compute the ratios of the Fourier coefficients juTi sj to the generalized singular values gi by the following geometric means [10] !1=ð2qC1Þ iCq 1 Y T ju sj ; i Z q C 1; .; MN K q; ri d gi jZiKq j (31)

400

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

where q is a small integer. The DPC is satisfied when the ri decay monotically to zero. We shall give demoinstrations in Section 5 for using the DPC in SRR.

5. Experimental demonstrations 5.1. The effects of H and B for the ill-posedness of SRR Here we perform experiments to illustrate the role of the matrices H and B in the discrete ill-posedness of the SRR problems. As in Fig. 2, we use a synthetic HR image with the size of 24!24, and 5 LR images with the size of 12! 12. The results are shown in Fig. 3. In Section 2.3 we have proven that the matrix H is well conditioned if it is of full rank; in addition, we showed in Section 3.1 and Fig. 2 that the perturbed version H^ of H is still well conditioned if it is rank preserved and the motion estimation error is small. On the other hand, Section 3.2 indicates that the matrix B is generally ill conditioned. Fig. 3(a) illustrates this observation by using the plots of singular values for some blurring matrices. This subfigure depicts that the singular values of B decay very fast as the PSF size increases. The condition numbers are computed in turn as 97.11, 1.73!103, 1.09!1018, and 4.52!1019. Note that the higher the magnification factor is set for SRR by using Eq. (26), the larger the PSF kernel size (e.g. the box kernel size [2,14]) is. In addition, we find in Section 3.2 that if H is of full rank, then the ill-conditionedness of W in Eq. (26) is almost entirely caused by the ill-conditionedness of B. Fig. 3(b) demonstrates this case. It shows that, even if the matrix H takes its perturbed

version, and the matrix B is nearly well conditioned (k2(B)Z97.11), the property of W is still dominated by and close to that of B. ^ is rank deficient, the However, if H (or its perturbation H) ill-conditionedness of W may be heavily affected by that of H, as shown in Fig. 3(c). Here the rank of H is low (Z553!242), and the singular values of H decay faster than that of B. Consequently, the singular value distribution of W is very biased from that of B. Clearly the image rotation used in Fig. 3(c) is generally worse for SRR than the image translation. Good motion type used in SRR may keep the singular value distribution of H as ‘flat’ as possible. 5.2. Performance limits and the DPC We now demonstrate the use of the DPC to identify the performance limits on SRR. For simplicity, we set in Eq. (29) the matrix LZIMN, so the generalized singular values gi in Theorem 4 and Eq. (31) reduce to the singular values si of the matrix W. In practice, we only know the perturbed data, rather than the true observation s. Thus in checking the DPC, we may replace the true s by the noisy data to compute the ratios ri in Eq. (31). Such a setting makes sense and is practically useful, since it can reveal the DPC for the underlying unperturbed problem over the noise level [10]. First, we show how the DPC can be used to indicate the limits on SRR when the magnification factor changes, see Fig. 4 for illustrations. Here the experimental setting is similar to that in [2]. The SRR is performed by solving the regularized problem (29) using certain iterative method, see, for example [16].

Fig. 4. The DPC for SRR under various magnifications. The HR image is the 96!96 Lena, which is translated, blurred with Gaussian filters, decimated with different factors r, and contaminated with white noise to generated LR observations in (a), (c) and (e) for rZ2, 6, and 12, respectively. The corresponding reconstructed images are shown in (b), (d) and (f), respectively. The first 5000 singular values si of W, the Fourier coefficients juTi sj, and the geometric means juTi sj=si for the DPCs under rZ2, 6, and 12 are shown in (g), (h) and (i), respectively.

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

401

Fig. 5. The DPC for SRR under arbitrary motion and a fixed magnification. (a) The HR image, the first frame of the Carphone video with size 96!96. (b) One of the 5 LR frames, which are generated from the first 5 Carphone frames by blurring with Gaussian, decimating with rZ2, and contaminating with white noise. (c) The bilinear interpolation of (b). The SRR results by using the block-match and the robust motion estimator are shown in (d) (25.05 dB, PSNR) and (e) (29.30 dB), respectively. (f) The reconstruction from 5 synthetic LR frames with perturbed translational motion (33.78 dB). The plots for the DPCs corresponding to the reconstructions in (d), (e) and (f) are shown in (g), (h) and (i), respectively.

We can see from Fig. 4 that, if the magnification factor is small (e.g. rZ2 in Fig. 4(g)), the DPC is satisfied for almost all induces i. This indicates that high-frequency information in the original image can be reconstructed fairly well. But in Fig. 4(h), the indices satisfying the DPC are less than 3000, and in Fig. 4(i), this value reduces nearly to 1000. Therefore, as the magnification factor increases, fewer and fewer highfrequency indices could satisfy the DPC and could be recovered from regularization methods as in Eq. (29), so the reconstructed image seems over smooth. The plots of the DPCs as in Fig. 4 may provide quantitative measures for limits on SRR. The performance degradation with the increase of magnification factor has been discussed in [2,14]. Here our contributions are: (1) We use the DPC to give a clearer interpretation to this problem. (2) We show that the limit on SRR as the magnification factor increases is essentially similar to the limit on CIR, since the SRR with large magnification factors is equivalent to the CIR with large PSF sizes if the aliasing matrix H has the ideal properties, see Section 5.1. Next, we check the DPC for SRR under arbitrary motion with motion estimation errors, see Fig. 5. We estimate the motion using block-match method (which has large error, so the perturbed matrix H^ is not rank preserved). The corresponding plot in Fig. 5(g) indicates that, for indices larger than 1000, the DPC is difficult to be satisfied. This predicts that the reconstruction quality is low (the result in Fig. 5(d) validates this). We then use a more accurate robust method [3] to estimate the motion. The corresponding DPC

plot is in Fig. 5(h), which shows that the indices satisfying the DPC is larger than 2000. Thus the regularization can obtain a better result, as shown in Fig. 5(e). The improvement from Fig. 5(d) to (e) is attributed to the more accurate motion estimation. But Fig. 5(h) shows that the DPC still fails for higher frequency indices. This may not be completely overcome by further increasing the estimation accuracy, since the matrix H in the underlying problem is rank deficient (due to the faulty motion among the 5 LR frames). For comparison, we use another 5 synthetic blurred and noisy LR frames generated from Fig. 5(a) by perturbed translations. In this case, the matrix H is almost rank full, although there still exist motion errors. Fig. 5(i) shows that more indices than those in (h) satisfy the DPC, so the reconstruction in Fig. 5(f) is better. Anyway, for all cases of the SRR problems, it is helpful to check the DPC before performing the reconstruction, while having constructed the matrix WZHB by selecting a magnification factor and then estimating the image motion. The DPC can predict whether the reconstruction is meaningful, and how the reconstruction quality may be.

6. Conclusions This paper discusses the theoretical aspects of the multiframe SRR under general cases, including arbitrary image motion, blurring and noisy effects, and the perturbations caused by motion estimation errors. The antialiasing part in SRR is essentially a discrete well-posed

402

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

problem. Small motion errors will not give rise to unacceptably large changes in the reconstruction if the perturbation is rank preserved. The ill-posedness of the entire SRR problem is mainly caused by the deblurring part. Specifically, we have the following claims and application guidelines. The discrete Picard condition provides a theoretical measure for limits on SRR. To perform the SRR with a Tikhonov-like regularization in practice, one should first check the DPC to ensure that the reconstruction is meaningful. The DPC also gives a prediction to the reconstruction quality, and gives a guideline for designing regularization parameters [10]. If the aliasing matrix H is of full rank, then the limit on SRR is essentially similar to the limit on the traditional image restoration; both of the two are caused by the illconditionedness of the blurring matrix B. The SRR with a large magnification factor is equivalent to the image restoration with a large PSF size. The discussion in [2] belongs to this case. Large magnification factors may cause the DPC failure. On the other hand, if the matrix H is rank deficient, its properties may heavily affect the performance of SRR, and the DPC may fail even at low magnifications. For a fixed magnification factor, the rank of H is determined by the number of frames, the warping type among images, and the motion estimation accuracy. We should select frames with appropriate warping type to increase the rank of H as high as possible, and ensure that the singular values of H decay as slowly as possible. We should also improve the motion estimation to ensure the rank of the perturbed matrix H^ as close to that of H as possible. In this paper we assume that the blurring matrix B is exactly known. This assumption may be released in the future to introduce the perturbation in B caused by the PSF estimation error.

Acknowledgements

as the linear combination of the pervious j row vectors, i.e. ej;span{e0,e1,.,ejK1}. Necessity. Suppose that qjK1(lj)s0, then dk2{0,., jK1}, so that ek(lj)Z1. Since ek is also a unit vector, it holds that ejZek2span{e0,e1,.,ejK1}. This contradicts e j;span{e 0,e 1,.,ejK1}. Therefore we must have qjK1(lj)Z0. (2) Sufficiency. According to the proof of (1), the condition q lmnK1(q)s0 for all q2{0,1,.,MNK1} implies that all the MN unit vectors of the space RMN belong to span{e0,e1,.,elmnK1}. That is, spanfe0 ; e1 ; .; elmnK1 gZ RMN . Hence rankðHÞ Z dim spanfe0 ; e1 ; .; elmnK1 g Z MN; i.e. H is of full rank. Necessity. Suppose there exists q2{0,1,.,MNK1}, so that qlmnK1(q)Z0. By (1) again, there is a unit vector e 2R1!MN , where e(q)Z1, so that ej;span{e0,e1,.,elmnK1}. This implies that span{e0,e1,.,elmnK1} is a proper subspace of RMN , thus its dimension is less than MN. This contradicts rank (H)ZMN. ,

Appendix B. Proof of Corollary 1 Proof. Denote A as the set on the right-hand side of Eq. (14). Consider the sum qlmnK1 of all row vectors in H, where lZcard(A)Zr2. Note that there are MNZmnr2 entries in the vector qlmnK1. Replace (ui, vi) in Eq. (A1) by the elements of the set A, we can easy verify: The motion vector (u0, v0)Z (x 0, y 0)2A makes the mn entries x p rCjp Mr, pZ0,1,.,mnK1 of the sum qlmnK1 to be non-zero, the motion vector (u1, v1)Z(x0, y0K1)2A makes the mn entries xprCjpMrC1, pZ0,1,.,mnK1 of qlmnK1 to be non-zero, and the rest may be deduced similarly. Hence the entire set of r2 motion vectors in A ensure that qlmnK1(i)s0 for all iZ0,1,.,lmnK1. Therefore the matrix H is of full rank according to Theorem 1. ,

This work was supported in part by the National Natural Science Foundation of China under Grant 60072029. Appendix C. Proof of Theorem 2 Appendix A. Proof of Theorem 1 Proof. (1) Sufficiency. From the definition of the matrix H, we know that all row vectors ej of H are unit vectors of the space RMN , i.e. entries of ej are all zeros, except the entry ej(lj)Z1 where lj dc C xp r C jp Mr K Mui;p K vi;p ;

(A1)

with pZmod(j, mn) and iZbj/mnc. Thus by the definition of the partial sum qjK1 in Eq. (12), Eq. (13) implies that all kZ0,.,jK1. Hence cak 2R, PjK1 e k(l j )Z0, a e ðl ÞZ 0. But e (l )Z1, so e cannot be expressed j j j kZ0 k k j

Proof. Let gZ[a0,a1,.,aMNK1]T be the HR image. From Eq. (10) we know that elements of the vector bdsKn are given by bðjÞ Z ej g Z alj ; j Z 0; 1; .; lmn K 1; where ej is the jth row vector of H, and lj is defined in Eq. (A1). The above equation shows that elements of b are only the rearrangement of elements of g. For the sum of row vectors of H defined in Eq. (12), assume that qlmnK1(i)Zmi, 0%mi%m, iZ0,1,.,MNK1. It can be seen that mi counts the number of instances of ai appearing in the vector b. Thus we have

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404 lmnK1 X

a2lj Z

MNK1 X

jZ0

mi a2i % m

MNK1 X

iZ0

a2i ;

iZ0

where the latter equality holds if aiZ0 for all mism. The above inequality implies that kbk22 % mkgk22 . Therefore pffiffiffi mkgk2 pffiffiffi kHgk2 Z Z m: kHk2 :Z max kgk kgk 2 2 kgk2s0 If H is of full rank, we have miR1, iZ0,1,.,MNK1, according to Theorem 1. This implies that MNK1 X

a2i %

MNK1 X

iZ0

mi a2i ;

iZ0

kgk22 % kbk22 . The C

equality holds if aiZ0 for all that is miO1. Since gZH b, we finally have kHCk2Z1 by derivations similar as the above. ,

Appendix D. Proof of Theorem 3 Proof. For an arbitrary matrix A 2Rm!n , its Frobenius norm is known as [9] !1=2 m X n X 2 kAkF :Z jaij j ; iZ1 jZ1

and the relationship between 2-norm and F-norm is given by pffiffiffi kAk2 % kAkF % nkAk2 : Thus for the matrices H, H^ 2Rlmn!MN , we have 2 jjH^ K Hjj2 % jjH^ K Hjj2F

¼

lK1 X

jjDGi ðDGi K IÞjj2F %

i¼0

%a2

lK1 X

lK1 X

jjDGi jj2F jjDGi K Ijj2F

i¼0

pffiffiffiffiffiffiffiffi jjDGi jj2F ¼ a2 jjHjj2F % a2 ð MN jjHjj2 Þ2 ;

i¼0

where a is defined in Eq. (25). This proves the inequality ^ (23). Next, if rankðHÞZ rankðHÞ, then from the general perturbation theorem of pseudo-inverse [26,32], we have pffiffiffi 1 C 5 ^ C C C kH K Hk2 kH^ k2 kHCk2 : kH^ K H k2 % 2 Using the bound on kH^ K Hk2 derived above and the definition of k2(H) in Eq. (15), we obtain the inequality (24) in Theorem 3. ,

References [1] H.C. Andrews, B.R. Hunt, Digital Image Restoration, Prentice Hall, Englewood Cliffs, NJ, 1977.

403

[2] S. Baker, T. Kanade, Limits on super-resolution and how to break them, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (9) (2002) 1167–1183. [3] M.J. Black, P. Anandan, A framework for the robust estimation of optical flow, Proceedings of the International Conference on Computer Vision 1993; 231–236. [4] S. Borman, R.L. Stevenson, Simultaneous multi-frame map superresolution video enhancement using spatio-temporal priors, Proceedings of the IEEE International Conference on Image Processing, Kobe, Japan 3 (1999) 469–473. [5] M.-C. Chiang, T.E. Boult, Efficient super-resolution via image warping, Image and Vision Computing 18 (2000) 1761–1771. [6] M. Elad, A. Feuer, Restoration of a single superresolution image from several blurred, noisy, and undersampled measured images, IEEE Transactions on Image Processing 6 (12) (1997) 1646–1658. [7] M. Elad, A. Feuer, Super-resolution reconstruction of image sequences, IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (9) (1999) 817–834. [8] K. Forbes, V.V. Anh, Condition of system matrices in image restoration, Journal of Optical Society of America A 11 (6) (1994) 1727–1735. [9] G.H. Golub, C.F. Van Loan, Matrix Computations, third ed, Johns Hopkins University Press, Baltimore, MD, 1996. [10] P.C. Hansen, The discrete Picard condition for discrete ill-posed problems, BIT 30 (1990) 658–672. [11] R.C. Hardie, K.J. Barnard, E.E. Armstrong, Joint map registration and high-resolution image estimation using a sequence of undersampled images, IEEE Transactions on Image Processing 6 (12) (1997) 1621– 1633. [12] S.P. Kim, N.K. Bose, H.M. Valenzuela, Recursive reconstruction of high resolution image from noisy undersampled multiframes, IEEE Transactions on Acoustics, Speech and Signal Processing 38 (1990) 1013–1027. [13] S. Lertrattanapanich, N.K. Bose, High resolution image formation from low resolution frames using Delaunay triangulation, IEEE Transactions on Image Processing 11 (12) (2002) 1427–1441. [14] Z. Lin, H.-Y. Shum, Fundamental limits of reconstruction-based superresolution algorithms under local translation, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (1) (2004) 83–97. [15] F. Milinazzo, C. Zala, I. Barrodale, On the rate of growth of condition numbers for convolution matrices, IEEE Transactions on Acoustics, Speech, and Signal Processing 35 (4) (1987) 471–475. [16] M.K. Ng, N.K. Bose, Mathematical analysis of super-resolution methodology, IEEE Signal Processing Magazine 20 (3) (2003) 62–74. [17] N. Nguyen, P. Milanfar, G. Golub, A computationally efficient superresolution image reconstruction algorithm, IEEE Transactions on Image Processing 10 (4) (2001) 573–583. [18] A. Papoulis, Generalized sampling expansion, IEEE Transactions on Circuits and Systems 24 (11) (1977) 652–654. [19] S.C. Park, M.K. Park, M.G. Kang, Super-resolution image reconstruction: a technical overview, IEEE Signal Processing Magazine 20 (3) (2003) 21–36. [20] A.J. Patti, Y. Altunbasak, Artifact reduction for set theoretic super resolution image reconstruction with edge adaptive constraints and higher-order interpolants, IEEE Transactions on Image Processing 10 (1) (2001) 179–186. [21] A.J. Patti, M.I. Sezan, A.M. Tekalp, Superresolution video reconstruction with arbitrary sampling lattices and nonzero aperture time, IEEE Transactions on Image Processing 6 (8) (1997) 1064–1076. [22] D. Rajan, S. Chaudhuri, Simultaneous estimation of super-resolved scene and depth map from low resolution defocused observations, IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (9) (2003) 1102–1117. [23] R.R. Schultz, R.L. Stevenson, Extraction of high-resolution frames from video sequences, IEEE Transactions on Image Processing 5 (6) (1996) 996–1011.

404

Z. Wang, F. Qi / Image and Vision Computing 23 (2005) 393–404

[24] H. Shekarforoush, R. Chellappa, Data-driven multi-channel superresolution with application to video sequences, Journal of Optical Society of America A 16 (3) (1999) 481–492. [25] H. Stark, P. Oskoui, High-resolution image recovery from imageplane arrays, using convex projections, Journal of Optical Society of America A 6 (1989) 1715–1726. [26] G.W. Stewart, On the perturbation of pseudo-inverses, projections, and linear least squares problems, SIAM Review 19 (1977) 634–662. [27] A.M. Tekalp, Digital Video Processing, Prentice Hall, Englewood Cliffs, NJ, 1995. [28] R.Y. Tsai, T.S. Huang, Multiframe image restoration and registration in: T.S. Huang (Ed.),, Advances in Computer Vision and Image Processing vol. 1, JAI Press, Greenwich, 1984, pp. 317–339.

[29] H. Ur, D. Gross, Improved resolution from sub-pixel shifted pictures, CVGIP: Graph, Models, Image Processing 54 (2) (1992) 181–186. [30] Z. Wang, F. Qi, Super-resolution video restoration with model uncertainties, Proceedings of the IEEE International Conference on Image Processing 2 (2002) 853–856. [31] Z. Wang, F. Qi, On ambiguities in super-resolution modeling, IEEE Signal Processing Letters 11 (8) (2004) 678–681. ˚ . Wedin, Pertubation theory for pseudo-inverses, BIT 13 (1973) [32] P.-A 217–232. [33] W. Zhao, H.S. Sawhney, Is super-resolution with optical flow feasible?, Proceedings of the European Conference on Computer Vision 1 (2002) 599–613.