Nonlinear Analysis 71 (2009) 4357–4363
Contents lists available at ScienceDirect
Nonlinear Analysis journal homepage: www.elsevier.com/locate/na
Quasi-random sampling for signal recovery Mirosław Pawlak a,∗ , Ewaryst Rafajłowicz b a
Department of Electrical & Computer Engineering, The University of Manitoba, Winnipeg, Manitoba, Canada, R3T 2N2
b
Institute of Computer Engineering, Control and Robotics Wrocław University of Technology, Wroclaw, Poland
article
info
Article history: Received 11 June 2008 Accepted 13 February 2009
abstract The problem of reconstruction of band-limited signals from nonuniformly sampled and and noisy observations is studied. It is proposed to sample a signal at quasi-random points that form a deterministic sequence, with properties resembling a random variable, being uniformly distributed. Such quasi-random points can be easily and efficiently generated. The reconstruction method based on the modified Whittaker–Shannon cardinal expansion is proposed and its asymptotical properties are examined. In particular, the sufficient conditions for the convergence of the mean integrated squared error are found. The key difference between the proposed reconstruction algorithm and the classical Whittaker–Shannon scheme is in treating the sampling rate and the reconstruction rate differently. This distinction is necessary to ensure consistency of the reconstruction algorithm in the presence of noise. © 2009 Elsevier Ltd. All rights reserved.
MSC: 94A12 41A45 62G07 41A25 Keywords: Band-limited signals Nonlinear sampling Quasi-random points Orthogonal series Noisy data Reconstruction Convergence
1. Introduction and preliminaries Throughout the paper we assume that a signal f (t ) has a bounded spectrum, i.e., that its Fourier transform F (ω) vanishes outside of a finite interval. Any signal with such a property is referred to as band-limited. A fundamental result in signal π processing (see [1–4]) is that any band-limited signal f (t ) can be perfectly recovered from its discrete values f (k Ω ), k = 0, ±1, ±2, . . .. In fact, the celebrated Whittaker–Shannon sampling theorem says that f (t ) =
∞ X
f
k=−∞
π π k sinc Ω t − k , Ω Ω
(1.1)
where sinc(t ) = sin(t )/t, and Ω is the bandwidth of f (t ), i.e., f (t ) =
1 2π
Z
Ω
−Ω
F (ω)eiωt dω.
(1.2)
Here F (ω) is the square integrable function on (−Ω , Ω ). In the sequel we shall denote the class of functions satisfying (1.2) by BL(Ω ).
∗
Corresponding author. Tel.: +1 204 474 8881; fax: +1 204 261 4639. E-mail address:
[email protected] (M. Pawlak).
0362-546X/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.na.2009.02.079
4358
M. Pawlak, E. Rafajłowicz / Nonlinear Analysis 71 (2009) 4357–4363
In practical applications of the sampling formula in (1.1) one has to truncate the series in(1.1). Moreover, the samples π f (k Ω ) are often observed in the presence of noise, i.e., we have noisy observations yk = f k Ωπ + zk , where zk is a zero mean noise process with a finite variance. This would lead to the following naive reconstruction scheme fn ( t ) =
X
yk sinc Ω t − k
|k|≤n
π , Ω
(1.3)
where 2n + 1 is the number of observations taken into account. The fundamental question, which arises is whether fn (t ) can be a consistent estimate of f (t ), i.e., whether %(fn , f ) → 0 as n → ∞, in a certain probabilistic sense, for some distance measure %. Since f (t ) is assumed to be square integrable, then the natural measure between fn (t ) and f (t ) is the mean integrated square error MISE(fn ) = E
Z
∞
(fn (t ) − f (t ))2 dt .
(1.4)
−∞
This represents the energy of the error signal fn (t ) − f (t ). It was shown in [5] that MISE(fn ) =
π X 2 π σ 2 π (2n + 1) + f k . Ω Ω |k|>n Ω
(1.5)
This readily implies that MISE(fn ) → ∞ as n → ∞. This unpleasant property of the estimate fn (t ) is caused by the presence π of the noise process in the observed data and the fact that fn (k Ω ) = yk , i.e., fn (t ) interpolates the noisy observations. It is clear that one should avoid interpolation schemes in the presence of noise since they would retain random errors. The aim of this paper is to propose a consistent estimate of f (t ) being a smooth correction of the naive algorithm in (1.3). This task is carried out by sampling a signal at irregularly spaced quasi-random points and by carefully selecting the number of terms in the sampling series. The conditions for consistency of our estimate are established. Throughout the paper we assume that the data are generated from the following model yk = f (τk ) + zk ,
k = 0, ±1, ±2, . . .
(1.6)
where τk are quasi-random points, and zk is uncorrelated noise process with Ezk = 0, var(zk ) = σ 2 < ∞. The problem of recovering a nonparametric function in the model (1.6) has been an active area of research in recent years, see, e.g., [6,7]. Nonparametric curve estimation techniques like kernel, spline, orthogonal series estimates have been extensively studied. Nevertheless, estimated functions have been assumed to be defined on a finite interval. In a number of signal processing and communication systems problems, the assumption that f (t ) is defined on a finite interval has a little physical significance. In fact, the function f (t ) has to be interpreted as a signal defined everywhere in time and, moreover, the signal has finite energy and bounded frequency content. These requirements are met by the class BL(Ω ) considered in our paper. Moreover, it has been commonly assumed that the sampling scheme in (1.6) is linear, i.e., the sampling points are specified as τk = k∆, for some sampling period ∆. In this paper, we propose a nonlinear sampling scheme based on the theory of quasi-random sequences. The use of quasi-random sequences as the sampling scheme has several salient features that are listed below. (1) If a quasi random-sequence is generated in the way as it is described in the next section, then adding new sampling points does not require changing positions of the sampling points that were placed earlier. The advantage of this property can be fully appreciated when one samples spatial objects like images. Note also that in the case of the regular sampling scheme with τk = k∆ we need to rearrange all the previous samples. (2) When a smoothing method is used to reconstruct a signal then the accuracy of the method is inevitably dependent on the quadrature error of approximating integrals by the corresponding finite sums. It is known that the quadrature error caused by quasi-random points is considerably smaller than that caused by the classical uniformly distributed sequences. This fact is extensively utilized in this paper. (3) Samples taken at quasi-random points can be transmitted in a secure way, since they are useless unless an irrational number, which we use for designing quasi-random points, is known. 2. Sampling with quasi-random points The notion of quasi-random sequences has been originally established in the theory of numerical integration (see [8,9]). A sequence of real numbers {xj , j = 1, 2, . . .} is said to be a quasi-random sequence in [0, 1] if for every function b(x) that is continuous on [0, 1] we have lim
n→∞
n 1X
n j =1
b(xj ) =
1
Z
b(x)dx. 0
(2.1)
M. Pawlak, E. Rafajłowicz / Nonlinear Analysis 71 (2009) 4357–4363
4359
Quasi-random sequences are also called equidistributed sequences, since (2.1) means that the sequence {xj } behaves like uniformly distributed random variables. Nevertheless, an important property of quasi-random sequences is that they are more uniform than random uniform sequences which tend to clump. A consequence of this fact is that the accuracy of approximating integrals based on quasi-random sequences is superior to the accuracy obtained by random sequences. This basic observation plays a key role in our developments concerning the signal recovery problem. Numerous sequences have been shown to have the property in (2.1). The simplest way (and one which is sufficient for our purposes) of generating a quasi-random sequence is the following (see [8,9]) xj = frac(j ϑ),
(2.2)
where ϑ is an irrational number and frac(.) denotes the fractional part of a number in the parenthesis. A good choice of ϑ is √ ( 5 − 1)/2 (see [10] for an extensive discussion on the choice of ϑ ). The signal sampling problem requires a rescaled version of quasi-random sequences. Hence, let us re-scale {xj } to the observation interval [−T , T ]. Thus, we adopt the following sampling scheme
τj = T sgn(j) frac(|j| ϑ),
j = 0, ±1, ±2, . . . , n,
(2.3)
where sgn(.) is the sign of a number. Since band-limited signals are defined on the whole real line we must increase the value of the observation horizon T . In fact, we treat T as a function of n, such that T (n) → ∞ as n → ∞. In order to establish the consistency result of our reconstruction algorithm we must control the growth of T (n), see Section 4 for details. The approximation property (see (2.1)) of quasi-random sequences applied to the sequence defined in (2.3) gives that 2T
X
2n + 1 |j|≤n
f (τj ) ≈
Z
T
f (t )dt ,
(2.4)
−T
for a class of all continuous functions on the real line. 3. Reconstruction algorithm It has been known since the work of Hardy [11], see also [12], that the representation in (1.1) can be viewed as the orthogonal expansion in BL(Ω ), i.e., we have f (t ) =
∞ X
ck sk (t ),
(3.1)
k=−∞
where functions sk (t ) = sinc
π h
(t − kh) ,
k = 0, ±1, . . .
form orthogonal and complete system in BL(Ω ) provided that h ≤ π /Ω . The Fourier coefficient ck is defined as ck = h−1
∞
Z
f (t )sk (t )dt . −∞
It is also clear that ck = f (kh) and that the case h = π /Ω corresponds to (1.1). R∞ Our estimation technique relies on (3.1) with ck replaced by a piecewise approximation of the integral −∞ f (t )sk (t )dt over a finite interval and with the appropriate truncation of the infinite series in (3.1). In this paper we approximate the integral with the help of quasi-random sequences. Hence, we can obtain the following estimate of f (t ) f˜ (t ) =
X
c˜k sk (t ),
(3.2)
|k|≤N
c˜k =
2T
X
(2n + 1)h |j|≤n
yj sk (τj ),
(3.3)
where {τj } is the quasi-random sequence defined in (2.3). In (3.2) the parameter N defines the number of terms in the expansion which are taken into account and 2n + 1 is the sample size. This truncation parameter plays important role in our asymptotic analysis, i.e., N depends on n such that N (n) → ∞ with the controlled rate. It is also worth noting that the sampling rate is nonuniform (defined by the discrepancy of the quasi-random sequence in (2.3)) and different from the reconstruction rate h. The latter is assumed to be constant and not greater than π /Ω . In the next section we present conditions on N and T ensuring that MISE(f˜ ) tends to zero as n → ∞.
4360
M. Pawlak, E. Rafajłowicz / Nonlinear Analysis 71 (2009) 4357–4363
4. The MISE consistency
R∞
From the orthogonality of {sk (t )} and by the fact that −∞ s2k (t )dt = h, we have
X
MISE(f˜ ) = h
X
E(˜ck − ck )2 + h
ck2 .
(4.1)
|k|>N
|k|≤N
Assuming that the sequence N (n) is chosen in such a way that N (n) → ∞ as n → ∞,
(4.2)
then for the second term in (4.1) we have h
X
ck2 → 0
as n → ∞.
(4.3)
|k|>N
The first term in (4.1) can be further decomposed as follows h
X
var(˜ck ) + h
|k|≤N
X
Ec˜k − ck
2
.
(4.4)
|k|≤N
The decomposition of the error given in (4.1) and (4.4) allows us to examine separately the estimate f˜ (t ) variance and bias. These terms will be examined in the next two paragraphs. Variance: Let us denote VAR(f˜ ) = h
X
var(˜ck ).
|k|≤N
The asymptotic behavior of VAR(f˜ ) is given in the following lemma. Lemma 1. Let h ≤ π /Ω . Suppose that T (n) → ∞. Then for n sufficiently large we have VAR(f˜ ) =
σ2
2T
h 2n + 1
(2N + 1)(1 + o(1)).
Proof. Let us first observe that var(˜ck ) =
σ2
4T 2
h2
(2n + 1)
X 2
s2k (τj ).
(4.5)
|j|≤n
Since the sequence {τj } is quasi-random, then due to (2.4), we have 2T
X
(2n + 1) |j|≤n
s2k (τj ) =
Z
T
s2k (t )dt + o(1),
(4.6)
−T
where, R ∞it can be shown that the error of the above approximation tends to zero as n → ∞. On the other hand, by the facts that −∞ s2k (t )dt = h and that s2k (t ) decays as 1/t 2 , we obtain
Z
T
s2k (t )dt = h + O(T −1 ). −T
Thus, for n sufficiently large we have var(˜ck ) =
σ2
2T
h2
(2n + 1)
(h + o(1)).
This completes the proof of Lemma 1.
•
The next paragraph concerns the bias component of the error. Bias: Owing to (4.1) and (4.4) BIAS(f˜ ) = h
X |k|≤N
(Ec˜k − ck )2 + h
X
ck2 .
(4.7)
|k|>N
Let us note that the first term in the above decomposition is due to approximation of the integral defining ck by the summation formula. On the other hand, the second term in (4.7) represents the error due to the truncation of the expansion
M. Pawlak, E. Rafajłowicz / Nonlinear Analysis 71 (2009) 4357–4363
4361
in (3.1). We have already noted that this term tends to zero as N → ∞. Hence, let us concentrate on the first term in (4.7), which is denoted as bias(f˜ ). For the single summand of bias(f˜ ) we have
Z ∞ 2T X |Ec˜k − ck | = h f (t )sk (t )dt f (τj )sk (τj ) − (2n + 1) |j|≤n −∞ Z Z T 2T X ≤ h−1 f (t )sk (t )dt + h−1 f (τj )sk (τj ) − f (t )sk (t )dt . 2n + 1 |j|≤n −T |t |>T −1
(4.8)
It is clear that for every f ∈ BL(Ω ) the second term in the above decomposition tends to zero as T → ∞. In order, however, to choose an appropriate T to ensure that the first term in the decomposition goes to zero, we need an assumption on the decay of f (t ) as |t | → ∞. Thus, let assume that (F) There exists r ≥ 0 and a constant Cf > 0 such that for |t | sufficiently large we have |f (t )| ≤ Cf /|t |r +1 . This assumption can be also expressed in the frequency domain by requiring that the Fourier transform F (ω) of f (t ) has r derivatives on [−Ω , Ω ]. By virtue of Assumption (F) we can readily obtain the following bound for the second term in (4.8)
Z
|t |>T
f (t )sk (t )dt ≤ C¯ f /T r ,
(4.9)
where C¯ f = 2Cf /r. To bound the first term in the decomposition in (4.8) we need the fundamental inequality due to Koksma-Hlavka (see [9]) on the size of the numerical error for quasi-random Monte Carlo integration. In our case this result reads as follows
Z T 2T X f (τj )sk (τj ) − f (t )sk (t )dt ≤ 2T VT (fsk )D∗n . 2n + 1 |j|≤n −T
(4.10)
Here VT (fsk ) is the total variation of f (t )sk (t ) over the interval [−T , T ]. On the other hand, D∗n denotes the so-called discrepancy of the first n terms of the quasi-random sequence {τj } (see [9]). The discrepancy of {τj } measures the strength of the sequence to approximate the uniform distribution on [0, 1]. It is worth noting that, in our case, we have the quasirandom sequence on [−T , T ]. Nevertheless, it can be shown that the discrepancy is independent on T . It is known that if the constant ϑ (this constant defines our quasi-random sequence {τj }, see (2.3)) is a quadratic irrational
√
number (e.g. ϑ = ( 5 − 1)/2) then we have D∗n = O(log(n)/n).
(4.11)
Consequently we obtain
Z T 2T X f (τj )sk (τj ) − f (t )sk (t )dt ≤ C 2T VT (fsk ) log(n)/n, 2n + 1 |j|≤n −T
(4.12)
for some universal constant C . Owing to aforementioned considerations, it remains to evaluate the total variation VT (fsk ) of f (t )sk (t ). To do so, let us observe that
VT (fsk ) ≤ M (VT (f ) + VT (sk )),
(4.13)
where VT (g ) denotes the total variation of function g (t ) on [−T , T ]. Also M = max(supt ∈R |f (t )|, supt ∈R |sk (t )|). It is known that |sk (t )| ≤ 1 and that for f ∈ BL(Ω ) we have |f (t )| ≤ kf k(Ω /π )1/2 , where kf k is the L2 -norm of f (t ), see [13]. Hence, M = max(kf k(Ω /π )1/2 , 1). Concerning the term VT (f ) + VT (sk ) in (4.13) we note that by the Cauchy–Schwarz inequality we have
VT (f ) =
T
Z
|f (1) (t )|dt ≤ (2T )1/2
−T
Z
∞
|f (1) (t )|2 dt
1/2
,
−∞
where f (1) (t ) denotes the derivative of f (t ). Since kf (1) k ≤ Ω kf k (Bernstein’s inequality, see [13]), therefore we obtain
VT (f ) ≤
√
2Ω kf kT 1/2 .
Observing that sk ∈ BL(π /h) and using the identical arguments we get
VT (sk ) ≤ π 2/3hT 1/2 .
p
Collecting the above results concerning the bias term we obtain.
4362
M. Pawlak, E. Rafajłowicz / Nonlinear Analysis 71 (2009) 4357–4363
Lemma 2. Under Assumption (F) bias(f˜ ) ≤ (2N + 1) C1 T −2r + C2 T 3 log2 (n)/n2 ,
where C1 , C2 are constants independent of T , N and n. Finally, we obtain MISE(f˜ ) ≤ (2N + 1)(C1 T −2r + C2 T 3 log2 (n)/n2 + C3 T /n) + h
X
ck2 ,
(4.14)
|k|>N
for some constants C1 , C2 , C3 . A careful analysis of the bound in (4.14) yields the following main result of this paper. Theorem 1. Let for f√∈ BL(Ω ) Assumption (F) be satisfied with r > 0. Let T (n) → ∞, N (n) → ∞. Suppose that T (n) does not grow faster than n. Let, moreover, N (n)T (n) n
→ 0,
N (n)T −2r (n) → 0.
Then MISE (f˜ ) → 0 as n → ∞. The conditions required on the parameters T (n) and N (n) in Theorem 1 impose some restrictions on their growth. In fact, let T (n) = nα and N (n) = nβ , some positive α, β . Then, the conditions of Theorem 1 are met if
β < 2r α,
β < 1 − α,
and α ≤ 1/2.
This defines a convex polygonal shape in the (α, β) plane. It is also worth nothing that the bound given in (4.14) allow us to choose T (n) that minimizes the upper bound. In fact, some algebra shows that T (n) that minimizes the right-hand-side 1
of (4.14) is of order n 2r +1 . The problem of choosing the tuning parameters T (n) and N (n) and corresponding convergence rates is left for future research. 5. Concluding remarks In this paper we have proposed an algorithm for recovering a band-limited signal observed under noise. Assuming that the signal is a square integrable function, the sufficient conditions for the convergence of the mean integrated square error have been established. The distinguishing feature of the proposed approach is that it is based on nonuniform samples, taken at quasi-random points. When quasi-random sequences are applied to the problem of numerical evaluation of integrals they √ reveal the approximation rate O(log(n)/n) for a class of bounded variation functions. This rate is superior to the rate O(1/ n) that characterizes usual numerical algorithms and classical Monte Carlo methods. This advantage of quasi-random sequences seems to be shown in the problem of signal sampling and recovery. In our consistency results, we assume that the reconstruction rate h is constant and could be chosen as large as π /Ω . One could also consider the case when h = h(n) and h(n) → 0 as n → ∞. The estimates with variable h would be needed for the problem of recovering not necessary band-limited functions. Furthermore, let us mention that the results of this paper can be extended to the d-dimensional case, where the Qd orthogonal system can be obtained in the form of the product of sinc functions, i.e., sk (t) = i=1 ski (ti ), where k = (k1 , k2 , . . . , kd ), t = (t1 , t2 , . . . , td ), see [4, Chapter 6] for a discussion on sampling theorems in higher dimensions. Here we only mention that multidimensional quasi-random sequences can be generated in a relatively straightforward way. Moreover, they exhibit the favorite discrepancy of order O(n−1 (log(n))d ) for any d. Yet another extension of our results could be made for reconstructing signals observed with some blurring, i.e., when one observes
Z
∞
K (s, τk )f (s)ds + zk ,
yk = −∞
where K (s, t ) is some kernel function. In particular, when K (s, t ) = K (s − t ) we can model the common situation when a noisy band-limited signal with missing frequencies is observed. This problem is left for future research. Acknowledgment The work of E. Rafajłowicz was supported by the Research and Development Grant from the Ministry of Science and Higher Education of Poland.
M. Pawlak, E. Rafajłowicz / Nonlinear Analysis 71 (2009) 4357–4363
4363
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
R.N. Bracewell, The Fourier Transform and Its Applications, 2nd ed. revised, McGraw-Hill Book Company, New York, 1986. P.L. Butzer, R.L. Stens, Sampling theory for not necessarily band-limited functions: A historical overview, SIAM Rev. 34 (1992) 40–53. J.A. Jerri, The Shannon sampling theorem—its various extensions and applications: A tutorial review, Proc. IEEE 65 (1977) 1565–1596. R.J. Marks II, Introduction to Shannon Sampling and Interpolation Theory, Springer-Verlag, New York, 1991. M. Pawlak, E. Rafajłowicz, On restoration of band-limited signals, IEEE Trans. Inform. Theory 40 (1994) 1490–1503. R.L. Eubank, Spline Smoothing and Nonparametric Regression, Marcell Dekker, New York, 1988. L. Györfi, M. Kohler, A. Krzyżak, H. Walk, A Distribution-Free Theory of Nonparametric Regression, Springer-Verlag, New York, 2002. P.J. Davis, P. Rabinowitz, Methods of Numerical Integration, Academic Press, Orlando, 1984. L. Kuipers, H. Niederreiter, Uniform Distribution of Sequences, Wiley, New York, 1974. E. Rafajłowicz, R. Schwabe, Equidistributed designs in nonparametric regression, Statist. Sinica 13 (2003) 129–142. G.H. Hardy, Notes on special systems of orthogonal functions (iv): The orthogonal functions of Whittaker’s cardinal series, Proc. Camb. Philos. Soc. 37 (1941) 331–348. [12] F. Stenger, Approximation via Whittaker’s cardinal function, J. Approx. Theory 17 (1976) 222–240. [13] A. Papoulis, Limits on bandlimited signals, Proc. IEEE 55 (1967) 1677–1686.
Further reading [1] K.T. Fang, T. Wang, Number-Theoretic Methods in Statistics, Chapman & Hall, London, 1994. [2] M. Pawlak, E. Rafajłowicz, A. Krzyżak, Postfiltering versus prefiltering for signal recovery from noisy samples, IEEE Trans. Inform. Theory 49 (2003) 3195–3212. [3] M. Pawlak, U. Stadtmüller, Signal sampling and recovery under dependent noise, IEEE Trans. Inform. Theory 53 (2007) 2526–2541. [4] E. Rafajłowicz, Nonparametric orthogonal series estimators of regression: A class attaining the optimal convergence rate in L2 , Statist. Probab. Lett. 5 (1987) 219–224.