Signal Processing 162 (2019) 180–188
Contents lists available at ScienceDirect
Signal Processing journal homepage: www.elsevier.com/locate/sigpro
On recovery of discrete time signals from their periodic subsequences Nikolai Dokuchaev a,b a b
School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, GPO Box U1987, Perth, 6845 Western Australia ITMO University, St. Petersburg, 197101 Russian Federation
a r t i c l e
i n f o
Article history: Received 8 November 2018 Revised 3 April 2019 Accepted 3 April 2019 Available online 4 April 2019
a b s t r a c t The paper investigates recoverability of discrete time signals (sequences) from their subsequences sampled at m-periodic points. It is shown that this recoverability is associated with arbitrarily small spectrum degeneracy of subsequences amended via insertion of dummy elements and bundled into a system. It appears that, for a given m, the signals of a general kind can be approximated by signals with this feature. A recovery algorithm is suggested, and some robustness of this recovery is established.
2010 MSC: 94A20 94A12 93E10
© 2019 Elsevier B.V. All rights reserved.
Keywords: Discrete time signals Digital signals Sampling Recovery from subsequences Spectrum degeneracy
1. Introduction Problems of recovery of signals from incomplete observations were studied intensively in different settings. Now this theory is rapidly developing to capture the growing requirements for the speed of processing and to overcome existing limitations. In general, possibility of recovery of a signal from a sample is usually associated with restrictions on the class of underlying signals such as restrictions on the spectrum or sparsity. For continuoustime signals, the classical Sampling Theorem establishes that a band-limited signal can be recovered without error from a discrete sample taken with a sampling rate that is at least twice the maximum frequency of the signal (the Nyquist critical rate). Respectively, a signal of a general type cannot be approximated by processes that are recoverable from their samples taken with insufficient frequency. This leads to major limitations for data compression and recovery. There are many works devoted to generalization of the sampling theorem and relaxation of the restrictions imposed by the Nyquist rate; see e.g., reviews in [15,23] and literature therein. For example, a subsequence sampled at sparse enough periodic points can be removed from observations of an oversampling sequence [11]. In addition, a band-limited function is uniquely defined by a semi-infinite half of any periodic over-
E-mail address:
[email protected] https://doi.org/10.1016/j.sigpro.2019.04.008 0165-1684/© 2019 Elsevier B.V. All rights reserved.
sampling sequence [10,11,22]. There is also a so-called Papoulis approach [18] allowing to reduce the sampling rate for continuous time signals with additional measurements at sampling points; this approach was extended on multidimensional signals [6]. A general model for signals in unions of subspaces covering a variety of sampling settings including sparsity and bandlimitedness was suggested in [14]. A connection of bandlimiteness and recoverability from samples was established for the fractional Fourier transform [1,24,25]. In general, there is a difference between the problem of uniqueness of recovery from a sample and the problem of existence of a stable recovery algorithm. For continuous-time signals, this was investigated in the framework of the approach based on the socalled Landau’s criterion [12,13]; see [12,13,16] and a recent literature review in [17]. It was found that there are arbitrarily sparse discrete uniqueness sets in the time domain for a fixed spectrum range. Originally, it was shown in [12] that there is an uniqueness set of sampling points representing small deviations of integers for classes of functions with an arbitrarily large measure of the spectrum range, given that there are periodic gaps in the spectrum and that the spectrum range is bounded. The implied sampling points were not equidistant and have to be calculated. This result was extended on functions with unbounded spectrum range and on sampling points allowing a convenient explicit representation [16]. Some generalization for multidimensional case were obtained in [20,21]. As was emphasized in [13], the uniqueness results do
N. Dokuchaev / Signal Processing 162 (2019) 180–188
not imply stable data recovery: Any sampling below the Landau’s critical rate cannot be stable. The Landau’s rate mentioned here is a generalization of the critical Nyquist rate for the case of stable recovery, non-equidistant sampling and disconnected spectrum gaps. For the recovery of finite discrete-time signals from incomplete observations, the most advanced method is a so-called compressive sensing exploring sparsity of signals; see, e.g., [2–5]. However, the problem of recovery of discrete time signals from their observed periodic subsequences has not been considered in the literature, with exception of the work [19]. A possible reason for this is that this recovery requires quite severe restrictions on the spectrum. For example, missing odd terms of digital signals represented by infinite sequences with Z-transform supported on the set {eiω }ω∈(−π /2,π /2] can be recovered from observations of even terms, but this recoverability vanishes for the case of wider spectrum. Clearly, the class of band-limited sequences with this support for Z-transform is quite tiny and is not dense in 2 . Recoverability of discrete time signals from their m-periodic subsequences for m > 2 would require even more severe restrictions on their Z-transform. This problem was attacked in [19]; the approach therein requires to observe quite large number of subsequences, and this number is increasing as spectrum gap is decreasing. The present paper targets a problem of determining classes of discrete time signals that can be recovered from their m-periodic subsequences for arbitrarily large m. More precisely, the goal is to find, for a given integer m ≥ 2, a class of sequences x ∈ 2 , featuring the following properties: (i) x are uniquely defined by their subsequences x(km); (ii) any finite part of x can be recovered from a large enough finite part of the subsequence x(km), and there is a robust recovery procedure; (iii) these sequences x are everywhere dense in 2 , i.e., they can approximate any sequence from 2 . We found a sought class of these sequences x defined via arbitrarily small spectrum spectrum degeneracy on T = {z ∈ C : |z| = 1} for Z-transforms of their m-periodic subsequences amended via insertion of dummy elements (Definition 4 and Theorems 1 and 2 below). The corresponding feature of x can be regarded as a special kind of spectrum degeneracy. It can be clarified that this degeneracy does not assume that there are gaps for the spectrum of the signals x featuring this degeneracy: Their Ztransforms can be separated from zero on T. Since these sequences are everywhere dense in 2 , this spectrum degeneracy can be imperceptible. In addition to the uniqueness result, the paper also suggests a method of recovery of finite parts {x(t)}|t| ≤ M of signals from their truncated periodic subsequences {x(mt)}t: M < |mt| ≤ N . In other words, this allows to replace observations of the entire signal on a certain time interval by observations of a periodic subsequence of this signal on a larger interval; the choice of large m or M would require larger N = N (m, M ). In addition, smaller recovery error also would require larger N. However, the procedure is robust with respect to noise contamination and data truncation (Theorem 3). The proofs presented here use a predictor representing a multistep version of an one-step predictor [7]. The choice of the predictor is not unique; for example, a predictor [8] could also be used. As it was mentioned above, a related problem of recovery of missing values of band-limited signals from observations of subsequences was studied in [19]. The approach in [19] requires to observe quite a large number of subsequences, and this number is increasing as the spectrum gap is decreasing. Our approach is quite different: It requires to observe just one subsequence sampled at m-periodic points to recover a sequence with arbitrarily imperceptible spectrum degeneracy in the sense of Definition 4.
181
The paper is organized as follows. Section 2 presents some definitions and preliminary results on predictability of sequences. Section 3 presents a special type of spectrum degeneracy for sequences and the main results on recoverability of sequences from subsequences sampled at periodic points. Section 4 discusses numerical implementation and shows some examples. Section 5 contains the proofs. Section 6 presents some discussion. 2. Definitions and preliminary results 2.1. Definitions
We denote by Z the set of all integers. Let Z[k,l] = {t ∈ Z :
k ≤ t ≤ l } for k, l ∈ Z. In addition, let Z[k,∞ ) = {t ∈ Z : t ≥ k} and
Z(−∞,k] = {t ∈ Z : t ≤ k}. For a set G ⊂ Z and r ∈ [1, ∞], we denote by r (G) a Banach space of complex valued sequences {x(t)}t ∈ G such that r 1/r < +∞ for r ∈ [1, +∞ ), and x xr ( G ) = ∞ ( G ) = t∈G |x (t )| supt∈G |x(t )| < +∞. We denote r = r (Z ). We denote by Br (2 ) the closed ball of radius r > 0 in 2 . Let D = {z ∈ C : |z| < 1}, Dc = {z ∈ C : |z| > 1}. For x ∈ 2 , we denote by X = Z x the Z-transform
X (z ) =
∞
x(k )z−k ,
z ∈ C.
k=−∞
Respectively, the inverse Z-transform x = Z −1 X is defined as
x(t ) =
1 2π
π
−π
X eiω eiωt dω,
t ∈ Z.
For x ∈ 2 , the trace X |T is defined as an element of L2 (T ). Here T = {z ∈ C : |z | = 1} Let Hr (Dc ) be the Hardy space of functions that are holomorphic on Dc including the point at infinity with finite norm
hHr (Dc ) = sup h ρ eiω Lr (−π ,π ) , r ∈ [1, +∞]. ρ >1
In this paper, we focus on signals without spectrum gaps of positive measure on T but with a spectrum gap at an isolated point on T where their Z-transforms vanish with a certain rate. For β ∈ (−π , π ], q > 1, c > 0, and ω ∈ (−π , π ], set (ω, β , q, c ) = exp
c
| e iω − e iβ | q
.
For r > 0, let Xβ (q, c, r ) be the set of all x ∈ 2 such that
π
−π
|X eiω |2 (ω, β , q, c )dω ≤ r,
(1)
where X = Zx. Let K+ be the class of functions h : Z → R such that h(t ) = 0 for t < 0 and such that H = Zh ∈ H ∞ (Dc ). Let K− be the class of functions h : Z → R such that h(t ) = 0 for t > 0 and such that H = Zh ∈ H ∞ (D ). 2.2. Preliminary results: Predictability of sequences Definition 1 (Predictability). Let X ⊂ 2 be a class of processes. We say that the class X is uniformly 2 -predictable if for any integer n > 0 (or n < 0) there exist ϑ, ψ ∈ ∞ and a sequence {h j } j≥0 ⊂ K+ (or a sequence {h j } j≥0 ⊂ K− if n < 0) such that
|x(t + n ) − x j (t )|2 → 0 as j → +∞
t∈Z
uniformly over x ∈ X , where x j (t ) = ϑ (t )
s∈Z
h j (t − s )ψ (s )x(s ).
182
N. Dokuchaev / Signal Processing 162 (2019) 180–188
In Definition 1, the case where n > 0 and h j ∈ K− correspond to causal extrapolation, where estimation of x(t + n ) is based on observations {x(s)}s ≤ t , and the case where n < 0 and h j ∈ K+ corresponds to anti-causal extrapolation, where estimation of x(t + n ) is based on observations {x(s)}s ≥ t . Lemma 1. For any β ∈ (−π , π ], q > 1, c > 0, and r > 0, the class of sequences Xβ (q, c, r ) is uniformly predictable on finite horizon in the sense of Definition 1. Proposition 1. In the setting of Lemma 1, the conditions of Definition 1 are satisfied for ψ (t ) = ei(β −π )t , ϑ (t ) = ψ (t )−1 , and for predicting kernels defined as hn = hn, j = Z −1 Hn , where
Hn (z ) = znV (z )n n
n>0
for
Hn (z ) = z V (1/z )|n|
n < 0,
class X is uniformly recoverable from observations of x on T if, for any integer M > 0, there exist a set of kernels {h j (t, s )}t∈Z[−M,M],s∈T ⊂ R, j ≥ 0, such that
max
t∈Z[−M,M]
|x(t ) − x j (t )| → 0
as
j → +∞
uniformly over x ∈ X . Here
x j (t ) =
h j (t, s )x(s ),
t ∈ Z[−M,M] .
(3)
s∈T
Up to the end of this paper, we assume that we are given an integer m ≥ 2. We consider recovering sequences from their subsequences sampled at m-periodic points.
(2)
Definition 4 (Braided spectrum degeneracy). Let q > 1, c > 0, and r > 0 be given. We say that x ∈ 2 features m-braided spectrum degeneracy if, for d ∈ Z[−m+1,m−1] , there exist numbers βd ∈ (−π , π ] and sequences yd ∈ 2 such that the following holds:
Here α = α (γ ) = 1 − γ −r , where r > 0 is fixed and where γ = γ j → +∞.
(i) All β d are different for different d, and yd have Z-transforms vanishing at β d such that yd ∈ Xβd (q, c, r ). (ii) Some traces of sequences yd coincide as the following:
and where
V (z ) = 1 − exp −
for
γ , z+α
z ∈ C.
These predicting kernels represent a modification of kernels introduced in [7] for n = 1. It can be noted that the kernels hn are real valued, since Hn (z¯ ) = Hn (z ). 2.2.1. Robustness of prediction Consider a problem of predicting {x(t )}t∈Z[−M,M] using observed noise contaminated sequences x =
x + η, where
x ∈ X , and where η ∈ 2 represents a noise. We consider two cases: (i) The case of forward predicting, where truncated traces {x(t )}t∈Z[−N,−M−1] are observable, and (ii) the case of backward predicting, where truncated traces {x(t )}t∈Z[M+1,N] are available. Here N > M is a given integer. Definition 2 (Robust predictability). Let X ⊂ 2 be a set of sequences. We say that the class X allows uniform and robust prediction if, for any integer M > 0 and any ε > 0, there exists ρ > 0, N0 > M, ϑ ∈ ∞ , set of sequences {ψt (· )}t∈Z[−M,M] ⊂ ∞ , and a set of kernels {ht (· )}t∈Z[−M,M] ⊂ K+ for forward predicting (or a set {ht (· )}t∈Z[−M,M] ⊂ K− respectively for backward predicting) such that
max
t∈Z[−M,M]
|x(t ) − x(t )| ≤ ε
for all x ∈ X , N > N0 , and η ∈ Bρ (2 ). Here
x(t ) = ϑ (t )
ht (t − s )ψt (s )x(s ),
t ∈ Z[−M,M] ,
s∈T
where T = Z[−N,−M−1] for forward predicting and T = Z[M+1,N] for backward predicting respectively. The following lemma shows that prediction is robust with respect to noise contamination and truncation. Lemma 2. For given β ∈ (−π , π ], q > 1, c > 0, r > 0, the class Xβ (q, c, r ) allows uniform and robust prediction in the sense of Definition 2 with ψt (s ) = ei(β −π )s , ϑ (s ) = ψt (t )−1 , and with kernels ht defined as hn in Proposition 1, where n = t + M + 1 for forward predicting and n = t − M − 1 for backward predicting. 3. The main results: Recoverability of sequences from subsequences Definition 3 (Recoverability from a set of observations). Let X ⊂ 2 be a class of sequences, and let T ⊂ Z be a set. We say that the
yd (t ) = y0 (t ),
t ≤ 0,
d > 0,
yd (t ) = y0 (t ),
t ≥ 0,
d < 0.
(iii) The sequence x is formed from shifted sequences yd as the following:
x(t ) =
m −1
yd
d=0 0
x(t ) =
t +d I{ t+d ∈Z} , m m
yd
d=−m+1
t ≥ 0,
t +d I{ t+d ∈Z} , m m
t ≤ 0.
(4)
−1 We denote by β¯ the corresponding set {βd }m , and we ded=−m+1 ¯ note by B the set of all β with the features described above. We denote by Pm,β¯ (q, c, r ) the set of all sequences x featuring this de-
generacy, and we denote
Pm,β¯ = ∪q>1,c>0,r>0 Pm,β¯ (q, c, r ),
Pm = ∪β¯ ∈B Pm,β¯ .
Remark 1. By Lemma 1, extrapolations of semi-infinite halves of yd ∈ Xβd (q, c, r ) are unique and yet halves of different yd coincide in Definition 1(ii). This is not a contradiction since the points of spectrum degeneracy eiβd are different for different d. Theorem 1. For any β¯ ∈ B, any x ∈ 2 , and any ε > 0, there exists x ∈ Pm,β¯ such that
x − x 2 ≤ ε .
(5)
In particular, the set Pm is everywhere dense in 2 . Theorem 2. For any β¯ ∈ B, q > 1, c > 0, r > 0, and M ≥ 0, the class Pm,β¯ (q, c, r ) is uniformly recoverable from the observations on the set T = {t ∈ Z : t/m ∈ Z, |t | > M} in the sense of Definition 3. Corollary 1. (i) Any sequence x ∈ Pm is uniquely defined by its subsequence {x(tm )}t∈Z . (ii) For any M ≥ 0, any sequence x ∈ Pm is uniquely defined by its subsequence {x(mt )}t∈Z, |t |>M/m . Proposition 2. Under the assumptions of Theorem 2, the conditions of Definition 3 are satisfied by the following procedure.
N. Dokuchaev / Signal Processing 162 (2019) 180–188
183
(i) (a) For t ∈ Z[0,M] , identify d ∈ Z[0,m−1] and k ∈ Z, k ≥ 0, such that x(t ) = yd (k ), where t = mk − d. (b) For t ∈ Z[0,M] , predict x(t ) = yd (k ) using available observations {yd ( p)} p<−M/m = {x(mp)} p<−M/m and the n-step predicting algorithm from Proposition 1 with ψd (t ) = ei(βd −π )t , ϑd (t ) = ψd−1 (t ), and with predicting kernels defined by (3) with γ = γ j → +∞, where n = k − max{ p ∈ Z : p < −M/m} for t ≥ 0. (ii) (a) For t ∈ Z[−M,−1] , identify d ∈ Z[−m+1,−1] and k ∈ Z, k ≤ 0, such that x(t ) = yd (k ), t = mk − d. (b) For t ∈ Z[−M,−1] , predict (backward) x(t ) = yd (k ) using available observations {yd ( p)} p>M/m = {x(mp)} p>M/m with ψd (t ) = ei(βd −π )t , ϑd (t ) = ψd−1 (t ), and with predicting kernels defined by (3) with γ = γ j → +∞, where n = k − min{ p ∈ Z : p > M/m}. (iii) The kernels hj (t, s) presented in Definition 3 can be represented via sequences ϑd , ψ d , and hn = hn, j with γ = γ j → +∞. Let us compare these results with the result of [19], where a method was suggested for recovery of missing values of oversampling sequences from observations of periodic subsequences for band-limited continuous time functions. The method [19] is also applicable for any band-limited sequences from 2 . The algorithm in [19] requires to observe quite large number of subsequences, and this number is increasing as the size of spectrum gap on T is decreasing. Theorems 1 and 2 ensure recoverability with just one subsequence for sequences that everywhere dense in 2 .
Fig. 1. Traces of the predicting kernel h1 (t) for γ = 3.2 and α = 0.5.
Robustness of recovery Consider a problem of recovery of the set {x(t )}t∈Z[−M,M] from observed subsequences of noise contaminated sequences x =
x + η, where M ∈ Z[0,+∞ ) ,
x ∈ Pm,β¯ (q, c, r ) and where η ∈ 2 represents a noise. Suppose that only truncated traces of observations of {x(t )}t : t /m∈Z, M<|t |≤N are available, where N > 0 is an integer. The following theorem shows that recovery of a finite part of a sequence from Pm from its subsequence sampled at m-periodic points is robust with respect to noise contamination and truncation.
Theorem 3. Let β¯ ∈ B, q > 1, c > 0, r > 0 be given. Then for any integer M > 0 and any ε > 0, there exists ρ > 0, N0 > 0, such that
max
t∈Z[−M,M]
|x(t ) − x(t )| ≤ ε ∀x ∈ Pm,β¯ (q, c, r ), ∀η ∈ Bρ (2 ),
for any N > N0 and for
x(t ) =
h j (t, s )x(ms ),
t ∈ Z[−M,M] ,
s∈Z: M<|ms|≤N
where the kernels hj (t, s) are the ones mentioned in Proposition 2(iii). 4. On numerical implementation To prove the theoretical results on uniqueness and recoverability presented above, we used predicting kernels hn defined in Lemma 1(i). It can be noted that this predicting method is neither unique nor optimal; other methods could also be used. Let us outline some essential features of kernels hn . Their transfer functions Hn eiω = Zhn are supposed to approximate the transfer functions eiωn corresponding to shift on n steps forward. The functions Hn are defined by (2), where V is such that
|1 − V eiω | = exp −γ
cos(ω ) + α
(cos(ω ) + α )2 + sin2 (ω )
.
We may expect a good approximation of the transfer function eiω of one step predictor by Hn (eiω ) if |1 − V eiω | is small which we may have only for large enough γ and for ω ∈ (−W (α ), W (α )),
where W (α ) = arccos(−α ); we have that |1 − V eiω | < 1 therein.
Fig. 2. The distance |H1 eiω − eiω | for H1 = Zh1 with γ = 3.2, α = 0.5 on [−π , π ].
On the other hand,
max
ω∈(−π ,π ]: |ω|>W (α )
1 − V eiω = exp γ
1−α
.
Clearly, this is growing fast when either γ is growing or α is approaching 1. Some examples: √ • If γ = 6 and and α = 3/2 = 0.8660, then W (α ) = π /6, |1 − V (π )| = 2.81 · 1019 and √hn ∞ = 1.66 · 1017 . (In this case, we have that r −γ = 1 − 3/2 in (2)). • If γ = 6 and α = 0.5, then W (α ) = π /3, |1 − V (π )| = 162754.79, and hn ∞ = 7, 823.89. • If γ = 3.2 and α = 0.5, then W (α ) = π /3, |1 − V (π )| = 601.85, and hn ∞ = 70.194. Fig. 1 shows values of h1 (t) for γ = 3.2 and α = 0.5. Fig. 2 shows the distance |H1 eiω − eiω | from the transfer function eiω for the ideal one-step predictor for γ = 3.2 and α = 0.5. Respectively, we have that: (i) hn ∞ is large if either γ is large or α is close to 1; (ii) the values |hn (t)| are decaying slow as t → +∞. These features mean that an accurate prediction would require high computational power for calculations with large numbers and with long traces of observations. Some examples of numerical experiments for the predicting algorithm based on the transfer function (2) can be found in [9]. To illustrate possibility to recover a sequence from a subsequence, we create an example of recovery of terms at odd times
184
N. Dokuchaev / Signal Processing 162 (2019) 180–188
from observations at even times. This would correspond to our setting with m = 2. More exactly, we consider prediction of x(1) given observations {x(t )}t≤0: t/2∈Z for x ∈ Pm,β¯ (q, c, r ). The setting discussed above does not involve stochastic processes and probability measures; it is oriented on extrapolation of sequences in the pathwise deterministic setting. However, to provide sufficiently large sets of input sequences for statistical estimation, we used signals x generated via Monte-Carlo simulation as a stochastic process. Furthermore, the theorems presented above focus on processes yd with spectrum degeneracy located at isolated points of T. However, for numerical examples, we considered processes yd with spectrum gaps on T of a positive measure. We created the input sequences x and yd for each Monte-Carlo simulation using the following steps. (i) Create a path x¯ of a discrete time Gaussian process as
z(t ) = Az(t − 1 ) + η (t ), x¯ (t ) = c z(t ). Here z(t) is a process with the values in Rν , where ν > 0 is an integer, c ∈ Rν . The process η represents a noise with values in Rν , A is a matrix with the values in Rν × ν . In each simulation, we selected random and mutually independent ν ∈ {1,...,10}, A, c, and η. We selected A and c among matrices with independent components from the uniform distribution on (0,1), with the spectrum in the circle 0.8 · T. The process η was selected as a stochastic discrete time Gaussian white noise with the values in Rν such that Eη (t ) = 0 and E|η (t )|2 = 1. (ii) Create sequences y¯ d for d = −1, 0, 1, as described in Definition 4 such that (4) holds with m = 2, x = x¯, and yd = y¯ d . (iii) Transform sequences y¯ d into sequences yd such that y1 (t )|t≤0 = y0 (t )|t≤0 , y−1 (t )|t≥0 = y0 (t )|t≥0 , and that Yd eiω = 0 for eiω ∈ Jd (δ ), where Yd = Zyd , and where
J−1 (δ ) = {eiω :
min(|ω − π /4|, |ω + π /4| ) < δ−1 },
J0 (δ ) = {eiω : min(|ω − π /2|, |ω + π /2| ) < δ0 }, J1 (δ ) = {eiω : min(|ω − π |, |ω + π | ) < δ1 }, for some selected δ d > 0. For this step, we used formulas from the Proof of Theorem 1 below; for calculation of Z-transforms required in these equations, we used truncated sequences defined for t ∈ Z[−N,N] for some N > 0. (iv) Using yd , create the process x defined by (4). In particular, this means that x(1 ) = y1 (1 ) and x(2t ) = y1 (t ) = y0 (t ) for t ≤ 0. Finally, for each set of sequences (x, y−1 , y0 , y1 ), we calculated estimation of x(1) as prediction y1 (1 ) of y1 (1) using observations y1 (t)|t ≤ 0 . (Remind that y1 (t )|t≤0 = y0 (t )|t≤0 = x(2t )|t≤0 ). We calculated truncated predicting kernels hn (t) defined on t ∈ Z[−2M,2M] . We calculated the average of the relative error
| y1 ( 1 ) − x ( 1 )| N 1 x(t )2 2N + 1 t=−N
.
For 50 0 0 Monte-Carlo trials with N = 70, γ = 3.2, δ−1 = π /18, δ0 = π /9, δ1 = π /3, the average of this error was 0.083185. The standard deviation of the error was 0.20763, and the median of the error was 0.061408. The calculation on a standard PC took several hours, the most time consuming part was construction of yd in Step (iii). As can be seen from Fig. 1, the predicting kernel has a heavy tail that was excluded due to truncation; actually, only the first cluster of large values for h1 was involved in the calculations. Further improvemnt of accuracy would require more computational power.
Fig. 3. |Y0 (eiω )|, |Y1 (eiω )|, and |X(eiω )|, where Y0 = Zy0 , Y1 = Zy1 , X = Zx, and where x is formed from simulated {yk }k=−1,0,1 according to (4).
Fig. 3 shows the absolute values |Y0 (eiω )|, |Y1 (eiω )|, and |X(eiω )|, where Y0 = Zy0 , Y1 = Z y1 , X = Z x, and where x is compounded from simulated signals y0 , y−1 , and y1 . In particular, it can be seen that the spectrum of x does not vanish on T, even if x is featuring spectrum degeneracy in the sense of Definition 4, meaning that it is compounded from processes with spectrum gap. 5. Proofs Proof of Lemma 1. follows immediately from Proposition 1 given below. Proof of Proposition 1. the case where β = π . The proof represents a modification of the proof from [7], where the case of n = 1 was considered. It suffices to consider predicting on n steps forward with n > 0. The case where n < 0 can be considered similarly. Let W (α ) = arccos(−α ), let D+ (α ) = (−W (α ), W (α )), and let D(α ) = [−π , π ]\D+ (α ). We have that cos(W (α )) + α = 0, cos(ω ) + α > 0 for ω ∈ D+ (α ), and cos(ω ) + α < 0 for ω ∈ D(α ). It was shown in Dokuchaev [7] that the following holds: (a1) V(z) ∈ H∞ (Dc ) and zV(z) ∈ H∞ (Dc ). (a2) V(eiω ) → 1 for all ω ∈ (−π , π ) as γ → +∞. (a3) If ω ∈ D+ (α ) then Re |V(eiω )| ≤ 2.
γ
eiω +α
> 0, |V eiω − 1| ≤ 1, and
N. Dokuchaev / Signal Processing 162 (2019) 180–188
(a4) For any c > 0, there exists γ 0 > 0 such that, for any γ ≥ γ 0 ,
|V e
iω
−1
− 1| (ω, π , q, c )
≤1
∀ω ∈ D ( α ).
z n V ( z )n
(b1) and Hn (z ) = ∈ (b2) V(eiω )n → 1 and Hn (eiω ) → einω for all ω ∈ (−π , π ) as γ → +∞. ∈ H∞ (Dc )
(b3) If ω ∈ D+ (α ) then Re
H∞ ( Dc ).
γ
> 0 and |(V eiω − 1 )n | ≤ 1.
eiω +α (eiω ) approximates
This suggests that Hn einω as γ → +∞ on a part of T and therefore it is possible that hn = Z −1 Hn can be used as kernels of a linear n-step predictors. We will prove below that these kernels are such as required in Proposition 1. Let p = { pk } ∈ Rn be such that an − 1 = nk=0 pk (a − 1 )k . In this case,
V e iω
n
−1=
n
p k V e iω − 1
By (6), it follows that I1 → 0 as γ → +∞ uniformly over x ∈ Xπ (q, c, r ). Let us estimate I2 . We have that
Clearly, this implies that the following holds for any n ≥ 0: V(z)n
k
| Hn eiω − eiωn X eiω |dω n ≤ | e iω n V e iω − 1 X e iω | d ω D+ ( α ) n ≤ 1 − V eiω L2 (D+ (α )) X eiω L2 (−π ,π ) .
I2 =
D+ ( α )
Further, ID+ (α ) (ω )|1 − V eiω properties of V,
ID+ (α ) (ω )|V eiω
|V e
− 1| ≤
| pk ||V e
iω
(ω, π , q, nc/2 )−1 = (ω, π , q, c )−n/2 ≤ (ω, π , q, c )−k/2 , k = 1, ..., n − 1.
Let C (n ) = nk=0 | pk |. By the choice of the function ϱ and by property (a4) listed above, it follows that that, for any c > 0, there exists γ 0 > 0 such that, for any γ > γ 0 ,
n |V eiω − 1| (ω, π , q, c )−n/2 dω ≤ C (n ) ∀ω ∈ (−π , π ).
This implies that the following analog of property (a4) holds. (b4) For any c > 0, there exists γ 0 > 0 such that
n |V eiω − 1| (ω, π , q, c )−1/2 ≤ C (n ) ∀ω ∈ (−π , π ), ∀γ > γ0 .
k
x (k ) =
γ → +∞.
I1 = I2 =
D (α )
|X e
D+ ( α )
iω
|X e
We have that
iω
− Xn eiω |dω,
iω
− Xn eiω |dω.
e I1 = X − Xn eiω L1 (D(α )) = Hn eiω − eiωn X L1 (D(α ) ) n ≤ V eiω − 1 (ω, π , q, c )−1/2 L2 (D(α ) ) ×X eiω (ω, π , q, c )1/2 L2 (−π ,π ) .
hn ( k − p )x ( p ).
θ= β − π , and let
ψ (t ) = eiθ t , ϑ (t ) = ψ (t )−1 , xβ (t ) = ψ (t )x(t ), t ∈ Z.
Let x ∈ Xβ (q, c, r ) for some β , q, c, r. Let Xβ = Zxβ and X = Zx. In this case,
e−iωk xβ (k ) =
k∈Z
=
k∈Z
e−iωk ψ (k )x(k )
e−iωk+iθ k x(k ) = X eiω−iθ .
k∈Z
Hence xβ ∈ Xπ (q, c, r ). In particular, we have that
(6)
X eiω − Xn eiω L1 (−π ,π ) = I1 + I2 ,
γ → +∞
Proof of Proposition 1. for an arbitrarily selected β ∈ (−π , π ]. Let
n |V eiω − 1|2 (ω, π , q, c )−1 dω → 0
as
Hence the predicting kernels hn = Z −1 Hn constructed for γ → +∞ are such as required. This completes the proof of Proposition 1 for the case where β = π .
Xβ eiω =
= Consider an arbitrary x ∈ Xπ (q, c, r ). We denote X = Zx and X Hn X. Let y(t ) = x(t + n ) and Xn = Zy. In this case, Xn (z ) = zn X (z ). We have that
where
n
p=−∞
Clearly, α = α (γ ) → 1 and mes D(α ) → 0 as γ → +∞. It follows that, for any c > 0,
as
where the signal x is the output of the linear predictor defined by the kernel hn = Z −1 Hn as
− 1|k .
k=0
D (α )
∀γ : γ > γ0 .
uniformly over x ∈ Xβ (q, c, r ),
Furthermore, we have that
| → 0 a.e. as γ → +∞. By the
− 1| ≤ 2n C ( n ),
k∈Z
n
n
From Lebesgue Dominance Theorem, it follows that V eiω − 1L2 (D+ (α )) → 0 as γ → +∞. It follows that I1 + I2 → 0 uniformly over x ∈ Xβ (q, c, r ). Hence
and
iω n
n
sup |x(k + n ) − x ( k )| → 0
k=0
185
Xβ eiπ = X eiπ −iθ = X eiβ = 0. As we have established above, the set of {xβ } is predictable in the sense of Definition 1. Let hn be the corresponding predicting kernel which existence is required by Definition 1 for the case where ψ (t) ≡ 1 such that the approximation xβ (t ) of xβ (t + n ) is given by
xβ (t ) =
t
hn (t − k )yβ (k ).
k=−∞
These kernels were constructed in the proof above for the special case β = π . It follows that
x(t ) = ϑ (t )
t
hn (t − k )ψ (t )x(k ).
k=−∞
This completes the proof of Proposition 1 as well as Lemma 1.
Proof of Lemma 2. It suffices to show that the error for recovery a singe term x(n) for a given integer n > 0 from the sequence {x(k )}k∈Z, k≤0 can be made arbitrarily small. The case where n < 0 and where {x(mk )}k∈Z, k≥0 are observable can be considered similarly.
186
N. Dokuchaev / Signal Processing 162 (2019) 180–188
We assume that N > n. Let us consider an input sequence x ∈ 2 such that
, x=
x+η
(7)
(k ) = where η η (k )I{|k|≤N} −
x(k )I{|k|>N} . In this case, (7) implies that x(k ) = (
x(k ) + η (k ))I{|k|≤N} .
=
. Let σ = x, and let H Zη Let X = Zx, let X = Z
eiω L (−π ,π ) ; this parameter represents the intensity of H 1
the noise. Let us assume first that β = π . Let an arbitrarily small ε > 0 be given.
)(t ) and Let x¯ (t ) = Z −1 (Hn X x(t ) = Z −1 (Hn X )(t ), where n = n(t ) = M + 1 + t. These values represent the estimates of
x(t ) and x(t) based on observations of {
x(km )}k<−M and {x(km )}k<−M respectively and defined by the kernels hn = Z −1 Hn as described in Definition 1 with ψ ≡ 1. Assume that the parameters (γ , r ) of Hn in (2) are selected such that
sup
t∈Z[−M,M]
|x¯ (t ) −
x(t )| ≤ ε /2 ∀
x ∈ Xπ (q, c, r );
(8)
where
Eη =
max
n∈Z[1,2M+1]
1
eiω |L (−π ,π ) ≤ σ (κ + 1 ), (Hn eiω − eiωn )H 1 2π
and where κ= max
sup
n∈Z[1,2M+1] ω∈[−π ,π ]
|Hn eiω |.
We have that σ → 0 as N → +∞ and η2 → 0. If N is large enough and η2 is small enough such that σ (κ + 1 ) ≤ ε/2, then supt∈Z[−M,M] | x(t ) − x(t )| ≤ ε . This completes the proof of
Lemma 2 for the case where β = π . The proof for the case where β = π follows from the case where β = π , since the processes x(t ), x¯ (t ),
x(t ), x(t ),
x(t ), presented in the proof with β = π will be simply converted to processes e−iθ t x(t ), e−iθ t x¯ (t ), e−iθ t x(t ), e−iθ t
x(t ), for θ = β − π , similarly to the proof of of Proposition 1. This completes the proof of Lemma 2. Remark 2. By the properties of Hn , we have that κ → +∞ as γ → +∞. This implies that, for any given σ > 0, error (8) will be actually increasing if ε → 0. This means that, in practice, the predictor should not target too small a size of the error, since in it impossible to ensure that σ = 0 due inevitable data truncation. To proceed further, let us introduce mappings Md : 2 → 2 for d ∈ Z[−m+1,m−1] such that, for x ∈ 2 , the sequence xd = Md x is defined by insertion of |d| new members equal to x(0) as the following:
(i) x0 = x;
Let x ∈ 2 be arbitrarily selected. Let xd = Md x, let yd = subseqm,0 xd , and let Yd = Zyd . Further, let
0 eiω = Y Y0 eiω I{ω∈/ J0 (δ )} −
Y p e iω I { ω ∈ J p ( δ ) } ,
p∈I,p=0
let ξd = y d − y 0 , d = Z ξd ,
and let = 0 − Y0 , Y Y0 + d = Yd + Y d
d = 0.
for d ∈ I. Let yd = Z −1Y d Since yd (k ) = y0 (k ) for k ≤ 0 and d > 0, and yd (k ) = y0 (k ) for k ≥ 0 and d < 0, it follows that ξd (k ) = 0 for k ≤ 0 and d > 0, and ξd (k ) = 0 for k ≥ 0 and d < 0. In addition, 0 ≡ 0 and ξ 0 ≡ 0. Hence yd ( k ) = y0 (k ), for k ≤ 0 and d > 0, and yd ( k ) = y0 (k ), for k ≥ 0 and d < 0. Further, by the definitions, we have that
eiω = Y eiω − Y0 eiω + Y0 eiω I Y d d {ω∈/ J0 (δ )} iω − Y p e I{ω ∈J p ( δ )} p∈I,p=0
= Yd eiω −
Y p e iω I { ω ∈ J p ( δ ) } .
p∈I
Hence
e iω = Y e iω I Y d d {ω∈/ Jd (δ )} −
Y p e iω I { ω ∈ J p ( δ ) } .
p∈I,p=d
Since the sets Jd (δ ) are mutually disjoint, it follows that e iω | Y yd ∈ Xβ (q, c, r ) for all d and d ω∈Jd (δ ) ≡ 0 for for all d. Hence for some r > 0. Let x be defined by (4) with yd instead of yd . By the definitions, it follows that, for any δ > 0, q > 1, c > 0, we have that x ∈ ∪r>0 Pm,β¯ (q, c, r ) for β¯ = {βd } such that β d ∈ Jd (δ ). Clearly, for d = 0, we have that
Yd eiω − Yd eiω L2 (−π ,π ) → 0 as δ → 0. for d = 0, it follows that (9) holds for all By the definition of Y d d ∈ I. It follows that
yd − yd 2 → 0 as δ → 0.
(9)
Let the mappings τd : Z → Z be defined by (4) such that
yd (k ) = x(τd (k ) ),
yd ( k ) = x(τd (k ) ),
k ∈ Z, d ∈ I.
By the definitions, it follows that
yd ( k ) − yd ( k ) = x(τd (k ) ) − x(τd (k ) ),
(ii) for d > 0 : xd ( k ) = x ( k ),
k < 0,
xd ( k ) = x ( 0 ),
k = 0, 1, ..., d,
xd ( k ) = x ( k − d ), (iii)
Proof of Theorem 1. Let I = Z[−m+1,m−1] . In the proof below, we consider d ∈ I. For δ > 0, consider a set {Jd (δ )}d∈I of mutually disjoint open measurable subsets of T such that mes Jd (δ ) ≤ δ .
this choice is possible by Proposition 1. The definitions imply that
)(t ). Hence x(t ) = x¯ (t ) + Z −1 (Hn H
(t )| ≤ ε /2 + Eη , | x(t ) − x(t )| ≤ |x¯ (t ) −
x(t )| + |Z −1 Hn H
Let subseqm,s : 2 → 2 be a mapping representing extraction of a subsequence such that y(k ) = x(km − s ) for all k ∈ Z for y= subseqm,s x.
k ≥ d + 1;
for d < 0 : xd ( k ) = x ( k ),
k > 0,
xd ( k ) = x ( 0 ),
k = d, d + 1, ..., 0,
xd ( k ) = x ( k − d ),
k ≤ d − 1.
k ∈ Z.
(10)
Finally, we obtain (5) from (9) and (10). This completes the proof of Theorem 1. Proof of Theorem 2. follows from Proposition 2 which proof is given below. Proof of Proposition 2. Let x ∈ Pm,β¯ (q, c, r ) and d ∈ Z[−m+1,m−1] .
By the definitions, (4) holds with yd = subseqm,0 xd , where xd = Md x. For d ≥ 0, we have that yd (k ) = x(km − d ) for k ≥ 1 and
N. Dokuchaev / Signal Processing 162 (2019) 180–188
yd (k ) = x(km ) for k ≤ 0. For d < 0, we have that yd (k ) = x(km − d ) for k ≤ −1 and yd (k ) = x(km ) for k ≥ 0. Let Yd be the set of all these yd for all x ∈ Pm,β¯ (q, c, r ). By Lemma 1, the sets Yd are uniformly predictable in the sense of Definition 1. It follows that, for any s > 0 and d ≥ 0 (or d ≤ 0), the sets Yd are uniformly recoverable in the sense of Definition 3 from the observations of {yd (k)} on {k ∈ Z : k < −M/m} (or from observations of {yd (k)} on {k ∈ Z : k > M/m} if d < 0). Let us consider the case where d ≥ 0. By Proposition 1, yd (k) for k ≥ 0 can be recovered with selected ϑd , ψ d ∈ ∞ and with kernels
hn+ (k ) , where n+ (k ) = k − max{l ∈ Z : lm < −M}, based on observations of {yd (k )}k<−M/m = {y0 (k )}k<−M/m = {x(mk )}k<−M/m such as described in Definition 3, with an estimate
yd ( k ) = ϑd ( k )
hn+ ( k ) ( k − s )ψd ( s )yd ( s )
s<−M/m, s∈Z
= ϑd ( k )
187
constructed for processes with Z-transform vanishing at z = 1 with V(z) replaced by 1 − exp [−γ /(α − z )]. Remark 5. If processes yd have fixed spectrum gaps on T of non-zero measure, then one can use predictors described in Proposition 1 with γ → +∞ and with fixed α ∈ (−1, 1 ) being independent on γ ; this a is defined by the size of the gaps. Remark 6. Let δ > 0, and let Um,β¯ (δ ) be the set of all x ∈ 2 rep resented as Definition 4) with yd ∈ 2 such that Yd eiω = 0 if |eiω − eiβd | ≤ δ for Y = Zy , where β¯ = {β }, with different β ∈ d
d
d
d
(−π , π ]. Then the set Um,β (δ ) represents a closed subspace in the
Hilbert space 2 , and the process x constructed in the proof of Theorem 1 represents a projection of x on this subspace.
Remark 7. Since the kernels hn are real valued, the results of the paper will be valid if we assume that all processes in time domain are real valued, and that r are spaces of real valued sequences.
hn+ (k ) (k − s )ψd (s )x(ms )
s<−M/m, s∈Z
that can be arbitrarily precise, i.e. such that
|yd ( k ) − yd (k )| → 0 as γ → +∞.
6. Discussion and future developments
This allows to recover x(t) for t ∈ Z[0,M] from the observations
{x(mk )}k<−M/m for any s > 0. For this, we select d ∈ Z[0,m−1] such that (t + d )/m ∈ Z, i.e., x(t ) = yd (k ) and t = km − d for some k ∈ Z. In this case, we can accept the value yd ( k ) = x(t ) as an estimate of x(t) for t = τd (k ), i.e., x(t )
= ϑd (k(t ))
hn+ (k(t )) (k(t ) − s )ψd (s )x(ms ),
(11)
s<−M/m, s∈Z
where t ∈ Z[0,M] and d ∈ Z[0,m−1] are such that (t + d )/m ∈ Z, and where k(t ) = τd−1 (t ) = (t + d )/m. Similar arguments establish recoverability of x(t) for t ∈ Z[−M,−1] from the observations {x(mk)}k > M/m . The value yd ( k ) = x(t ) is again an estimate of x(t) for t = τd (k ), i.e.,
x(t )
= ϑd (k(t ))
hn− (k(t )) (k(t ) − s )ψd (s )x(ms ),
(12)
s>M/m, s∈Z
where t ∈ Z[−M,−1] and d ∈ Z[−m+1,−1] are such that (t + d )/m ∈ Z, and where k(t ) = τd−1 (t ) = (t + d )/m. Representation of x(t ) as
x(t ) = It∈Z[0,M]
x(k )I{(t+d )/m∈Z}
d∈Z[0,m−1]
+It∈Z[−M,−1]
x(k )I{(t+d )/m∈Z}
d∈Z[−m+1,−1]
combined with (11)-(12) gives that hj (t,s) required in Definition 3 is defined via ψ d , ϑd , and hn = hn, j calculated with γ = γ j . This completes the proof of Proposition 2 and Theorem 2. Theorem 2 follows immediately from Proposition 2. Corollary 1 follows immediately from Theorem 2. Proof of Theorem 3. follows from Lemma 2 applied for the robust prediction of the corresponding subsequences similarly to the proof of Proposition 2. Remark 3. In the setting of Corollary 1 (i), the predictability of the process y0 is not required. In this case, one can select J0 (δ ) = ∅ in the proof of Theorem 3. Remark 4. Instead of predicting kernels h1 tailored for the processes with Z-transform vanishing at z = −1, one can use kernels oriented on other processes. For example, similar kernels can be
The paper shows that recoverability of sequences from their subsequences sampled at m-periodic points is associated with certain spectrum degeneracy of a new kind, and that, for a fixed m, a sequence of a general kind can be approximated by sequences featuring this degeneracy. Some robustness of recovery of finite samples is established. For the recovery of discrete time signals from relatively small amount of incomplete observations, the most common is the compressing sensing approach. It was found that recovery of a sparse finite signals with N values and with S << N nonzero values at unknown locations would require M = 2S observations of the Fourier coefficients if N is a prime number, and that the required M is increasing quite slow with S if N is non-prime; see e.g., [2–5] and the literature therein. This approach has been proven to be effective for sparse signals and numerically sustainable; however, the set of admissible sparse signals is relatively tiny. The approach suggested in the present paper has the following distinctive feature: The sparsity of the admissible signals is not required, and the required “braided” degeneracy can be arbitrarily small and even imperceptible. This was only the very first step in attacking the problem; the numerical implementation to data compression and recovery is yet to be developed. There are many open questions and additional opportunities. The choice of a predictor, the choice of a system of amended subsequences {yd }, and the choice of a version the “braided” degeneracy, are non-unique. Other systems {yd } and other predictors could be used. For example, predictors suggested in [8] or in Remark 4 could be implemented. Further, the results for recovery from subsequences can be applied to samples of continuous time signals. Analogs of the braided spectrum degeneracy can be defined for continuous time functions. Also, it could be interesting to extend this approach on signals defined on discrete multidimensional lattices. The current approach relies on the natural concept of 1D prediction and is not directly applicable in this case.
Acknowledgements This work was supported by grant 08-08 of Government of Russian Federation. In addition, the author thanks the Handling Editor and anonymous referees for their valuable suggestions which helped to improve the paper.
188
N. Dokuchaev / Signal Processing 162 (2019) 180–188
References [1] A. Bhandari, P. Marziliano, Sampling and reconstruction of sparse signals in fractional fourier domain, IEEE Signal Process. Lett. 17 (3) (2010) 221–224. [2] T. Cai, G. Xu, J. Zhang, On recovery of sparse signals via 1 minimization, IEEE Trans. Inf. Theory 55 (7) (2009) 3388–3397. [3] E. Candés, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory 52 (12) (2006) 489–509. 52(2) [4] E. Candés, J. Romberg, Sparsity and incoherence in compressive sampling, Inverse Prob. 23 (3) (2007) 969–985. [5] D.L. Donoho, Compressed sensing, IEEE Trans. Inf. Theory 52 (2006) 1289–1306. [6] K.F. Cheung, R.J. Marks II, Imaging sampling below the Nyquist density without aliasing, J. Opt. Soc. Amer. 7 (1) (1990) 92–105. [7] N. Dokuchaev, Predictors for discrete time processes with energy decay on higher frequencies, IEEE Trans. Signal Process. 60 (11) (2012) 6027–6030. [8] N. Dokuchaev, On predictors for band-limited and high-frequency time series, Signal Process. 92 (10) (2012) 2571–2575. [9] N. Dokuchaev, Near-ideal causal smoothing filters for the real sequences, Signal Process. 118 (1) (2016) 285–293. [10] P.G.S.G. Ferreira, Incomplete sampling series and the recovery of missing samples from oversampled bandlimited signals, IEEE Trans. Signal Process. 40 (1) (1992) 225–227. [11] P.G.S.G. Ferreira, Sampling series with an infinite number of unknown samples, in: SampTA’95, 1995 Workshop on Sampling Theory and Applications, 1995, pp. 268–271. [12] H.J. Landau, A sparse regular sequence of exponentials closed on large sets, Bull. Amer. Math. Soc. 70 (1964) 566–569.
[13] H.J. Landau, Sampling, data transmission, and the Nyquist rate, Proc. IEEE 55 (10) (1967) 1701–1706. [14] Y.M. Lu, M.N. Do, A theory for sampling signals from a union of subspaces, IEEE Trans. Signal Process. 56 (6) (2008) 2334–2345. [15] M. Mishali, Y.C. Eldar, Sub-Nyquist sampling: bridging theory and practice, IEEE Sign. Proc. Mag. 28 (6) (2011) 98–124. [16] A. Olevski, A. Ulanovskii, Universal sampling and interpolation of band-limited signals, Geom. Functional Anal. 18 (3) (2008) 1029–1052. [17] A. Olevskii, A. Ulanovskii, Functions with disconnected spectrum: Sampling, interpolation, translates, Amer. Math. Soc., Univ. Lect. Ser. 46 (2016). Providence, RI [18] A. Papoulis, Generalized sampling expansion, IEEE Trans. Circuits Syst. 24 (11) (1977) 652–654. [19] E. Serpedin, Subsequence based recovery of missing samples in oversampled bandlimited signals, IEEE Trans. Signal Proc. 48 (2) (20 0 0) 580–583. [20] A. Ulanovskii, On Landau’s phenomenon in rn , Mathematica Scandinavica 88 (1) (2001) 72–78. [21] A. Ulanovskii, Sparse systems of functions closed on large sets in rn , J. Lond. Math. Soc. 63 (2) (2001) 428–440. [22] P.P. Vaidyanathan, On predicting a band-limited signal based on past sample values, Proc. IEEE 75 (8) (1987) 1125–1127. [23] P.P. Vaidyanathan, Generalizations of the sampling theorem: seven decades after nyquist, IEEE Trans. Circuits Syst. - I 48 (9) (2001) 1094–1109. [24] D. Wei, L.Y. Mi, Generalized sampling expansions with multiple sampling rates for lowpass and bandpass signals in the fractional fourier transform domain, IEEE Trans. Signal Process. 64 (18) (2016) 4861–4874. [25] X.G. Xia, On bandlimited signals with fractional fourier transform, IEEE Signal Process. Lett. 3 (3) (1996) 72–74.