Copyright © IFAC Adaptive Systems in Control and Signal Processing. Grenoble. France. 1992
ADAPTIVE CHANNEL ESTIMATION FOR MAXIMUM LIKELIHOOD SEQUENCE ESTIMATION K. Wesolowskl Technical University ofPoznan.lnstituJe of Electronics and Communications. 60-965 Poznan. Poland
Abstract The paper reviews the problem of adaptive channel estimation for maximum likelihood sequence estimation. It explains necessity of channel estimation for operation of the ML receiver, reviews a few of the most important receiver structures, discusses several adaptation algorithms and joint blind adaptive sequence and channel estimation. Keywords adaptive estimation, adaptive identification, data sequence estimation
INTRODUCTION Intersymbol interference (ISI) is one of the main impairments in digital transmission over bandlimited or multipath communication channels. It is observed in front of the receiver as the overlapping responses to subsequent data symbols sent by a transmitter. Among a few receiver structures applied to minimize the disturbing influence of intersymbol interference the maximum likelihood (ML) receiver (Forney, 1972) achieves the highest performance. All versions of an ML receiver require estimates of the channel impulse response for their operation. Several algorithms of channel estimation are the subject of this review paper. We begin with a data transmission channel model and short introduction of ML sequence estimation. Subsequently we present the Viterbi algorithm and the channel estimation algorithms. We also discuss modifications of the Viterbi algorithm which fully implement the ML receiver or approximate its operation . We also consider the adaptive whitened matched filter in front of the ML detector. Further we present the newest developments in ML reception - the joint blind ML sequence and channel estimation.
.(1)
.00
!;dlc5(t-IT) .--_ _ _....,~ Chuael + .(1)
I....
I .
h(1)
t~
.1
Receher
Figure 1: The baseband equivalent model of a data transmission system
linear modulations usually applied in a transmission over channels introducing intersymbol interference. Symbol T denotes the signalling period. The data symbols dn are equally probable mutually independent and selected from the finite alphabet of size M. Function h(t) is the impulse response of the base band equivalent channel which represents the cascade connection of a transmit filter, transmission channel and receive filter. Let us assume that the impulse response h(t) spans L signalling periods. Thus L
x(t) =
L d;h(t -
iT)
+ z(t)
;=0
CHANNEL MODEL AND ML SEQUENCE ESTIMATION
or after sampling once each T second
Let us consider a baseband equivalent transmission corrupted by intersymbol interference (ISI) and additive Gaussian noise. The system model is presented in Fig.I. All variables and functions of time are complex-valued, thus allowing for consideration of all one- and two-dimensional
L
x(nT)
=L
d;h((n - i)T)
+ z(nT)
i=O
L
L i=O
523
hn-idi
+ Zn
=
rn
+ Zn
(1)
The channel described by (2) has ML states. From each state at the n - I-th moment M transitions to the states in the n-th timing instant are possible . Each state at the n-th moment is characterized by the cost c~ (k = 1, .. . , ML) of reaching it along the shortest route on the trellis (i .e. having the smallest ~n) from the state in which the channel remained at the moment n = 1. The shortest routes reaching all the states are often called survivors. Let us denote the cost of the transition between the j-th state at the n - I-th moment and the k-th state at the n-th moment as >,ik = Ix n _ r!.n k12 (5) n
where the sample rn is described by the formula L
rn
= Lhidn-i
(2)
i=O
Let us note that the sequence of unobservable terms {r1l} (n = 1,2 , ... ) is a sample function of a Markov chain. The full channel output signal is its noisy version. The channel described by (2) can be understood as the automaton with the states determined by the current vector (dn,d n - 1 , ... dn-d. The transitions among states in subsequent time instants are due to the new symbol dn + 1 entering and modifying the data vector. Let us assume that the noise samples are mutually independent . Our aim is to estimate the transmitted data sequence d n = {d 1 , ... , dn } upon observation of x" = {x 1, . .. , x,,}. In order to perform this task we maximize the n-dimensional probability density function p(x n Id,,). The receiver which performs this maximization is called maximum likelihood recewer. While X1l = rn + Zn where z" = {Zl , Z2 , ... , zn} is the sequence of statistically independent Gaussian variables we obtain p(xnldn)
= (27rO",1) n /~• •
where r~k is the channel response to that particular data sequence (In , I n - 1 , .. . , I n - L ) which is assigned to the transition between the states j and k and the survivor for the state j at the n - I-th moment. At the n-th moment for each state k (k = 1, .. . , ML) the Viterbi algorithm finds the shortest route (survivor) and the state P from the n - I- th moment calculating j c k = min(Cn-l + )..jk) n n
(3) One can easily conclude from (3) that maximization of P(X1lIdn) is equivalent to minimization of the squared Euclidean distance ~n between the sequence Xn and that sequence rn which would be achieved if a particular sequence d n were transmitted , i.e. we mllllmlze
=
It has been noted that D 3£ to 5£ timing instants back, with respect to the current time index, all the survivors originate from the common route with the probability close to one. Thus, the common part of the survivors corresponds to the single data sequence . In consequence, in the n-th timing instant the data symbol estimate dn - D can be transferred to the output of the Viterbi detector.
n
=L
IXi - ri 12
(6)
The survivor for the state k at the n-th moment consists of the data symbol dn associated with the transition p --+ k and the survivor for the state j* at the n - I-th moment. The new cost C~ is recursively assigned to the state k.
ITn exp[-Ixi - riI 2 / 20";] i=l
0"; is the variance of noise samples.
~1l
.
J
(4)
i=l
Let us note that finding the optimal d n which ensures the minimum of ~n is equivalent to selecting the corresponding rn. However, to calculate the sequence rn the knowledge of hi, ( i = 0, 1 . .. , £) is necessary. It often happens that the channel impulse response is not known at the beginning of the transmission or that it. varies with t.ime. Thus, the startup impulse response channel estimation and the tracking of its changes is necessary for appropriate operation of the ML receiver applying one of the forms of the Viterbi algorithm.
ML RECEIVER BASED ON VITERBI ALGORITHM AND ADAPTIVE CHANNEL ESTIMATOR We have already stressed that the knowledge of the channel impulse response is crucial for the operation of the detector due to (4) and (5) . Magee and Proakis (1973) proposed the Viterbi receiver supplemented by adaptive estimation of the channel impulse response based on detected symbols. Fig .2 presents the scheme of such a receiver. The channel estimator has a structure of an adaptive transversal filter generating the estimate of the channel output Xn-D. and the channel impulse response hn based on the input signal vector (d n - D , dn - D - 1 , ... , dn - D - L ) and the reference signal xn-D!. Let us note that the channel output which serves as the reference signal is appropriately delayed by Dl ~ D symbol periods.
THE VITERBI ALGORITHM The Viterbi algorithm is a version of dynamic programming developed for decoding of convolutional codes. Forney (1972) discovered that it can be also applied to detection of digital signals corrupted by IS!. The Viterbi algorithm minimizes the squared distance (4) using one of the possible automaton descriptions - the trellis diagram. 524
receiver. In regular operation, algorithm (9) is needed only to adjust time variations of channel characteristics. The latter task can create some difficulties if the channel is relatively fast time-variant. Let us note that due to the delay introduced by the Viterbi detector, algorithm (9) incorporates a substantial delay in its loop. One solution to avoid this disadvantage is to apply less delayed but also less reliable data symbols derived from the survivor having the lowest cost. This can lead to incorrect estimation of the channel impulse response. Another approach is to use algorithm (9) supplemented by a linear predictor compensating at least part of the delay caused by the Viterbi detector (Clark, Hariharan, 1989).
Delay 01
Figure 2: Basic adaptive ML receiver with Viterbi detector Usually the channel is non- minimum phase, thus the main tap of the transversal channel estimator is located somewhere in the middle of the tapped delay line. In consequence D! > D .
There exist applications for which the rapid start-up channel estimation based on a short reference data sequence is required. Mobile digital TDMA radio is a good example of such a system. The solution of this problem can be categorized into two classes.
Magee and Proakis (1973) proposed minimization of the mean squared error! En = E[lxn-D, Xn-D! 12] as the criterion of th~ adjustment of the estimated coefficient vector h n . Knowing that the estimator's output signal is ~
Xn-D,
I
= hndn- D
The first is the application of the least squares (LS) tap adjustment algorithm according to the criterion
(7)
( 10)
we obtain after standard manipulations which leads to the solution (11)
where Rd = E[d~_Dd~_D] is the data autocorrelation matrix and v = E[Xn-D, d~_D] is the cross-correlation vector. Assuming, as previously, that data symbols are zero mean and statistically independent they are uncorrelated as well. In consequence Rd = E[ld n 12] I and we obtain a particularly advantageous case - the correlation matrix is diagonal. It is obvious that it is also positive definite. Thus, the LMS algorithm
hn+l =
where Rn = 2:7=1 d: d; and Vn 2:7=1 Xid: . There is very rich literature on recursive computationally efficient solution of (11) which will not be quoted here. Although LS algorithms are very fast during the start- up stage the intensive investigations for the nonstationary channel estimation (McLaughlin , Mulgrew, Cowan 1987) led to the conclusion that the LMS algorithm follows the changes in channel impulse response as effectively as LS algorithms.
hn + aend~_D'
en = Xn-D, - :rn-DJ (9) leads to the global minimum if the step size a is appropriately small. In the general case the speed of convergence of the LMS algorithm depends on the eigenvalue spread of the input signal autocorrelation matrix. For the channel estimator all the eigenvalues are equal to E[ld n 12 ]. As a result the convergence speed for "good" and "bad" channels which are characterized by "small" and "large" eigenvalue spread of the channel output autocorrelation matrix is the same.
Let us note that the correlation matrix Rn is calculated based on the known training sequence. For the start-up procedure with a given training sequence length N the matrix RI:/ used in (11) can be precomputed and stored. For some special training sequences the matrix is not even required. This case occurs if RN is diagonal (Crozier, Falconer , Mahmoud, 1991). The second solution is to apply a carefully selected periodical training sequence which in the absence of noise would allow the precise calculation of the samples of finite impulse response . Let the period of the training sequence be N 2': L + 1. The sequence x = (Xl,X2, ... ,XN)' of length N being the response of the channel to a periodical training sequence (d 1 ,d 2 , ···,dN ), (di = dN +;)
In most cases before the regular operation of the detector the initial settings of the channel estimator are required. Thus, the transmitter sends the data training sequence which is known to the 1 E[.) denotes expectation, (.). complex conjugation, (.)' vector transposition , (.)1 conjugate transposition
525
m the presence of the white zero-mean Gaussian noise samples z = (Zl, Z2, . . . , ZN)' can be described in the matrix form as
x = Sh+z
(12)
where
Figure 3: ML receiver with adaptive front-end filter and channel estimator and h = (hI, ... , h N )' . S is a circulant matrix. If S is nonsingular (the training sequence can be easily selected to ensure this property) it has an inverse S-1 which is also a circulant matrix . We obtain from (12)
h = S-l x = h + S-I Z
MODIFICATIONS OF THE VITERBI RECEIVER
(13)
As we have shown previously the Viterbi algorithm for data sequence estimation has ML states. It is computationally unacceptable when L is in the order of tens or M is higher than two . Thus, several modifications of the original receiver have been developed which have significant consequences for adaptive algorithm in the receiver . Generally they can be divided into two categories.
While the noise is zero-mean the estimator h of the channel impulse response vector h is unbiased. It can be easily proven (Milewski, 1983) (Clark, Zhu, Joshi, 1984) that the minimization of the mean squared error
takes place if the matrix (S-I)t(S-I) is a scalar matrix. In consequence the sequence (d 1 , d 2 , . . . , dN ) should be self-orthogonal. The multiplication h = S-l x can be effectively performed by FFT and IFFT operations as was shown by Milewski (1983) .
The goal of modifications which belong to the first category is to shorten the length L + 1 of the impulse response seen by the Viterbi detector to an acceptable value. It is realized by adding a linear adaptive transversal filter with coefficients fn in front of the Viterbi detector . (see Fig.3) . In the simpler case the impulse response h n can be preselected and kept constant (Quereshi, Newhall, 1973) . Better performance is obtained if the vector h n is adaptively adjusted jointly with the linear filter fn (Falconer, Magee, 1976) . Falconer and Magee (1976) reported application of the LMS gradient algorithm to both adaptive filters. Generally the linear filter in front of the detector which forces shortening of the impulse response length L + 1 realizes some kind of channel equalization . In consequence , the signal to noise ratio in front of the Viterbi detector is lower as compared with the system drawn in Fig .2 and the price paid for simplification of the detector is deterioration of the detection performance. This drawback is at least partially avoided if an adaptive decision feedback equalizer in front of the Viterbi detector is applied. Lee and Hill (1977) proposed such a nonlinear structure (see Fig.4) . Wesolowski (1987) extended its application to multi point QAM signals .
Let us note that for self-orthogonal periodical sequences
S-1 = ~ r .
N
d* d' N-l
N(Td
d*1 d*N
d N_ _,2
d*2
d*N
N
1
( 14)
:
d*1
L:::
where N (T~ = 1 Id i 12. Equation (13) can be understood as a calculation of a correlation between the received sequence x and the training sequence. We have namely , hi
IN
= N(T2 L
xkd~_i'
i
= 1, .. . , N
(15)
d k=1
Equation (15) is particularly easily implementable and often used in start-up channel estimation . The only requirement for this method is that the shifted versions of the training sequence have to be self-orthogonal - the assumption about periodicity of the training sequence is not necessary.
The second category of modifications relies on changes made "inside" the Viterbi detector. The prominent examples of such receiver structures are due to Duell-Hallen, Heegard (1988) and 526
negative of the reciprocal of a root of H (z) . G is the gain factor. H 2 (z) represents that part of the channel transfer function which is characterized by roots located outside the unit circle. The cascade connection of the channel H(z) and the WMF should be be minimum phase so that all their roots should lie inside the unit circle. The transfer function of the WMF should be
Figure 4: Decision feedback equalizer with Viterbi detector
where
Eyuboglu, Qureshi (1988). Among other modifications they both apply decision feedback inside the trellis diagram . The number of states is substantially decreased by subtracting part of intersymbol interference term (2) using the data sequence associated with each survivor. Equation (2) used to calculate metric (5) is modified in the following way K
l·~k
=L ;=0
and the joint transfer function of the channel and the WMF is
W(z) is an all-pass function with IW(z)1 = 1. Assuming that th e samples h; (i = 0, ... , L) of the channel impulse response have already been est imated Clark and Hau proposed a simple and original method of recursive identification of subsequent roots (3i (i = 1, ... , m). Th e algorithm is well suited for the DSP software implementation. Mulgrew's (1991) algorithm is based on the Kalman filtering approach.
L
hid n - i
+
L
hid n _ ;
(16)
;=K+l
where J( is the appropriately selected length (I( < L) which ensures compromise between the computational complexity and performance. However , th e id ea of decision feedback works well if a substantial part of the impulse energy is contained in the first J( + 1 samples hi . The whitened matched filter (WMF) is the filter which should be applied in front of the detec tor to fulfill this requirement. Let us recall that the cascade connection of the WMF and Viterbi detector is the optimum receiver. The WMF ensures that the joint channel and WMF impulse response is causal, minimum phase and the output noise is white. In general, the WMF synthesis is not straightforward . The linear fe edforward filter of the MMSE decision feedback equalizer well approximates the WMF if the SNR is high (Wesolowski, 1987). However, it is also co nvenient to adaptively synthesize the WMF after obtaining the estimat.e of th e channel impulse respo nse . Such an ap proach was reported by Clark and Hau (1984) and M ulgrew (1991). Clark and Hau based their method on the representation of the z- transform channel transfer function in the form
JOINT BLIND CHANNEL AND DATA SEQUENCE ESTIMATION Re ce nt developments in blind equalization gave the incentive to investigate joint blind channel parameters and data sequence estimation. The results of research in this fi eld have been reported only very re cently. Kawas Kaleh and Vallet (1991) proposed to apply the so-called Expectation - Maximization algorithm to joint estimation of the data sequence and channel impulse respo nse. The channel estimates are obtain ed through itera tive maximization of a fun ction related to the Kullback- Leibler information measure. Optimum symbol-by-symbol decisions on information symbols are derived after obtaining channel parameters by maximization of the selected likelihood function. The receiver presented by IItis, Shynk and Giridhar (1991) consists of ML+l conditional Kalman or LMS channel estimators. The symbol decisions are made using the maXimum a posteriori probability metrics similar to the receiver proposed by Abend and Fritchman (1970).
( 17) where
H1(z)
= G(I+aoz-1)(1+alZ-1) ... (1+aL_m z - 1) (18)
Ushirokawa et al. (1991) proposed the most practical and conceptually the simplest algorithm of blind maximum likelihood detection with channel est imation. They considered the Viterbi detector with individually estimated channel impulse response at each trellis state, contrary to
and
ai (la; < 1, i = 0, ... , L - m) is the negative of a root of H(z), (3i (1(3; < 1, i = 1, ... , m) is the 527
IEEE Trans . Commun ., 37, 428-436. Eyuboglu, M. V. and S. U. H . Qureshi (1988) . Reduced-state sequence estimation with set partitioning and decision feedback . IEEE Trans . Commun ., 36 , 13-20 . Falconer, D. D. and F . R. Magee (1976). Evaluation of decision feedback equalization and Viterbi algorithm detection for voiceband data transmission - part 1. IEEE Trans. Commun., 24, 1130-1139. Forney, G . D. (1972) . Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference . IEEE Trans . Inform. Theory , 18, 363-378. litis, R. A ., Shynk J. , J . and K. Giridhar (1991) . Recursive Bayesian algorithms for blind equalization . Proc. of Twenty-Fifth Asilomar Conf. on Signals, Syst ems and Computers Kawas Kaleh, G . and R. Vallet (1991) . Joint parameters estimation and symbols detection for digital communication over linear or nonlinear unknown channels. - submitted to IEEE Trans . Commun . Lee, W . U. and F. S. Hill , Jr . (1977). A maximum-likelihood sequence estimator with decision-feedback equalization . IEEE Trans . Commun., 25, 971-979. Magee, F. R. and J. G. Proakis (1973) . Adaptive maximum-likelihood sequence estimation for digital sign alling in the presence of intersymbol interference. IEEE Trans. Inform . Th eory, 19, 120-124. McLaughlin, S., Mulgrew , B. and C . Cowan (1987). Performance comparison of least squares and least mean squares algorithms as HF channel estimators . Proc. IEEE 1nl. Conf. ASSP. Milewski, A. (1983) . Periodic sequences with optimal properties for channel estimation and fast start- up equalization . IBM J. Res. Develop. , Q., 426-43l. Mulgrew, B. (1991) . An adaptive whitened matched filter. Proc . of Int . Conf. ASSP. Qureshi, S. U. H. and E. E . Newhall (1973) . An adaptive receiver for data transmission over time-disp ersive channels . IE EE Trans . Inform. Theory , 19., 448-457. Ushirokawa, A. , Furuya, Y ., Isa, H., Oda, H. and Y . Sato (1991) . Viterbi equalization on timevarying channel. Proc. of Makuhari Int. Conf. on High Technology, 101-104. Wesolowski, K. (1987). An efficient DFE & ML suboptimum receiver for data transmission over dispersive channels using two-dimensional signal constellations . IEEE Trans . Com-, mun . 35, 336-339 .
the conventional Viterbi detector using the single channel estimator . The set of estimated impulse responses is applied in metric calculations (2) and (5) so the decision sequence is never used by the channel estimates. The authors consider two methods of channel impulse response estimation . The first is the direct least squares solution of the linear equation describing the N -element block channel output as in (12) . However, this time the data matrix Sn depends on the time index n and is given by the expression
The least squares solution is (N is not necessarily equal to L) .,
h = [stn S n )-lStn x n
(22)
The a uthors also mention the possibility of application of adaptive channel estimators for each trellis state driven by the data sequence associated with th e corresponding survivor . Although much more compli cated then the generic structure shown in Fig .2 the blind Viterbi equalizer prop osed by Ushirokawa et al. allows for more effective op eration on fast time-variant ch annels du e to the lack of the delay in the estimator 's adjustment algorithm . CON C LUSIONS Several stru ctures of a maximum likelihood receiver a nd associated channel estimation algorithms have been considered . The LMS , LS and noniterative algorithms are the most import ant for estimation of the channel impulse response. The newest developments in TDMA digital mobile radio indicate that for the first time the Viterbi algorithm with the adaptive cha nnel estimator will find the appli cation in a consumer product - a mobile station in the form of a handheld phone or car telephone set . REFEREN C ES C lark , A. P. and S. Harihar an (1989) . Ada ptive channel es tima tor for an HF radio link. IEEE Tran s. Commun. , 37, 918-926. C lark , A. P. and S. F. Hau (1984). Adaptive adjustment of receiver for distorted digital signa ls. lEE Proc. -F, 131,526-536. Clark, A. P., Zhu Z. C. and J . K. Joshi (1984). Fast start- up channel estimation . lEE Pro c.F , ill, 375-382 . Crozier , S. N. , Falconer , D. D. and S. A. Mahmoud (1991) . Least sum of squared errors (LSSE) channel estimation . lEE Proc.-F, 138 , 371-378 . Duell-Hallen , A. and Ch . Heegard (1989) . Delayed decision- feedback sequence estimation. 528