ELSEVIER
Copyright © IFAC Programmable Devices and Systems, Ostrava, Czech Republic , 2003
IFAC PUBLICATIONS www .elsevier.comllocalelifa c
TURBO CODE PARADIGM
AND SISO IMPLEMENTA TIONS
Karel Vkek
VSB Technical University ofOstrava Czech Republic Karel.
[email protected]
Abstract: The paper introduces the soft-input soft-output (SI SO) concept of error-control systems, which represents basic principle of turbo codes. The turbo codes are the most powerful forward error correcting (FEC) technology commercially available. Its iterative decoding process is called turbo decoding. The block pseudo-random intereaver construction based on random search is designed using permutations on each row. The turbo decoding is a sub-optimal decoding; it is not a maximum likelihood decoding. It is important to base the choice on performance at low convergence properties. Copyright © 2003 IFAC
Keywords: Forward error correcting (FEC), Turbo codes, Soft-input soft-output (SISO), Suboptimal decoding, block pseudo-random intereaver
1.
r(t) = s(t,
INTRODUCTION
X)+n(r) ,
where X E {a i } , under condition, that t
Turbo codes represent a new class of codes that were introduced in 1993 by a group of researchers from France Telecom, along with a practical decoding algorithm. The basic principle is to use two low complexity component encoders, separated by an interleaver function, which introduces diversity. The data sequence is encoded twice, but not in the same order. The encoding process produces the two parity sequences.
E
(0, T) .
The noise is expressed as a sum of more independence signals. We suppose that the signals have normal (Gauss) spread out. If the wide-band of the noise is greater then the wideband of the transmitted signal, the noise will be regard as the so-called white noise. The spectral power density of the white noise is considered as constant value. This abstraction can be used, if it is not constant out of the interval of wide-band of input circuits. In this case the result will be not influenced in the process of receiving. The theory of the estimate is used for the optimal receiving of the input signal.
At reception, two elementary decoders are implemented, one processing natural order data, the other processing permuted data. Each elementary decoder produces reliability information, called extrinsic information, which can be used by the other decoder to improve its performance. This iterative process is called turbo decoding. Implementing the turbo code generally leads to a gain of 2 to 4 dB compared to classical FEe solutions.
2.
THE MOSTIPROBABLE AND THE MOST LIKELIHOOD ESTIMATION
Let the vector P is vector of unknown accidental parameters, which influences vector R, which can be considered as a vector of measured accidental values. If it is known the statistic dependency of measured values, and the seek parameters, it can be expressed the conditioned density of probability of the vector of measured values by the expression:
The received message is expressed by a signal r(t). It is possible to suppose that it is an addition of useful signal and the noise:
r(t)= z(t)+n(t), Let us suppose, that the symbol synchronisation is fulfilled, it is known in what time is known the beginning and ends of the changes of the modulated signa\. Without the lost of general, we will suppose, that the estimation of occurrence of symbol in the received signal will be in the interval (0, T). The estimation is done on the base of relation:
i R1P(r, p) = fR, ...R. Ip, .:.P, h ,,. .,r. ,p",.. ,Pt )' We can do "good" estimation P by using suitable method for considered criterion of unknown parameters. Frequently used criterion, we use is the most probable estimation.
259
This criterion of the quality is called "maximum a posteriori probability" (MAP). This criterion is frequently used in digital communication that is why it minimises the error in the symbol receiving. The process of the MAP estimation will be derived first for the case, if the vector P have the discrete values:
The system of equations is called likelihood equations. The all parameters are included into the equation system. In practice we need only some parameters often. This is the way to significant simplification of the solution of equation system.
P= {pJ. The probability that the P have the value Pi' if we had measured the value r of the vector R, is expressed by the Bayes'es formula. The relation due to expresses the most probable estimation:
PMAP (r)= argmax Pr{P = Pi jR = r)= Pi
l
.j
fRIP(r,Pi) ( PrP=P i ) = fR (r)
=argmax p,
Elimination of non-interesting parameters before the seeking of estimation of interesting parameters is very productive in this case. During this process it is possible to split the vector of parameters into the vector Z, which is constituted by the interesting parameters, and the vector N of non-interested parameters. The two-dimensional value of density f ZR' on which the MAP estimation is based as a marginal spread out.
= argmp~ Rip (r,Pi ))Pr(P = Pi)
f
f ZR (z, r) = f ZNR {z, n, r )dn-
The value fR(r} will not depend on the value of parameter P. The usual prescription of the maximum of function is:
The integration by the vector n is the multidimensional integral by the components of the vector. The MAP estimation of interesting parameters is done by the equation:
argmaxf{x} = argf{xrnax) = xrnax' x
3.
f
ZMAP =argm;u' fZNR(z,n,r';in=
THE MAP ESTIMATION
n
If there are unknown parameters expressed by the continuous values, it is not possible to use the method of maximising of probability, due to the amount of value growth over all values, and the value P tends to zero.
f
= arg m;u' fRIZN (r ,z,n)' f ZN (z,n ';in . As far as the interesting and non-interesting parameters are independent, we can write:
We will seek the maximum probability in this case, if the estimation is near to the real value, that is equivalent to the condition of the maximum of the density of probability f PIR' The most probable
The seeking of the MAP estimation is successful with the suppose of knowledge of conditioned probability fRjp{r, p), and an a priori knowledge of decomposition
estimation we will write that:
of unknown parameters f P (p) .
PMAP(r) = argmax fPlR(p,r)= p
= argmax(rw (p,r). fp (p p
=
If the information is not accessible, we must seek by the other criterion of quality estimation. This criterion is so called "maximum likelihood" (ML). It is defined by the description:
))=
argmaxfRP(r,P) p
When we seek of the maximum of the parameter, which is expressed by the function it is used derivation in the most cases. The MAP estimation is by this way expressed by the equivalence, in which the derivation of the scalar function designates the vector of derivation by the elements of this vector:
The practical calculation it is more useful to seek the maximum of its logarithm instead of the maximum of the density of probability. The equation of estimation is formed in this case:
260
PML (r;) = argmaxPr(R = r;jp = p). p
The expression of the density probability function we will write:
The comparison of the MAP and ML gives the equivalence of the two expressions is fulfilled, if the unknown parameters have the equivalent distribution. This gives the rule for elimination of unknown random parameters; we can write:
Similarly we will write relations the maximum seeking:
:p fRlp (r, p)l p=P = ML
4.
0, .. .
:p
In(JRIP (r,
p))1 P=P
ML
The MAP (Maximum cl Posteriori) decoding is done by multiplication of the received signal function s(t, X) and the function q,r (t) for the transmitted symbol estimation. After integration it is possible to write the equivalence:
= O·
R = g{X)+ N,
DECOMPOSITION INTO BASE FUNCTION
where
I:
g(X) = q1T(t) · s{t, x}dt, and
The received r(t) signal is defined at the interval (0, T) is decomposed into the base functions. The decomposition is written as the row vector of the base functions, as follows:
N =
1: q1T (t) · n{t}dt .
The variable f RiX is the probability density (by other word likelihood). This variable expresses, that the accident symbol X is the same value as the symbol a The conditioned j •
accidental vector is done by the equation: The base functions must fulfil the condition of the orthonormality, which is expressed by the equivalence:
RI{X = aJ= g{aJ+ N fRiX (r, aJ= f N{r- g{aJ) ,
T
Jq1T(r) ' q1(t)dt = E,
where a differs in non-accidental vector for various indexes. j
where E is the unit matrix, and the symbol T in the exponent signs the operation of the qy{t) matrix transposition.
~
fR1x{r, aJ=
1
{2JT: . Cor
exi1.__ . 1 {r -g{aJY.{r- g{aJ)J 2Co
where Co is the dissipation. The estimation of
5.
probability of density is expressed as:
ORTHONORMAL SYSTEM
X
The complete ortonormal system is described by the equivalence g(t} for the general function: T
Jq1T(t} . g(t}dt = o<=> g(r) = 0
".
=
t E (0, T)'
o
a
j )
arg~~( - 2~o {rr- g{aJY . (r - g(aJ) J
The maximum of exponential function is done for its maximal argument. The probability of density IS expressed as in this case: I
The decomposition of the received sequence r{t) at the interval t E (0, T) into the complete ortonormal system of partial sequences is described by the equation:
,(t) =
= argmax f R1x (r,
MAP
X
MAP
=argmin({r - g{aJY . (r - g{aJ)). " j
~R, ~,(t)= [~,(t",,(r).lH1
The equations are prescribed using the transformation to the time region:
=
(r-g{aJY·{r - g{a j ))= =
q1{t). Rt E (O,T)
T
J
I
= {r- g{a j)Y . q1T (i)· (r(t)-S j(t))dt =
where R is the vector of co-ordinates. The elements of the matrix R are the coefficients of decomposition into the elected system q1(t) . These elements of the decomposition are calculated by the equations:
o T
j
=
J(q1{t)· (r - g{a j)))
t . (r{t)- S j {t )}dt =
o T
= J(r(t)- Sj (t)Y dt
T
J
o
Rj = q1j (t) · r{t}dt i = 1,2, .. .<=> o
Previous derivation shows the MAP estimation transmitted symbol X .
X
of
This symbol is expressed by the signal element r{t) of modulated signal, which is represented in the sense of minimal energy distance. We will write:
The symbol of comparison <= of the coefficient sequence and the function is used in the sense of the quadratic centre.
261
T
bargain those even convolutional codes with small memory are extremely near by ideal codes to the case, in which R > C .
XMAP == arg~ f(r{t)- Sj{t)f dt· ,
We can rewrite rearrangement:
the
0
equation
under
algebraic
The code systems with the concatenated codes, in which the decoders exchange one to another the extrinsic information only, is substantial the way of distribution of intrinsic, and extrinsic part of information. It is shown in the following text how it is possible to separate intrinsic and extrinsic part of decoded message by soft decision. It will be introduced how have [PC I to do characteristics by graphs EXIT application [1].
where Ej is the energy of the signal element, which represents symbol aj
•
The derived algorithm is implemented by the system of turbo code based on iterative decoding algorithm.
6.
7.
SEPARATION OF EXTRINSIC AND INTRINSIC INFORMATION
The considerations will be limited to binary code system with systematic code with s code ratio R < 1 , for which the applied channel without the memory is reached by the natural way. In this case, there are all binary symbols of message transmitted two times.
CONCEPT OF EXTRINSIC AND INTRINSIC INFORMATION
Analytical separation of extrinsic and intrinsic part of information at the output of decoder of the errorcontrol coding system is ensured by the analytical method, [I]. The method is based on the standard code characteristic measurement BER (Bit Error Ratio). In other words, it is based on the probability of the first error occurrence and its equivalent expression by the soft output value of the decoder [3].
Let us consider a symbol in information part of systematic code with intrinsic probability calculated after decoding ~ and as another symbols in the message generated by the code. The extrinsic probability calculated by decoding is PE' If we consider the channel without memory are independent the both intrinsic PI and extrinsic PE probabilities.
The error-control coding systems with the hard expression produces the output value as the average BER value related to the channel capacity or the equivalent Signal to Noise Ratio (SNR) of the channel with Average White Gauss Noise (AWGN).
The double transmission of symbols of the message will be modelled by the two independent information channels, for example it will be two parallel binary symmetric channels with bit error rate c. = ~ and c2 = PE' This model is equally discrete channel without the memory, the capacity of which is:
Another way of description of the error correction ability is introduced in [3] . This way is based on calculation of the decoder output value called [PC (Information Processing Characteristics). The mutual information from the encoder input to the decoder output is in the [PC considered as variable ratio R in dependence to channel capacity C of the investigated system.
C,"m
=1- e2 (c.)- e2 (C2)+ e2 [(1- c.) (l- C2)+ £.£2]'
were e2 (. ) is the binary entropy with the function:
e2 (x):= -x .log2(x)- (1- x)·log2(1- x).
The code [PC characterises with respect to used model of information channel in conditions of optimal decoding. If the [PC characterises the sequence of the message symbol by symbol, during the decoding in initial order, then the [PCT (C) describes the message with the interleaving. The values of symbols at the output of the decoder with the hard s decision express the probability of the error occurrence and BER. The [PC curves are applicable for very simple codes with the short words by [3] it is evident. The [PC gives the basic information of the concatenated code system configuration, it makes more easy selection of systematic code. Looses, who are caused by the errors, are minimised by this way. The [PC shows into the
262
It will be consider the intrinsic and extrinsic information another example of the model of binary phase modulation with BPSK-AWGN. The two parallel BPSK-AWGN channels with quadratic noise error will be created by these models:
The expression of the channel capacity will be:
The most efficient way of modelling for computation of two-direction transmission is the application of two parallel binary erasure channels BECs. Their capacity
is linearly dependent on erasure ratio p (C = 1- p) and the resulting equivalent channel is the BEC type too. Multiplication and summation will calculate the "capacity summation":
Cm m =l-(l-C) 1 ·(l-C). 2 The results are very similar, we can calculate "capacity summation" with sufficient accuracy, if we know the capacity of parallel channels only and we mustn't have any knowledge of its detail behaviour. The application of this principle by the calculation of average value fPCE(C) from fPC/(C) is the code system with the systematic code. The mutual information, which is equal fPC/ (C), is the function of intrinsic and extrinsic information. This function fPCe(C) can be calculated from fPCJC) .
8. DECODING ALGORITHMS AND The decoding process is inherently iterative. One is able to decode the two encoded streams with an iterative process using two soft-in soft-out decoders, one corresponding to each of the encoders. The most used algorithm of the decoding is a MAP decoder. This is the reason why one uses the convolutional codes in the recursive systematic form.
Output 1
~ Interleaver
I
-'-
E\
Output 2 E2
Input I
Interleaver
Input 2
De-interleaver OUtpUl
Due to us cannot assure that the iterative decod algorithm will converge. The a priori probabilil must be thoroughly analysed.
9 . INTERLEAVING AND DEINTERLEAVING The turbo coding scheme usually have a few cc words of low weight. The pseudo-random interleavi enhances the performance in this solution. The co system can be bounded quite accurately. Special coding procedures are designed to fi! against burst errors, interleaving being one of the This method consists of reordering the symbols bef( parallel concatenation of codes. The receiver perfon the inverse operation, called de-interleaving. If t interleaving depth is large, we may treat errors on t output as independent.
,...
Figure 1 Encoder for turbo codes The decoding process must do the information exchange between the decoders. The output is the loglikelihood ratio for the information bits i , :
A(I' ) =Iog Pr{i, =I} =Iog Pr.PriOri {i, =I} + A'('I)' Pr{i, =o} PrapriOri {i, =o} I
results will be used in decode D] . However, ther still some positive feedback since A' depends on tl priori probabilities for the surrounding informa bits.
Figure 2 Decoder for turbo codes
ARCHITECTURES
Input
We must therefore subtract A aprlOrJ .. value before
The simplest interleaver separates the symbols of t code sequence by a fixed number of positions. It c be implemented as a memory array. Interleav symbols are written into the array by the rows a] columns read them. If the array has j columns t adjacent symbols of the same code sequence a separated by j - 1 symbols on the interleaver output
I
A(i , )= A.PriOri (i, )+ A' (i, ).
The output from the first decoding in decoder D] can be used directly as a priori probabilities of decoder D2 • On the other hand if we return to decoder D] for the second iteration, we cannot use the output from the decoder D2 directly, since the part of it came from the decoder D] in the fIrst place, this is mainly A apn.. ' orl
263
Indeed, in this case, the probability of son combination ep e2 , ... en of errors in the block wi interleaved positions is equal to: n
Pj (e], e2 ,· ·· en )= 7r
I1 pk )p j l' ;=1
Due to Markov chain regularity: limp j j-+oo
=17r,
over A WGN and Rayleigh fading channels will be presented.
therefore pj is approximately equal to In for large value of j. Asymptotically, relationship will have the form
p~(epe2'" .. eJ=
fIn P(eJl= IIp(e ;=1
11. j ),
Support for GACR project (No. 10210111531) is gratefully acknowledged.
j=1
and when the interleaving depth tends to infinity, the interleaved errors are statistically independent. However, in order to achieve better performance, the interleaving depth might be large, thus requiring it to have a large capacity of encoder and decoder memory as well as a large delay in delivering information to the user. It is possible to improve interleaving efficiency by using some sophisticated interleaving techniques.
LITERATURE [1] [2]
[3]
The error level depends on the specific interleaver, it can be done by improving the interleaver construction instead of using pseudo-random interleavers. The many of codes that are actually implemented are not quite effective in an actual channel due to the grouping of errors.
[4]
The interleaving may be treated as a particular case of coding with zero redundancy. For instance, the product code of the (n" 11, ) -code and (n 2 , k2 )-code represents the {n 2 , k2 )-code with depth nl canonical interleaving. Ifwe consider the interleaver construction used in [5], we can construct a block interleaver with permutations of each row. These interleavers have a large number of the most critical patterns with regard to a specific code. With this approach we have found an interleaver where we have a tight upper bound to minimum distance of the complete turbo code system.
[5]
[6]
[7]
[8]
The choice of the interleaving is the simplest of the block interleaver. The information is written by rows and the columns read it. However two input words of low weight would give some very unfortunate patters in this interleaver.
[9]
Turin, W.: Digital Transmission Systems, McGraw-Hill (1999), ISBN 0-07-065534-0 Divsalar, D., Pollara, F.: Serial and Hybrid Concatenated Codes with Applications, Jet Propulsion Lab., under contract with NASA Huettinger, S., Huber, J., Johannesson, R. Fischer, R.: Information Processing in So ftOutput Decoding, Proc. of the 39th Annual Allerton Conf. (Oct. 2001) Andersen, J., D.: Selection Component Codes for Turbo Coding Based on Convergence Properties, http://www. tele.dtu.dkHda Berrou, C., Glavieux, A.: Near Optimum Error Correcting Coding and Decoding: Turbo-codes, IEEE Transactions on Communication, vo!. 44, no. 10, Oct. 1996 Viterbi, A., J.: Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm, IEEE Trans. Inf. Theory, vo1. IT-13, (1967) Vlcek, K., at a1.: Turbo Coding Performance and Implementation, Proc. Of the IEEE DDECS 2000 Workshop, (5-7 April 2000, Smolenice, Slovakia), ISBN 80-968320-3-4, p. 71 Vlcek, K.: Compression and Error Control Coding in Multimedia Communication, BEN Technicka literatura, Prague (2000), ISBN 8086056-68-6 Vlcek, K.: Turbo Codes and Radio-data Transmission, Proc. of the IFAC PDS 2001 Workshop, Elsevier Sci., Itd., Prep., pp. 22-29
[10] Vlcek, K.: Turbo Code SISO Iterative Decoder Implementations. In: Socrates Workshop 2002 Proceedings. Intensive Training Programme in Electronic System Design. Chania, Crete (Greece) 2-9 November, 2002. Musil, V. (editor), Novotny Z., Brno 2002, ISBN 80-2142217-3, pp. 117-120 [11] Vlcek, K.: Iterative Decoding for Data Transmission and Storage Devices, Proc. of the 25 th Internat. Conf. TD 2002-DIAGON 2002, (21 st - 2200 May, 2002, Academia Centre, Tornas Bata University, Zlin, Czech Republic), ISBN 80-7318-076-6, pp. 150-156
Pseudo-random interleaver is the best solution of the block interleaver. The pseudo-random interleaving is standard for the turbo codes. In spite of, it is possible to find interleavers that are slightly better than the pseudo-random ones.
10.
ACKNOWLEDGEMENTS
CONCLUSION
We must conclude that the final result after turbo decoding is a sub-optimal decoding due to the loops in the decision process. For low signal-to-noise ratios we may even see that the decoding does not converge to anything close to the transmitted word. Results are applied to examples of parallel, serial, and hybrid concatenation of codes, and self-concatenated codes, over A WGN and Rayleigh fading channels. Design rules for concatenated codes with interleavers
264