Robust differential modulations for asynchronous cooperative systems

Robust differential modulations for asynchronous cooperative systems

Signal Processing 105 (2014) 30–42 Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/sigpro Rob...

994KB Sizes 0 Downloads 70 Views

Signal Processing 105 (2014) 30–42

Contents lists available at ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

Robust differential modulations for asynchronous cooperative systems$ Alfonso Cano, Eduardo Morgado, Javier Ramos, Antonio J. Caamaño n Department of Signal Theory and Communications, Universidad Rey Juan Carlos, Camino del Molino s/n, 28943 Fuenlabrada, Madrid, Spain

a r t i c l e in f o

abstract

Article history: Received 2 December 2013 Received in revised form 9 May 2014 Accepted 18 May 2014 Available online 26 May 2014

Accurate channel state information and time/frequency synchronization are challenging to acquire in mobile ad hoc cooperative set-ups, where resource-constrained relaying terminals rapidly (dis)join cooperation. This paper derives low-complexity differential modulations and cooperative transmission schemes so that detection at the receiver is feasible without channel knowledge or synchronization information. A simple distributed doubly-differential (DD) time-division multiplexing (TDM) scheme using diagonal space– time (ST) unitary codes is derived to bypass multiple carrier frequency offsets (CFOs) and collect space diversity. Precoded differential orthogonal frequency division multiplexing (OFDM) transmissions are derived to additionally bypass timing offsets and collect multipath diversity using either DD diagonal or single-differential (SD) orthogonal ST block codes (OSTBCs). DD diagonal mappings suffer from considerable coding gain and signal-to-noise ratio (SNR) loss due to unitary constellation constraints and noise enhancement at the receiver. SD-OSTBCs achieve higher performance with reduced decoding complexity, at the cost of reduced CFO mitigation range. Simple and robust non-coherent selective and adaptive transmissions are also included, showing that full space and multipath diversity can still be achieved even with decoding errors at relays. Simulations corroborate theoretical claims. & 2014 Elsevier B.V. All rights reserved.

Keywords: Cooperative communications Carrier frequency offset Synchronization OFDM Differential modulation Decode-and-forward

1. Introduction Cooperative schemes using distributed single-antenna relays have been shown to be an effective alternative to enable spatial diversity along with resilience against shadowing

☆ This work was supported by the Spanish Ministry of Science and Innovation (MICINN) through project TEC2010-19263. The work of the first author was also supported by the Spanish Science and Technology Foundation (FECYT) Postdoctoral Grant (2007-0790). n Corresponding author. Tel.: þ34914887248; fax: þ 34914887500. E-mail addresses: [email protected] (A. Cano), [email protected] (E. Morgado), [email protected] (J. Ramos), [email protected] (A.J. Caamaño).

http://dx.doi.org/10.1016/j.sigpro.2014.05.023 0165-1684/& 2014 Elsevier B.V. All rights reserved.

and coverage enhancement [1,2]. These schemes typically operate under the assumption that cooperating terminals are perfectly synchronized in both time and frequency and the receiver has exact channel state information from all cooperating paths. However, this might be difficult to guarantee in uncoordinated mobile ad hoc set-ups, whereby coherence time is shorter and resource-limited terminals are unable to efficiently mitigate local oscillator drifts [3,4]. Other schemes operate with non-synchronized terminals but under the assumption of both channel and delay state information from all cooperating paths in the destination node [5]. With out perfect synchronization or accurate channel knowledge, existing coherent cooperative schemes suffer from considerable performance degradation. This is particularly critical for

A. Cano et al. / Signal Processing 105 (2014) 30–42

schemes using distributed orthogonal space–time block codes (OSTBCs) in which all relays transmit simultaneously, requiring accurate synchronization at a symbol level [2]. As already recognized in [4,6], orthogonal frequency division multiplexing (OFDM) transmissions can effectively combat time synchronization mismatches by treating them as multipath and increasing accordingly the cyclic prefix (CP) length. Distributed version of co-located space–time– frequency codes can then be employed to collect space and multipath diversity [4]. However, if carrier frequency offset (CFO) is also present, OFDM loses its orthogonality. The presence of CFO can be caused by (i) relative movement of terminals, which causes Doppler shifts and (ii) mismatches between local oscillators at relaying and destination terminals. Compared to co-located multi-antenna systems, CFO is particularly challenging in distributed set-ups since different relays (antennas) experience different CFOs. Current works have focused on algorithms to estimate or mitigate CFO effects via training [7–9], but assume perfect time synchronization. The schemes in [6,10,11] deal with both time and frequency mismatches via equalization assuming single-carrier [10,11], or OFDM transmissions [6], with known channel fading and assuming no relaying errors. The scheme in [12] deals with both time and frequency mismatches in OFDM-based cooperative systems, but it performs the estimation of the channel and CFO via training whereas our objective is to bypass both channel and CFO estimation at the receiver. This paper considers a different approach, designing specific modulations at the relays so that low-complexity detection at the receiver is feasible without channel knowledge or synchronization information. Differential modulations are employed, which define a recursion at the transmitter such that detection at the receiver can be accomplished using previously-received symbols. Differential modulations have distinctive implementation advantages in low-complexity (distributed) dynamic environments [3]. Single-differential (SD) schemes are able to bypass channel knowledge but fail if severe CFO is present. Double-differential (DD) schemes are able to bypass both channel and CFO at the cost of reduced signal-to-noise ratio (SNR) at the receiver side. Multi-antenna (OFDM) versions of SD and DD recursions for point-to-point links (same CFO for all transmitting antennas) are available in the literature [13–16], showing that differential recursions are able not only to bypass channel and CFO knowledge but also to achieve space (and multipath) diversity. We must point out that the coding schemes developed in this paper deal with cooperative systems with different CFOs among relays. This problem of different CFOs is specific in cooperative scenarios and makes the coding schemes previously proposed for non-cooperative systems, where there is an only CFO between source and destination nodes, non-directly applicable to cooperative systems with multiple relays. In this paper, we design these specific modulations:

 Assuming relays transmit frames using time-division multiplexing (TDM), a DD-TDM scheme is developed that deals with multiple CFOs among relays. Space diversity can be collected by suitable space–time (ST) mapping and interleaving across relays. Compared to existing





31

distributed DD designs, this simple DD-TDM scheme allows for distinct CFOs from different cooperating paths [17] and does not require repetition coding [18,19]. When both CFO and timing/multipath effects are present, a DD-TDM OFDM scheme is derived. Multipath and space diversity can be collected using coding across both relays and OFDM blocks, whereas orthogonality among OFDM subcarriers is preserved using a suitable block precoding. The resulting developed scheme can be seen as a distributed extension of the one in [16], designed for point-to-point links. For long multipath spread or increased number of relays TDM-based schemes suffer from considerable coding gain loss due to diagonal unitary constellation constraints and reduced SNR at the receiver. For that reason a simple distributed SD-OSTBC OFDM scheme is also derived. Relays simultaneously transmit signals differentially encoded and transmitted within OFDM symbols. Different from ST–frequency systems in which differentiallyencoded blocks are transmitted in different OFDM transmissions [13], the SD-OSTBC OFDM scheme here derived has differentially-encoded blocks closer to each other and thus is more robust to CFO effects. This scheme also features low complexity decoding and collects both multipath and space diversity.

To deal with decoding errors at relays, simple and practically-appealing decode-and-forward (DF) protocols are derived. Analog amplify-and-forward based schemes, as in [20], require analog signal processing and storage at relays because these relaying terminals amplify the received signal and retransmit, but in DF protocols relaying terminals decode frames prior to re-encoding and retransmitting them to the destination. Two DF schemes are considered: (i) select-and-forward (SF) [2,1], whereby relays only transmit frames if correctly decoded; and (ii) link-adaptive-relaying (LAR) [18,2], whereby relays weigh the power of the transmitted frame according to the instantaneous power of the received signal. SF and LAR protocols were originally developed for coherent frequencyflat cooperative schemes. They are extended here to noncoherent transmissions under multipath channels. Performance analysis and simulated tests will show that both LAR and SF protocols are robust against intermediate decoding errors and enable the maximum possible diversity available by the distributed set-up. This work extends our results obtained in [21], where we explored the use of DD modulations in cooperative systems to obtain a low-complexity detection at the receiver, by-passing channel and CFO estimation. In the present work, we consider not only different CFOs among relays, but also timing/multipaths effects, again without channel knowledge or synchronization information. The rest of the paper is organized as follows. Section 2 presents the general system model, including CFO, timing offsets and fading effects. Section 3 considers the case when CFO is only present, derives the DD-TDM scheme and analyzes its performance in the error-free case. Section 4 considers the case when both CFO and timing/ multipaths effects are present, and derives the main novel contributions of the present paper: two distributed differential OFDM modulations that collect both space and multipath

32

A. Cano et al. / Signal Processing 105 (2014) 30–42

diversity in the error-free case. The DD-TDM OFDM scheme, derived in Section 4.1, is robust to any CFO range. The SDOSTBC OFDM scheme, derived in Section 4.2, achieves higher coding gain and features simpler decoding complexity, at the expense of limited CFO mitigation range. Section 5 presents cooperative protocols to deal with decoding errors at relaying nodes, and further analyzes their performance to prove that, also in the non-error-free case, our two differential OFDM schemes keep collecting full diversity. Simulated results are provided in Section 6; and Section 7 concludes the paper. Notation: Upper (lower) bold face letters will be used for matrices (column vectors); calligraphic letters will be used for sets; ðÞT , ðÞn and ðÞH are the transpose, conjugate and transpose conjugate (hermitian) of a vector or matrix respectively; ½k is the kth entry of a vector;  denotes Kronecker product; IN denotes the N  N identity matrix; 1N (1NM ) is the N  1 (N  M) all-one vector; 0N (0NM ) is the N  1 (N  M) allzero vector; diagðX1 ; …; XN Þ is a matrix with matrices X1 ; …; XN in its diagonal; Dx is a diagonal matrix with the elements of vector x in its diagonal; J  J is the Frobenius norm; j  j is the cardinality of a set; FN is the N  N Fast Fourier Transform (FFT) matrix with ½FN k þ 1;n þ 1 ≔N  1=2 expð2πnk=NÞ; and CN ðμ; s2 Þ denotes the complex Gaussian distribution with mean μ and variance s2.

2. System model and problem statement With reference to Fig. 1, consider a set of R relaying terminals fT r gRr ¼ 1 collaborating to transmit a block of symbols s to an access point or destination (D). Each element of this block belongs to a set As with cardinality jAs j and thus transports log2 jAs j information bits. Vector s can be made available to fT r gRr ¼ 1 through a previous broadcast transmission by a source S (as in Fig. 1) or through successive broadcast phases among relaying terminals as in [2]. For simplicity, it will be assumed block s is the same at all terminals; i.e., no transmission error occurred during the broadcasting phase. Section 5 will develop and analyze cooperation protocols that consider R fT r gRr ¼ 1 have different decoded blocks fs~ r gr ¼ 1 . Block s is encoded at each relay Tr and cooperatively sent to the destination. The nth sample at D, namely y(n), is the noisy output of the fading-and-CFO equivalent channel, with input being the transmitted symbol tr(n) per relay Tr; i.e., R

L1

r¼1

ℓ¼0

yðnÞ ¼ ∑ ejωr n ∑ hr;ℓ t r ðn ℓÞ þ zðnÞ

ð1Þ

where ωr and hr;ℓ are the normalized CFO and the equivalent channel for link T r  D respectively, whereas z(n) is the channel noise. The CFO ωr is given by ωr ¼ 2πT s f r , with Ts being the sampling period and fr being the physical frequency offset in Hertz. The CFO is different across terminals and is assumed to remain constant along the duration of the transmission and modeled as uniform within the interval ð  1=2T s ; 1=2T s . The channel hr;0 ; …; hr;L  1 is the effective impulse response per relay Tr, with þ τmax Þ=T s ⌉ þ 1 including both maximum channel L ¼ ⌈ðτmax c s ) and maximum sampling position error delay spread (τmax c (τmax ). Note that in general hrℓ might be correlated across s index ℓ. However, to simplify performance analysis hr;ℓ will be assumed independent and identically distributed (i.i.d.) zero-mean complex Gaussian; i.e., hr;ℓ  CN ð0; s2r;ℓ γ Þ, with γ r ≔γ ∑Lℓ¼10 s2r;ℓ the average SNR of link T r D. Simulated tests in Section 6 will consider different T r D link profiles. The noise term z(n) is modeled as white Gaussian zðnÞ  CN ð0; 1Þ 8 n. Note that the model in (1) is similar to that of [22], where a basis expansion model is employed to model Doppler shifts. Also note however that here the frequency of the complex exponentials (basis) is unknown. If known, only the fading coefficients would need to be bypassed, and so simple SD schemes developed for TV channels could be applied. The challenge here is to design modulation strategies at fT r gRr ¼ 1 such that detection at D can be accomplished without knowledge of either hr;ℓ or ωr 8 r while at the same time exploiting the maximum degrees of freedom that the R independent fades enable. In other words, the modulation scheme should also be designed so as to achieve the maximum spatial diversity enabled by the distributed set-up. Next section will deal with a simpler model in which multipath effects are not considered. It will be shown that in this case diagonal space–time matrices can robustly bypass CFO and channel knowledge. The scheme described below will serve as a basis for the most general I/O model described in (1). 3. DD-TDM transmissions under multiple CFOs In the present section and in Section 4 we focused on the relaying phase under the assumption that symbols s are available error-free at all relaying terminals. The baseband system model is depicted in Fig. 2. We will start from the outer (DD encoder) to the inner (parallel-to-serial (P/S)

+

Fig. 1. R-relay cooperation network.

Fig. 2. DD-TDM system model.

A. Cano et al. / Signal Processing 105 (2014) 30–42

conversion) stages at the transmitters' side, and proceed through the channel to the inner (serial-to-parallel (S/P) conversion) and outer (DD decoder) stages at the receiver.

33

with y~ r ¼ hr ejωr ðr  1ÞK Dωr tr þ z~ r

3.1. DD recursion and interleaving Assume that the block s at the input of the DD modulator is of length K 2 information symbols. Each symbol in s is mapped to a unitary diagonal matrix Dvk , with k ¼ 3; …; K, size R  R picked from a constellation V with jVj ¼ jAs j as in [14]. The constellation V critically affects the system performance, as will be shown at the end of this section. The considered scenario in [14] is nondistributed, with multiple antennas in transmission and reception; therefore, we suggest the straightforward application of the constellations designed in a distributed scenario, where the multiple antennas are replaced with the relays and there is an only antenna in the destination node. Matrix Dvk is used to yield K DD modulated blocks xk according to these recursions ( Dvk gk  1 ; k ¼ 3; …; K ð2aÞ gk ¼ k¼2 1R ; and xk ¼

(

Dgk xk  1 ;

k ¼ 3; …; K

1R ;

k ¼ 1; 2

;

ð2bÞ

similar recursions are also used in [14,23], where, remembering our notation, Dgk is a diagonal matrix with the elements of vector gk in its diagonal. Note that xk ranges from k ¼ 1; …; K, whereas Dvk ranges from k ¼ 3; …; K as it transports the K  2 symbols in s. Elements of xk are transmitted by different relays so as to enable spatial diversity. In practice, however, transmission is arranged in frames (or blocks) and so interleaving becomes necessary as shown next. Symbols xk are concatenated to build a block x≔ ½xT1 ; …; xTK T size KR  1 that is block-interleaved using a matrix ΘK to form t≔ΘK x. The KR  1 matrix ΘK is defined so that ½tðr  1ÞK þ k ¼ ½xðk  1ÞR þ r and can be compactly written as ΘK ≔½IR  e1 ; …; IR  eK 

ð3Þ

with ek being the kth column of IK . The interleaved block is t≔½tT1 ; …; tTR T . Relay Tr transmits the subblock tr during the rth time slot. Thus, after P/S conversion, the n-th sample of the transmitted signal tr(n) contains the entries of tr during time slot ðr 1ÞK þ1; …; rK and is zero outside that interval. The duration of the entire transmission is N ¼KR. 3.2. DD receiver and performance analysis In the absence of multipath effects, the I/O relationship in (1) reduces to R

yðnÞ ¼ ∑ e  jωr n hr t r ðnÞ þ zðnÞ; r¼1

n ¼ 1; …; RK:

ð4Þ

Let the received block at D after S/P conversion be denoted ~ This block can be partitioned into R subblocks as y. ~ y~ T1 ; …; y~ TR T , each corresponding to a different relay, y≔½

ð5Þ

Þ and z~ r is the noise term. where Dωr ≔diagðe ; …; e With reference to Fig. 2, the KR  1 block y~ is passed ~ through the de-interleaving matrix ΘTK to form y≔ΘTK y≔ ½y1 ; …; yK T . Each R  1 subblock yk is given by jωr

jωr K

yk ¼ Dh Dω Dωk xk þzk

ð6Þ

with ½Dh r;r ≔hr , ½Dω r;r ≔ejωr ððr  1ÞKÞ and ½Dωk r;r ≔ejωr k accounting for channel and CFO. The noise vector zk remains white Gaussian with the same power, as permutations do not affect its distribution. Observe that Dh and Dω are independent of k, whereas matrix Dωk changes with k. Consider three consecutive received blocks yk , yk  1 , yk  2 , expressed as [cf. (2) and (6)] yk ¼ Dh Dω Dωk  1 Dω1 D2gk  1 Dvk xk  2 þzk

ð7aÞ

yk  1 ¼ Dh Dω Dωk  1 Dgk  1 xk  2 þzk  1

ð7bÞ

yk  2 ¼ Dh Dω Dωk  1 Dω  1 xk  2 þ zk  2 :

ð7cÞ

Compared to [14], here the relative phase is expressed in terms of the diagonal matrix Dω instead of a scalar. As mentioned in [14] in the context of DD designs for colocated systems, the maximum-likelihood (ML) decoder for Dvk given yk , yk  1 and yk  2 in (7) may depend on the frequency offsets. To avoid this problem, a heuristic detector to decode Dvk by-passing CFO knowledge, and whose performance is surprisingly close to the ML detector, is derived next. Note that, as it is shown in Appendix A, Dyk ynk  1 ¼ Dvk Dyk  1 ynk  2 þ z0k ;

ð8Þ

where, remembering our notation, Dyk is a diagonal matrix with the elements of vector yk in its diagonal, and z0k ¼ Dzk ynk  1 þDnzk  1 yk þ Dnzk  1 zk  Dvk ½Dzk  1 ynk  2 þ Dnzk  2 yk  1 þ Dnzk  2 zk  1 :

ð9Þ

Discarding high-order noise terms, z0k can be approximated  ¼ Dyk Dnyk þ as Gaussian with covariance matrix Σk ≔E½z0k z0H k

2Dyk  1 Dnyk  1 þ Dyk  2 Dnyk  2 . This Gaussian approximation allows one to write a detector for Dvk from (7) as ^ v ¼ argminf J Σ  1=2 ðDy yn  Dv Dy yn Þ J 2 g D k k k1 k1 k2 k Dv

ð10Þ

^ v , a decoded block s^ is 8 k ¼ 3; …; K. De-mapping D k obtained at the destination. Note that this receiver only requires knowledge of previously-received blocks Dyk , Dyk  1 and Dyk  2 , by-passing channel or CFO knowledge. The average Pairwise Error Probability (PEP) will be used as the performance metric. The PEP is defined as the probability that the detector (10) confuses s for another s^ averaged over all channel realizations, and is denoted as Prðs-s^ Þ. We are interested in its diversity order, defined as the rate of decay in a logarithm scale as a function of the SNR (in dB); see, e.g., [2,14]. The following proposition states the diversity order achieved by (10). Proposition 1 (Diversity of error-free DD-TDM). The diversity order achieved by the detector in (10) assuming errorfree relays is R; i.e., Prðs-s^ Þ r ðGc γ Þ  R .

34

A. Cano et al. / Signal Processing 105 (2014) 30–42

The coding gain coefficient Gc ¼ Gc ðs21 ; …; s2R ; VÞ absorbs different relative SNRs among T r  D links, constellation distances and potential noise correlation effects, and will be analyzed via simulations in Section 6. The diagonal structure of the coding matrices effectively separate in different CFOs from the relays in distinct time slots and thus the resulting system in (10) is also diagonal. ^ v . If V The diversity order depends on the rank of Dv  D k

k

^ v is full rank, or is designed to guarantee that Dvk  D k ^ ^ v in V, equivalently ½Dvk r;r  ½D vk r;r a 0 8 r and 8 Dvk ; D k then the detector in (10) achieves the maximum possible diversity R. The details of the proof are similar to those in [14], where diagonal matrices are also employed, and thus are omitted here. Note however that when relays suffer from decoding errors the transmitted block is different from Dvk , and thus full rank may be lost. Section 5 will deal with this scenario in detail. Before that, we extend the analysis of this section to OFDM transmissions. 4. Differential OFDM transmissions under multiple CFOs and timing/multipath effects

Different from the previous section, x~ r;k is now precoded prior to transmission as x r;k ≔ð1L  FM Þx~ r;k

where FM is an M  M FFT matrix. Note that the precoding in (12) amounts to repeating FM x~ r;k L times. This will be instrumental to diagonalize the resulting CFO-plusmultipath channel. The ML  1 block x r;k in (12) is transmitted using OFDM. Mathematically, this is expressed using an IFFT matrix followed by an L-length CP-inserting matrix TML . The resulting OFDM block tr;k is thus given by tr;k ≔TML FH ML x r;k

4.1. DD-TDM OFDM scheme: DD-TDM using OFDM Assume the block of symbols s at Tr is now of size MðK 2Þ, that is, it has K 2 subblocks of size M each. The mth symbol at the kth subblock is mapped to an R  R unitary diagonal matrix Dvk and differentially encoded as in (1) ( Dvm;k gm;k  1 ; k ¼ 3; …; K gm;k ¼ ð11aÞ k¼2 1R ; ( xm;k ¼

Dgm;k xm;k  1 ;

k ¼ 3; …; K

1R ;

k ¼ 1; 2

ð13Þ

where TML ¼ ½T~ ML ; IML T and T~ ML ¼ ½0L;ðM  1ÞL ; IL T . Fig. 3 shows the overall scheme. Subblocks tr;k are concatenated across index k to form the block tr of length ðM þ1ÞLK to be transmitted by relay r through the channel in (1). Fig. 4 shows the resulting frame structure. Upon CP removal and S/P conversion the received ~ y~ T1;1 ; …; y~ T1;K ; …; y~ TR;K T , with each ML  1 subsignal is y≔½ T block y~ r;k is given by ~ y~ r;k ¼ ejωr Ir;k Dωr Hr FH ML x r;k þ z r;k

In this section, two differential schemes are derived to cope with both CFO and timing/multipath effects using OFDM transmissions. The first one is based on differential encoding across OFDM blocks and can be seen as a distributed generalization of the scheme in [16] using diagonal ST codes as in the previous section. The second scheme differentially encodes symbols within OFDM blocks using OSTBCs and allows for simultaneous relay transmissions.

ð12Þ

ð14Þ

where I r;k ¼ ðML þ LÞððr  1ÞK þ k  1Þ þL; ½Dωr m;m ≔ejωr m ; Hr T

is an ML  ML circulant matrix with first column ½hr ; 0TðM  1ÞL T where hr ≔½hr;0 ; …; hr;L  1 T ; and z~ r;k is the AWGN noise. Note that the number of subcarriers in this case is simply M  L. In the absence of CFO, Hr FH ML in (14) can be diagonalized by passing y~ r;k through an FFT block at the receiver side. However, the presence of Dωr destroys the orthogonality among subcarriers. In Appendix B it is shown that by exploiting the precoding in (12) the expression in (14) can be rewritten as pffiffiffi y~ r;k ¼ Lejωr Ir;k Dωr ðx~ r;k  IL Þhr þ z~ r;k : ð15Þ This expression shows that the equivalent channel seen at the receiver is now diagonal; i.e., orthogonality is preserved. Note however that entries in x~ r;k are coded across index r. Since the differential recursions in (11) are carried T T across indexes m; k, the block y~ k ≔½y~ 1;k ; …; y~ R;K T is deT interleaved to obtain yk ≔ΘTML y~ k ¼ ½yT1;k ; …; y~ M;K T where

:

ð11bÞ

The block xk ≔½xT1;k ; …; xTM;k T is first permuted as x~ k ≔ T T ΘM xk ≔½x~ 1;k ; …; x~ R;k T , with x~ r;k size M  1; see Fig. 3.

Fig. 4. DD-TDM OFDM transmitted frame.

Fig. 3. DD-TDM OFDM modulation scheme.

A. Cano et al. / Signal Processing 105 (2014) 30–42

the RL  1 subblock yTm;k is given by pffiffiffi ~ ωD ~ ω ðIL  Dx Þhþ zm;k ym;k ¼ LD m;k m;k

ð16Þ

~ ω  ≔ejωmodfm;Rg I~m;k ~ ω  ≔ejωmodfm;Rg ⌈m=R⌉ , ½D where now ½D m;m m;k m;m with I~m;k ≔ðML þ LÞððmodfm; Rg 1ÞK þ k  1Þ þ mL and h≔ ½h1;0 ; …; hR;0 ; …; hR;L  1 T . Subblocks ym;k can be further ~ ω, ; …; y T , and so can be D partitioned as y ¼ ½y m;k

m;k;1

m;k;L

~ ω and zm;k . Following similar steps as in the previous D m;k section, it is not difficult to show that Dym;k;ℓ ynm;k  1;ℓ ¼ Dvm;k;ℓ Dym;k  1;ℓ ynm;k  2;ℓ þ z0m;k;ℓ .

Approximating

the

noise

35

Note that the two DD-TDM strategies described so far separate transmissions in time slots to avoid inter-terminal interference. This is effected using the ST mappings in (2) and (11), constrained to be unitary diagonal. However, if the number of antennas or transmission rate increases, the minimum constellation distance decreases considerably fast. Moreover, diagonal mappings incur in decoding complexity that increases exponentially with the number of the transmit relays R. Next, it is shown that one can design an efficient differential scheme by having terminals simultaneously transmit signals within OFDM symbols with low complexity decoding using SD-OSTBC modulations.

z0m;k;ℓ as Gaussian with covariance matrix Σm;k;ℓ ≔Dym;k;ℓ ynm;k;ℓ þ2Dym;k  1;ℓ ynm;k  1;ℓ þ Dym;k  2;ℓ ynm;k  2;ℓ ,

the

following

detector can be derived [cf. (10)]:  L1  ^ v ¼ argmin ∑ J Σ1=2 ðDy yn D  Dv Dym;k  1;ℓ ynm;k  2;ℓ Þ J 2 m;k m;k;ℓ m;k  1;ℓ m;k;ℓ Dv

ℓ¼0

ð17Þ 8 k ¼ 3; …; K; m ¼ 1; …; M. As in (10), the search is over the same alphabet V, and so the detection rule in (17) incurs in the same complexity as the one in (10). De^ v , a decoded block s^ is obtained at the mapping D k

destination. The detector in (17) collects both multipath and space diversity as highlighted in the following proposition. Proposition 2 (Diversity of error-free DD-TDM OFDM). The diversity order achieved by the detector in (17) assuming error-free relays is RL; i.e., Prðs-s^ Þ rðGc γ Þ  RL . Multipath diversity is enabled thanks to the repetition structure in (12). To show this, notice that the detector in (17) can be alternatively written as ^ v ¼ argminf J Σ1=2 ðDy yn D  ðIL  Dv ÞDym;k  1 ynm;k  2 Þ J 2 g m;k m;k m;k  1 m;k Dv

ð18Þ 1=2

1=2

with Σm;k formed diagonally concatenating matrices Σm;k;ℓ . ~ v ÞÞ. The diversity now is given by the rank of ðIL  ðDv  D Since the Kronecker product of a full-rank matrix is still ~ v ÞÞ is LR. It is worth full rank, the rank of ðIL  ðDv  D stressing that the diversity result holds true when the L entries of hr are i.i.d. Gaussian, as stated in Section 2. If no i.i.d. multipath is present and the spread L is only given by synchronization mismatches, then the entries of hr are correlated and multipath diversity is lost. Still, the repetition structure in (12) is required to bypass CFO mismatches. Simulations in Section 6 with further elaborate on this. As far as throughput is concerned, a total of 2KR channel uses (KR channel uses to share information among cooperating terminals, plus another KR channel uses to transmit to the destination) are required to transmit K  2 information symbols. This is 1/2 of that of a point-to-point MIMO system with T ¼R transmit antennas, which would use KT channel uses to transmit same K 2 symbols. This reduction is understood as the price to pay for having cheap distributed cooperating terminals instead of a single, more expensive multi-antenna transmitter.

4.2. SD-OSTBC OFDM scheme: simultaneous transmissions using OSTBCs In the following it will be assumed that the CFO is less than half of the subcarrier spacing. Note that for larger values one can consider null-subcarrier based algorithms to correct the integer part of the offset prior to transmission [24]. The premise here is that, if information is coded within an OFDM block instead of across blocks as is typically done in space–time–frequency codes, the CFO seen is considerably reduced, and so SD-OSTBCs can be employed. As before, we start from the outer differential encoder. The block of symbols s is now split into K  1 subblocks size M each, with its kth entry of the mth subblock mapped to a set of P unitary fvm;k;1 ; …; vm;k;P g complex scalars each transporting 1=P log2 jAs j bits, with which the following matrix can be constructed [15]: 1 P Vm;k ¼ pffiffiffi ∑ Φp Refvm;k;p g þ jΨp Imfvm;k;p g Pp¼1

ð19Þ

where Φp and Ψp are unitary matrices given in [15]. K Having mapped s to blocks ffVm;k gM m ¼ 1 gk ¼ 2 , the latter are differentially encoded as ( VTm;k Xm;k  1 ; k ¼ 2; …; K Xm;k ¼ : ð20Þ k¼1 IR ; The resulting KR  R coded block Xm ≔½XTm;1 ; …; XTm;K T is further processed as follows: ~ m ¼ ð1L  FK ÞXm : X

ð21Þ

~ m , namely Each relay Tr will transmit the rth column of X x~ m;r , using OFDM. The mth transmitted block by the rth relay is given by ~ tm;r ¼ TKL FH KL x m;r :

ð22Þ

Observe that, different from the DD-TDM scheme, tm;r does not depend on index k used for differential recursions. This is because differential encoding is carried within (as opposed to across) OFDM blocks. Subblocks tm;r are stacked to obtain tr ≔½tT1;r ; …; tTM;r T . All relays fT r gRr ¼ 1 transmit ftr gRr ¼ 1 simultaneously through the channel in (1). After S/P conversion and CP removal, the mth KL  1 OFDM block at the receiver, labeled as y~ m is given by [cf. (1)] R

~ ~ y~ m ¼ ∑ ejωr ððm  1ÞKL þ mLÞ Dωr Hr FH KL x m;r þ z m r¼1

ð23Þ

36

A. Cano et al. / Signal Processing 105 (2014) 30–42

where ½Dωr k;k ¼ ejωr k . Each row x~ m;r was repeated L times in (21). Note that in this case the number of subcarriers is R  K  L. Mimicking the steps in Appendix B, each term within the sum (23) can be rewritten as y~ m ¼

pffiffiffi R jω ððm  1ÞKL þ mLÞ L ∑ e r Dωr ðx~ m;r  IL Þhr þ z~ m :

ð24Þ

r¼1

Vector y~ m can be partitioned in K subgroups size R each; ~ ~T ~T T ~ i.e., p ffiffiffi y m ≔½y m;1 ; …; y m;K  . In the absence of CFO, y m;k ¼ LðXm;k  IL Þh þ z~ m;k . Clearly, the ST matrix Xm;k is repeated L times. Let us further split y~ m;k into L subgroups by defining ym;k ≔ΘTL y~ m;k ≔½yTm;k;1 ; …; yTm;k;L T . In the absence of CFO, each subblock ym;k;ℓ is given by pffiffiffi ð25Þ ym;k;ℓ ¼ LXm;k h~ ℓ þ zm;k;ℓ with h~ ℓ ≔½h1;ℓ ; …; hR;ℓ T . From (25), the following ML SD detector can be derived:  L1  V^ m;k ¼ argmin ∑ J ym;k;ℓ  VT ym;k  1;ℓ J 2 ð26Þ V

ℓ¼0

which searches over all jAs j possible matrices V. After standard manipulations, and exploiting the structure of V in (19), a simple ML detector for each vm;k;p can be derived [15] ( v^ m;k;p ¼ argmax v

L1

∑ ReftrðΦp ynm;k;ℓ yTm;k  1;ℓ ÞRefvgg

ℓ¼0

n

þ ReftrðjΨp ym;k;ℓ yTm;k  1;ℓ ÞImfvgg

) ð27Þ

where the search is now carried over jvj ¼ jAs j1=P symbols in vm;k;p . De-mapping v^ m;k;p a decoded block s^ is obtained at the destination. Interestingly, the diversity achieved is still the same, as summarized next. Proposition 3 (Diversity of CFO/error-free SD-OSTBC OFDM). The diversity order achieved by the detector in (17) assuming error-free relays and in the absence of CFO is RL; i.e., Prðs-s^ Þ r ðGc γÞ  RL . In the absence of CFO ym;k;ℓ ¼ VTm;k Xm;k  1 h~ ℓ þ zm;k;ℓ . Stacking across index ℓ, ym;k ¼ ðIL  VTm;k ÞðIL  Xm;k  1 Þ T T T ~ h~ þ zm;k with h≔½h 1 ; …; hL  . Using the decoder in (26), which is equivalent to the one in (27), it is not difficult to see that once again the diversity order depends on the rank of ðIL  ðVm;k  V^ m;k ÞÞ. Since OSTBCs are full rank mappings, Proposition 3 follows. Simulations in Section 6 will show that SD-OSTBC OFDM efficiently bypasses a wide range of CFO values. Tests will also show that while maximum diversity is achieved, SD-OSTBC OFDM will feature higher coding gains compared to the DD-TDM OFDM in Section 4.1. This is because: (i) OSTBC matrices are not constrained to be diagonal, and so have larger minimum distance among constellation points and (ii) SD detection as in (26) is incurred in only 3 dB SNR loss, as opposed to DD detection as in (10) or (17), which incur in at least 6 dB SNR loss; see (9), [14]. Note that SD-OSTBC OFDM transmissions differentially code information within OFDM blocks and thus the phase difference due to CFO seen between consecutive subblocks is considerably reduced with respect to differentially code across OFDM blocks. Specifically, with OFDM blocks of size RKL, the CFO seen from subblock ym;k to

subblock ym;k  1 is 1=K less than what would be seen if differential recursions were carried across OFDM blocks. Hence, for sufficiently large K, CFO effects are negligible. This is because the CFO is assumed to be less than half of the subcarrier spacing, and integer deviations can be efficiently corrected in OFDM transmissions via null subcarrier methods [24]. About the possibility of exploiting time diversity, the CFO-and-multipath model in (1) can also include time varying channels as a special case. Note however that in general this scheme would not be able to collect Doppler diversity. For that matter a rather more sophisticated approach would be needed, involving differential modulation across space, frequency, and time as in [22]. However, simulations will show that for large L or R, the extra diversity that could be gained using additional coding may be negligible for typical SNR ranges, while at the same time coding gain loss would exponentially increase due to higher constellation mappings. This suggest that coding across space and frequency while bypassing frequency drifts strikes the right trade-off between diversity and coding gain advantages. As far as computational complexity is concerned, as it can be seen for both DD-TDM (10) and DD-TDM OFDM (17) decoders, a ML demodulation scheme is derived, therefore, the computational complexity of both methods exponentially grows with the size of Dvk (therefore the number multiplications/additions exponentially grows with the number of cooperating relays R). It is worth noting that compared to existing point-to-point MIMO non-coherent modulation schemes, this design entails in the same decoding complexity. Furthermore, if needed, the computational complexity of both algorithms may be brought down to polynomial complexity (therefore the number of multiplications/additions grows with R3), without any major change in the setting, by using lattice reduction methods, e.g. such as the one proposed in [25]. In the case of DD-TDM OFDM, the multipath diversity increases the exponential complexity of the decoder, and the number of multiplications/additions exponentially grows with RL. For the special case of SD-OSTBC OFDM transmissions, the search in (27) is reduced to be linear the number of cooperating relays, as is also the case with point-to-point MIMO systems as in, e.g., differential Alamouiti schemes (thus the number of operations grows with R). The computational complexity of the decoding scheme of [17] is also linear with the number of relays. However, as it can be seen in Fig. 5, the performance of such method is less than optimal compared to the one proposed here, because it assumes the same CFO across terminals. In other words, our proposed design necessitates equal complexity to solve a more general and complex problem (different CFOs). The same can be said for the detector in [13] that is polynomial in complexity with the number of relays (therefore the number of multiplications/additions grows with R3), but the performance of such detector is lacking and practically insensitive to changes in SNR. Finally, when compared with coherent designs, the differential modulation schemes proposed in this work do not require channel estimation, nor CFO correction blocks at the receiver. Thus, they considerably simplify

A. Cano et al. / Signal Processing 105 (2014) 30–42

given by ( 1 SF αr ¼ 0

37

if s^ r ¼ s if s^ r a s

ð28Þ SF

Fig. 5. DD-TDM vs. no cooperation and [17] with no errors for R¼ 2,3. Also included asymptotes with diversity of order 2 (narrow gray line) and 3 (thick gray line).

the receiver complexity compared to coherent designs. Although this is also true for point-to-point systems, it becomes more critical in cooperative setups, which are based on the premise of cheap low-complexity cooperating relays.

where s^ r is the decoded block at relay Tr. Note that αr could be defined for each entry of s, instead of the entire block. However, in practice transmission and error detection via CRC codes is carried in frames, and so the definition in (28) is more practically-appealing. Also, note that discarding entire frames instead of individual symbols always increases the end-to-end error rate, and so the definition in (28) can be seen as a “worst-case” cooperation scenario in terms of error performance. In LAR, the decoded frame is always transmitted by all relays regardless of whether s^ r equals s or not, but with power weighted according to an estimate of the instantaneous channel reliability using coefficient 8 if ϕr Z γ r > <1 αLAR ¼ ϕr ð29Þ r if ϕr o γ r > :γ r where γ r ≔γ ∑Lℓ¼10 s2r;ℓ is the average T r  D SNR (cf. (1)) and

ðsÞ H ðsÞ ϕr ≔ðyðsÞ r Þ yr =P  1 with y r being the received signal at Tr from the source, which is assumed of length P. With this

definition, and since yðsÞ r is defined as in [16], it holds that ðsÞ

5. Relaying errors

H ðsÞ 2 L1 at high SNR ðyðsÞ r Þ yr P∑ℓ ¼ 0 jhr;ℓ j þ P (the variance of the noise at Tr is assumed without loss of generality 1). Or ðsÞ

The synchronization problem in the broadcasting phase is a well known problem because this phase consists in a set of point-to-point links. Therefore, we focused on the relaying phase, and differential modulations robust to CFO and multipath were derived in the previous section under the assumption that symbols s are available error-free at all terminals. In this section it is shown that these modulations can also be robust against errors at intermediate steps; i.e., when s is erroneously decoded at the relays because of the presence of CFO in the source-relay links or any other reason. For that matter, two DF cooperating protocols will be applied to the previously developed differential modulations: (i) selective forwarding (SF) [1,2] and (ii) link-adaptive regenerative (LAR) transmissions [2,18]. OFDM transmissions (either DD-TDM or SD-OSTBC) for the T r  D link will be considered. Non-OFDM transmissions in Section 2 will fall as special case of the analysis here and thus are omitted. The S  T r link will be assumed multipath with channel coeffiðsÞ ðsÞ cients hr;0 ; …; hr;L  1 . Since the S  T r channel is non-cooperative, the source S can employ point-to-point DD-OFDM transmissions as in [16].

equivalently, ϕr ∑Lℓ¼10 jhr;ℓ j2 , which is the instantaneous SNR of link S  T r . In words, relays transmit frames at full power when the S  T r is more reliable than the average LAR T r D link; otherwise, αr scales power down to mitigate errors. Note that since γ r varies at a large scale, having it available at relays is certainly affordable. The LAR protocol can be further improved by setting αLAR ¼ 1 if no error is detected as in the SF protocol (since r the wireless protocol may be employing CRC codes at upper layers anyway), otherwise the block is transmitted LAR with αr as in (29). This strategy outperforms LAR, because it guarantees that those blocks correctly decoded will always be transmitted with full power. Both SF and LAR protocols only affect the transmitted power, the former does it in an on–off fashion whereas the latter does it adaptively. In any case, the destination “sees” α pffiffiffiffiffi an equivalent channel fading hr ≔ αr hr per relay Tr, with αr as in (28) or (29). Thus, decoding as in (10) and (17) or (27) can still be performed without knowledge of fαr gRr ¼ 1 ; i.e., the decoding is still non-coherent. Most interestingly, both SF and LAR strategies prevent these detectors from incurring in diversity loss even when relays have errors, as shown next.

5.1. Selective and adaptive transmissions 5.2. Performance analysis In SF, cooperating relays decode the source's block s and, if correctly decoded, they transmit the re-encoded signal tr in (13) or (22). If relay Tr incurs in an error, tr is not transmitted. Decoding errors can be verified using, e.g., cyclic redundancy check (CRC) codes, which are relatively bandwidth-efficient. Mathematically, the SF scheme can be described using a scalar αSF r that multiplies tr and is

Define the set E of terminals with erroneously decoded block s~ r and its complementary set E containing the terminals that successfully decoded s. The instantaneous error probability of the S  T r link can be bounded as ðsÞ PrðT r A EÞ rexpð  δ2r ∑Lℓ ¼10 jhr;ℓ j2 Þ, where the non-zero con2 stant δr depends on the constellation employed and the

38

A. Cano et al. / Signal Processing 105 (2014) 30–42 2

number of repetitions L [16]. Since δr is independent of the fading realizations, it will be henceforth ignored. AssumðsÞ ing hr;ℓ is independent across S  T r links, the probability of having a set E of relaying terminals incorrectly decoding s is, with the exception of a multiplicative constant,   R L1 ðsÞ Pe;h ðEÞ rexp  ∑ ∑ jhr;ℓ j2 : ð30Þ r ¼1ℓ¼0

Whenever Tr is in E, the ST mapping at Tr entails in errors. We will henceforth treat both DD-TDM and DD-OSTBC OFDM systems in a unified manner. Let T T s~ ≔½s~ 1 ; …; s~ R T be the set of all blocks estimated at all relays. If T r A E , then s~ r ¼ s; however if T r A E, then s~ r a s. Appendix C shows that using either DD-TDM or DD-OSTBC OFDM transmissions, the conditional probability of decoding s^ provided that s was transmitted by S and s~ was transmitted by fT r gRr ¼ 1 is given by, with the exception of multiplicative constants, 0 1 2 2 L1 L1  ∑ α ∑ jh j  ∑ α ∑ jh j r r;ℓ r A E r r;ℓ B C ℓ¼0 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiℓ ¼ 0 Prðs; s~ -s^ hÞ rQ @ r A E A: 2 R L  1 ∑r ¼ 1 αr ∑ℓ ¼ 0 jhr;ℓ j ð31Þ Considering all possible error events E, each with probability Pe;h ðEÞ, the overall conditional PEP can be written as Prðs-s^ jhÞ r ∑Pe;h ðEÞPrðs; s~ -s^ jh; EÞ

with Pe;h ðEÞ and Prðs; s~ -s^ jh; EÞ given in (30) and (31), respectively. It is worth noting that when E ¼ | and αr ¼ 1, the expression in (32) reduces to sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! R

L1

∑ ∑ jhr;ℓ j2

Fig. 5 compares the symbol-error-rate (SER) vs. SNR curves of the DD-TDM scheme for R¼ 2,3 and N ¼500. The constellation size is jAs j ¼ 2ρR where ρ is the transmission rate (not including the source transmission) in bits per channel use (bpcu). The rates considered here are ρ ¼ 0:5; 1 for R¼2 and ρ ¼ 2=3; 1 for R ¼3. The CFO ωr is uniform between ½  π; π. Scalar non-cooperative DD (i.e. a simple DD transmission, with only a source and a destination node, and without any space–time block coding) and the DD-OSTBC scheme in [17] for R¼2, ρ ¼ 0:5 are also included for comparison. Fig. 5 also includes two asymptotes with diversity order 2 and 3, respectively. You can use these asymptotes to corroborate that DD-TDM achieves diversity of order R for high SNR (from 15 dB and 20 dB respectively with R¼2 and R¼3), whereas scalar non-cooperative DD transmissions are stuck with diversity 1. The DD-OSTBC scheme in [17] considerably degrades as it is designed assuming all CFOs are the same. It can also be observed that coding gain losses of DD-TDM increase with R and ρ. This is because the constellation size jAs j ¼ 2ρR increases and the minimum constellation distance reduces; see [14].

ð32Þ

E

Prðs-s^ jhÞ r Q

6.1. Error-free DD-TDM transmissions

ð33Þ

r ¼1ℓ¼0

whose expectation over the channel fading clearly achieves diversity RL, indirectly proving Propositions 2 and 3. Most interestingly, the expectation of (32) achieves this same diversity order even when E a∅, as shown in the following proposition. Proposition 4 (Diversity order of SF and LAR OFDM transmissions). The average of the PEP in (32) using the weight αr as defined in (28) or (29) has diversity of order RL; i.e., Prðs-s^ Þ r ðGc γ Þ  RL .

6.2. Error-free OFDM transmissions Figs. 6 and 7 compare the SER vs. SNR curves of the DD-TDM OFDM and SD-OSTBC OFDM schemes for R ¼2,3 and L¼ 2,3. For comparison, the same transmission rates as in Fig. 5 are employed, with Fig. 6 fixing ρ ¼ 0:5; 2=3 and Fig. 7 fixing ρ ¼ 1. SD-OSTBC OFDM uses slightly higher rates so that scalar PSK constellations can be employed for vm;k;p in (19). Note that because of the L-fold repetition in the precoding (12) and (21), the constellation size is now jAs j ¼ 2ρRL . The CFO is set to ½  π; π=Nc where Nc is the number of subcarriers. For DD-TDM OFDM N c ¼ LM ¼ 500; for SD-OSTBC OFDM Nc ¼ RKL ¼ 900. The (multipath) channel is i.i.d. Rayleigh with s2r;ℓ ¼ 1 8 r; ℓ. SD space– frequency codes using OSTBCs or diagonal matrices as in

Proof. See Appendix D. Thus, the diversity order is independent of the intermediate errors s~ r per relay Tr and preserves the multipath diversity. Note that if one single link has reduced diversity the overall diversity order is affected. For example, if the S  T r link has diversity 1 (instead of L), the overall diversity is reduced to R [cf. Appendix D]. Simulations in Section 6 will also include this case. Since both SF/LAR schemes achieve the same diversity order, we will compare their relative coding gain performance through simulations in the ensuing section. 6. Simulations This section compares the performance of the distributed differential schemes in Sections 3, 4.1 and 4.2 with and without relaying errors under different system assumptions.

Fig. 6. DD-TDM OFDM (solid line) vs. SD-OSTBC OFDM (dashed line) with no errors for R ¼2,3, L ¼2,3 and ρ ¼ 0:5; 2=3. Also included asymptotes with diversity of order 4 (narrow gray line), 6 (thick gray line) and 9 (thickest gray line).

A. Cano et al. / Signal Processing 105 (2014) 30–42

Fig. 7. DD-TDM OFDM (solid line) vs. SD-OSTBC OFDM (dashed line) with no errors for R ¼2,3, L ¼2,3 and ρ ¼ 1. Also included asymptotes with diversity of order 4 (narrow gray line), 6 (thick gray line) and 9 (thickest gray line).

[13] coding across OFDM blocks are included for comparison. As shown, both DD-TDM OFDM and SD-OSTBC OFDM achieve higher diversity orders, whereas non-CFO aware SD schemes severely degrade. Figs. 6 and 7 also include three asymptotes with diversity order 4, 6 and 9. You can use these asymptotes to corroborate that SD-OSTBC OFDM tends to achieve diversity of order RL for medium SNR (in the range from 10 dB to 15 dB), whereas DD-TDM OFDM needs higher SNR to approach this diversity order (especially with ρ ¼ 1) because of its higher coding gain losses. Coding gain losses become considerably large as R and L grow, especially for the DD-TDM OFDM scheme, as seen in Fig. 7, since the constellation size grows exponentially with ρRL. In contrast, SD-OSTBC OFDM offers reduced coding gain losses, which in turn allow for diversity gains to “kick-in” earlier. This is because: (i) SD decoding is employed, reducing the SNR loss by 3 dB and (ii) OSTBC mappings are not constrained to be unitary diagonal.

39

Fig. 8. DD-TDM OFDM (solid line) vs. SD-OSTBC OFDM (dashed line) with different CFO ranges β for SNR ¼ 10; 15 and R ¼2.

Fig. 9. DD-TDM with and without OFDM with different multipath profiles for SNR¼ 20 and R¼2.

6.4. Robustness to multipath effects

6.3. Robustness to CFO drifts As stated in Section 4.2, SD-OSTBC OFDM schemes operate under the assumption that the CFO is less than half of the subcarrier spacing. This assumption is tested in Fig. 8, which shows the SER as a function of the CFO, measured in number of subcarriers β; i.e., ωr A ½  π; πβ=N c with Nc ¼512, for SNR¼10,15 dB. The DD-TDM OFDM scheme is included for comparison. As shown, with small CFO deviations (i.e. when CFO is less than half of the subcarrier spacing; and, in this case, when the CFO varies within 20 subcarriers' range (β o 20)), SD-OSTBC OFDM outperforms DD-TDM OFDM because of its reduced coding gain losses, recently justified in the previous section. DD-TDM OFDM, on the other hand, is robust to any CFO range. The same curves were obtained for Nc ¼ 64, showing that a relatively small number of subcarriers suffice to achieve robust performance against CFO.

The major drawback of the precoding employed in (12) is that coding gain losses become considerably large as L increases, as already seen in Figs. 6 and 7. We further elaborate on this effect by comparing the performance of DD-TDM with and without precoding and OFDM transmissions for different values of L and multipath profiles. The ℓ1 per-path variance is now set to s2r;ℓ ¼ d , with 0 rd r1. Fig. 9, shows the SER as a function of d for different values of L with the SNR fixed to 20 dB. When d¼0, then ½s2r;0 ; …; s2r;L  1  ¼ ½1; 0; …; 0; i.e., there is no multipath effect and the SER of the DD-TDM scheme coincides with that of Fig. 5 at 20 dB. DD-TDM OFDM is outperformed in this case since no multipath diversity is present to mitigate the coding gain loss. The other extreme case is when d¼1, in such case the SER of the DD-TDM OFDM scheme coincides with that of Fig. 7 at 20 dB. DD-TDM without OFDM suffers from severe degradation as it was not designed to deal with multipath. For intermediate values of d it can be observed that DD-TDM rapidly degrades and

40

A. Cano et al. / Signal Processing 105 (2014) 30–42

is outperformed by DD-TDM OFDM, even for small L or d. DD-TDM OFDM experiences faster performance improvement for small L, because diversity gains “kick-in” earlier; for large L coding gain losses predominate.

6.5. SF and LAR protocols SF and LAR protocols to combat relaying errors are tested here. Fig. 10 shows the SER vs. SNR curves of DD-TDM OFDM for R ¼2,3, L ¼2 and ρ ¼ 1; 2=3. The S T r link is also assumed multipath with L ¼2 with scalar OFDM transmissions as in [16] and SNR 3 dB larger than that of the T r  D link. Recall that SF and LAR adjust the transSF LAR mitted power using different coefficients αr and αr as in (28) and (29). Thus, for a fair comparison, the overall transmission power is normalized to be the same. This is done by numerically computing the power consumption of each scheme, and adjusting the overall SNR accordingly. Compared to Fig. 5, relaying errors inevitably degrade system performance. The good news here is that diversity RL is still achievable, as predicted in Section 5.2. LAR-based transmissions attain larger coding gains because they consider “soft” information at the receiver. SF, however, discards entire blocks, here of size ðK  2ÞLM ¼ 2940, even when only a few symbols were erroneously decoded at the relays. Frequency-flat (L ¼1) scalar DD transmissions for link S T r are also included for comparison. In this case diversity is reduced from RL to R, as the S T r link only collects diversity 1, and consequently becomes more unreliable (there are more relaying errors). Fig. 11 repeats this test using now SD-OSTBC OFDM transmissions. Note that in some cases (R ¼3 or R¼2 from 17 dB) LAR is outperformed by SF. This is because transmissions with errors affect the OSTBC matrix orthogonality. This effect is not present when using DD-TDM OFDM transmissions (see Fig. 11) because the diagonal matrix structure is always preserved; the transmitted matrix is always diagonal. In any case, LAR transmissions are still able to combat errors more efficiently than the non-adaptive DF [2] (setting αr ¼ 1 8 r), which is also included for comparison. Figs. 10 and 11 also include two asymptotes with diversity order 4

Fig. 11. SF vs. LAR vs. DF SD-OSTBC OFDM for R ¼2,3 and L ¼2. Also included asymptotes with diversity of order 4 (narrow gray line) and 6 (thick gray line).

and 6, respectively. You can use these asymptotes to corroborate that SF and LAR SD-OSTBD OFDM tend to achieve diversity of order RL for high SNR (from 16 dB and 20 dB respectively with R¼2 and R¼ 3), whereas SF and LAR DD-TDM OFDM need higher SNR to approach this diversity order because of its higher coding gain losses. 7. Conclusions Differential modulation schemes that do not require channel state or synchronization information are developed. With the DD-TDM scheme, TDM transmissions using diagonal DD ST mappings are shown to effectively bypass channel and CFO knowledge at the destination. Precoded differential OFDM transmissions can additionally bypass timing offsets using either: (i) DD-TDM OFDM scheme (diagonal DD coding across OFDM blocks); and (ii) SD-OSTBC OFDM scheme (SD-OSTBC within OFDM blocks). DD-TDM has the lowest complexity, but it is vulnerable to timing errors among relays or multipath effects. The differential OFDM schemes deal with timing/multipath effects and collect space and multipath diversity: DD-TDM OFDM transmissions are robust to any CFO range, whereas SD-OSTBC OFDM achieve higher coding gains and have simpler decoding complexity, at the expense of limited CFO mitigation range. Performance analysis assuming adaptive/selective decodeand-forward relaying is also provided, showing that diversity can still be achieved even when errors are present. Simulations corroborated theoretical claims and showed that the SDOSTBC OFDM scheme using SF, due to its simple decoding, relative robustness, and considerably higher coding gain can be preferable in applications with small CFO deviations. Appendix A. Derivation of (7) and (8)

Fig. 10. SF vs. LAR DD-TDM OFDM for R ¼2,3; and L ¼ 2. Also included asymptotes with diversity of order 4 (narrow gray line) and 6 (thick gray line).

Eqs. (7a)–(7c) are derived from (6) by substituting and expanding the recursive equations in (2a) and (2b). Specifically, from (2b) we can expand xk , for k Z 3, as xk ¼ Dgk xk  1 ¼ Dgk Dgk  1 xk  2

A. Cano et al. / Signal Processing 105 (2014) 30–42

and, from (2a), we can expand gk as Dgk Dgk  1 xk  2 ¼ Dvk Dgk  1 Dgk  1 xk  2 ¼ Dvk D2gk  1 xk  2 ; where the latter equality is possible because Dgk  1 is a diagonal matrix. Substituting xk ¼ Dvk D2gk  1 xk  2 and rewriting Dωk ¼ Dωk  1 Dω1 yields (7a). To derive (7b), we only need to expand xk  1 ¼ Dgk  1 xk  2 . To derive (7c), we only need to rewrite Dωk  2 ¼ Dωk  1 Dω  1 . Eq. (8) is derived by expanding the products Dyk ynk  1 and Dyk  2 ynk  2 . Substituting yk and yk  1 from (7a) and (7b), respectively, and ignoring the noise term for simplicity, yields Dyk ynk  1 ¼ Dh Dω Dωk  1 Dω1 Dvk D2gk  1 Dxk  2 ðDh Dω Dωk  1 Dgk  1 xk  2 Þn

¼ Dvk Djhj2 Dω1 gk  1 :

ðA:1Þ

The last equality follows because, except for Dh , all matrices involved are diagonal and unitary, and so: (1) we can freely apply commutative property and (2) conjugate multiplication is the identity vector/matrix:

~ v be the transmitted matrix by the relays, which is Let D k different from the error-free Dvk . With appropriate constants, the block error probability Prðs; s~ -s^ jhÞ can be bounded by the symbol error probability. Since there's a one-to-one mapping from each symbol (entry) in s, s~ and s^ ~ v and D ^ v , the block error probability is thus to Dvk , D k k given by  L1  L1 ^ v rk  1;ℓ J 2 o ∑ J rk;ℓ  Dv rk  1;ℓ J 2 Prðs; s~ -s^ jhÞr Pr ∑ J rk;ℓ  D k k ℓ¼0

ℓ¼0

ðC:2Þ ~ v being the equivalent matrix transmitted by all with D k pffiffiffi  1=2 ~ relays. We approximate rk;ℓ as rk;ℓ LDvk Σk;ℓ D ωk;ℓ pffiffiffiffiffi ðαÞ ðαÞ hℓ þ z″k;ℓ , with ½hℓ r ≔ αr jhr;ℓ j and z″k;ℓ Gaussian. Ignoring Σk;ℓ , the probability of error in (C.2) can be written as PrðX 4 0Þ, where X is Gaussian with respective mean and variance L1

~ v D ^ v Þh J 2  J ðD ~ v  Dv Þh J 2 μ≔ ∑ J ðD ℓ ℓ k k k k ðαÞ

ðαÞ

ℓ¼0 L1

~ v Dv Þh J 2 : s≔ ∑ J ðD ℓ k k

Dωk  1 Dnωk  1 ¼ I;

ðαÞ

ℓ¼0

Dxk  2 xnk  2 ¼ 1;

The probability of error in (C.2) can then be bounded as

D2gk  1 Dngk  1 ¼ Dgk  1 :

0

Dyk  1 ynk  2 ¼ Dh Dω Dωk  1 Dgk  1 Dxk  2 ðDh Dω Dωk  1 Dω  1 xk  2 Þn ¼ Djhj2 Dω1 gk  1 : ðA:2Þ As seen, (A.1) and (A.2) are the same except for the Dvk term, hence (in the absence of noise): Dyk ynk  1 ¼ Dvk Dyk  1 ynk  2 :

ðA:3Þ

Introducing noise, and after a straightforward calculations, the term z0k in (9) can be likewise derived. Appendix B. Derivation of (15) Matrix Hr is circulant and can be diagonalized (pre) H multiplying by (I)FFT matrices; i.e., Hr FH ML ¼ FML Dhr with pffiffiffiffiffiffiffi ð1:LÞ ð1:LÞ Dhr ≔ ML diagðFML hr Þ and FML being a matrix formed by the first L columns of FML . Using this fact in (14), and substituting (12), it holds that pffiffiffiffiffiffiffi H MLFML diagð1L  ðFM x~ r;k ÞÞFð1:LÞ ðB:1Þ Hr FH ML x r;k ¼ ML hr : In [16, Appendix A] it is shown that for any M  1 vector u it further holds that pffiffiffi pffiffiffiffiffiffiffi H ð1:LÞ ¼ LððFH ðB:2Þ MLFML ðIL  diagðuÞÞFML M uÞ  IL Þ: Substituting u ¼ FM x~ r;k , Eq. (15) follows. Appendix C. Derivation of (31) Consider first the DD-TDM OFDM scheme. For simplicity, we set M¼1 and thus subindex m will be henceforth 1=2 removed. Define vector rk;ℓ ≔Σk;ℓ Dyk;ℓ ynk  1;ℓ . The detector in (17) can be written as  L1  ^ v ¼ argmin ∑ J rk;ℓ  Dv rk  1;ℓ J 2 : ðC:1Þ D k ℓ¼0

1

L1 ~ v D ^ v Þh J 2  ∑L  1 J ðD ~ v  Dv Þh J 2 C  B ∑ ℓ ¼ 0 J ðD ℓ ℓ k k k k ℓ¼0 ffiA: Prðs; s~ -s^ hÞ r Q @qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ~ v D ^ v ÞhðαÞ J 2 þ 2∑L  1 J ðD ~ v  Dv ÞhðαÞ J 2 2∑Lℓ¼10 J ðD ℓ ℓ k k k k ℓ¼0 ðαÞ

Likewise, substituting yk  1 and yk  2 from (7b) and (7c), respectively, we obtain:

Dv

41

ðαÞ

ðC:3Þ The terms inside the exponential can be bounded as, up to multiplicative constants, L1

^ v ÞÞh J 2 Z ∑ αr ∑ jhr;ℓ j2 ~ v D J ðD ℓ k k ðαÞ

rAC

ℓ¼0 L1

~ v  Dv ÞÞhðαÞ J 2 r ∑ αr ∑ jhr;ℓ j2 : J ðD ℓ k k rAE

ℓ¼0

Substituting in (C.3) and using the inequality !   0 ab a0  b Q pffiffiffiffiffiffiffiffiffiffi rQ pffiffiffiffiffiffiffiffiffiffiffiffi0ffi aþb a0 þ b 0

whenever a0 r a and b Z b, Eq. (31) yields. A similar argument can be carried for OSTBCs by simply substituting rk;ℓ ≔yk;ℓ in (C.1) and approximating pffiffiffi it (in the absence of CFO) after Eq. (C.2) as rk;ℓ LVTk Xk  1 ðαÞ hℓ þ z″k;ℓ . Appendix D. Proof of Proposition 4 In the SF protocol, αr ¼ f0; 1g and thus the expectation of (32) over the channel can be split as the product of two expectations    L1 ðsÞ Eh ½Prðs-s^ jhÞ r ∑Eh exp  ∑ ∑ jhr;ℓ j2 E rAE ℓ ¼ 0 " !# L1

Eh exp  ∑ ∑ jhr;ℓ j2 rAE ℓ ¼ 0

:

ðD:1Þ

Using the Rayleigh channel model, the expectations in (D.1) can be easily computed yielding, up to multiplicative constants, Eh ½Prðs-s^ jhÞ r ∑γ  jEjL γ  ðR  jEjÞL r γ  RL E

ðD:2Þ

42

A. Cano et al. / Signal Processing 105 (2014) 30–42

The second term shows that diversity at the destination is reduced to ðR  jEjÞL. However the probability of that happening behaves with diversity jEjL, as seen in the first term. The product of both terms shows full diversity of order RL is achieved. Next we deal with αr defined as in (29). In this case both Pe;h ðEÞ and Prðs; s~ -s^ jh; EÞ are coupled through αr and the expectation of (32) cannot be partitioned into a product of two terms. Alternatively, one can use the definition of αr in (29) to bound the terms inside (31) as follows: ! L1 L1 minℓ;r A E fjhr;ℓ j2 ; γ g ∑ ∑ jhr;ℓ j2 ∑ αr ∑ jhr;ℓ j2 Z γ ℓ¼0 r AE rAE ℓ ¼ 0 ðD:3Þ 

L1

∑ αr ∑ jhr;ℓ j2 r

r AE

ℓ¼0

 ∑ αr

 r

rAE

L1

∑ ∑ jhr;ℓ j2

rAE ℓ ¼ 0

L1

ðsÞ

∑ ∑ jhr;ℓ j2

r AE ℓ ¼ 0



jhr;ℓ j2 : γ r AE ℓ ¼ 0 L1

∑ ∑

ðD:4Þ

For compactness, define the terms inside (D.3) and (D.4) as L1

λE ≔ ∑ ∑ jhr;ℓ j2 ; rAE ℓ ¼ 0

jhr;ℓ j2 ; γ rAE ℓ ¼ 0 L1

κE ≔ ∑ ∑

L1

ðsÞ

λE ≔ ∑ ∑ jhr;ℓ j2 rAE ℓ ¼ 0

minℓ;r A E fjhr;ℓ j2 ; γ g κE ≔ γ

With these definitions, the product in (32) can be bounded as !

λE  κE λE ffi Pr s-s^ jh r expð  λE ÞQ κ E pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðD:5Þ 2κE λE þ 2κE λE where the inequality !   0 a b a0  b Q pffiffiffiffiffiffiffiffiffiffi r Q pffiffiffiffiffiffiffiffiffiffiffiffi0ffi aþb a0 þ b 0

whenever a0 r a and b Z b have been used. The random variable λE (λE ) is Gamma-distributed with jE jL (jEjL) degrees of freedom. The random variables κE and κ E are independent of the SNR γ , since they are divided by it. In [2, Lemma 2] it is shown that a expression of the form (D.5) with the conditions above-mentioned can be bounded by a Gamma-distributed random variable with jE jL þ jEjL ¼ RL degrees of freedom. Averaging over the channel distribution the diversity result of Proposition 4 follows. References [1] J.N. Laneman, G.W. Wornell, Distributed space–time-coded protocols for exploiting cooperative diversity in wireless networks, IEEE Trans. Inf. Theory 49 (October (10)) (2003) 2415–2425. [2] A. Cano, T. Wang, A. Ribeiro, G.B. Giannakis, Link-adaptive distributed coding for multi-source cooperation, EURASIP J. Adv. Signal Process., 2008, http://dx.doi.org/10.1155/2008/352796.

[3] M. El-Hajjar, L. Hanzo, Dispensing with channel estimation…Low complexity noncoherent virtual MIMOs, IEEE Veh. Technol. Mag. 5 (June) (2010) 42–48. [4] Y. Mei, Y. Hua, A. Swami, B. Daneshrad, Combating synchronization errors in cooperative relays, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, Philadelphia, March 2005. [5] M.R. Bhatnagar, M. Debbah, A. Hjorungnes, A simple scheme for delay-tolerant decode-and-forward based cooperative communication, in: Proceedings of IEEE International Symposium on Information Theory, Seoul, July 2009. [6] X. Li, F. Ng, T. Han, Carrier frequency offset mitigation in asynchronous cooperative OFDM transmissions, IEEE Trans. Signal Process. 56 (February (2)) (2008) 675–685. [7] H. Mehrpouyan, S.D. Blostein, Bounds and algorithms for multiple frequency offset estimation in cooperative networks, IEEE Trans. Wirel. Commun. 10 (April (4)) (2011) 1300–1311. [8] L.B. Thiagarajan, S. Sun, P.H.W. Fung, C.K. Ho, Multiple carrier frequency offset and channel estimation for distributed relay networks, in: Proceedings of IEEE Global Communications Conference, Miami, December, 2010. [9] D. Veronesi, D.L. Goeckel, Multiple frequency offset compensation in cooperative wireless systems, in: Proceedings of IEEE Global Communications Conference, San Francisco, December 2006. [10] A. Yadav, V. Tapio, M. Juntti, J. Karjalainen, Timing and frequency offsets compensation in relay transmission for 3GPP LTE uplink, in: Proceedings of International Conference on Communications, Cape Town, South Africa, May, 2010. [11] H. Wang, X.-G. Xia, Q. Yin, Distributed space–frequency codes for cooperative communication systems with multiple carrier frequency offsets, IEEE Trans. Wirel. Commun. 8 (February (2)) (2009) 1045–1055. [12] Q. Huang, M. Ghogho, J. Wei, P. Ciblat, Practical timing and frequency synchronization for OFDM-based cooperative systems, IEEE Trans. Signal Process. 58 (July (7)) (2010) 3706–3716. [13] Q. Ma, C. Tepedelenlioglu, Z. Liu, Differential space–time–frequency coded OFDM with maximum diversity, IEEE Trans. Wirel. Commun. 4 (September (5)) (2005) 2232–2243. [14] Z. Liu, G.B. Giannakis, B.L. Hughes, Double differential space–time block coding for time-varying fading channels, IEEE Trans. Commun. 49 (September (9)) (2001) 1529–1539. [15] G. Ganesan, P. Stoica, Differential modulation using space–time block codes, IEEE Signal Process. Lett. 9 (February (2)) (2002) 57–60. [16] X. Ma, Low-complexity block double-differential design for OFDM with carrier frequency offset, IEEE Trans. Commun. 53 (December (12)) (2005) 2129–2138. [17] M.R. Bhatnagar, A. Hjørungnes, Distributed double-differential orthogonal space–time code for cooperative networks, in: Proceedings of IEEE Global Communications Conference, New Orleans, December 2008. [18] T. Wang, A. Cano, G.B. Giannakis, Link-adaptive cooperative communications without channel state information, in: Proceedings of MILCOM Conference, Washington D.C., October 2006. [19] N. Yi, Y. Ma, R. Tafazolli, Doubly differential communication assisted with cooperative relay, in: Proceedings of Vehicular Technology Conference, Singapore, May 2008. [20] M.R. Bhatnagar, A. Hjörungnes, L. Song, Cooperative communications over flat fading channels with carrier offsets: a doubledifferential modulation approach, EURASIP J. Adv. Signal Process., 2008, http://dx.doi.org/10.1155/2008/531786. [21] A. Cano, E. Morgado, A. Caamaño, J. Ramos, Distributed doubledifferential modulation for cooperative communications under CFO, in: Proceedings of IEEE Global Communications Conference, Washington D.C., November 2007. [22] A. Cano, X. Ma, G.B. Giannakis, Block-differential modulation for doubly-selective wireless fading channels, IEEE Trans. Commun. 53 (December (12)) (2005) 2157–2166. [23] P. Stoica, J. Liu, J. Li, Maximum-likelihood double differential detection clarified, IEEE Trans. Inf. Theory 50 (March (3)) (2004) 572–576. [24] M. Morelli, A.N. DAndrea, U. Mengali, Frequency ambiguity resolution in OFDM systems, IEEE Commun. Lett. 4 (April) (2000) 134–136. [25] K.L. Clarkson, W. Sweldens, A. Zhang, Fast multiple-antenna differential decoding, IEEE Trans. Commun. 49 (February) (2001) 253–261.