ROD-based hybrid TH precoding and combining for mmWave large-scale MIMO systems

Digital Signal Processing 93 (2019) 102–114 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp ROD-bas...

Download PDF

683KB Sizes 2 Downloads 45 Views

Report

PDF Reader
Full Text

Digital Signal Processing 93 (2019) 102–114

Contents lists available at ScienceDirect

Digital Signal Processing www.elsevier.com/locate/dsp

ROD-based hybrid TH precoding and combining for mmWave large-scale MIMO systems ✩ Xiaoyu Bai a , Fulai Liu a,b,∗ , Ruiyan Du a,b , Xiaodong Kan a,b , Yixin Xu a , Yanshuo Zhang a,b a b

School of Computer Science and Engineering, Northeastern University, Shenyang, 110819, China Institute of Engineering Optimization & Smart Antenna, Northeastern University at Qinhuangdao, 066004, China

a r t i c l e

i n f o

Article history: Available online 24 July 2019 Keywords: Millimeter-wave Hybrid precoder Tomlinson-Harashima precoding Nonlinear precoding MIMO

a b s t r a c t Hybrid precoding is one of key techniques for millimeter wave (mmWave) large-scale multiple-input multiple-output (MIMO) systems. This paper considers a nonlinear hybrid precoding architecture which consists of a nonlinear unit, a reductive digital precoder and a constant modulus radio frequency (RF) precoder, and presents a novel hybrid Tomlinson-Harashima (TH) precoding and combining algorithm. Firstly, due to the intractability of the sum rates maximization problem for such a nonlinear hybrid precoding architecture, a tractable three-stage optimization problem is constructed through the lower bound of the sum rates, which allows the digital precoding matrix, the RF precoding matrix and the RF combining matrix to be optimized sequentially and independently. Then, in order to solve the threestage optimization problem effectively, a novel row orthogonal decomposition (ROD) is deﬁned. Based on the ROD, it is interesting that the necessary and suﬃcient condition of the optimal digital precoding matrix can be obtained, and a near-optimal RF precoding matrix can be derived. Finally, the optimization of the RF combining matrix is reformulated as a unimodular quadratic programming and solved by a generalized power method. Theoretical analyses and simulations indicate that the proposed ROD-based hybrid TH precoding and combining algorithm can offer a higher sum rates and a lower bit error rate with a comparable complexity in comparison to the previous works. © 2019 Elsevier Inc. All rights reserved.

1. Introduction

Recent proliferation of smart mobile devices has resulted in an unprecedented growth of data traﬃc in wireless communications. For example, Cisco Visual Networking Index shows that global mobile data traﬃc will increase sevenfold between 2017 and 2022 [1]. Since the spectrum resources below 6 GHz have been almost completely occupied, it is very challenging to fulﬁll the increasing communication demands with the conventional commercial frequency bands. To meet the incredible increase of mobile data traﬃc, one of the most eﬃcient resolutions is to transmit data with millimeter wave (mmWave) large-scale multiple-input multipleoutput (MIMO) systems, due to its high data rate, enormous idle spectrum resources and compact hardware structure [2].

✩ This work was supported by the Natural Science Foundation of Hebei Province (Grant No. F2016501139) and the Fundamental Research Funds for the Central Universities (Grant No. N172302002 and No. N162304002). Corresponding author at: Institute of Engineering Optimization & Smart Antenna, Northeastern University at Qinhuangdao, 066004, China. E-mail address: [email protected] (F. Liu).

*

https://doi.org/10.1016/j.dsp.2019.07.010 1051-2004/© 2019 Elsevier Inc. All rights reserved.

Fully digital precoding scheme for MIMO systems requires one individual radio frequency (RF) chain per antenna element [3–5], which is prohibitively complex and costly at mmWave frequencies. To address this issue, a hybrid precoding architecture is widely considered for mmWave MIMO systems in recent years. The hybrid precoding architecture divides the precoder into a low-dimensional digital signal processor and a high-dimensional analog signal processor, so that it only requires a signiﬁcantly lower number of RF chains in comparison to the fully digital counterpart [6–9]. In the past few years, several hybrid precoding algorithms for multi-user mmWave MIMO systems have been proposed. The ﬁrst prevalent category of hybrid precoding is codebook-based method [10–12], in which the columns of the RF precoding matrix are selected from a predeﬁned codebook. An equally spaced grouping scheme based on the discrete Fourier transform (DFT) codebook is proposed in [10], the presented algorithm ﬁrstly divides the DFT codebook into groups, and then the group which maximizes the sum rates is selected to construct the RF precoding matrix. A spatial rotation algorithm is proposed in [11], which can reﬁne the angles of the DFT beams and improve the performance of the DFT codebook-based hybrid precoding method effectively. Besides, the impacts of the instantaneous channel state information (CSI)

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

and hybrid CSI are studied for codebook-based hybrid precoding method in [12], and it is shown that the hybrid CSI is suﬃcient to achieve the ﬁrst-order gain provided by massive MIMO systems for most of cases. The second category of hybrid precoding is noncodebook based method, which usually solves a relaxation problem of the hybrid precoding matrix ﬁrstly, and then regulates the solution according to the hardware constraints [13–16]. For example, a singular value decomposition (SVD)-based hybrid precoding algorithm is given in [13], which derives the analog precoding matrix and the digital precoding matrix via the SVD and the zero forcing (ZF) precoding, respectively. A hybrid block diagonalization (BD) precoding algorithm is developed in [14], the proposed algorithm ﬁrstly maximizes the effective channel gain via the RF precoding matrix, and then the BD precoding is implemented in the digital domain to suppress the inter-user interference. A joint channel estimation and hybrid precoding algorithm is given in [15], which exploits the strongest angle-of-arrival to design the analog precoding matrix and optimizes the digital precoding matrix through the ZF precoding. Additionally, an iterative hybrid precoding algorithm for low resolution RF phase shifters is developed in [16], which reﬁnes the RF precoding matrix and digital precoding matrix via block coordinated ascent and minimum mean square error (MMSE) precoding, respectively. However, all the aforementioned algorithms are linear precoding, which may incur a performance loss, especially for an illconditioned channel matrix [17]. Fortunately, it has been shown in [18] that such a phenomenon could be effectively avoided by adding a perturbation vector to data streams in advance. Based on this idea, a nonlinear hybrid MMSE-vector perturbation (MMSEVP) scheme is proposed in [19]. However, the RF precoding matrix in [19] is implemented by phase shifters and power ampliﬁers simultaneously, which is energy hungry. To solve this problem, a DFT codebook-based nonlinear hybrid MMSE-VP precoding algorithm is further given in [20], whose RF precoding matrix is only based on energy-eﬃcient phase shifters. Though the hybrid MMSEVP precoding algorithms offer signiﬁcant improvement compared with linear methods, a Lenstra-Lenstra-Lovász basis reduction and a Branching-Reduction-and-Bounding method are used to solve the perturbation vector and RF precoding matrix in [19,20], which involve high complexities. In order to reduce the computational costs, a low-complexity nonlinear hybrid block diagonal geometric mean decomposition (BD-GMD) Tomlinson-Harashima (TH) precoding algorithm is proposed [21], in which an orthogonal matching pursuit (OMP) algorithm is used to decompose the fully digital TH precoding matrix into the product of the RF precoding matrix and the digital precoding matrix. However, the OMP algorithm restricts the column vectors of the RF precoding matrix to belong to a predeﬁned codebook, if the codewords within the codebook are far from the optimal solution of the RF precoding matrix, the system performance will inevitably decays. With this backdrop, a novel hybrid TH precoding and combining algorithm is proposed in this paper, the main contributions can be summarized as follows:

• A tractable optimization problem of the precoding and combining matrices is ﬁrstly constructed through the lower bound of the sum rates, and then the problem is further transformed into an equivalent three-stage optimization problem which allows the digital precoding matrix, the RF precoding matrix and the RF combining matrix to be optimized sequentially and independently. • To solve the aforementioned three-stage optimization problem effectively, a novel row orthogonal decomposition (ROD) which represents the orthonormal bases of the row space of a matrix is deﬁned. Based on the newly deﬁned ROD, it is interesting that the necessary and suﬃcient condition for the optimal dig-

103

ital precoding matrix can be derived and a near-optimal RF precoding matrix can be given. Then, by utilizing the asymptotic orthogonality of different user channels, the optimization of the RF combining matrix is reformulated as a unimodular quadratic programming and solved by a generalized power method. • The sum rates and bit error rate (BER) of the presented algorithm are evaluated by theoretic analyses and simulations. Results indicate that the performance loss of the proposed algorithm is slight. Compared with the previous hybrid precoding methods, it is observed that the proposed algorithm can improve the sum rates and reduce the BER signiﬁcantly with comparable computational costs. The rest of this paper is organized as follows. In Section 2, the system and channel models are described. Section 3 explains the proposed hybrid TH precoding algorithm. In Section 4, the asymptotic performance of the proposed algorithm is analyzed. Section 5 evaluates the performance of the proposed algorithm through several simulations, and Section 6 concludes the whole paper. Throughout this paper, A is a matrix, a is a vector, a is a scalar. |a|, a and a∗ are the magnitude, argument and conjugate of the complex number a, respectively. The ﬁeld of complex numbers is represented by C . | A | denotes its determinant, A F is its Frobenius norm, rank( A ) stands for the rank of A, Tr( A ) represents the trace of A, A −1 , A † A T and A H are its inverse, Moore-Penrose pseudo-inverse, transpose and conjugate transpose, respectively. R ( A ) and N ( A ) are the column space and nullspace of A. [ A ]m,n stands for the (m, n)th element of the matrix A. e j j

A

is a matrix

[ A ]m,n . I and 0 stand for

whose (m, n)th element is equal to e the identity matrix and zero matrix, respectively. diag{a1 , · · · , a N } is a diagonal matrix with the entries in {a1 , · · · , a N } on its diagonal, diag{ A } represents a diagonal matrix with diagonal elements given by [ A ]1,1 , · · · , [ A ] N , N , diag{ A 1 , · · · , A N } is a block-diagonal matrix with the elements in { A 1 , · · · , A N } as the diagonal blocks. a denotes the Euclidean norm of the vector a, a(i ) is the ith entry of the vector a. E[·] is used to denote expectation, CN (a, b) is the complex Gaussian distribution with the mean a and the covariance b. 2. System model Consider a multi-user mmWave MIMO system with a hybrid TH precoder as shown in Fig. 1 [21], in which a base station (BS) equipped with N t antennas and N R F RF chains simultaneously serves K users with N r antennas and one RF chain. The BS employs a hybrid TH precoder to send an N s × 1 signal vector T s = s1 , · · · , s N s to users, where each entry of s is chosen from an M-ary quadrature amplitude modulation(M-QAM) constellation

√ 3 , ±3 2(M3−1) , · · · , ±( M − 2( M −1) 1) 2( M3−1) }. It is assumed that all the data streams are indepenset A = {s R + js I |s R , s I ∈ ±

dent and have unit power, i.e., E{ssH } = I . The signal vector s is ﬁrstly fed into a nonlinear unit which consists of an N s × N s feedback matrix B and a modulo operator. The feedback matrix B is a strictly lower triangular matrix and the modulo operator MOD M (x) is deﬁned as [22]

MOD M (x) = x − 2τ

√

2τ

3 , 2( M −1)

+

1 2

+j

x 2τ

+

1

2

x is the largest integer not exceeding x. and denote the real and imaginary parts of a complex number, respectively. The output vector of the nonlinear unit s˜ can be where

τ= M

x

expressed as [23]

104

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

Fig. 1. A multi-user mmWave MIMO system with a hybrid TH precoder.

s˜ i = MOD M si −

i −1

˜ [ B ]i ,k sk , i = 1, · · · , N s .

H LOS,k = αk ar (φkr , θkr )at (φkt , θkt )H

k =1

It is evident that s˜ i no longer belongs to the M-QAM constellation points, hence the power of s˜ differs from that of s. Invoking H [24], we have E{˜s s˜ } = α −1 E{ssH }, where α = MM−1 . After the nonlinear operation, the vector s˜ is multiplied by the digital precoding matrix F B B ∈ C N R F × N s and the analog precoding matrix F R F ∈ C Nt × N R F sequentially, therefore, the transmitted signal x can be given by

x = F R F F B B s˜

where the analog precoding matrix F R F is subjected to [ F R F ]m,n

=

√1 . The transmit power constraint can be expressed as Nt

F R F F B B 2F = α P where P is the transmit power.

H NLOS,k =

1 N c ,k N l,k

N c ,k Nl,k

αk,i,l ar (φkr ,i,l , θkr,i,l )at (φkt ,i,l , θkt,i,l )H

i =1 l =1

where αk , and φkt (θkt ) denote the complex gain, azimuth (elevation) angle of arrival (AOA) and the azimuth (elevation) angle of departure (AOD) of the LOS path associated with the kth user. αk,i,l , φkr ,i,l (θkr,i,l ) and φkt ,i,l (θkt,i,l ) are referred to as the complex gain, azimuth (elevation) AOA and the azimuth (elevation) AOD of the (i , l)th path of the NLOS component associated with the kth user. ar (φ r , θ r ) and at (φ t , θ t ) represent the array response vectors of the BS and the user respectively. For an M × N uniform planar array (UPA) in the yz-plane with a half-wavelength inter-element space, the array response vector can be expressed as [26]

φkr

(θkr )

The received signal at the kth user prior to the modulo operator can be written as

a(φ, θ) =

H ˜ y˜ k = w B B ,k w H R F ,k H k F R F F B B s + w B B ,k w R F ,k nk

where 0 ≤ m ≤ M − 1 and 0 ≤ n ≤ N − 1 are the y and z indices of antenna elements respectively.

Nr ×Nt

where H k ∈ C stands for the channel matrix between the BS and the kth user. w R F ,k ∈ C N r ×1 denotes the analog RF combining vector which is subjected to w R F ,k (i ) = √1 . w B B ,k is a scaling Nr N r ×1

factor. The entries of the noise vector nk ∈ C are independent and identically distributed (i.i.d.) CN (0, σn2 ) random variables, in which σn2 denotes the noise power. For simplicity, the received signals prior to modulo operator for all users can be reformulated as

˜ + W B B W HR F n y˜ = W B B W H RF H F RF F BBs T H where y˜ = y˜ 1 · · · y˜ K , H = H H · · · H HK , W B B = 1 diag{ w B B ,1 , · · · , w B B , K } and W R F = diag{ w R F ,1 , · · · , w R F , K }. The mmWave propagation channel can be characterized by a Rician channel model consisting of LOS and NLOS components [15, 25]

Hk =

vk vk + 1

H LOS,k +

1 vk + 1

H NLOS,k

where v k is referred to as the Rician factor. H LOS,k and H NLOS,k denote the LOS component and the NLOS component, respectively. It is assumed that the NLOS component H NLOS,k consists of N c ,k clusters and each cluster is composed of N l,k paths, accordingly, the LOS and NLOS components H LOS,k and H NLOS,k can be given by [15]

1 ···

e j π (m sin φ sin θ +n cos θ )

···

e j π (( M −1) sin φ sin θ +( N −1) cos θ )

T

3. Hybrid TH precoding algorithm In this section, the nonlinear unit, the hybrid precoder at BS and the hybrid combiner at users will be considered sequentially. For convenience, the diagram of the hybrid TH precoder is redescribed in Fig. 2, where the digital precoding matrix F B B is decomposed into a product of the matrices F P ∈ C N R F × K and F D ∈ C K × K , i.e., F B B = F P F D . 3.1. Problem formulation Firstly, by regarding the matrix H e = W H R F H F R F F P as an effective channel and invoking the fully digital TH precoding algorithm proposed in [27], it is easy to know that the feedback matrix B, the digital precoding matrix F D and the digital combining matrix W B B can be given by1

B = G e−1 L e − I FD = Q

H e

−1

W BB = Ge

(1) (2) (3)

1 For the details of Eq. (1)–(3), it is omitted due to the limited space, the interested readers are referred to Eq. (14) in [27].

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

105

Fig. 2. The diagram of the hybrid TH precoder.

where L e and Q e are the LQ decomposition2 (LQD) of the effective channel H e = L e Q e . G e = diag{le,11 , · · · , le, K K } is a diagonal matrix in which le,kk is the kth diagonal entry of L e . Then, the sum rates of the hybrid TH precoder can be expressed as

R = log I + 12 G e2 [22,27]. Accordingly, the optimization probσn

lem of the hybrid precoding matrices F R F , F P and the analog combining matrix W R F can be expressed as

max

W RF ,F RF ,F P

s.t.

1 2 log I + 2 G e σ n

F R F F P 2F = α P C1 [ F R F ]i , j = √1 , ∀i , j C2 Nt w R F ,k (m) = √1 , ∀k, m C3. N

(4)

However, the objective function of (4) is not an explicit function of the matrix variables F R F , F P and W R F , which renders the problem intractable. Fortunately, it is easy to know that the

objective function of (4) is lower bounded by log

1

σn2

G e2

[16].

Accordingly, the problem (4) can be approximately transformed into the following formulation

s.t.

max

H H H log W H RF H F RF F P F P F RF H W RF

F R F F P 2F = α P C1 [ F R F ]i , j = √1 , ∀i , j C2 Nt w R F ,k (m) = √1 , ∀k, m C3, N

(5)

worthy that the transmit power constraint C1 of (5) involves a product of F R F and F P . In order to simplify the constraint C1 fur-

12

The problem (7) consists of three subproblems P 1, P 2 and P 3. P 1 is a problem of Fˆ P with given F R F and W R F , without loss of generality, the optimal solution of P 1 can be expressed as Fˆ P ,opt = f ( F R F , W R F ). Then, the objective function of P 2 can be given by J ( W R F , F R F , f ( F R F , W R F )). Similarly, introducing a notation F R F ,opt = g ( W R F ) to represent the optimal solution of P 2, the objective function of P 3 is J ( W R F , g ( W R F ), f ( g ( W R F ), W R F )). The

Theorem 1. If Fˆ P ,opt = f ( F R F , W R F ), F R F ,opt = g ( W R F ) and W R F ,opt are the optimal solution of P 1, P 2 and P 3 in (7) respectively, then, { W R F ,opt , g ( W R F ,opt ), f ( g ( W R F ,opt ), W R F ,opt )} is the optimal solution of (6). Proof 1. It is easy to know that for arbitrary matrix variables Fˆ P , F R F and W R F that satisfy the constraints in (7), the following inequalities are true

(8)

J ( W R F , g ( W R F ), f ( g ( W R F ), W R F )) ≥

(9)

J ( W R F , F R F , f ( F R F , W R F ))

2 H H H H where W H R F H F R F F P F P F R F H W R F = H e H e = G e . It is note

r

J ( W R F , F R F , f ( F R F , W R F )) ≥ J ( W R F , F R F , Fˆ P )

r

ther, a notation Fˆ P = F H RF F RF lem (5) can be rewritten as

(7)

following theorem show the relationship between the optimal solutions of P 1, P 2, P 3 and the optimal solution of (6).

r

W RF ,F RF ,F P

⎧ ⎫⎫ max J ( W R F , F R F , Fˆ P ) ⎪ ⎪ ⎨ ⎬⎪ ⎪ ⎪ Fˆ P ⎪ ⎬ max P 1 : 2 ⎪ ⎪ F R F ⎩ ⎭ max P 2 : ˆ s.t. F P = α P C1 ⎪ ⎪ ⎪ P3 : W RF ⎪ F ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ s.t. [ F R F ]i , j = √1 , ∀i , j C2 Nt s.t. w R F ,k (m) = √1 , ∀k, m C3. N ⎧ ⎪ ⎪ ⎪ ⎪ ⎨

F P is introduced and the prob-

J ( W R F ,opt , g ( W R F ,opt ), f ( g ( W R F ,opt ), W R F ,opt )) ≥ J ( W R F , g ( W R F ), f ( g ( W R F ), W R F )).

(10)

Combining (8), (9) and (10), it can be easily derived that

J ( W R F ,opt , g ( W R F ,opt ), f ( g ( W R F ,opt ), W R F ,opt )) ≥ J ( W R F , F R F , Fˆ P ). This completes the proof of Theorem 1.

max

W R F , F R F , Fˆ P

J ( W R F , F R F , Fˆ P )

2 s.t. Fˆ P = α P C1 F [ F R F ]i , j = √1 , ∀i , j C2 Nt w R F ,k (m) = √1 , ∀k, m C3, N

(6)

r

H − 12 H where J ( W R F , F R F , Fˆ P ) = log W H Fˆ P Fˆ P RF H F RF F RF F RF H − 1 H H F RF F RF 2 FH R F H W R F . Inspired by [28], the problem (6) can be rewritten as the following three-stage form

2

The LQD is equivalent to the well-known QR decomposition, i.e., H = L Q ⇔

HH = Q

H

R.

According to Theorem 1, the optimal solution of (6) can be tained via three stages. Based on this idea, the optimization of digital precoding matrix Fˆ P , the RF precoding matrix F R F and RF combining matrix W R F will be respectively discussed in following.

obthe the the

3.2. Optimization of the digital precoding matrix Fˆ P Firstly, let’s focus on the problem P 1 of the digital precoding matrix Fˆ P . To solve P 1, the following deﬁnition and lemma are essential. Deﬁnition 1. For a matrix A ∈ Cm×n (m ≤ n), the decomposition of A into A = P R is called as a row orthogonal decomposition (ROD) when R ∈ Cm×n satisﬁes R R H = I , and the matrix R is called as the row orthogonal matrix of A.

106

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

Lemma 1. If A ∈ Cm×n (m ≤ n) has full row rank, then there must be an ROD A = P R such that P ∈ Cm×m is invertible.

H

Proof 2. Considering the SVD A = U 0 V 1 V 2 , let P = U H and R = V H 1 , we have A = P R and R R = I . Because A has full row rank and is invertible, P is also invertible. This completes the proof of Lemma 1. With the deﬁnitions of the ROD and the row orthogonal matrix, the optimal solution of the problem P 1 in (7) can be given by the following theorem. Theorem 2. For the optimization problem P 1 in (7), if the matrix

− 1

H 2 WH ∈ C K × N R F has full row rank, then the folRF H F RF F RF F RF lowing statements are true. 1) Fˆ P ,opt ∈ C N R F × K is the optimal solution of the problem P 1 if

and only if

K ˆH α P F P ,opt

H is the row orthogonal matrix of W H RF H F RF (F RF

− 12

F RF ) . 2) The maximum of the problem P 1 is given by K log αKP +

H −1 H H log W H F RF H W RF . RF H F RF F RF F RF

According to Theorem 2, for ﬁxed RF precoding and combining matrices F R F and W R F , Fˆ P ,opt can be derived by the ROD of

F P = FH RF F RF

− 12

− 12

F RF

1 s.t. [ F R F ]i , j = √ . Nt

Unfortunately, the constant modulus constraint renders the problem P 2 non-convex. Inspired by [29], the constant modulus constraint will be ignored ﬁrstly in the following, and then the unconstrained optimal solution will be projected on the constant modulus set. Namely, the problem P 2 is approximately transformed into the following two sequential problems

H −1 H H F uncons F RF HHW RF , R F ,opt = arg max log W R F H F R F ( F R F F R F ) F RF

⎧ 2 uncons ⎪ ⎪ ⎨ arg min F R F − F R F ,opt F F RF

Proof 4. The proof is shown in Appendix B. According to Theorem optimal solution of (12) can 3, a feasible

1 s.t. [ F R F ]i , j = √ . Nt

ˆ R

H be given by F uncons R F ,opt = R

where R ∈ C K × Nt is the row or-

H

H

ˆ thogonal matrix of W H R F H , R is an N t × ( N R F − K ) matrix whose column vectors are the orthonormal bases of the nullspace N ( R H ). ˆ can be easily obVia the Gram-Schmidt process, such a matrix R tained. Then, invoking [29], the optimal solution of (13) can be j expressed as F R F ,opt = √1 e

F uncons R F ,opt

Nt

. Namely, the solution of P 2

can be approximately given by

1 F RF = √ ej Nt

RH

ej

ˆ R

H

(14)

.

After the design of the hybrid digital precoding matrices Fˆ P and F R F , we will focus on the problem P 3 of the RF combining matrix W R F last. By utilizing the RF precoding matrix given by (14), the problem P 3 can be expressed as

(12)

H max log W H RF H Z H W RF W RF

1 s.t. w R F ,k (m) = √ , Nr

where

H −1 H max log W H F RF HHW RF RF H F RF (F RF F RF )

⎪ ⎪ ⎩

W RF | .

(11)

With the digital precoding matrix F P given by (11), the optimization problem P 2 of the RF precoding matrix F R F can be rewritten as

F R F ,opt =

P3 :

Fˆ P ,opt .

H 2) The maximum of the problem (12) is given by log | W H RF H H

, and the digital precoding matrix F P can

3.3. Design of the RF precoding matrix F R F

P2 :

uncons H uncons −1 uncons H H R F uncons F R F ,opt R = I . R F ,opt ( F R F ,opt F R F ,opt )

3.4. Design of the analog combining matrix W R F

Proof 3. The proof is shown in Appendix A.

H WH RF H F RF F RF F RF be given by

Nt ×N R F 1) F uncons is the optimal solution of (12) if and only if R F ,opt ∈ C

ej

1 Nt

=

Z

R

ej

1 Nt

RH

ej

j

e

R

ej

RH

ˆ R

H

(15)

∀k, m

e

j

ˆ R

H

−1

Z0

!

ej ej

R

ˆ R

! ,

=

Z0

. However, due to the intractability H ˆ ˆ ˆH e j Re j R e j Re j R of the matrix Z , it is diﬃcult to solve the problem (15) directly. Fortunately, Theorem 3 provides a reasonable upper bound3 of (15), by adopting the upper bound as a surrogate objective function, the problem P 3 can be reformulated as

H max log W H RF H H W RF W RF

1 s.t. w R F ,k (m) = √ , Nr

(16)

∀k, m.

Considering the block-diagonal structure of W R F = diag{ w R F ,1 , w R F ,2 , · · · , w R F , K }, the objective function of (16) can be extended to

⎛ ⎡ H w R F ,1 H 1 H H ··· 1 w R F ,1 ⎜ ⎢ . .. . log ⎝ ⎣ . . w H H K H H w R F ,1 · · · 1 R F ,K

⎤ ⎞ ⎥ ⎟ .. ⎦ ⎠ . . H K HH wRF K

H wH R F ,1 H 1 H K w R F , K

wH R F ,K

K

(13)

The optimal solution of (12) can be given by the following theorem. K ×Nt Theorem 3. For the full row rank matrix W H (K ≤ N t ) RF H ∈ C with the ROD W H H = P R, the following statements are true. RF

3

It is easy to know that the diagonal entries of Z 0 are 1 and the off-diagonal

.N

t βi entries are of the form N1 i =1 e . When N t is large, it can be readily seen t that the off-diagonal entries of Z 0 are far smaller than 1 with high probability [30]. Therefore, the matrix Z 0 can be approximated as I . Accordingly, we have

Z≈

1 Nt

ej

NRF Nt

RH

ej

R

+ej

ˆ R

H

ej

ˆ R

. Similarly, the matrix Z can also be approxi-

mated as I when N R F is suﬃcient large. In a nut shell, (16) is the upper bound of (15), and the gap between (16) and (15) goes to zero with the increasing of N t and N R F .

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

Invoking the asymptotic orthogonality of different user channels in a large scale regime [31], the off-diagonal elements of H WH R F H H W R F are very small with high probability, therefore, the problem (16) can be approximately simpliﬁed to K independent problems which have the same formulation

w R F ,k (m) = √1 ,

(17)

∀k, m.

Nr

The problem (17) is a unimodular quadratic program which is in general NP-hard [32], the optimal solution can not be found in polynomial time. Fortunately, by making use of the generalized power method proposed in [32,33], the problem (17) can be approximately solved with the following iteration

1 (n+1) w R F ,k = √ e Nr

j

Table 1 Pseudo-Code of Proposed Algorithm. Proposed ROD-based hybrid TH precoding and combining algorithm Input: the value of M in M-QAM, the number of users K , transmit power P , channel matrices H 1 , H 2 , · · · , H K , iteration threshold . Design of the RF combining matrix W R F (0)

H max w H R F ,k H k H k w R F ,k

w R F ,k

s.t.

(n) H k H kH w R F ,k

(18)

(n) w R F ,k = √1 e

end for

/

1

H have W H RF H F RF (F RF F RF )

( M −1) P

He = W H RF H F RF F P =

=

αP K

P R RH =

( F HR F

αP K

αP K

1

1

H −2 H R WH RF H F RF (F RF F RF )

H

(I −RH rk 1 R 1 )ˆ ( I − R H rk 1 R 1 )ˆ

RH 1

H

RH 1

Design of the digital precoding matrix F P 1

−2 H Compute the ROD W H = P 2 R 2 via LQD RF H F RF (F RF F RF )

FP =

α P ( F H F )− 12 R H RF RF 2 K

Design of the feedback matrix B, the digital precoding matrix F D and the digital combining matrix W B B B = diag{ P 2 }−1 P 2 − I FD = I W BB =

K −1 α P diag{ P 2 }

Output: B, F D , F P , F R F , W R F , W B B .

Table 2 Complexities of proposed algorithm and several previous works. Algorithm

Complexity

Proposed hybrid TH precoding Hybrid ZF precoding [13] Hybrid BD precoding [14] Hybrid BD-GMD TH precoding [21]

O (max{ K N r2 N t , K 2 N t }) O (max{ K N r2 N t , K 2 N t }) O (max{ K N r2 N t , K 2 N t , K 4 }) O (max{ K N r2 N t , K 2 N t2 })

P

namely, the effective channel H e is a lower triangular matrix. Thus, the LQD of the effective channel H e =

α P P = L Q in (1), (2) e e K

and (3) no longer needs to be calculate and can be straightfor-

wardly given by L e =

0

= P R where P is a lower triangular

matrix and F P = F R F )− 2 R H . Thereby, the effective MK channel involved in (1), (2) and (3) can be expressed as

(n)

H H H = HH 1 , H2 , · · · , H K Compute the ROD W H R F H = P 1 R 1 via LQD for k = 1, 2, · · · , N R F − K Generate a random N t × 1 vector rˆ k

Nt

− 12

(n)

Design of the RF precoding matrix F R F

F R F = √1 e j

−2 H culate the row orthogonal matrix of W H , we RF H F RF (F RF F RF )

(n−1)

H k H kH w R F ,k

W R F = diag w R F ,1 , · · · , w R F , K

end for

Now, the problem (7) has been solved, the digital precoding matrix F P , the analog precoding matrix F R F and the analog combining matrix W R F are given by (11), (14) and (18), respectively. Whereas, the ROD involved in (11) and (14) is not unique, there are many existed decomposition methods fulﬁlled the deﬁnition of the ROD, such as the LQD, the SVD, the geometric mean decomposition, etc. It is noteworthy that if the LQD is used to cal-

j

Nr (n) (n−1) until w R F ,k − w R F ,k <

R1 =

3.5. Computation of the ROD

(0)

Initialize combining vectors w R F ,1 , · · · , w R F , K for k = 1, 2, · · · , K Initialize iteration index n = 0 repeat n=n+1

where n is the iteration index.

107

α P P and Q = I . In other words, come K

pared with other decomposition methods, computing the row orthogonal matrix via the LQD is able to reduce the computational cost of proposed algorithm effectively. Accordingly, Table 1 provides the pseudo-code of the proposed ROD-based hybrid TH precoding and combining algorithm. 3.6. Complexity analysis In this section, the complexity of the proposed algorithm is analyzed and compared with several prevalent hybrid precoding methods. According to Table 1, the RF combining matrix W R F is ﬁrstly determined by the generalized power method, which is of complexity O (max{ K N r2 N t , K N r2 n}), where n denotes the iteration number of the generalized power method. It will be shown in next Section that the iteration number n is far smaller than the number of transmit antennas N t under normal operating conditions, hence the complexity involved by the RF combining matrix W R F is of order O ( K N r2 N t ). Then, the RF precoding matrix F R F is obtained via the LQD of W H R F H and a Gram-Schmidt process, which

is of complexity O (max{ K N r N t , K 2 N t , N t ( N R F + K )( N R F − K )}). Furthermore, the digital precoding matrix F P is given by the LQD 1

−2 H of W H , which is of order O ( N 2R F N t ). Finally, it RF H F RF (F RF F RF ) is easy to verify that the calculation of the feedback matrix B is of complexity O ( K 2 ). Consequently, the overall complexity of the proposed algorithm is of order O (max{ K N r2 N t , N 2R F N t }). Table 2 compares the complexities of the proposed algorithm and several previous works in a special case of N R F = K , since several previous methods can only be performed in such a special case. Evidently, if K < N r2 , the complexities of the proposed algorithm and the hybrid ZF precoding algorithm are of the same order O ( K N r2 N t ), while the complexities of the hybrid BD precoding algorithm and the hybrid BD-GMD TH precoding algorithm are of orders O (max{ K N r2 N t , K 4 }) and O (max{ K N r2 N t , K 2 N t2 }) respectively, which are higher than or equal to the complexity of the proposed algorithm. By contrast, when K > N r2 , the proposed algorithm and the hybrid ZF precoding algorithm are of the same complexity O ( K 2 N t ), while the hybrid BD precoding algorithm is of complexity O (max{ K 2 N t , K 4 }), which is still higher than or equal to the proposed algorithm. Besides, the complexity of the hybrid BD-GMD TH precoding algorithm is of order O ( K 2 N t2 ), compared with the proposed algorithm, the computational cost of the hybrid BD-GMD TH precoding algorithm suffers from an order of magnitude increase.

108

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

4. Asymptotic sum rates analysis The asymptotic sum rates of the proposed ROD-based hybrid TH precoding and combining algorithm is analyzed in this section. For simplicity, we consider the LOS component and NLOS component, respectively. The sum rates achieved by cooperative users serves as a benchmark. For LOS component, the asymptotic sum rates loss of the proposed ROD-based hybrid TH precoding algorithm is discussed by the following theorem. Theorem 4. For a multi-user mmWave MIMO system as shown in Fig. 1 under an LOS channel environment, the achievable sum rates R obtained by the proposed ROD-based hybrid TH precoding and combining algorithm has the property that

lim

SNR→∞, N t →∞

R − R coop = K log2 α

(19)

where R coop stands for the optimal sum rates given by cooperative users under equal power allocation. Proof 5. The optimal sum rates given by cooperative users under equal

power allocation can be expressed as R coop = log2 I +

P K σn2

,

where P is the transmit power. is a diagonal matrix with diagonal elements given by the largest K eigenvalues of H H H . Accordingly, (19) can be rewritten as

lim

SNR→∞ N t →∞

R − R coop

···

H H

HK

, Hk =

(20)

=√

Nr

e

jϕt 1

e √

Nt

at (φ tK , θ Kt )

H

where ϕ (k = 1, · · · , K ) is any real number, ϕk = ϕkt − ϕkr . According to (14), the analog precoding matrix F R F can be expressed as t k

F RF =

1

√

Nt

t

e j ϕ1 at (φ1t , θ1t ) · · ·

t

e j ϕ K at (φ tK , θ Kt )

rˆ 1

···

rˆ N R F − K

where rˆ i (i = 1, · · · , N R F − K ) is an N t × 1 vector with constant modulus entries. It is easy to know that the inner product N1 at (φit , θit )H rˆ j (∀i , j) is of the form seen that

1 Nt

.Nt

t

j βn , n=1 e 1 t t Hˆ lim Nt →∞ N at (φi , θi ) r j t

when N t is large, it can be readily

= 0. Therefore, we have

/

(21)

With respect to the matrix in (20), we recall that is a diagonal matrix whose diagonal elements are the largest K eigenvalues of H H H . When N t tends to inﬁnity, the matrix H H H can be given by

lim H H H

/ = lim N t diag |α1 |2 ar (φ1r , θ1r )ar (φ1r , θ1r )H , · · · , N t →∞ 0 |α K |2 ar (φ rK , θ Kr )ar (φ rK , θ Kr )H

(0)

jϕr k ar (φ r ,θ r )) k k

ar (φkr , θkr )

.. . (0)

0

N t →∞

αk ar (φkr , θkr )at (φkt , θkt )H . Ac-

r

j ϕkr

Nt

at (φ1t , θ1t ) · · ·

|G 2 |

(|αk |2 ar (φkr ,θkr )at (φkt ,θkt )H at (φkt ,θkt )ar (φkr ,θkr )H √1N e

1

jϕt 1

e √

0

N t →∞

(0) 1 (1 ) j (|αk |2 ar (φkr ,θkr )at (φkt ,θkt )H at (φkt ,θkt )ar (φkr ,θkr )H w R F ,k ) w R F ,k = √ e Nr 1 r = √ e j ϕk ar (φkr , θkr ) Nr 1 (2 ) w R F ,k = √ Nr

× e

t

lim N t N r diag |α1 |2 , · · · , |α K |2 .

cording to the proposed algorithm, for an arbitrary initial vector w R F ,k , the combining vector w R F ,k can be given by

j

/

N t N r diag e j ϕ1 α1 , · · · , e j ϕ K α K

| I +aG 2 |

HH 1

WH RF H =

N t →∞

where (a) follows lima→∞ | I +ae| = |e| , (b) follows Theorem 2, (c) follows lim Nt →∞ F H R F F R F = I [10,30]. For the LOS channel environment, we have H =

at (φ tj , θ tj ) = 0 (i = j ), when N t tends to inﬁnity, the LQD of W H RF H can be given by

H H lim W H RF H F RF F RF H W RF =

I + σ12 G e2 (a) | G e2 | n = lim log2 = lim log2 N t →∞ SNR→∞ | KP | I + K Pσ 2 N t →∞ n H W H F R F F P F H F H H H W R F RF P P R F = lim log2 N t →∞ K H H −1 F H H H W W H F (b) RF (F RF F RF ) RF RF RF K = lim log2 α N t →∞ || H H H W H F (c ) RF F RF H W RF RF = K log2 α + lim log2 N t →∞ ||

0 / r 1 r W R F = √ diag e j ϕ1 ar (φ1r , θ1r ), · · · , e j ϕ K ar (φ rK , θ Kr ) . Nr √ jϕr Then, we have WH N r e 1 α1 at (φ1t , θ1t ) · · · RF H = r H jϕK t t e α K at (φ K , θ K ) . Invoking the property limNt →∞ N1t at (φit , θit )H

where ϕkr is the argument of ar (φkr , θkr )H w R F ,k . Accordingly, the combining matrix W R F can be given by

= lim N t N r N t →∞

K

(22)

|αk |2 a˜ r (φkr , θkr )a˜ r (φkr , θkr )H

k =1

H

where a˜ r (φkr , θkr ) = 01×(k−1) N r ar (φkr , θkr )H 01×( K −k) N r . Obviously, (22) can be regarded as the eigendecomposition of the matrix H H H , consequently, we have

/

0

lim || = lim N t N r diag |α1 |2 , · · · , |α K |2 .

N t →∞

N t →∞

(23)

Substituting (21) and (23) into (20), it is easy to obtain that

lim

SNR→∞, N t →∞

R − R coop = K log2 α .

This completes the proof of Theorem 4. Theorem 4 shows that for the LOS channel environment, the asymptotic sum rates loss per user of the proposed algorithm only depends on the value of M in M-QAM. When a high order QAM is adopted, the asymptotic sum rates loss per user is very slight. For the NLOS component, invoking (20), the asymptotic sum rates loss of the proposed algorithm can be given by

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

109

lim ( R − R coop )

SNR→∞ N t →∞

H W H F R F F H H H W R F RF RF = K log2 α + lim log2 N t →∞ || 1K max = K log2 α + lim K

log2

k =1

σk

|| H w R F ,k H k H kH w R F ,k

N t →∞

+

log2

σkmax

k =1

H W H H H W R F RF H H w H H w k R F , k k =1 k R F ,k H W H F R F F H H H W R F RF RF H + log2 W H H H W R F + log2 1 K

RF

where σkmax is the largest eigenvalue of H k H kH . According to the asymptotic orthogonality of different user channels [31], the off H diagonal elements of W H R F H H W R F tend to zero when N t goes

H

W HH to inﬁnity. Consequently, we have lim Nt →∞ 1 K RHF

2

H

Fig. 3. The asymptotic sum rates loss per user as functions of N R F , N r and K under the NLOS channel environment.

W RF

H k=1 w R F ,k H k H k w R F ,k

=

H 3 W H F R F F H H H W R F R F RF 1. Introduce the notations βt = E lim Nt →∞ , H H W H H W R F RF 4 5 H w R F ,k H k H kH w R F ,k βr = E lim Nt →∞ and βe = σ max

2

1K

E lim Nt →∞

max k=1 σk

||

k

3

, the expectation of the asymptotic sum

rates loss of the proposed algorithm can be expressed as

/ E

0

lim ( R − R coop ) =

SNR→∞ N t →∞

K log2 α + log2 βt + K log2 βr + log2 βe where log2 βt and K log2 βr represent the sum rates loss caused by the constant modulus constraint of the analog precoding matrix F R F and the analog combining matrix W R F , respectively. The sum rates loss log2 βe is caused by the noncooperation of the users. Speciﬁcally, cooperative users can make use of the virtual channels corresponding to the largest K eigenvalues of H H H , while for noncooperative systems, the kth user can only take advantage of the virtual channel corresponding to the largest eigenvalue of H k H kH . Since the sum rates loss log2 βe is irrelevant to the constant modulus constraint, in the following, we only discuss the sum rates loss log2 βt and K log2 βr . Fig. 3 plots the asymptotic sum rates losses per user K1 log2 βt and log2 βr when N t = 500. All results are obtained by taking the average of 1000 random channel realizations. Explicitly, the loss log2 βr caused by the constant modulus analog combining matrix W R F is only about 0.03–0.05 bits/s/Hz, which is very slight. With respect to the sum rates loss K1 log2 βt caused by the constant modulus analog precoding matrix F R F , it can be seen that the loss is about 0.25–0.35 bits/s/Hz, and decreases signiﬁcantly when the number of RF chains increases. 5. Simulation results In this section, we present simulation results to evaluate the performance of the proposed ROD-based hybrid TH precoding and combining algorithm. The channel matrix H k between the BS and the kth user is modeled as a Rician fading channel, the Rician factor v k is uniformly distributed between 1 and 10. The AOA and AOD of the LOS component H LOS,k are uniformly distributed in

Fig. 4. The error

η as functions of iteration number n when Nt = 100, K = 15.

[0, 2π ). The NLOS component H NLOS,k consists of N c,k = 5 clusters, each cluster is composed of N l,k = 10 rays. The average AOA and AOD of the ith cluster are uniformly distributed over [0, 2π ), the AOA and AOD of the lth ray within the ith cluster φkr ,i ,l , θkr,i ,l , φkt ,i ,l and θkt,i ,l follow the Laplace distribution with a scale parameter of 10◦ . The complex gain of each ray is assumed to be a CN (0, 1) random variable. All simulation results are averaged over 1000 random channel realizations. Firstly, the convergence of the generalized power method (18) is evaluated in Fig. 4, because it has a signiﬁcant impact on the computational cost of the proposed algorithm. The convergence of the generalized power method is measured by the error η(n) = 1 K

.K

(n) k=1 w R F ,k

(n−1) 2 − w R F ,k . It can be seen from Fig. 4 that the

error η is reduced to 10−3 with only about 5 iterations, which implies that the proposed algorithm is able to obtain the RF combining matrix W R F with a low computational complexity. Next, the sum rates of the proposed algorithm is compared with those of the hybrid ZF precoding algorithm [13], the hybrid BD precoding algorithm [14], the hybrid BD-GMD TH precoding algorithm [21], the fully-digital TH precoding algorithm [34] and the optimal sum rates given by cooperative users in Fig. 5 when N t = 100,

110

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

Fig. 5. The sum rates of different algorithms as functions of SNR when N t = 100, N r = 4, N R F = K = 15.

Fig. 7. The sum rates of different algorithms as functions of N t when N r = 4, N R F = K = 15.

Fig. 6. The BERs of different algorithms as functions of SNR when N t = 100, N r = 4, N R F = K = 15.

Fig. 8. The sum rates of different algorithms as functions of N r when N t = 100, N R F = K = 15.

N r = 4, N R F = K = 15. It can be seen that compared with the existed hybrid precoding algorithms, the proposed algorithm offers about 4 dB gain in the high SNR region (SNR > 0 dB), and in the low SNR region (SNR < −10 dB), the proposed algorithm can still provide about 1-2 dB improvement. Compared with the fullydigital TH precoding algorithm, the proposed algorithm is able to provide comparable sum rates with limited number of RF chains. Additionally, it is noteworthy that the performance of the hybrid BD-GMD TH precoder is inferior to those of the hybrid ZF precoder and the hybrid BD precoder. This unexpected phenomenon is caused by the assumption that each user only contains a single RF chain. In fact, the hybrid BD-GMD TH precoder is developed for users equipped with multiple RF chains, and it does not work well for single RF chain users. BER comparison of the aforementioned algorithms is exhibited in Fig. 6, the transmitted symbols belong to the 64QAM constellation. It can be observed that in low SNR region (SNR < −10 dB), the BERs of different algorithms are approximately equal, while in high SNR regime (SNR > 0 dB), the proposed algorithm outperforms other hybrid precoding algorithms obviously. Speciﬁcally, the hybrid ZF precoding algorithm suffers from about 2-3 dB loss com-

pared with the proposed algorithm, and the performance losses of the hybrid BD precoding algorithm and the hybrid BD-GMD TH precoding algorithm are much more severe. In a nutshell, the proposed algorithm can offer better performance compared with previous works in terms of either sum rates or BER. In the following, the scalability of the proposed ROD-based hybrid TH precoding and combining algorithm is shown in Fig. 7– Fig. 9 by increasing the number of transmit antennas N t , receive antennas N r and users K respectively. The sum rates achieved by cooperative users, the hybrid ZF precoding algorithm [13], the hybrid BD precoding algorithm [14], the hybrid BD-GMD TH precoding algorithm [21] and the fully-digital TH precoding algorithm [34] serve as benchmarks. Fig. 7 clearly shows that the proposed algorithm can provide at least about 10 bits/s/Hz performance gain compared with other hybrid precoding algorithms over a widely range of N t , while the gap between the sum rates of the proposed algorithm and that of the fully digital TH precoding algorithm is negligible. Fig. 8 plots the sum rates of various hybrid precoding algorithms as functions of the number of receive antennas N r . Explicitly, when N r = 1, the difference between the sum rates of the

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

111

Fig. 9. The sum rates of different algorithms as functions of K when N t = 100, N r = 4, N R F = K .

Fig. 10. The sum rates of different algorithms as functions of v when N t = 100, N r = 4, N R F = K = 15.

proposed algorithm and other hybrid precoding algorithms is less than 10 bits/s/Hz, while when N r > 15, the difference increases to more than 20 bits/s/Hz. Namely, as the number of receive antennas N r increases, the advantage of the proposed algorithm becomes more and more signiﬁcant. Due to the high path loss of mmWave frequency bands, terminals operating in the mmWave range are of multi-antennas in general. Therefore, it is expected that the proposed algorithm could yield a relatively high throughput in practical systems. The impact of the number of users K on the sum rates is investigated in Fig. 9. The number of RF chains N R F at the BS is assumed to be equal to the number of users K . As the number of users increases, the performances of the hybrid ZF precoding algorithm and the hybrid BD precoding algorithm reach to tops ﬁrstly and then deteriorate rapidly. Fortunately, the disastrous performance degradation is not the case for the nonlinear precoding algorithms, the sum rates of the proposed algorithm, the fullydigital TH precoding algorithm and the hybrid BD-GMD TH precoding algorithm steadily increase as K becomes large. Moreover, it can be seen that the proposed algorithm outperforms the hybrid BD-GMD TH precoding algorithm signiﬁcantly for both SNR = 0 dB and SNR = 20 dB cases. Compared with the fully-digital TH precoding algorithm, the performance loss of the proposed algorithm is much less than other hybrid precoding schemes. Namely, the proposed algorithm is able to provide a more reasonable solution for large number of users. Finally, to shed light on the stability of the proposed algorithm, Fig. 10 illustrates the sum rates of different precoding algorithms for variable channel parameters. In this setup, all the Rician factors v k of the channel H k (k = 1, 2, · · · , K ) are assumed to be equal, i.e. v 1 = v 2 = · · · = v k = v. By varying the factor v from 0 to 9, it is explicit that the proposed algorithm is a promising approach for the channel dominated by either LOS component or NLOS component. Besides, the performance gap between the proposed algorithm and the fully-digital TH precoding algorithm decreases when the Rician factor v increases, which veriﬁes that the proposed hybrid precoding algorithm can make use of more channel capacity in the LOS channel environment.

precoding matrix and a near-optimal RF precoding matrix are derived via a newly deﬁned ROD, and the RF combining matrix is given by a generalized power method. Theoretical analyses and simulations show that compared with the optimal sum rates of cooperative users, the proposed algorithm only yields a slight performance loss. Besides, compared with the existed hybrid precoders, the proposed algorithm can not only offer a steady and signiﬁcant performance gain for different number of antennas and different channel environments at various SNR, but also avoid the severe performance deterioration for large numbers of users.

6. Conclusions

Declaration of Competing Interest The authors declare that they have no known competing ﬁnancial interests or personal relationships that could have appeared to inﬂuence the work reported in this paper. Appendix A Proof of Theorem 2. Introducing the notation

H F RF FH RF F RF

H RF = W H RF

, the problem P 1 in (7) can be rewritten as

H max log H R F Fˆ P Fˆ P H H RF Fˆ P

s.t.

(24)

2 ˆ F P = α P . F

Considering the ROD H R F = P R F R R F , we have

H H H R F Fˆ P Fˆ P H HR F = P R F R R F Fˆ P Fˆ P R HR F P HR F H = R R F Fˆ P Fˆ P R HR F P R F P HR F H = R R F Fˆ P Fˆ P R HR F H R F H HR F

(25)

Therefore, the problem (24) can be reformulated as

H

max log R R F Fˆ P Fˆ P R H RF Fˆ P

s.t. A novel nonlinear hybrid ROD-based TH precoding and combining algorithm is presented in this paper. In the presented algorithm, the necessary and suﬃcient condition of the optimal digital

− 12

2 ˆ F P = α P .

(26)

F

To tackle the problem (26), we consider the following problem ﬁrstly,

112

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

/

H

max Tr R R F Fˆ P Fˆ P R H RF Fˆ P

s.t.

0

H

K ˆ ˆ H α P F P F P = R R F R R F = I , where P P and P R F are invertible matrices.

(27)

2 ˆ F P = α P .

Introducing the notation Fˆ P =

ˆf P ,1

ˆf P ,2 · · · ˆf P , K , the

max

s.t.

ˆf H R H P ,k R F

R R F ˆf P ,k (28)

7

H

t2

(29)

· · · t K . Substitute (29) to (27), we have 0 . = Tr{ T T H } = kK=1 t k 2 = α P . Then,

K α P T is a unitary ma-

trix. Invoking the ROD of H R F = P R F R R F , we have

H RF = P RF RRF =

= =

H

6 = αKP I , we have Tr R R F

H ˆ ˆ lem (27), i.e., Fˆ P = R H R F T . Consequently, we have R R F F P F P R R F =

K

αP K

αP

.K

2 where t k ∈ C K ×1 is an arbitrary vector such that k=1 t k = αP. Then, the optimal solution of problem (27) can be expressed as

/

K

H R R F Fˆ P Fˆ P R H RF

αP H H H R R F RH R F T T R R F R R F = T T = K I , namely,

RH R F tk

where T = t 1

αP αP 1 H −H ˆH ˆ =P− R R F RH I. RF = RF P P F P F P P P P RF = H

K

ˆ 2 f P ,k = α P .

1 H −H ˆH ˆ ˆH ˆ (P− RF P P F P )F P F P (F P P P P RF )

ˆ Fˆ P Fˆ P R H R F = α P , hence, F P should be the optimal solution of the prob-

k =1

It is easy to know that the optimal solution of ˆf P ,k (k = 1, · · · , K ) is the linear combination of the eigenvectors corresponding to the maximum eigenvalue of the matrix R H R F R R F . Evidently, the eigenvalues of R H R F R R F are 1 and 0, the eigenvectors corresponding to 1 are just the column vectors of R H R F . Therefore, the optimal solution of the problem (28) can be expressed as

Fˆ P = R H RF T

αP

For the only if part, when

k =1

ˆf P ,k =

R R F Fˆ P Fˆ P R H RF =

K

problem (27) is equivalent to

ˆf P ,1 ,··· , ˆf P , K

K

H

F

K

Then, we have

K

P RF T T HRRF

α P

K

P RF T

αP

K

P RF T

αP

H

T RRF

H Fˆ P

H

K H where αKP Fˆ P Fˆ P = αKP T H R R F R H R F T = α P T T = I , in other words, H

K ˆ α P F P is a row orthogonal matrix of H R F .

This completes the proof of Lemma 3. Combine Lemma 2 and Lemma 3, it is easy to know that the

max Tr R R F Fˆ P Fˆ P R H RF

matrix Fˆ P is the optimal solution of (24) if and only if

H H H R F Fˆ P Fˆ P H HR F = R R F Fˆ P Fˆ P R HR F H R F H HR F / 0 (a) H ≤ diag R R F Fˆ P Fˆ P R HR F H R F H HR F / 0 K (b) 1 H H R F H H ≤ max Tr R R F Fˆ P Fˆ P R H RF RF

the row orthogonal matrix of H R F , and the maximum of (24) is K log αKP + log H R F H H RF . The proof is completed.

=

K

αP

K

(30)

1

H R F H H

H

Lemma 2. Both (a) and (b) in (30) are equalities if and only if R R F Fˆ P Fˆ P αP RH RF = K I. H

Proof 6. It is well-known that (a) is equality if and only if R R F Fˆ P Fˆ P R H RF is a diagonal matrix, (b) is equality if and only if the diagonal elements of / 0 H

H

H ˆ ˆ R R F Fˆ P Fˆ P R H R F is equal. Consider the equality max Tr R R F F P F P R R F

= α P , it is easy to know that both (a) and (b) are equalities if and only H αP if R R F Fˆ P Fˆ P R H RF = K I. This completes the proof of Lemma 2.

K ˆH αP F P

max log H W F X F HX H H W

where (a) follows Hadamard’s inequality and (b) follows the inequality of arithmetic and geometric means. Now, let’s proof the following two lemmas.

Lemma 3. The equality

Appendix B −2 Proof of Theorem 3. Let F X = F R F ( F H and H W = W H RF F RF ) RF H, the problem (12) can be rewritten as

RF

K

H

K ˆ α P F P is

H R R F Fˆ P Fˆ P R H RF

FX

s.t.

F HX F X = I ,

where the constraint F HX F X = I stems from the property that the 1

−2 matrix F R F ( F H is always a semi-unitary matrix. RF F RF ) Similar to (30) in Appendix A, we have the following inequalities

H W F X F H H H = R W F X F H RH H W H H X W X W W 6 7 K 1 H W H H , ≤ max Tr R W F X F H R H X

K

max Tr R W F X F HX R H W FX

s.t.

is a row orthogonal matrix of H R F .

H Proof 7. The if part is straightforward. Both Fˆ P and R R F are H orthogonal matrices of H R F , i.e., H R F = αKP P P Fˆ P = P R F R R F K αP

row and

(31)

W

where R W is the row orthogonal matrix of H W . Let’s consider the following optimization problem ﬁrstly

6

= αKP I is true if and only if

W

7 (32)

F HX F X = I .

Without loss of generality, the matrix F X can be divided into H F X = F + F ⊥ , where F = R H W R W F X and F ⊥ = ( I − R W R W ) F X . It is well-known that R ( F ) ⊆ R ( R H ) and R ( F ) ⊆ N ( RH ⊥ W W ). Accordingly, the problem (32) can be expressed as

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

/

H max Tr R W F F H RW

0

F ,F ⊥

(33)

H FH F + F ⊥ F ⊥ = I.

s.t.

H Let F = R H W T and F ⊥ = R W ,⊥ T ⊥ , where the columns of

∈ C Nt ×( Nt − K ) are the orthonormal bases of N ( R H ), T and are K × N R F and ( N − t − K ) × N R F matrices respectively, the

RH W ,⊥

T⊥ problem (33) can be further rewritten as

/

max Tr T T H

0

T ,T ⊥

s.t.

H TH T + T ⊥T ⊥ = I.

For the positive semideﬁnite Hermitian matrix T H T , there

must be a unitary matrix S ∈ C N R F × N R F such that S H T H T S = , where is the diagonal matrix with non-negative diagonal elements. Then, we have S H T H ⊥ T ⊥ S = I − . Obviously, the diagonal elements of the diagonal matrix I − is also non-negative. Invoking the inequalities rank() = rank( T H T ) = rank( T ) ≤ K , ≥ 0 and I − ≥ 0, it is easy to know that max rank() = K and max Tr{} = K . Therefore,

6

/

7

H H max Tr R W F X F HX R H W = max Tr R W F F R W

0

(34)

H = max Tr{ T T H } = max Tr{ T T } = max Tr{} = K .

Substitute (34) into (31), we have

H W F X F H H H ≤ X W

1 K

113

max Tr

6

K

R W F X F HX R H W

7

H W H H W

= H W H H W

with equality holds if and only if R W F X F HX R H W = I. This completes the proof of Theorem 3. References [1] CISCO, Cisco Visual Networking Index: Forecast and Trends, 2017–2022, White Paper, Nov. 2018. [2] X. Wang, L. Kong, F. Kong, F. Qiu, M. Xia, G. Chen, Millimeter wave communication: a comprehensive survey, IEEE Commun. Surv. Tutor. 20 (3) (2018) 1616–1653. [3] N.Q. Nhan, P. Rostaing, K. Amis, L. Collin, E. Radoi, Optimization of linear MIMO precoding assuming MMSE-based turbo equalization, Digit. Signal Process. 75 (2018) 45–55. [4] P.A.C. Lopes, J.A.B. Gerald, Leakage-based precoding algorithms for multiple streams per terminal MU-MIMO systems, Digit. Signal Process. 75 (2018) 38–44. [5] X. He, Q. Guo, J. Tong, J. Xi, Y. Yu, Low-complexity approximate iterative LMMSE detection for large-scale MIMO systems, Digit. Signal Process. 60 (2017) 134–139. [6] R. Zhang, J. Zhang, Y. Gao, H. Zhao, Bussgang decomposition-based sparse channel estimation in wideband hybrid millimeter wave MIMO systems with ﬁnitebit ADCs, Digit. Signal Process. 85 (2019) 29–40. [7] F. Liu, X. Bai, R. Du, Z. Sun, X. Kan, Givens rotation based column-wise hybrid precoding for millimeter wave MIMO systems, Digit. Signal Process. 88 (2019) 130–137. [8] J.-C. Chen, Eﬃcient codebook-based beamforming algorithm for millimeterwave massive MIMO systems, IEEE Trans. Veh. Technol. 66 (9) (2017) 7809–7817. [9] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, R.W. Heath Jr., Spatially sparse precoding in millimeter wave MIMO systems, IEEE Trans. Wirel. Commun. 13 (3) (2014) 1499–1513. [10] G. Kwon, H. Park, A joint scheduling and millimeter wave hybrid beamforming system with partial side information, in: IEEE International Conference on Communications, 2016, pp. 1–6. [11] J. Zhao, F. Gao, W. Jia, S. Zhang, S. Jin, H. Lin, Angle domain hybrid precoding and channel tracking for mmWave massive MIMO systems, IEEE Trans. Wirel. Commun. 16 (10) (2017) 6868–6880. [12] A. Liu, V.K.N. Lau, Impact of CSI knowledge on the codebook-based hybrid beamforming in massive MIMO, IEEE Trans. Signal Process. 64 (24) (2016) 6545–6556.

[13] A. Li, C. Masouros, Hybrid precoding and combining design for millimeter-wave multi-user MIMO based on SVD, in: IEEE International Conference on Communications, 2017, pp. 1–6. [14] W. Ni, X. Dong, Hybrid block diagonalization for massive multiuser MIMO systems, IEEE Trans. Commun. 64 (1) (2016) 201–211. [15] L. Zhao, D. Wing Kwan Ng, J. Yuan, Multi-user precoding and channel estimation for hybrid millimeter wave systems, IEEE J. Sel. Areas Commun. 35 (7) (2017) 1576–1590. [16] Z. Wang, M. Li, Q. Liu, A.L. Swindlehurst, Hybrid precoder and combiner design with low resolution phase shifters in mmWave MIMO systems, IEEE J. Sel. Top. Signal Process. 12 (2) (2018) 256–269. [17] C.B. Peel, B.M. Hochwald, A.L. Swindlehurst, A vector-perturbation technique for near-capacity multiantenna multiuser communication – part I: channel inversion and regularization, IEEE Trans. Commun. 53 (1) (2005) 195–202. [18] B.M. Hochwald, C.B. Peel, A.L. Swindlehurst, A vector-perturbation technique for near-capacity multiantenna multiuser communication – part II: perturbation, IEEE Trans. Commun. 53 (3) (2005) 537–544. [19] R. Mai, T. Le-Ngoc, D.H.N. Nguyen, Hybrid MMSE-VP precoding for multi-user massive MIMO systems, in: IEEE International Conference on Communications, 2017, pp. 1–6. [20] R. Mai, T. Le-Ngoc, D.H.N. Nguyen, Two-timescale hybrid RF-baseband precoding with MMSE-VP for multi-user massive MIMO broadcast channels, IEEE Trans. Wirel. Commun. 17 (7) (2018) 4462–4476. [21] T.Y. Chang, C.E. Chen, A hybrid Tomlinson-Harashima transceiver design for multiuser mmwave MIMO systems, IEEE Wirel. Commun. Lett. 7 (1) (2018) 118–121. [22] L. Sun, M.R. McKay, Tomlinson-Harashima precoding for multiuser MIMO systems with quantized CSI feedback and user scheduling, IEEE Trans. Signal Process. 62 (16) (2014) 4077–4090. [23] L. Sun, M. Lei, Quantized CSI-based Tomlinson-Harashima precoding in multiuser MIMO systems, IEEE Trans. Wirel. Commun. 12 (3) (2013) 1118–1126. [24] R.F.H. Fischer, Precoding and Signal Shaping for Digital Transmission, Wiley, 2002. [25] Z. Gao, C. Hu, L. Dai, Z. Wang, Channel estimation for millimeter-wave massive MIMO with hybrid precoding over frequency-selective fading channels, IEEE Commun. Lett. 20 (6) (2016) 1259–1262. [26] C. Balanis, Antenna Theory, Wiley, 1997. [27] C. Windpassinger, R.F.H. Fischer, T. Vencel, J.B. Huber, Precoding in multiantenna and mutiuser communications, IEEE Trans. Wirel. Commun. 3 (4) (2004) 1305–1316. [28] A. Alkhateeb, R.W. Heath, Frequency selected hybrid precoding for limited feedback millimeter wave systems, IEEE Trans. Commun. 64 (5) (2016) 1801–1818. [29] H. Ghauch, T. Kim, M. Bengtsson, M. Skoglund, Subspace estimation and decomposition for large millimeter-wave MIMO systems, IEEE J. Sel. Top. Signal Process. 10 (3) (2016) 528–542. [30] R. Rajashekar, L. Hanzo, Iterative matrix decomposition aided block diagonalization for mm-Wave multiuser MIMO systems, IEEE Trans. Wirel. Commun. 16 (3) (2017) 1372–1384. [31] X. Wu, D. Liu, F. Yin, Hybrid beamforming for multi-user massive MIMO systems, IEEE Trans. Commun. 66 (9) (2018) 3879–3891. [32] M. Soltanalian, P. Stoica, Designing unimodular codes via quadratic optimization, IEEE Trans. Signal Process. 62 (5) (2014) 1221–1234. [33] N. Boumal, Nonconvex phase synchronization, SIAM J. Optim. 26 (4) (2016) 2355–2377. [34] S. Lin, W.W.L. Ho, Y.-C. Liang, Block diagonal geometric mean decomposition (BD-GMD) for MIMO broadcast channels, IEEE Trans. Wirel. Commun. 7 (7) (2008) 2778–2789.

Xiaoyu Bai received the B.S. degree from Nanjing University of Science and Technology, Nanjing, China, in 2010 and the M.S. degree from Shanghai Academy of Spaceﬂight Technology, Shanghai, China, in 2013. Currently, he is pursuing the PhD at the School of Computer Science and Engineering, Northeastern University, Shenyang, China. His research interests include array signal processing and millimeter wave MIMO systems. He is a recipient of the Best Student Paper Award at 2017 Progress in Electromagnetics Research Symposium (PIERS). Fulai Liu received the M.S. degree and Ph.D. degree from Northeastern University, Shenyang, China, in 2002 and in 2005, respectively. Since 2010, he is a professor in Northeastern University, Qinhuangdao, China. His research interests include array signal processing and its applications, cognitive radio, millimeter wave MIMO systems, etc. Ruiyan Du received her B.S. degree from Hebei Normal University, Shijiazhuang, China, in 1999, the M.S. degree from Yanshan University, Qinhuangdao, China, in 2006, and the Ph.D. degree from Northeastern University, Shenyang, China, in 2012. Since 2012, she is an assistant professor

114

X. Bai et al. / Digital Signal Processing 93 (2019) 102–114

in Northeastern University, Qinhuangdao, China. Her research interests include wireless communications, signal processing for communications. Xiaodong Kan received the B.S. degree from Anhui Normal University, Wuhu, China, in 2017. She is currently pursuing the M.S. degree at Northeastern University. Her research interests include millimeter wave communications and array signal processing. Yixin Xu received the M.S. degree from Yanshan University, Qinhuangdao, China, in 2011. He is currently pursuing the Ph.D. degree in signal

and information processing with Northeastern University. His research interests include signal processing for communications and massive MIMO systems. Yanshuo Zhang received the B.S. degree from Northeast Petroleum University, Daqing, China, in 2017. She is currently pursuing the M.S. degree at Northeastern University. Her research interests include millimeter wave communications and massive MIMO systems.

ROD-based hybrid TH precoding and combining for mmWave large-scale MIMO systems

ROD-based hybrid TH precoding and combining for mmWave large-scale MIMO systems

Recommend Documents