Digital Signal Processing 93 (2019) 102–114
Contents lists available at ScienceDirect
Digital Signal Processing www.elsevier.com/locate/dsp
ROD-based hybrid TH precoding and combining for mmWave large-scale MIMO systems ✩ Xiaoyu Bai a , Fulai Liu a,b,∗ , Ruiyan Du a,b , Xiaodong Kan a,b , Yixin Xu a , Yanshuo Zhang a,b a b
School of Computer Science and Engineering, Northeastern University, Shenyang, 110819, China Institute of Engineering Optimization & Smart Antenna, Northeastern University at Qinhuangdao, 066004, China
a r t i c l e
i n f o
Article history: Available online 24 July 2019 Keywords: Millimeter-wave Hybrid precoder Tomlinson-Harashima precoding Nonlinear precoding MIMO
a b s t r a c t Hybrid precoding is one of key techniques for millimeter wave (mmWave) large-scale multiple-input multiple-output (MIMO) systems. This paper considers a nonlinear hybrid precoding architecture which consists of a nonlinear unit, a reductive digital precoder and a constant modulus radio frequency (RF) precoder, and presents a novel hybrid Tomlinson-Harashima (TH) precoding and combining algorithm. Firstly, due to the intractability of the sum rates maximization problem for such a nonlinear hybrid precoding architecture, a tractable three-stage optimization problem is constructed through the lower bound of the sum rates, which allows the digital precoding matrix, the RF precoding matrix and the RF combining matrix to be optimized sequentially and independently. Then, in order to solve the threestage optimization problem effectively, a novel row orthogonal decomposition (ROD) is defined. Based on the ROD, it is interesting that the necessary and sufficient condition of the optimal digital precoding matrix can be obtained, and a near-optimal RF precoding matrix can be derived. Finally, the optimization of the RF combining matrix is reformulated as a unimodular quadratic programming and solved by a generalized power method. Theoretical analyses and simulations indicate that the proposed ROD-based hybrid TH precoding and combining algorithm can offer a higher sum rates and a lower bit error rate with a comparable complexity in comparison to the previous works. © 2019 Elsevier Inc. All rights reserved.
1. Introduction
Recent proliferation of smart mobile devices has resulted in an unprecedented growth of data traffic in wireless communications. For example, Cisco Visual Networking Index shows that global mobile data traffic will increase sevenfold between 2017 and 2022 [1]. Since the spectrum resources below 6 GHz have been almost completely occupied, it is very challenging to fulfill the increasing communication demands with the conventional commercial frequency bands. To meet the incredible increase of mobile data traffic, one of the most efficient resolutions is to transmit data with millimeter wave (mmWave) large-scale multiple-input multipleoutput (MIMO) systems, due to its high data rate, enormous idle spectrum resources and compact hardware structure [2].
✩ This work was supported by the Natural Science Foundation of Hebei Province (Grant No. F2016501139) and the Fundamental Research Funds for the Central Universities (Grant No. N172302002 and No. N162304002). Corresponding author at: Institute of Engineering Optimization & Smart Antenna, Northeastern University at Qinhuangdao, 066004, China. E-mail address:
[email protected] (F. Liu).
*
https://doi.org/10.1016/j.dsp.2019.07.010 1051-2004/© 2019 Elsevier Inc. All rights reserved.
Fully digital precoding scheme for MIMO systems requires one individual radio frequency (RF) chain per antenna element [3–5], which is prohibitively complex and costly at mmWave frequencies. To address this issue, a hybrid precoding architecture is widely considered for mmWave MIMO systems in recent years. The hybrid precoding architecture divides the precoder into a low-dimensional digital signal processor and a high-dimensional analog signal processor, so that it only requires a significantly lower number of RF chains in comparison to the fully digital counterpart [6–9]. In the past few years, several hybrid precoding algorithms for multi-user mmWave MIMO systems have been proposed. The first prevalent category of hybrid precoding is codebook-based method [10–12], in which the columns of the RF precoding matrix are selected from a predefined codebook. An equally spaced grouping scheme based on the discrete Fourier transform (DFT) codebook is proposed in [10], the presented algorithm firstly divides the DFT codebook into groups, and then the group which maximizes the sum rates is selected to construct the RF precoding matrix. A spatial rotation algorithm is proposed in [11], which can refine the angles of the DFT beams and improve the performance of the DFT codebook-based hybrid precoding method effectively. Besides, the impacts of the instantaneous channel state information (CSI)
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
and hybrid CSI are studied for codebook-based hybrid precoding method in [12], and it is shown that the hybrid CSI is sufficient to achieve the first-order gain provided by massive MIMO systems for most of cases. The second category of hybrid precoding is noncodebook based method, which usually solves a relaxation problem of the hybrid precoding matrix firstly, and then regulates the solution according to the hardware constraints [13–16]. For example, a singular value decomposition (SVD)-based hybrid precoding algorithm is given in [13], which derives the analog precoding matrix and the digital precoding matrix via the SVD and the zero forcing (ZF) precoding, respectively. A hybrid block diagonalization (BD) precoding algorithm is developed in [14], the proposed algorithm firstly maximizes the effective channel gain via the RF precoding matrix, and then the BD precoding is implemented in the digital domain to suppress the inter-user interference. A joint channel estimation and hybrid precoding algorithm is given in [15], which exploits the strongest angle-of-arrival to design the analog precoding matrix and optimizes the digital precoding matrix through the ZF precoding. Additionally, an iterative hybrid precoding algorithm for low resolution RF phase shifters is developed in [16], which refines the RF precoding matrix and digital precoding matrix via block coordinated ascent and minimum mean square error (MMSE) precoding, respectively. However, all the aforementioned algorithms are linear precoding, which may incur a performance loss, especially for an illconditioned channel matrix [17]. Fortunately, it has been shown in [18] that such a phenomenon could be effectively avoided by adding a perturbation vector to data streams in advance. Based on this idea, a nonlinear hybrid MMSE-vector perturbation (MMSEVP) scheme is proposed in [19]. However, the RF precoding matrix in [19] is implemented by phase shifters and power amplifiers simultaneously, which is energy hungry. To solve this problem, a DFT codebook-based nonlinear hybrid MMSE-VP precoding algorithm is further given in [20], whose RF precoding matrix is only based on energy-efficient phase shifters. Though the hybrid MMSEVP precoding algorithms offer significant improvement compared with linear methods, a Lenstra-Lenstra-Lovász basis reduction and a Branching-Reduction-and-Bounding method are used to solve the perturbation vector and RF precoding matrix in [19,20], which involve high complexities. In order to reduce the computational costs, a low-complexity nonlinear hybrid block diagonal geometric mean decomposition (BD-GMD) Tomlinson-Harashima (TH) precoding algorithm is proposed [21], in which an orthogonal matching pursuit (OMP) algorithm is used to decompose the fully digital TH precoding matrix into the product of the RF precoding matrix and the digital precoding matrix. However, the OMP algorithm restricts the column vectors of the RF precoding matrix to belong to a predefined codebook, if the codewords within the codebook are far from the optimal solution of the RF precoding matrix, the system performance will inevitably decays. With this backdrop, a novel hybrid TH precoding and combining algorithm is proposed in this paper, the main contributions can be summarized as follows:
• A tractable optimization problem of the precoding and combining matrices is firstly constructed through the lower bound of the sum rates, and then the problem is further transformed into an equivalent three-stage optimization problem which allows the digital precoding matrix, the RF precoding matrix and the RF combining matrix to be optimized sequentially and independently. • To solve the aforementioned three-stage optimization problem effectively, a novel row orthogonal decomposition (ROD) which represents the orthonormal bases of the row space of a matrix is defined. Based on the newly defined ROD, it is interesting that the necessary and sufficient condition for the optimal dig-
103
ital precoding matrix can be derived and a near-optimal RF precoding matrix can be given. Then, by utilizing the asymptotic orthogonality of different user channels, the optimization of the RF combining matrix is reformulated as a unimodular quadratic programming and solved by a generalized power method. • The sum rates and bit error rate (BER) of the presented algorithm are evaluated by theoretic analyses and simulations. Results indicate that the performance loss of the proposed algorithm is slight. Compared with the previous hybrid precoding methods, it is observed that the proposed algorithm can improve the sum rates and reduce the BER significantly with comparable computational costs. The rest of this paper is organized as follows. In Section 2, the system and channel models are described. Section 3 explains the proposed hybrid TH precoding algorithm. In Section 4, the asymptotic performance of the proposed algorithm is analyzed. Section 5 evaluates the performance of the proposed algorithm through several simulations, and Section 6 concludes the whole paper. Throughout this paper, A is a matrix, a is a vector, a is a scalar. |a|, a and a∗ are the magnitude, argument and conjugate of the complex number a, respectively. The field of complex numbers is represented by C . | A | denotes its determinant, A F is its Frobenius norm, rank( A ) stands for the rank of A, Tr( A ) represents the trace of A, A −1 , A † A T and A H are its inverse, Moore-Penrose pseudo-inverse, transpose and conjugate transpose, respectively. R ( A ) and N ( A ) are the column space and nullspace of A. [ A ]m,n stands for the (m, n)th element of the matrix A. e j j
A
is a matrix
[ A ]m,n . I and 0 stand for
whose (m, n)th element is equal to e the identity matrix and zero matrix, respectively. diag{a1 , · · · , a N } is a diagonal matrix with the entries in {a1 , · · · , a N } on its diagonal, diag{ A } represents a diagonal matrix with diagonal elements given by [ A ]1,1 , · · · , [ A ] N , N , diag{ A 1 , · · · , A N } is a block-diagonal matrix with the elements in { A 1 , · · · , A N } as the diagonal blocks. a denotes the Euclidean norm of the vector a, a(i ) is the ith entry of the vector a. E[·] is used to denote expectation, CN (a, b) is the complex Gaussian distribution with the mean a and the covariance b. 2. System model Consider a multi-user mmWave MIMO system with a hybrid TH precoder as shown in Fig. 1 [21], in which a base station (BS) equipped with N t antennas and N R F RF chains simultaneously serves K users with N r antennas and one RF chain. The BS employs a hybrid TH precoder to send an N s × 1 signal vector T s = s1 , · · · , s N s to users, where each entry of s is chosen from an M-ary quadrature amplitude modulation(M-QAM) constellation
√ 3 , ±3 2(M3−1) , · · · , ±( M − 2( M −1) 1) 2( M3−1) }. It is assumed that all the data streams are indepenset A = {s R + js I |s R , s I ∈ ±
dent and have unit power, i.e., E{ssH } = I . The signal vector s is firstly fed into a nonlinear unit which consists of an N s × N s feedback matrix B and a modulo operator. The feedback matrix B is a strictly lower triangular matrix and the modulo operator MOD M (x) is defined as [22]
MOD M (x) = x − 2τ
√
2τ
3 , 2( M −1)
+
1 2
+j
x 2τ
+
1
2
x is the largest integer not exceeding x. and denote the real and imaginary parts of a complex number, respectively. The output vector of the nonlinear unit s˜ can be where
τ= M
x
expressed as [23]
104
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
Fig. 1. A multi-user mmWave MIMO system with a hybrid TH precoder.
s˜ i = MOD M si −
i −1
˜ [ B ]i ,k sk , i = 1, · · · , N s .
H LOS,k = αk ar (φkr , θkr )at (φkt , θkt )H
k =1
It is evident that s˜ i no longer belongs to the M-QAM constellation points, hence the power of s˜ differs from that of s. Invoking H [24], we have E{˜s s˜ } = α −1 E{ssH }, where α = MM−1 . After the nonlinear operation, the vector s˜ is multiplied by the digital precoding matrix F B B ∈ C N R F × N s and the analog precoding matrix F R F ∈ C Nt × N R F sequentially, therefore, the transmitted signal x can be given by
x = F R F F B B s˜
where the analog precoding matrix F R F is subjected to [ F R F ]m,n
=
√1 . The transmit power constraint can be expressed as Nt
F R F F B B 2F = α P where P is the transmit power.
H NLOS,k =
1 N c ,k N l,k
N c ,k Nl,k
αk,i,l ar (φkr ,i,l , θkr,i,l )at (φkt ,i,l , θkt,i,l )H
i =1 l =1
where αk , and φkt (θkt ) denote the complex gain, azimuth (elevation) angle of arrival (AOA) and the azimuth (elevation) angle of departure (AOD) of the LOS path associated with the kth user. αk,i,l , φkr ,i,l (θkr,i,l ) and φkt ,i,l (θkt,i,l ) are referred to as the complex gain, azimuth (elevation) AOA and the azimuth (elevation) AOD of the (i , l)th path of the NLOS component associated with the kth user. ar (φ r , θ r ) and at (φ t , θ t ) represent the array response vectors of the BS and the user respectively. For an M × N uniform planar array (UPA) in the yz-plane with a half-wavelength inter-element space, the array response vector can be expressed as [26]
φkr
(θkr )
The received signal at the kth user prior to the modulo operator can be written as
a(φ, θ) =
H ˜ y˜ k = w B B ,k w H R F ,k H k F R F F B B s + w B B ,k w R F ,k nk
where 0 ≤ m ≤ M − 1 and 0 ≤ n ≤ N − 1 are the y and z indices of antenna elements respectively.
Nr ×Nt
where H k ∈ C stands for the channel matrix between the BS and the kth user. w R F ,k ∈ C N r ×1 denotes the analog RF combining vector which is subjected to w R F ,k (i ) = √1 . w B B ,k is a scaling Nr N r ×1
factor. The entries of the noise vector nk ∈ C are independent and identically distributed (i.i.d.) CN (0, σn2 ) random variables, in which σn2 denotes the noise power. For simplicity, the received signals prior to modulo operator for all users can be reformulated as
˜ + W B B W HR F n y˜ = W B B W H RF H F RF F BBs T H where y˜ = y˜ 1 · · · y˜ K , H = H H · · · H HK , W B B = 1 diag{ w B B ,1 , · · · , w B B , K } and W R F = diag{ w R F ,1 , · · · , w R F , K }. The mmWave propagation channel can be characterized by a Rician channel model consisting of LOS and NLOS components [15, 25]
Hk =
vk vk + 1
H LOS,k +
1 vk + 1
H NLOS,k
where v k is referred to as the Rician factor. H LOS,k and H NLOS,k denote the LOS component and the NLOS component, respectively. It is assumed that the NLOS component H NLOS,k consists of N c ,k clusters and each cluster is composed of N l,k paths, accordingly, the LOS and NLOS components H LOS,k and H NLOS,k can be given by [15]
1 ···
e j π (m sin φ sin θ +n cos θ )
···
e j π (( M −1) sin φ sin θ +( N −1) cos θ )
T
3. Hybrid TH precoding algorithm In this section, the nonlinear unit, the hybrid precoder at BS and the hybrid combiner at users will be considered sequentially. For convenience, the diagram of the hybrid TH precoder is redescribed in Fig. 2, where the digital precoding matrix F B B is decomposed into a product of the matrices F P ∈ C N R F × K and F D ∈ C K × K , i.e., F B B = F P F D . 3.1. Problem formulation Firstly, by regarding the matrix H e = W H R F H F R F F P as an effective channel and invoking the fully digital TH precoding algorithm proposed in [27], it is easy to know that the feedback matrix B, the digital precoding matrix F D and the digital combining matrix W B B can be given by1
B = G e−1 L e − I FD = Q
H e
−1
W BB = Ge
(1) (2) (3)
1 For the details of Eq. (1)–(3), it is omitted due to the limited space, the interested readers are referred to Eq. (14) in [27].
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
105
Fig. 2. The diagram of the hybrid TH precoder.
where L e and Q e are the LQ decomposition2 (LQD) of the effective channel H e = L e Q e . G e = diag{le,11 , · · · , le, K K } is a diagonal matrix in which le,kk is the kth diagonal entry of L e . Then, the sum rates of the hybrid TH precoder can be expressed as
R = log I + 12 G e2 [22,27]. Accordingly, the optimization probσn
lem of the hybrid precoding matrices F R F , F P and the analog combining matrix W R F can be expressed as
max
W RF ,F RF ,F P
s.t.
1 2 log I + 2 G e σ n
F R F F P 2F = α P C1 [ F R F ]i , j = √1 , ∀i , j C2 Nt w R F ,k (m) = √1 , ∀k, m C3. N
(4)
However, the objective function of (4) is not an explicit function of the matrix variables F R F , F P and W R F , which renders the problem intractable. Fortunately, it is easy to know that the
objective function of (4) is lower bounded by log
1
σn2
G e2
[16].
Accordingly, the problem (4) can be approximately transformed into the following formulation
s.t.
max
H H H log W H RF H F RF F P F P F RF H W RF
F R F F P 2F = α P C1 [ F R F ]i , j = √1 , ∀i , j C2 Nt w R F ,k (m) = √1 , ∀k, m C3, N
(5)
worthy that the transmit power constraint C1 of (5) involves a product of F R F and F P . In order to simplify the constraint C1 fur-
12
The problem (7) consists of three subproblems P 1, P 2 and P 3. P 1 is a problem of Fˆ P with given F R F and W R F , without loss of generality, the optimal solution of P 1 can be expressed as Fˆ P ,opt = f ( F R F , W R F ). Then, the objective function of P 2 can be given by J ( W R F , F R F , f ( F R F , W R F )). Similarly, introducing a notation F R F ,opt = g ( W R F ) to represent the optimal solution of P 2, the objective function of P 3 is J ( W R F , g ( W R F ), f ( g ( W R F ), W R F )). The
Theorem 1. If Fˆ P ,opt = f ( F R F , W R F ), F R F ,opt = g ( W R F ) and W R F ,opt are the optimal solution of P 1, P 2 and P 3 in (7) respectively, then, { W R F ,opt , g ( W R F ,opt ), f ( g ( W R F ,opt ), W R F ,opt )} is the optimal solution of (6). Proof 1. It is easy to know that for arbitrary matrix variables Fˆ P , F R F and W R F that satisfy the constraints in (7), the following inequalities are true
(8)
J ( W R F , g ( W R F ), f ( g ( W R F ), W R F )) ≥
(9)
J ( W R F , F R F , f ( F R F , W R F ))
2 H H H H where W H R F H F R F F P F P F R F H W R F = H e H e = G e . It is note
r
J ( W R F , F R F , f ( F R F , W R F )) ≥ J ( W R F , F R F , Fˆ P )
r
ther, a notation Fˆ P = F H RF F RF lem (5) can be rewritten as
(7)
following theorem show the relationship between the optimal solutions of P 1, P 2, P 3 and the optimal solution of (6).
r
W RF ,F RF ,F P
⎧ ⎫⎫ max J ( W R F , F R F , Fˆ P ) ⎪ ⎪ ⎨ ⎬⎪ ⎪ ⎪ Fˆ P ⎪ ⎬ max P 1 : 2 ⎪ ⎪ F R F ⎩ ⎭ max P 2 : ˆ s.t. F P = α P C1 ⎪ ⎪ ⎪ P3 : W RF ⎪ F ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ s.t. [ F R F ]i , j = √1 , ∀i , j C2 Nt s.t. w R F ,k (m) = √1 , ∀k, m C3. N ⎧ ⎪ ⎪ ⎪ ⎪ ⎨
F P is introduced and the prob-
J ( W R F ,opt , g ( W R F ,opt ), f ( g ( W R F ,opt ), W R F ,opt )) ≥ J ( W R F , g ( W R F ), f ( g ( W R F ), W R F )).
(10)
Combining (8), (9) and (10), it can be easily derived that
J ( W R F ,opt , g ( W R F ,opt ), f ( g ( W R F ,opt ), W R F ,opt )) ≥ J ( W R F , F R F , Fˆ P ). This completes the proof of Theorem 1.
max
W R F , F R F , Fˆ P
J ( W R F , F R F , Fˆ P )
2 s.t. Fˆ P = α P C1 F [ F R F ]i , j = √1 , ∀i , j C2 Nt w R F ,k (m) = √1 , ∀k, m C3, N
(6)
r
H − 12 H where J ( W R F , F R F , Fˆ P ) = log W H Fˆ P Fˆ P RF H F RF F RF F RF H − 1 H H F RF F RF 2 FH R F H W R F . Inspired by [28], the problem (6) can be rewritten as the following three-stage form
2
The LQD is equivalent to the well-known QR decomposition, i.e., H = L Q ⇔
HH = Q
H
R.
According to Theorem 1, the optimal solution of (6) can be tained via three stages. Based on this idea, the optimization of digital precoding matrix Fˆ P , the RF precoding matrix F R F and RF combining matrix W R F will be respectively discussed in following.
obthe the the
3.2. Optimization of the digital precoding matrix Fˆ P Firstly, let’s focus on the problem P 1 of the digital precoding matrix Fˆ P . To solve P 1, the following definition and lemma are essential. Definition 1. For a matrix A ∈ Cm×n (m ≤ n), the decomposition of A into A = P R is called as a row orthogonal decomposition (ROD) when R ∈ Cm×n satisfies R R H = I , and the matrix R is called as the row orthogonal matrix of A.
106
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
Lemma 1. If A ∈ Cm×n (m ≤ n) has full row rank, then there must be an ROD A = P R such that P ∈ Cm×m is invertible.
H
Proof 2. Considering the SVD A = U 0 V 1 V 2 , let P = U H and R = V H 1 , we have A = P R and R R = I . Because A has full row rank and is invertible, P is also invertible. This completes the proof of Lemma 1. With the definitions of the ROD and the row orthogonal matrix, the optimal solution of the problem P 1 in (7) can be given by the following theorem. Theorem 2. For the optimization problem P 1 in (7), if the matrix
− 1
H 2 WH ∈ C K × N R F has full row rank, then the folRF H F RF F RF F RF lowing statements are true. 1) Fˆ P ,opt ∈ C N R F × K is the optimal solution of the problem P 1 if
and only if
K ˆH α P F P ,opt
H is the row orthogonal matrix of W H RF H F RF (F RF
− 12
F RF ) . 2) The maximum of the problem P 1 is given by K log αKP +
H −1 H H log W H F RF H W RF . RF H F RF F RF F RF
According to Theorem 2, for fixed RF precoding and combining matrices F R F and W R F , Fˆ P ,opt can be derived by the ROD of
F P = FH RF F RF
− 12
− 12
F RF
1 s.t. [ F R F ]i , j = √ . Nt
Unfortunately, the constant modulus constraint renders the problem P 2 non-convex. Inspired by [29], the constant modulus constraint will be ignored firstly in the following, and then the unconstrained optimal solution will be projected on the constant modulus set. Namely, the problem P 2 is approximately transformed into the following two sequential problems
H −1 H H F uncons F RF HHW RF , R F ,opt = arg max log W R F H F R F ( F R F F R F ) F RF
⎧ 2 uncons ⎪ ⎪ ⎨ arg min F R F − F R F ,opt F F RF
Proof 4. The proof is shown in Appendix B. According to Theorem optimal solution of (12) can 3, a feasible
1 s.t. [ F R F ]i , j = √ . Nt
ˆ R
H be given by F uncons R F ,opt = R
where R ∈ C K × Nt is the row or-
H
H
ˆ thogonal matrix of W H R F H , R is an N t × ( N R F − K ) matrix whose column vectors are the orthonormal bases of the nullspace N ( R H ). ˆ can be easily obVia the Gram-Schmidt process, such a matrix R tained. Then, invoking [29], the optimal solution of (13) can be j expressed as F R F ,opt = √1 e
F uncons R F ,opt
Nt
. Namely, the solution of P 2
can be approximately given by
1 F RF = √ ej Nt
RH
ej
ˆ R
H
(14)
.
After the design of the hybrid digital precoding matrices Fˆ P and F R F , we will focus on the problem P 3 of the RF combining matrix W R F last. By utilizing the RF precoding matrix given by (14), the problem P 3 can be expressed as
(12)
H max log W H RF H Z H W RF W RF
1 s.t. w R F ,k (m) = √ , Nr
where
H −1 H max log W H F RF HHW RF RF H F RF (F RF F RF )
⎪ ⎪ ⎩
W RF | .
(11)
With the digital precoding matrix F P given by (11), the optimization problem P 2 of the RF precoding matrix F R F can be rewritten as
F R F ,opt =
P3 :
Fˆ P ,opt .
H 2) The maximum of the problem (12) is given by log | W H RF H H
, and the digital precoding matrix F P can
3.3. Design of the RF precoding matrix F R F
P2 :
uncons H uncons −1 uncons H H R F uncons F R F ,opt R = I . R F ,opt ( F R F ,opt F R F ,opt )
3.4. Design of the analog combining matrix W R F
Proof 3. The proof is shown in Appendix A.
H WH RF H F RF F RF F RF be given by
Nt ×N R F 1) F uncons is the optimal solution of (12) if and only if R F ,opt ∈ C
ej
1 Nt
=
Z
R
ej
1 Nt
RH
ej
j
e
R
ej
RH
ˆ R
H
(15)
∀k, m
e
j
ˆ R
H
−1
Z0
!
ej ej
R
ˆ R
! ,
=
Z0
. However, due to the intractability H ˆ ˆ ˆH e j Re j R e j Re j R of the matrix Z , it is difficult to solve the problem (15) directly. Fortunately, Theorem 3 provides a reasonable upper bound3 of (15), by adopting the upper bound as a surrogate objective function, the problem P 3 can be reformulated as
H max log W H RF H H W RF W RF
1 s.t. w R F ,k (m) = √ , Nr
(16)
∀k, m.
Considering the block-diagonal structure of W R F = diag{ w R F ,1 , w R F ,2 , · · · , w R F , K }, the objective function of (16) can be extended to
⎛ ⎡ H w R F ,1 H 1 H H ··· 1 w R F ,1 ⎜ ⎢ . .. . log ⎝ ⎣ . . w H H K H H w R F ,1 · · · 1 R F ,K
⎤ ⎞ ⎥ ⎟ .. ⎦ ⎠ . . H K HH wRF K
H wH R F ,1 H 1 H K w R F , K
wH R F ,K
K
(13)
The optimal solution of (12) can be given by the following theorem. K ×Nt Theorem 3. For the full row rank matrix W H (K ≤ N t ) RF H ∈ C with the ROD W H H = P R, the following statements are true. RF
3
It is easy to know that the diagonal entries of Z 0 are 1 and the off-diagonal
.N
t βi entries are of the form N1 i =1 e . When N t is large, it can be readily seen t that the off-diagonal entries of Z 0 are far smaller than 1 with high probability [30]. Therefore, the matrix Z 0 can be approximated as I . Accordingly, we have
Z≈
1 Nt
ej
NRF Nt
RH
ej
R
+ej
ˆ R
H
ej
ˆ R
. Similarly, the matrix Z can also be approxi-
mated as I when N R F is sufficient large. In a nut shell, (16) is the upper bound of (15), and the gap between (16) and (15) goes to zero with the increasing of N t and N R F .
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
Invoking the asymptotic orthogonality of different user channels in a large scale regime [31], the off-diagonal elements of H WH R F H H W R F are very small with high probability, therefore, the problem (16) can be approximately simplified to K independent problems which have the same formulation
w R F ,k (m) = √1 ,
(17)
∀k, m.
Nr
The problem (17) is a unimodular quadratic program which is in general NP-hard [32], the optimal solution can not be found in polynomial time. Fortunately, by making use of the generalized power method proposed in [32,33], the problem (17) can be approximately solved with the following iteration
1 (n+1) w R F ,k = √ e Nr
j
Table 1 Pseudo-Code of Proposed Algorithm. Proposed ROD-based hybrid TH precoding and combining algorithm Input: the value of M in M-QAM, the number of users K , transmit power P , channel matrices H 1 , H 2 , · · · , H K , iteration threshold . Design of the RF combining matrix W R F (0)
H max w H R F ,k H k H k w R F ,k
w R F ,k
s.t.
(n) H k H kH w R F ,k
(18)
(n) w R F ,k = √1 e
end for
/
1
H have W H RF H F RF (F RF F RF )
( M −1) P
He = W H RF H F RF F P =
=
αP K
P R RH =
( F HR F
αP K
αP K
1
1
H −2 H R WH RF H F RF (F RF F RF )
H
(I −RH rk 1 R 1 )ˆ ( I − R H rk 1 R 1 )ˆ
RH 1
H
RH 1
Design of the digital precoding matrix F P 1
−2 H Compute the ROD W H = P 2 R 2 via LQD RF H F RF (F RF F RF )
FP =
α P ( F H F )− 12 R H RF RF 2 K
Design of the feedback matrix B, the digital precoding matrix F D and the digital combining matrix W B B B = diag{ P 2 }−1 P 2 − I FD = I W BB =
K −1 α P diag{ P 2 }
Output: B, F D , F P , F R F , W R F , W B B .
Table 2 Complexities of proposed algorithm and several previous works. Algorithm
Complexity
Proposed hybrid TH precoding Hybrid ZF precoding [13] Hybrid BD precoding [14] Hybrid BD-GMD TH precoding [21]
O (max{ K N r2 N t , K 2 N t }) O (max{ K N r2 N t , K 2 N t }) O (max{ K N r2 N t , K 2 N t , K 4 }) O (max{ K N r2 N t , K 2 N t2 })
P
namely, the effective channel H e is a lower triangular matrix. Thus, the LQD of the effective channel H e =
α P P = L Q in (1), (2) e e K
and (3) no longer needs to be calculate and can be straightfor-
wardly given by L e =
0
= P R where P is a lower triangular
matrix and F P = F R F )− 2 R H . Thereby, the effective MK channel involved in (1), (2) and (3) can be expressed as
(n)
H H H = HH 1 , H2 , · · · , H K Compute the ROD W H R F H = P 1 R 1 via LQD for k = 1, 2, · · · , N R F − K Generate a random N t × 1 vector rˆ k
Nt
− 12
(n)
Design of the RF precoding matrix F R F
F R F = √1 e j
−2 H culate the row orthogonal matrix of W H , we RF H F RF (F RF F RF )
(n−1)
H k H kH w R F ,k
W R F = diag w R F ,1 , · · · , w R F , K
end for
Now, the problem (7) has been solved, the digital precoding matrix F P , the analog precoding matrix F R F and the analog combining matrix W R F are given by (11), (14) and (18), respectively. Whereas, the ROD involved in (11) and (14) is not unique, there are many existed decomposition methods fulfilled the definition of the ROD, such as the LQD, the SVD, the geometric mean decomposition, etc. It is noteworthy that if the LQD is used to cal-
j
Nr (n) (n−1) until w R F ,k − w R F ,k <
R1 =
3.5. Computation of the ROD
(0)
Initialize combining vectors w R F ,1 , · · · , w R F , K for k = 1, 2, · · · , K Initialize iteration index n = 0 repeat n=n+1
where n is the iteration index.
107
α P P and Q = I . In other words, come K
pared with other decomposition methods, computing the row orthogonal matrix via the LQD is able to reduce the computational cost of proposed algorithm effectively. Accordingly, Table 1 provides the pseudo-code of the proposed ROD-based hybrid TH precoding and combining algorithm. 3.6. Complexity analysis In this section, the complexity of the proposed algorithm is analyzed and compared with several prevalent hybrid precoding methods. According to Table 1, the RF combining matrix W R F is firstly determined by the generalized power method, which is of complexity O (max{ K N r2 N t , K N r2 n}), where n denotes the iteration number of the generalized power method. It will be shown in next Section that the iteration number n is far smaller than the number of transmit antennas N t under normal operating conditions, hence the complexity involved by the RF combining matrix W R F is of order O ( K N r2 N t ). Then, the RF precoding matrix F R F is obtained via the LQD of W H R F H and a Gram-Schmidt process, which
is of complexity O (max{ K N r N t , K 2 N t , N t ( N R F + K )( N R F − K )}). Furthermore, the digital precoding matrix F P is given by the LQD 1
−2 H of W H , which is of order O ( N 2R F N t ). Finally, it RF H F RF (F RF F RF ) is easy to verify that the calculation of the feedback matrix B is of complexity O ( K 2 ). Consequently, the overall complexity of the proposed algorithm is of order O (max{ K N r2 N t , N 2R F N t }). Table 2 compares the complexities of the proposed algorithm and several previous works in a special case of N R F = K , since several previous methods can only be performed in such a special case. Evidently, if K < N r2 , the complexities of the proposed algorithm and the hybrid ZF precoding algorithm are of the same order O ( K N r2 N t ), while the complexities of the hybrid BD precoding algorithm and the hybrid BD-GMD TH precoding algorithm are of orders O (max{ K N r2 N t , K 4 }) and O (max{ K N r2 N t , K 2 N t2 }) respectively, which are higher than or equal to the complexity of the proposed algorithm. By contrast, when K > N r2 , the proposed algorithm and the hybrid ZF precoding algorithm are of the same complexity O ( K 2 N t ), while the hybrid BD precoding algorithm is of complexity O (max{ K 2 N t , K 4 }), which is still higher than or equal to the proposed algorithm. Besides, the complexity of the hybrid BD-GMD TH precoding algorithm is of order O ( K 2 N t2 ), compared with the proposed algorithm, the computational cost of the hybrid BD-GMD TH precoding algorithm suffers from an order of magnitude increase.
108
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
4. Asymptotic sum rates analysis The asymptotic sum rates of the proposed ROD-based hybrid TH precoding and combining algorithm is analyzed in this section. For simplicity, we consider the LOS component and NLOS component, respectively. The sum rates achieved by cooperative users serves as a benchmark. For LOS component, the asymptotic sum rates loss of the proposed ROD-based hybrid TH precoding algorithm is discussed by the following theorem. Theorem 4. For a multi-user mmWave MIMO system as shown in Fig. 1 under an LOS channel environment, the achievable sum rates R obtained by the proposed ROD-based hybrid TH precoding and combining algorithm has the property that
lim
SNR→∞, N t →∞
R − R coop = K log2 α
(19)
where R coop stands for the optimal sum rates given by cooperative users under equal power allocation. Proof 5. The optimal sum rates given by cooperative users under equal
power allocation can be expressed as R coop = log2 I +
P K σn2
,
where P is the transmit power. is a diagonal matrix with diagonal elements given by the largest K eigenvalues of H H H . Accordingly, (19) can be rewritten as
lim
SNR→∞ N t →∞
R − R coop
···
H H
HK
, Hk =
(20)
=√
Nr
e
jϕt 1
e √
Nt
at (φ tK , θ Kt )
H
where ϕ (k = 1, · · · , K ) is any real number, ϕk = ϕkt − ϕkr . According to (14), the analog precoding matrix F R F can be expressed as t k
F RF =
1
√
Nt
t
e j ϕ1 at (φ1t , θ1t ) · · ·
t
e j ϕ K at (φ tK , θ Kt )
rˆ 1
···
rˆ N R F − K
where rˆ i (i = 1, · · · , N R F − K ) is an N t × 1 vector with constant modulus entries. It is easy to know that the inner product N1 at (φit , θit )H rˆ j (∀i , j) is of the form seen that
1 Nt
.Nt
t
j βn , n=1 e 1 t t Hˆ lim Nt →∞ N at (φi , θi ) r j t
when N t is large, it can be readily
= 0. Therefore, we have
/
(21)
With respect to the matrix in (20), we recall that is a diagonal matrix whose diagonal elements are the largest K eigenvalues of H H H . When N t tends to infinity, the matrix H H H can be given by
lim H H H
/ = lim N t diag |α1 |2 ar (φ1r , θ1r )ar (φ1r , θ1r )H , · · · , N t →∞ 0 |α K |2 ar (φ rK , θ Kr )ar (φ rK , θ Kr )H
(0)
jϕr k ar (φ r ,θ r )) k k
ar (φkr , θkr )
.. . (0)
0
N t →∞
αk ar (φkr , θkr )at (φkt , θkt )H . Ac-
r
j ϕkr
Nt
at (φ1t , θ1t ) · · ·
|G 2 |
(|αk |2 ar (φkr ,θkr )at (φkt ,θkt )H at (φkt ,θkt )ar (φkr ,θkr )H √1N e
1
jϕt 1
e √
0
N t →∞
(0) 1 (1 ) j (|αk |2 ar (φkr ,θkr )at (φkt ,θkt )H at (φkt ,θkt )ar (φkr ,θkr )H w R F ,k ) w R F ,k = √ e Nr 1 r = √ e j ϕk ar (φkr , θkr ) Nr 1 (2 ) w R F ,k = √ Nr
× e
t
lim N t N r diag |α1 |2 , · · · , |α K |2 .
cording to the proposed algorithm, for an arbitrary initial vector w R F ,k , the combining vector w R F ,k can be given by
j
/
N t N r diag e j ϕ1 α1 , · · · , e j ϕ K α K
| I +aG 2 |
HH 1
WH RF H =
N t →∞
where (a) follows lima→∞ | I +ae| = |e| , (b) follows Theorem 2, (c) follows lim Nt →∞ F H R F F R F = I [10,30]. For the LOS channel environment, we have H =
at (φ tj , θ tj ) = 0 (i = j ), when N t tends to infinity, the LQD of W H RF H can be given by
H H lim W H RF H F RF F RF H W RF =
I + σ12 G e2 (a) | G e2 | n = lim log2 = lim log2 N t →∞ SNR→∞ | KP | I + K Pσ 2 N t →∞ n H W H F R F F P F H F H H H W R F RF P P R F = lim log2 N t →∞ K H H −1 F H H H W W H F (b) RF (F RF F RF ) RF RF RF K = lim log2 α N t →∞ || H H H W H F (c ) RF F RF H W RF RF = K log2 α + lim log2 N t →∞ ||
0 / r 1 r W R F = √ diag e j ϕ1 ar (φ1r , θ1r ), · · · , e j ϕ K ar (φ rK , θ Kr ) . Nr √ jϕr Then, we have WH N r e 1 α1 at (φ1t , θ1t ) · · · RF H = r H jϕK t t e α K at (φ K , θ K ) . Invoking the property limNt →∞ N1t at (φit , θit )H
where ϕkr is the argument of ar (φkr , θkr )H w R F ,k . Accordingly, the combining matrix W R F can be given by
= lim N t N r N t →∞
K
(22)
|αk |2 a˜ r (φkr , θkr )a˜ r (φkr , θkr )H
k =1
H
where a˜ r (φkr , θkr ) = 01×(k−1) N r ar (φkr , θkr )H 01×( K −k) N r . Obviously, (22) can be regarded as the eigendecomposition of the matrix H H H , consequently, we have
/
0
lim || = lim N t N r diag |α1 |2 , · · · , |α K |2 .
N t →∞
N t →∞
(23)
Substituting (21) and (23) into (20), it is easy to obtain that
lim
SNR→∞, N t →∞
R − R coop = K log2 α .
This completes the proof of Theorem 4. Theorem 4 shows that for the LOS channel environment, the asymptotic sum rates loss per user of the proposed algorithm only depends on the value of M in M-QAM. When a high order QAM is adopted, the asymptotic sum rates loss per user is very slight. For the NLOS component, invoking (20), the asymptotic sum rates loss of the proposed algorithm can be given by
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
109
lim ( R − R coop )
SNR→∞ N t →∞
H W H F R F F H H H W R F RF RF = K log2 α + lim log2 N t →∞ || 1K max = K log2 α + lim K
log2
k =1
σk
|| H w R F ,k H k H kH w R F ,k
N t →∞
+
log2
σkmax
k =1
H W H H H W R F RF H H w H H w k R F , k k =1 k R F ,k H W H F R F F H H H W R F RF RF H + log2 W H H H W R F + log2 1 K
RF
where σkmax is the largest eigenvalue of H k H kH . According to the asymptotic orthogonality of different user channels [31], the off H diagonal elements of W H R F H H W R F tend to zero when N t goes
H
W HH to infinity. Consequently, we have lim Nt →∞ 1 K RHF
2
H
Fig. 3. The asymptotic sum rates loss per user as functions of N R F , N r and K under the NLOS channel environment.
W RF
H k=1 w R F ,k H k H k w R F ,k
=
H 3 W H F R F F H H H W R F R F RF 1. Introduce the notations βt = E lim Nt →∞ , H H W H H W R F RF 4 5 H w R F ,k H k H kH w R F ,k βr = E lim Nt →∞ and βe = σ max
2
1K
E lim Nt →∞
max k=1 σk
||
k
3
, the expectation of the asymptotic sum
rates loss of the proposed algorithm can be expressed as
/ E
0
lim ( R − R coop ) =
SNR→∞ N t →∞
K log2 α + log2 βt + K log2 βr + log2 βe where log2 βt and K log2 βr represent the sum rates loss caused by the constant modulus constraint of the analog precoding matrix F R F and the analog combining matrix W R F , respectively. The sum rates loss log2 βe is caused by the noncooperation of the users. Specifically, cooperative users can make use of the virtual channels corresponding to the largest K eigenvalues of H H H , while for noncooperative systems, the kth user can only take advantage of the virtual channel corresponding to the largest eigenvalue of H k H kH . Since the sum rates loss log2 βe is irrelevant to the constant modulus constraint, in the following, we only discuss the sum rates loss log2 βt and K log2 βr . Fig. 3 plots the asymptotic sum rates losses per user K1 log2 βt and log2 βr when N t = 500. All results are obtained by taking the average of 1000 random channel realizations. Explicitly, the loss log2 βr caused by the constant modulus analog combining matrix W R F is only about 0.03–0.05 bits/s/Hz, which is very slight. With respect to the sum rates loss K1 log2 βt caused by the constant modulus analog precoding matrix F R F , it can be seen that the loss is about 0.25–0.35 bits/s/Hz, and decreases significantly when the number of RF chains increases. 5. Simulation results In this section, we present simulation results to evaluate the performance of the proposed ROD-based hybrid TH precoding and combining algorithm. The channel matrix H k between the BS and the kth user is modeled as a Rician fading channel, the Rician factor v k is uniformly distributed between 1 and 10. The AOA and AOD of the LOS component H LOS,k are uniformly distributed in
Fig. 4. The error
η as functions of iteration number n when Nt = 100, K = 15.
[0, 2π ). The NLOS component H NLOS,k consists of N c,k = 5 clusters, each cluster is composed of N l,k = 10 rays. The average AOA and AOD of the ith cluster are uniformly distributed over [0, 2π ), the AOA and AOD of the lth ray within the ith cluster φkr ,i ,l , θkr,i ,l , φkt ,i ,l and θkt,i ,l follow the Laplace distribution with a scale parameter of 10◦ . The complex gain of each ray is assumed to be a CN (0, 1) random variable. All simulation results are averaged over 1000 random channel realizations. Firstly, the convergence of the generalized power method (18) is evaluated in Fig. 4, because it has a significant impact on the computational cost of the proposed algorithm. The convergence of the generalized power method is measured by the error η(n) = 1 K
.K
(n) k=1 w R F ,k
(n−1) 2 − w R F ,k . It can be seen from Fig. 4 that the
error η is reduced to 10−3 with only about 5 iterations, which implies that the proposed algorithm is able to obtain the RF combining matrix W R F with a low computational complexity. Next, the sum rates of the proposed algorithm is compared with those of the hybrid ZF precoding algorithm [13], the hybrid BD precoding algorithm [14], the hybrid BD-GMD TH precoding algorithm [21], the fully-digital TH precoding algorithm [34] and the optimal sum rates given by cooperative users in Fig. 5 when N t = 100,
110
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
Fig. 5. The sum rates of different algorithms as functions of SNR when N t = 100, N r = 4, N R F = K = 15.
Fig. 7. The sum rates of different algorithms as functions of N t when N r = 4, N R F = K = 15.
Fig. 6. The BERs of different algorithms as functions of SNR when N t = 100, N r = 4, N R F = K = 15.
Fig. 8. The sum rates of different algorithms as functions of N r when N t = 100, N R F = K = 15.
N r = 4, N R F = K = 15. It can be seen that compared with the existed hybrid precoding algorithms, the proposed algorithm offers about 4 dB gain in the high SNR region (SNR > 0 dB), and in the low SNR region (SNR < −10 dB), the proposed algorithm can still provide about 1-2 dB improvement. Compared with the fullydigital TH precoding algorithm, the proposed algorithm is able to provide comparable sum rates with limited number of RF chains. Additionally, it is noteworthy that the performance of the hybrid BD-GMD TH precoder is inferior to those of the hybrid ZF precoder and the hybrid BD precoder. This unexpected phenomenon is caused by the assumption that each user only contains a single RF chain. In fact, the hybrid BD-GMD TH precoder is developed for users equipped with multiple RF chains, and it does not work well for single RF chain users. BER comparison of the aforementioned algorithms is exhibited in Fig. 6, the transmitted symbols belong to the 64QAM constellation. It can be observed that in low SNR region (SNR < −10 dB), the BERs of different algorithms are approximately equal, while in high SNR regime (SNR > 0 dB), the proposed algorithm outperforms other hybrid precoding algorithms obviously. Specifically, the hybrid ZF precoding algorithm suffers from about 2-3 dB loss com-
pared with the proposed algorithm, and the performance losses of the hybrid BD precoding algorithm and the hybrid BD-GMD TH precoding algorithm are much more severe. In a nutshell, the proposed algorithm can offer better performance compared with previous works in terms of either sum rates or BER. In the following, the scalability of the proposed ROD-based hybrid TH precoding and combining algorithm is shown in Fig. 7– Fig. 9 by increasing the number of transmit antennas N t , receive antennas N r and users K respectively. The sum rates achieved by cooperative users, the hybrid ZF precoding algorithm [13], the hybrid BD precoding algorithm [14], the hybrid BD-GMD TH precoding algorithm [21] and the fully-digital TH precoding algorithm [34] serve as benchmarks. Fig. 7 clearly shows that the proposed algorithm can provide at least about 10 bits/s/Hz performance gain compared with other hybrid precoding algorithms over a widely range of N t , while the gap between the sum rates of the proposed algorithm and that of the fully digital TH precoding algorithm is negligible. Fig. 8 plots the sum rates of various hybrid precoding algorithms as functions of the number of receive antennas N r . Explicitly, when N r = 1, the difference between the sum rates of the
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
111
Fig. 9. The sum rates of different algorithms as functions of K when N t = 100, N r = 4, N R F = K .
Fig. 10. The sum rates of different algorithms as functions of v when N t = 100, N r = 4, N R F = K = 15.
proposed algorithm and other hybrid precoding algorithms is less than 10 bits/s/Hz, while when N r > 15, the difference increases to more than 20 bits/s/Hz. Namely, as the number of receive antennas N r increases, the advantage of the proposed algorithm becomes more and more significant. Due to the high path loss of mmWave frequency bands, terminals operating in the mmWave range are of multi-antennas in general. Therefore, it is expected that the proposed algorithm could yield a relatively high throughput in practical systems. The impact of the number of users K on the sum rates is investigated in Fig. 9. The number of RF chains N R F at the BS is assumed to be equal to the number of users K . As the number of users increases, the performances of the hybrid ZF precoding algorithm and the hybrid BD precoding algorithm reach to tops firstly and then deteriorate rapidly. Fortunately, the disastrous performance degradation is not the case for the nonlinear precoding algorithms, the sum rates of the proposed algorithm, the fullydigital TH precoding algorithm and the hybrid BD-GMD TH precoding algorithm steadily increase as K becomes large. Moreover, it can be seen that the proposed algorithm outperforms the hybrid BD-GMD TH precoding algorithm significantly for both SNR = 0 dB and SNR = 20 dB cases. Compared with the fully-digital TH precoding algorithm, the performance loss of the proposed algorithm is much less than other hybrid precoding schemes. Namely, the proposed algorithm is able to provide a more reasonable solution for large number of users. Finally, to shed light on the stability of the proposed algorithm, Fig. 10 illustrates the sum rates of different precoding algorithms for variable channel parameters. In this setup, all the Rician factors v k of the channel H k (k = 1, 2, · · · , K ) are assumed to be equal, i.e. v 1 = v 2 = · · · = v k = v. By varying the factor v from 0 to 9, it is explicit that the proposed algorithm is a promising approach for the channel dominated by either LOS component or NLOS component. Besides, the performance gap between the proposed algorithm and the fully-digital TH precoding algorithm decreases when the Rician factor v increases, which verifies that the proposed hybrid precoding algorithm can make use of more channel capacity in the LOS channel environment.
precoding matrix and a near-optimal RF precoding matrix are derived via a newly defined ROD, and the RF combining matrix is given by a generalized power method. Theoretical analyses and simulations show that compared with the optimal sum rates of cooperative users, the proposed algorithm only yields a slight performance loss. Besides, compared with the existed hybrid precoders, the proposed algorithm can not only offer a steady and significant performance gain for different number of antennas and different channel environments at various SNR, but also avoid the severe performance deterioration for large numbers of users.
6. Conclusions
Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Appendix A Proof of Theorem 2. Introducing the notation
H F RF FH RF F RF
H RF = W H RF
, the problem P 1 in (7) can be rewritten as
H max log H R F Fˆ P Fˆ P H H RF Fˆ P
s.t.
(24)
2 ˆ F P = α P . F
Considering the ROD H R F = P R F R R F , we have
H H H R F Fˆ P Fˆ P H HR F = P R F R R F Fˆ P Fˆ P R HR F P HR F H = R R F Fˆ P Fˆ P R HR F P R F P HR F H = R R F Fˆ P Fˆ P R HR F H R F H HR F
(25)
Therefore, the problem (24) can be reformulated as
H
max log R R F Fˆ P Fˆ P R H RF Fˆ P
s.t. A novel nonlinear hybrid ROD-based TH precoding and combining algorithm is presented in this paper. In the presented algorithm, the necessary and sufficient condition of the optimal digital
− 12
2 ˆ F P = α P .
(26)
F
To tackle the problem (26), we consider the following problem firstly,
112
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
/
H
max Tr R R F Fˆ P Fˆ P R H RF Fˆ P
s.t.
0
H
K ˆ ˆ H α P F P F P = R R F R R F = I , where P P and P R F are invertible matrices.
(27)
2 ˆ F P = α P .
Introducing the notation Fˆ P =
ˆf P ,1
ˆf P ,2 · · · ˆf P , K , the
max
s.t.
ˆf H R H P ,k R F
R R F ˆf P ,k (28)
7
H
t2
(29)
· · · t K . Substitute (29) to (27), we have 0 . = Tr{ T T H } = kK=1 t k 2 = α P . Then,
K α P T is a unitary ma-
trix. Invoking the ROD of H R F = P R F R R F , we have
H RF = P RF RRF =
= =
H
6 = αKP I , we have Tr R R F
H ˆ ˆ lem (27), i.e., Fˆ P = R H R F T . Consequently, we have R R F F P F P R R F =
K
αP K
αP
.K
2 where t k ∈ C K ×1 is an arbitrary vector such that k=1 t k = αP. Then, the optimal solution of problem (27) can be expressed as
/
K
H R R F Fˆ P Fˆ P R H RF
αP H H H R R F RH R F T T R R F R R F = T T = K I , namely,
RH R F tk
where T = t 1
αP αP 1 H −H ˆH ˆ =P− R R F RH I. RF = RF P P F P F P P P P RF = H
K
ˆ 2 f P ,k = α P .
1 H −H ˆH ˆ ˆH ˆ (P− RF P P F P )F P F P (F P P P P RF )
ˆ Fˆ P Fˆ P R H R F = α P , hence, F P should be the optimal solution of the prob-
k =1
It is easy to know that the optimal solution of ˆf P ,k (k = 1, · · · , K ) is the linear combination of the eigenvectors corresponding to the maximum eigenvalue of the matrix R H R F R R F . Evidently, the eigenvalues of R H R F R R F are 1 and 0, the eigenvectors corresponding to 1 are just the column vectors of R H R F . Therefore, the optimal solution of the problem (28) can be expressed as
Fˆ P = R H RF T
αP
For the only if part, when
k =1
ˆf P ,k =
R R F Fˆ P Fˆ P R H RF =
K
problem (27) is equivalent to
ˆf P ,1 ,··· , ˆf P , K
K
H
F
K
Then, we have
K
P RF T T HRRF
α P
K
P RF T
αP
K
P RF T
αP
H
T RRF
H Fˆ P
H
K H where αKP Fˆ P Fˆ P = αKP T H R R F R H R F T = α P T T = I , in other words, H
K ˆ α P F P is a row orthogonal matrix of H R F .
This completes the proof of Lemma 3. Combine Lemma 2 and Lemma 3, it is easy to know that the
max Tr R R F Fˆ P Fˆ P R H RF
matrix Fˆ P is the optimal solution of (24) if and only if
H H H R F Fˆ P Fˆ P H HR F = R R F Fˆ P Fˆ P R HR F H R F H HR F / 0 (a) H ≤ diag R R F Fˆ P Fˆ P R HR F H R F H HR F / 0 K (b) 1 H H R F H H ≤ max Tr R R F Fˆ P Fˆ P R H RF RF
the row orthogonal matrix of H R F , and the maximum of (24) is K log αKP + log H R F H H RF . The proof is completed.
=
K
αP
K
(30)
1
H R F H H
H
Lemma 2. Both (a) and (b) in (30) are equalities if and only if R R F Fˆ P Fˆ P αP RH RF = K I. H
Proof 6. It is well-known that (a) is equality if and only if R R F Fˆ P Fˆ P R H RF is a diagonal matrix, (b) is equality if and only if the diagonal elements of / 0 H
H
H ˆ ˆ R R F Fˆ P Fˆ P R H R F is equal. Consider the equality max Tr R R F F P F P R R F
= α P , it is easy to know that both (a) and (b) are equalities if and only H αP if R R F Fˆ P Fˆ P R H RF = K I. This completes the proof of Lemma 2.
K ˆH αP F P
max log H W F X F HX H H W
where (a) follows Hadamard’s inequality and (b) follows the inequality of arithmetic and geometric means. Now, let’s proof the following two lemmas.
Lemma 3. The equality
Appendix B −2 Proof of Theorem 3. Let F X = F R F ( F H and H W = W H RF F RF ) RF H, the problem (12) can be rewritten as
RF
K
H
K ˆ α P F P is
H R R F Fˆ P Fˆ P R H RF
FX
s.t.
F HX F X = I ,
where the constraint F HX F X = I stems from the property that the 1
−2 matrix F R F ( F H is always a semi-unitary matrix. RF F RF ) Similar to (30) in Appendix A, we have the following inequalities
H W F X F H H H = R W F X F H RH H W H H X W X W W 6 7 K 1 H W H H , ≤ max Tr R W F X F H R H X
K
max Tr R W F X F HX R H W FX
s.t.
is a row orthogonal matrix of H R F .
H Proof 7. The if part is straightforward. Both Fˆ P and R R F are H orthogonal matrices of H R F , i.e., H R F = αKP P P Fˆ P = P R F R R F K αP
row and
(31)
W
where R W is the row orthogonal matrix of H W . Let’s consider the following optimization problem firstly
6
= αKP I is true if and only if
W
7 (32)
F HX F X = I .
Without loss of generality, the matrix F X can be divided into H F X = F + F ⊥ , where F = R H W R W F X and F ⊥ = ( I − R W R W ) F X . It is well-known that R ( F ) ⊆ R ( R H ) and R ( F ) ⊆ N ( RH ⊥ W W ). Accordingly, the problem (32) can be expressed as
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
/
H max Tr R W F F H RW
0
F ,F ⊥
(33)
H FH F + F ⊥ F ⊥ = I.
s.t.
H Let F = R H W T and F ⊥ = R W ,⊥ T ⊥ , where the columns of
∈ C Nt ×( Nt − K ) are the orthonormal bases of N ( R H ), T and are K × N R F and ( N − t − K ) × N R F matrices respectively, the
RH W ,⊥
T⊥ problem (33) can be further rewritten as
/
max Tr T T H
0
T ,T ⊥
s.t.
H TH T + T ⊥T ⊥ = I.
For the positive semidefinite Hermitian matrix T H T , there
must be a unitary matrix S ∈ C N R F × N R F such that S H T H T S = , where is the diagonal matrix with non-negative diagonal elements. Then, we have S H T H ⊥ T ⊥ S = I − . Obviously, the diagonal elements of the diagonal matrix I − is also non-negative. Invoking the inequalities rank() = rank( T H T ) = rank( T ) ≤ K , ≥ 0 and I − ≥ 0, it is easy to know that max rank() = K and max Tr{} = K . Therefore,
6
/
7
H H max Tr R W F X F HX R H W = max Tr R W F F R W
0
(34)
H = max Tr{ T T H } = max Tr{ T T } = max Tr{} = K .
Substitute (34) into (31), we have
H W F X F H H H ≤ X W
1 K
113
max Tr
6
K
R W F X F HX R H W
7
H W H H W
= H W H H W
with equality holds if and only if R W F X F HX R H W = I. This completes the proof of Theorem 3. References [1] CISCO, Cisco Visual Networking Index: Forecast and Trends, 2017–2022, White Paper, Nov. 2018. [2] X. Wang, L. Kong, F. Kong, F. Qiu, M. Xia, G. Chen, Millimeter wave communication: a comprehensive survey, IEEE Commun. Surv. Tutor. 20 (3) (2018) 1616–1653. [3] N.Q. Nhan, P. Rostaing, K. Amis, L. Collin, E. Radoi, Optimization of linear MIMO precoding assuming MMSE-based turbo equalization, Digit. Signal Process. 75 (2018) 45–55. [4] P.A.C. Lopes, J.A.B. Gerald, Leakage-based precoding algorithms for multiple streams per terminal MU-MIMO systems, Digit. Signal Process. 75 (2018) 38–44. [5] X. He, Q. Guo, J. Tong, J. Xi, Y. Yu, Low-complexity approximate iterative LMMSE detection for large-scale MIMO systems, Digit. Signal Process. 60 (2017) 134–139. [6] R. Zhang, J. Zhang, Y. Gao, H. Zhao, Bussgang decomposition-based sparse channel estimation in wideband hybrid millimeter wave MIMO systems with finitebit ADCs, Digit. Signal Process. 85 (2019) 29–40. [7] F. Liu, X. Bai, R. Du, Z. Sun, X. Kan, Givens rotation based column-wise hybrid precoding for millimeter wave MIMO systems, Digit. Signal Process. 88 (2019) 130–137. [8] J.-C. Chen, Efficient codebook-based beamforming algorithm for millimeterwave massive MIMO systems, IEEE Trans. Veh. Technol. 66 (9) (2017) 7809–7817. [9] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, R.W. Heath Jr., Spatially sparse precoding in millimeter wave MIMO systems, IEEE Trans. Wirel. Commun. 13 (3) (2014) 1499–1513. [10] G. Kwon, H. Park, A joint scheduling and millimeter wave hybrid beamforming system with partial side information, in: IEEE International Conference on Communications, 2016, pp. 1–6. [11] J. Zhao, F. Gao, W. Jia, S. Zhang, S. Jin, H. Lin, Angle domain hybrid precoding and channel tracking for mmWave massive MIMO systems, IEEE Trans. Wirel. Commun. 16 (10) (2017) 6868–6880. [12] A. Liu, V.K.N. Lau, Impact of CSI knowledge on the codebook-based hybrid beamforming in massive MIMO, IEEE Trans. Signal Process. 64 (24) (2016) 6545–6556.
[13] A. Li, C. Masouros, Hybrid precoding and combining design for millimeter-wave multi-user MIMO based on SVD, in: IEEE International Conference on Communications, 2017, pp. 1–6. [14] W. Ni, X. Dong, Hybrid block diagonalization for massive multiuser MIMO systems, IEEE Trans. Commun. 64 (1) (2016) 201–211. [15] L. Zhao, D. Wing Kwan Ng, J. Yuan, Multi-user precoding and channel estimation for hybrid millimeter wave systems, IEEE J. Sel. Areas Commun. 35 (7) (2017) 1576–1590. [16] Z. Wang, M. Li, Q. Liu, A.L. Swindlehurst, Hybrid precoder and combiner design with low resolution phase shifters in mmWave MIMO systems, IEEE J. Sel. Top. Signal Process. 12 (2) (2018) 256–269. [17] C.B. Peel, B.M. Hochwald, A.L. Swindlehurst, A vector-perturbation technique for near-capacity multiantenna multiuser communication – part I: channel inversion and regularization, IEEE Trans. Commun. 53 (1) (2005) 195–202. [18] B.M. Hochwald, C.B. Peel, A.L. Swindlehurst, A vector-perturbation technique for near-capacity multiantenna multiuser communication – part II: perturbation, IEEE Trans. Commun. 53 (3) (2005) 537–544. [19] R. Mai, T. Le-Ngoc, D.H.N. Nguyen, Hybrid MMSE-VP precoding for multi-user massive MIMO systems, in: IEEE International Conference on Communications, 2017, pp. 1–6. [20] R. Mai, T. Le-Ngoc, D.H.N. Nguyen, Two-timescale hybrid RF-baseband precoding with MMSE-VP for multi-user massive MIMO broadcast channels, IEEE Trans. Wirel. Commun. 17 (7) (2018) 4462–4476. [21] T.Y. Chang, C.E. Chen, A hybrid Tomlinson-Harashima transceiver design for multiuser mmwave MIMO systems, IEEE Wirel. Commun. Lett. 7 (1) (2018) 118–121. [22] L. Sun, M.R. McKay, Tomlinson-Harashima precoding for multiuser MIMO systems with quantized CSI feedback and user scheduling, IEEE Trans. Signal Process. 62 (16) (2014) 4077–4090. [23] L. Sun, M. Lei, Quantized CSI-based Tomlinson-Harashima precoding in multiuser MIMO systems, IEEE Trans. Wirel. Commun. 12 (3) (2013) 1118–1126. [24] R.F.H. Fischer, Precoding and Signal Shaping for Digital Transmission, Wiley, 2002. [25] Z. Gao, C. Hu, L. Dai, Z. Wang, Channel estimation for millimeter-wave massive MIMO with hybrid precoding over frequency-selective fading channels, IEEE Commun. Lett. 20 (6) (2016) 1259–1262. [26] C. Balanis, Antenna Theory, Wiley, 1997. [27] C. Windpassinger, R.F.H. Fischer, T. Vencel, J.B. Huber, Precoding in multiantenna and mutiuser communications, IEEE Trans. Wirel. Commun. 3 (4) (2004) 1305–1316. [28] A. Alkhateeb, R.W. Heath, Frequency selected hybrid precoding for limited feedback millimeter wave systems, IEEE Trans. Commun. 64 (5) (2016) 1801–1818. [29] H. Ghauch, T. Kim, M. Bengtsson, M. Skoglund, Subspace estimation and decomposition for large millimeter-wave MIMO systems, IEEE J. Sel. Top. Signal Process. 10 (3) (2016) 528–542. [30] R. Rajashekar, L. Hanzo, Iterative matrix decomposition aided block diagonalization for mm-Wave multiuser MIMO systems, IEEE Trans. Wirel. Commun. 16 (3) (2017) 1372–1384. [31] X. Wu, D. Liu, F. Yin, Hybrid beamforming for multi-user massive MIMO systems, IEEE Trans. Commun. 66 (9) (2018) 3879–3891. [32] M. Soltanalian, P. Stoica, Designing unimodular codes via quadratic optimization, IEEE Trans. Signal Process. 62 (5) (2014) 1221–1234. [33] N. Boumal, Nonconvex phase synchronization, SIAM J. Optim. 26 (4) (2016) 2355–2377. [34] S. Lin, W.W.L. Ho, Y.-C. Liang, Block diagonal geometric mean decomposition (BD-GMD) for MIMO broadcast channels, IEEE Trans. Wirel. Commun. 7 (7) (2008) 2778–2789.
Xiaoyu Bai received the B.S. degree from Nanjing University of Science and Technology, Nanjing, China, in 2010 and the M.S. degree from Shanghai Academy of Spaceflight Technology, Shanghai, China, in 2013. Currently, he is pursuing the PhD at the School of Computer Science and Engineering, Northeastern University, Shenyang, China. His research interests include array signal processing and millimeter wave MIMO systems. He is a recipient of the Best Student Paper Award at 2017 Progress in Electromagnetics Research Symposium (PIERS). Fulai Liu received the M.S. degree and Ph.D. degree from Northeastern University, Shenyang, China, in 2002 and in 2005, respectively. Since 2010, he is a professor in Northeastern University, Qinhuangdao, China. His research interests include array signal processing and its applications, cognitive radio, millimeter wave MIMO systems, etc. Ruiyan Du received her B.S. degree from Hebei Normal University, Shijiazhuang, China, in 1999, the M.S. degree from Yanshan University, Qinhuangdao, China, in 2006, and the Ph.D. degree from Northeastern University, Shenyang, China, in 2012. Since 2012, she is an assistant professor
114
X. Bai et al. / Digital Signal Processing 93 (2019) 102–114
in Northeastern University, Qinhuangdao, China. Her research interests include wireless communications, signal processing for communications. Xiaodong Kan received the B.S. degree from Anhui Normal University, Wuhu, China, in 2017. She is currently pursuing the M.S. degree at Northeastern University. Her research interests include millimeter wave communications and array signal processing. Yixin Xu received the M.S. degree from Yanshan University, Qinhuangdao, China, in 2011. He is currently pursuing the Ph.D. degree in signal
and information processing with Northeastern University. His research interests include signal processing for communications and massive MIMO systems. Yanshuo Zhang received the B.S. degree from Northeast Petroleum University, Daqing, China, in 2017. She is currently pursuing the M.S. degree at Northeastern University. Her research interests include millimeter wave communications and massive MIMO systems.