Efficient beamforming method for downlink MU-MIMO broadcast channels

Efficient beamforming method for downlink MU-MIMO broadcast channels

Accepted Manuscript Title: Efficient Beamforming Method for Downlink MU-MIMO Broadcast Channels Author: YAN YANG HUI YUE PII: DOI: Reference: S1434-8...

814KB Sizes 0 Downloads 63 Views

Accepted Manuscript Title: Efficient Beamforming Method for Downlink MU-MIMO Broadcast Channels Author: YAN YANG HUI YUE PII: DOI: Reference:

S1434-8411(14)00335-5 http://dx.doi.org/doi:10.1016/j.aeue.2014.12.002 AEUE 51331

To appear in: Received date: Revised date: Accepted date:

18-2-2014 1-12-2014 1-12-2014

Please cite this article as: YAN YANG, HUI YUE, Efficient Beamforming Method for Downlink MU-MIMO Broadcast Channels, AEUE - International Journal of Electronics and Communications (2014), http://dx.doi.org/10.1016/j.aeue.2014.12.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

YAN YANGa,∗, HUI YUEb

The School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou, 730070, China b Lanzhou Jiaotong University, Lanzhou, 730070, China

us

cr

a

ip t

Efficient Beamforming Method for Downlink MU-MIMO Broadcast Channels

an

Abstract

te

d

M

The sum rate maximization in multiuser MIMO broadcast channels is investigated in this paper. Due to the high computational complexity of nonlinear dirty paper coding (DPC), zero-forcing dirty paper coding (ZF-DPC) is proposed as an alternative suboptimal approach. However, traditional ZF-DPC method requires that the number of total receive antennas is less than or equal to the number of transmit antennas. In this paper, we consider the scenario where the sum number of receive antennas may be more than the number of transmit antennas. It is shown that the optimal data stream allocation needs exhaustive search over all possibilities, and the complexity is significantly high. We propose a greedy transmit data allocation scheme that allocates one data stream at each step, the corresponding transmit beamforming vector and receive combining vector are designed to avoid interfering with the previous allocated data streams, and the pre-equalization Tomlinson Harashima Precoder (THP) technique is adopted to pre-cancel the non-causally known interference caused by the previous allocated data streams. The proposed method is computationally efficient thanks to the low complexity. Simulation results show that this novel method outperforms the methods in the literature.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Keywords: MIMO, broadcast, beamforming, dirty paper coding, zero-forcing.



Corresponding author. Mobilephone: +8613893615088. Email address: [email protected]. ( YAN YANG )

Preprint submitted to International Journal of Electronics and CommunicationsDecember 2, 2014

Page 1 of 32

1. Introduction

te

d

M

an

us

cr

ip t

In a downlink multiuser multiple-input multiple-output (MU-MIMO) communication system, a base station with multiple antennas transmits data streams to multiple users, each user is equipped with multiple antennas. It is shown that dirty paper coding (DPC) achieves the sum capacity of this system[1] and [2]. However, the optimal transmit covariance matrix in DPC is difficult to obtain due to the non-concave optimization problem [2]. To avoid this complex DPC processing, linear processing solutions, such as zero-forcing beamforming (ZFBF) [3] and block diagonalization (BD) [4] techniques are proposed. It is shown that the sum rates of these zero-forcing methods are close to the sum capacity, and they are easy to implement. However, such simple zero-forcing approaches suffer from noise enhancement problem. Moreover, the application scenarios are limited to the situations where the total number of receive antennas is not more than the number of transmit antennas [5]. Another promising approach named zero-forcing DPC (ZF-DPC) [6] is a suboptimal but an intuitive way to achieve DPC by triangularizing the channel. The non-linear ZF-DPC method offers improved sum rate compared with linear beamformings (such as ZFBF and BD). However, ZF-DPC method only supports one single receive antenna for each user. In [7], a successive zero-forcing DPC (SZF-DPC) is proposed which extends the original ZF-DPC method to the scenario where each user can be equipped with multiple receive antennas. However, the total number of receive antennas is still restricted by the number of transmit antennas, and the user order also affects the achievable throughput substantially. In [8], several user selection algorithms are proposed for SZF-DPC strategy. In [9], receive combining technique is adopted, which improves the sum rate and the bit error rate (BER) performance. In [10] and [11], receive combining technique is introduced to ZF-DPC method by taking advantage of the duality between the downlink MU-MIMO broadcast channels and the corresponding virtual uplink multiple access channels. However, this iterative method has high computational complexity when the number of users is large. Zero-forcing successive allocation (ZF-SA) method proposed in [12] finds transmit beamforming and receive combining vectors of one data stream at each step for the user who can bring the largest increase of the throughput. This technique is very attractive since it does not impose any constraint on the number of receive antennas and the number of users, the interference can

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2

Page 2 of 32

te

d

M

an

us

cr

ip t

also be removed completely. However, all the information contained in MUMIMO broadcast channels is not fully exploited in ZF-SA method since the entire interference is eliminated only by transmit beamforming and receive combining vectors. The performance gap between DPC and ZF-SA method is still significant. The pre-equalization Tomlinson Harashima Precoder (THP) technique, proposed in [13] and [14], aims to pre-subtract the non-causally known interference at the transmitter. THP is initially proposed for single-input singleoutput channels in the presence of inter-symbol interference (ISI), and it is extended to MU-MIMO broadcast channels in [15]. Several different criteria are proposed using THP including zero-forcing (ZF) [15], minimum mean square error (MMSE) [16]. In [17], two practical THP implementation algorithms are proposed, and the comprehensive performance analysis is carried out in terms of the error covariance matrix, the sum-rate and the computational complexity. In [18] and [19], THP performance is investigated with imperfect channel state information (CSI) at the transmitter. The main idea of THP is that, the non-causally known interference produced by the previous precoded symbols can be pre-canceled before transmission at the transmitter, and the modulo operation can be adopted to ensure that transmit power does not exceed the power constraint. THP can also be used for implementing ZF-DPC method. In this paper, we propose a novel successive allocation method under the assumption of perfect CSI at the base station. One data stream is assigned at each step to the user who brings the largest increase of the global throughput, the non-causally known interference is pre-subtracted through the preequalization THP technique before transmission, and the remaining interference is eliminated by the transmit beamforming and receive combining vectors. Note that the whole interference is removed completely only through beamforming vectors design in ZF-SA method [12], the degrees of freedom for choosing appropriate beamforming vectors in the proposed method are much larger compared with ZF-SA method, which results in improved sum rate gains. Compared with SZF-DPC method proposed in [7], receive combining vectors are introduced to further improve the sum rate, and the total number of receive antennas can be larger than that of transmit antennas in the proposed method. The main contributions of this paper are summarized as follows.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

• A new greedy data stream allocation method in multiuser MIMO broad3

Page 3 of 32

ip t

cast channels is proposed. Since the pre-equalization THP technique is used to pre-subtract the non-causally known interference at the base station, compared with ZF-SA method in [12], the proposed method has a significant sum rate improvement.

an

us

cr

• The proposed method can be considered as a practical implementation and general version of SZF-DPC method in [7]. SZF-DPC method only supports the case where the total number of receive antennas is not more than that of transmit antennas. In the proposed method, the receive combining technique is adopted, and the data streams are assigned to the users successively. the total number of receive antennas can be larger than that of transmit antennas.

M

• One pair of transmit beamforming and receive combining vectors are determined at each step, which is computational efficient, and easy to implement.

te

d

The rest of the paper is organized as follows. Section 2 presents MUMIMO broadcast channels and DPC strategy, then overviews ZF-DPC, SZFDPC and ZF-SA methods. In Section 3, we first describe the proposed beamforming method, then the pre-equalization THP technique is introduced. In Section 4, the computational complexity of the proposed method is analyzed, and compared to that of ZF-SA method. Numerical simulation results are provided in Section 5, followed by concluding remarks in Section 6. N otation: Standard notations are used in this paper. Bold lower and upper letters describe vectors and matrices, respectively; A† , AH , and AT represent pseudo inverse, Hermitian transpose and transpose of matrix A, respectively; diag(A) and tr(A) denote a vector containing the diagonal elements and the trace of A, respectively; |A| and Ai,j are the determinant and the element in row i and column j of A, respectively; Ii is the i × i identity matrix; 0 is the zero vector in which every element is zero.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2. System Model and methods in the literature Consider the downlink MU-MIMO broadcast channels with K users, where a base station is equipped with Nt transmit antennas and transmits P k Lk = L data streams to the users, each user has Nr,k receive antennas and receives Lk data streams. Since the channel gains vary with different users [5], [12], an appropriate Lk (0 ≤ Lk ≤ Nr,k ) for the kth (∀k) user has to be found 4

Page 4 of 32

+ UkH nk

(1 ≤ k ≤ K)

(1)

an

+

K X p H Uk Hk ( Vl Pl xl ) l=1,l6=k

us

cr

ip t

to maximize the global throughput. In this paper, we propose a novel data stream allocation and beamforming design algorithm. The proposed method assigns a number of data streams for each user and the corresponding transmit beamforming and receive combining vectors are designed to optimize the global throughput under the total transmit power constraint PT . In the downlink transmission, the received signal yk ∈ CLk ×1 by the kth user after the receive combining filter is p yk = UkH Hk Vk Pk xk

te

d

M

where Hk ∈ CNr,k ×Nt denotes the channel between the transmitter and the kth user; xk ∈ CLk ×1 is the transmit vector of the kth user and satisfies Nt ×Lk denotes the transmit beamforming matrix E{xk xH k } = ILk ; Vk ∈ C with normalized columns kvlk k = 1 (1 ≤ lk ≤ Lk ); Uk ∈ CNr,k ×Lk is the receive combining matrix with normalized columns kulk k = 1 (1 ≤ lk ≤ Lk ); Pk is a diagonal matrix and diag(Pk ) ∈ RL+k ×1 represents the assigned power for the data streams of the kth user; nk v CN (0, σ 2 INr,k ) is the additive zeromean complex white Gaussian noise (AWGN) vector with covariance matrix σ 2 INr,k observed at the kth user. The rate of the kth user over MU-MIMO broadcast channels can be presented as [12] P H H H |σ 2 INr,k + K j=1 Uk Hk Vj Pj Vj Hk Uk | (2) Rk = log P |σ 2 INr,k + Kj=1 UkH Hk Vj Pj VjH HkH Uk |

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

j6=k

In (2), PKPk is restricted by the total transmit power PT at the base station, i.e., k=1 tr(Pk ) = PT . Under a natural user order, the base station first finds the transmit beamforming vector for the first user, then when the base station chooses the transmit beamforming vector for the second user, the non-causal interference produced by the first user is known. In [20], the case where an additive white Gaussian noise channel corrupted by an interference known at the transmitter but unknown at the receiver is modeled as Y =X +S+Z

(3)

5

Page 5 of 32

max

{Uk ,Vk ,Pk }

us

RDP C =

cr

ip t

where X and Y are the desired and received signals, respectively, S is the non-causally known interference at the transmitter, and Z is the unknown Gaussian noise. [20] shows that the capacity of this channel under the transmit power constraint is the same as if S did not exist. If this DPC strategy is used, in which the interference produced by the previous coded users can be pre-canceled perfectly, the sum capacity of MU-MIMO broadcast channels can be calculated as

P H H H |σ 2 INr,k + K j=k Uk Hk Vj Pj Vj Hk Uk | log P H H H |σ 2 INr,k + K j=k+1 Uk Hk Vj Pj Vj Hk Uk | k=1

an

K X

subject to UkH Uk = I, VkH Vk = I,

(4)

tr(Pk ) = PT , 1 ≤ k ≤ K.

k=1

M

K X

te

d

Several iterative approaches are proposed to achieve RDP C [2], [21], in which the duality between the broadcast channels and the corresponding virtual uplink multiple access channels is used. However, the computational complexity is significantly high. 2.1. ZF-DPC method ZF-DPC method [6] is based on LQ decomposition of the channel where each user is supposed to have one single receive antenna. The received signal by the kth user (1) is then simplified to

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

yk =

√ hH k vk pk xk +

hH k

K X

+

hH k

k−1 X

√ vj pj xj

j=1



vj pj xj + nk

(5) (1 ≤ k ≤ K)

j=k+1

In (5), the receive combining vector is omitted due to the P single receive √ k−1 H antenna case. According to DPC in [6] and [20], the term hk j=1 vj pj xj PK √ is considered as the non-causally known interference, and hH k j=k+1 vj pj xj is treated as additional noise which can be removed by appropriate transmit 6

Page 6 of 32

beamforming vectors, i.e., √ vj pj xj = 0

ip t

K X

hH k

(6)

j=k+1

us

 H H = h1 , · · · , hK ∈ CK×Nt

cr

If the number of users K is less than or equal to the number of transmit antennas Nt (i.e., K ≤ Nt ), and the channel matrix H is described as

(7)

yk = lk,k pk xk +

k−1 X

√ lk,j pj xj + nk

j=1

M



an

then the transmit beamforming vectors can be obtained by performing LQ decomposition of the channel matrix H = LQ. vk (∀k) is chosen as the kth column of the unitary matrix QH ∈ CNt ×Nt , and (5) can be reformulated as (1 ≤ k ≤ K)

(8)

te

d

where lk,j is the element of the lower triangular matrix LK×Nt . DPC principle showsP that the capacity of the kth user in (8) is the same as if the interference √ term k−1 j=1 lk,j pj xj did not exist. In practice, the pre-equalization THP technique can be used to remove this non-causally known interference term. ZF-DPC method only supports one single receive antenna for each user, and the user permutation also has a great effect on the throughput performance, it is shown that the optimal user order needs exhaustive search [6].

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2.2. SZF-DPC method SZF-DPC method [7] is considered as a generalization of ZF-DPC method, it extends ZF-DPC method to the case where each user has multiple receive antennas. The received signal of the kth user (1) is rewritten as k−1 X p p yk = Hk Vk Pk xk + Hk ( Vj Pj xj )

+ Hk (

K X

j=1

p Vj Pj xj ) + nk

(9) (1 ≤ k ≤ K)

j=k+1

In (9), the receive combining matrix is omitted. Similarly to ZF-DPC method, SZF-DPC method is also a combination of DPC and zero-forcing technique, 7

Page 7 of 32

K X

Vj

p Pj xj ) = 0

j=k+1

cr

Hk (

ip t

Pk−1 p the non-causally known interference term Hk ( j=1 Vj Pj xj ) can be rep PK moved according to DPC principle, and Hk ( j=k+1 Vj Pj xj ), which is denoted as the residual interference, is eliminated completely by transmit beamforming design, i.e.,

(11)

an

us

ˆ k as Define the matrix H P   ˆ k = H1H H2H · · · H H H ∈ C( k−1 j=1 Nr,j )×Nt H k−1

(10)

te

d

M

The transmit beamforming matrix Vk must lie in the null space of the space ˆ k . In the literature, several matrix decomposition methods are spanned by H proposed to find the transmit beamforming matrix Vk , such as SVD method in [7], and SGO method in [8]. Then, the downlink MU-MIMO broadcast channels can be considered as parallel and interference free, and water-filling power allocation is performed as the optimal solution. SZF-DPC method also requires exhaustive search over all the user permutations to find the optimal solution. Moreover, the total number of receive antennas is restricted by the number of transmit antennas, i.e., the transmit beamforming matrix can be found for each served user only in the case where ˆ k is larger than the dimension of the null space of the space spanned by H zero.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2.3. ZF-SA method ZF-SA method proposed in [12] uses receive combining technique, and extends MU-MIMO broadcast channels to a more general case. This method allocates data streams successively to the users, at each step one data stream is assigned to the user who brings the largest increase of the global throughput. The first data stream is assigned to the user who has the largest data rate. For the ith (2 ≤ i ≤ Nt ) data stream allocation, we assume that the receive combining vectors of the previously allocated data streams are fixed, the original sum rate optimization problem is approximated to a generalized eigenvalue problem, and each possible receive combining vector uk(i) for the ith data stream is calculated over all the K users. The details of the calculation of receive combining vectors are given in [12]. In order to suppress 8

Page 8 of 32

 uH (1) Hπ(1) ..     . = H  u(i−1) Hπ(i−1)  H uk(i) Hk 

k

(12)

us

H (i)

cr

k

candidate user, the matrix H (i) is defined as

ip t

the interference, each transmit beamforming vector v(l) (1 ≤ l ≤ i) should be orthogonal to row vectors uH (j) Hπ(j) (∀j 6= l). For the kth (1 ≤ k ≤ K)

an

where π(l) (1 ≤ l ≤ i − 1) indicates that the lth data stream is allocated to the π(l)th user. Therefore, the transmit beamforming vectors of the alk ready allocated data streams can be obtained via the pseudo inverse of H (i) , k †

te

d

M

denoted as H (i) . Finally, the ith data stream is assigned to the user who contributes the largest increase of the total throughput with the previously selected users. ZF-SA can remove the entire interference without any constraint on the number of receive antennas. However, the spatial multiplexing in MU-MIMO broadcast channels is not fully exploited in this way. 3. The proposed method

3.1. Problem statement In this paper, we adopt receive combining technique in MU-MIMO broadcast channels and also consider the general case similarly to ZF-SA method: the base station is equipped with Nt transmit antennas, and K users are served, the kth (1 ≤ k ≤ K) user has Nr,k (Nr,k ≥ 1) receive antenna(s). Hk is denoted as the channel between the base station and the kth user. The available number of data streams for the kth user is LkP . In order to eliminate the interference, the total number of data streams K k=1 Lk = L for transmission should be not more than the number of transmit antennas. Since the transmit beamforming and receive combining vectors are determined separately for each data stream in the proposed method, in general, the lth (1 ≤ l ≤ L) data stream after MU-MIMO broadcast channels and

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

9

Page 9 of 32

the receive combining vector is √ v(j) p(j) x(j) )

j=1

√ uH v(j) p(j) x(j) ) (l) Hπ(l) ( j=l+1

+

(13)

uH (l) nπ(l)

cr

+

L X

ip t

y(l) =

l−1 X

√ H uH (l) Hπ(l) v(l) p(l) x(l) +u(l) Hπ(l) (

an

us

Similarly to ZF-DPC method, if DPC technique Pl−1 is√used, the base station H would view the interference term u(l) Hπ(l) ( j=1 v(j) p(j) x(j) ) as known noncausally, and it can be pre-subtracted before transmission, the details will be given in subsection 3.3. Therefore, the sum capacity (4) can be reformulated as RDP C = max ˆ (l) ,ˆ {u v(l) ,ˆ p(l) }

ˆH ˆ(l) |2 pˆ(l) |u (l) Hπ(l) v

M

L X

log(1 + PL

j=l+1

l=1

ˆ(j) |2 + σ 2 ˆH pˆ(j) |u (l) Hπ(l) v

) (14)

ˆ (l) k = 1, kˆ subject to ku v(l) k = 1,

d

L X

te

pˆ(l) = PT , 1 ≤ l ≤ L.

l=1

Since the maximum value of (14) is difficult to obtain, in this paper, we propose a suboptimal beamforming method. Instead of maximizing (14) directly, the PLproposed√method tries to remove the residual interference term H u(l) Hπ(l) ( j=l+1 v(j) p(j) x(j) ) in (13) and (14) completely. Meanwhile, the sum rate in (14) is maximized.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3.2. The proposed beamforming method It is shown that the optimal user permutation needs exhaustive search [6], which has high complexity. Even though the user order is preselected, data allocation is still a combinatorial optimization problem, the computational complexity is pretty high especially when the number of users is large. In the proposed method, data streams are allocated successively to the K users. At each step, only one data stream is assigned to the user who brings the largest increase of the global throughput. To simplify the statement, we focus on beamforming design by ignoring power allocation in this subsection. Note

10

Page 10 of 32

cr

ip t

that with zero-forcing constraint, water-filling power allocation principle is the optimal solution. The first data stream is assigned to the user who has the largest data rate: ˆH ˆ(1) |2 |u (1) Hk v {u(1) , v(1) , π(1)} = arg max log2 (1 + ) ˆ (1) ,ˆ {u v(1) ,k} σ2 (15) ˆ (1) k = 1, kˆ subject to ku v(1) k = 1, 1 ≤ k ≤ K.

M

an

us

ˆ (1) and vˆ(1) as the left It is obvious that (15) is maximized by choosing u and right singular vectors corresponding to the largest singular value of the channel matrix Hπ(1) , we denote them as u(1) and v(1) , respectively. For the ith (2 ≤ i ≤ L ≤ Nt ) data stream’s transmit beamforming and receive combining vectors design, we suppose that the already selected users for the previously assigned i−1 data streams are fixed. The rate optimization of the ith data stream is performed with {u(i) , v(i) , π(i)} = arg max

ˆ (i) ,ˆ {u v(i) ,k}

log2 (1 + Pi−1

ˆH ˆ(i) |2 |u (i) Hk v

(16)

)

d

2 2 ˆH j=1 |u (i) Hk v(j) | + σ

te

ˆ (i) k = 1, kˆ subject to ku v(i) k = 1, 1 ≤ k ≤ K. As discussed before, the pre-equalization THP technique, which is a practical implementation of DPC, can pre-subtract the interference that is brought by the previous precoded P i − 1 data streams. If it is adopted at the base station, 2 ˆH the interference term i−1 j=1 |u (i) Hk v(j) | in (16) can be removed completely. Additionally, in order to avoid that the ith data stream interferes with the ˆ(i) (1 ≤ j < i) to previous allocated i − 1 data streams, we force uH (j) Hπ(j) v be zero. Under these two constrains, (16) can be rewritten as

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

{u(i) , v(i) , π(i)} = arg max

ˆ (i) ,ˆ {u v(i) ,k}

log2 (1 +

ˆH ˆ(i) |2 |u (i) Hk v σ2

ˆ (i) k = 1, kˆ subject to ku v(i) k = 1, 1 ≤ k ≤ K,

) (17)

ˆ(i) = 0, 1 ≤ j < i. uH (j) Hπ(j) v

11

Page 11 of 32

ip t

In (17), vˆ(i) must lie in the null space of the space spanned by  H  u(1) Hπ(1)   .. (i−1)×Nt N(i) =  ∈C .

(18)

cr

uH (i−1) Hπ(i−1)

As a result of applying LQ decomposition to matrix N(i)

us

N(i)

    AH 1 = L(i−1)×(i−1) 0(i−1)×(Nt −i+1) AH (i)

(19)

M

an

where L(i−1)×(i−1) is a lower triangular matrix. A(i) ∈ CNt ×(Nt −i+1) forms an orthonormal basis of the null space of the row space of N(i) , i.e., N(i) A(i) = 0, and satisfies AH (i) A(i) = I. A(i) contains the orthonormal basis of the space where the transmit beamforming vector of the ith data stream vˆ(i) must lie in at the ith step. The optimization problem (17) can then be rewritten as {u(i) , f(i) , π(i)} = arg max

ˆ (i) ,fˆ(i) ,k} {u

d

ˆ 2 ˆH |u (i) Hk A(i) f(i) |

log2 (1 +

)

(20)

te

σ2 ˆ (i) k = 1, kx ˆ (i) k = 1, 1 ≤ k ≤ K. subject to ku It is obvious that the rate of the ith data stream is maximized by choosing ˆ (i) and fˆ(i) as the left and right singular vectors corresponding to the domu inant singular value of matrix Hπ(i) A(i) , we denote them as u(i) and f(i) , respectively. Then the transmit beamforming vector is v(i) = A(i) f(i) . It is easy to show that v(i) and u(i) are unitary vectors. The ith data stream is allocated to the user (among all the users), denoted by π(i), who brings the largest increase of the total throughput with the previously selected users. The algorithm stops when there is no increase of the global throughput. To suppress the interference, the number of total allocated transmit data streams PK L should be less or equal to the number of transmit antennas Nt (i.e. k=1 Lk = L ≤ Nt ). PL √ It is shown that the interference term uH j=l+1 v(j) p(j) x(j) ) in (l) Hπ(l) ( (13) can be suppressed completely by the above proposed beamforming method,

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

12

Page 12 of 32

the received signal (13) at the π(l)th user can be written as +

uH (l) Hπ(l) (

l−1 X

√ v(j) p(j) x(j) )

ip t

y(l) =

√ uH (l) Hπ(l) v(l) p(l) x(l)

j=1

+ uH (l) nπ(l)

cr

(1 ≤ l ≤ L)

(21)

log(1 +

2 p(l) |uH (l) Hπ(l) v(l) |

l=1

an

RP =

L X

us

(21) is exactly the problem described by (3). The pre-equalization THP technique can then be P applied to pre-subtract the non-causally known inter√ l−1 H ( ference term uH π(l) j=1 v(j) p(j) x(j) ) at the base station. Therefore, the (l) sum rate of the proposed method can be calculated as

σ2

)

(22)

M

The transmit power p(l) is optimized by the well known water-filling algorithm.

te

d

3.3. Pre-equalization THP technique Now, we introduce the pre-equalization THP technique that pre-subtracts the non-causally known interference at the base station. We suppose that the transmit symbol is x(l) = a(l) −

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

l−1 X

bl,j x(j)

(1 ≤ l ≤ L)

(23)

j=1

where a(l) is the information symbol and bl,j is defined as bl,j

√ uH (l) Hπ(l) v(j) p(j) = H √ u(l) Hπ(l) v(l) p(l)

(24)

Using (23) and (24) in (21), we have √ H y(l) = uH (l) Hπ(l) v(l) p(l) a(l) + u(l) nπ(l)

(1 ≤ l ≤ L)

(25)

It is observed that the interference can be pre-subtracted completely from the base station. e(l) = Similarly to THP in [15], b(l) = [bl,1 , · · · , bl,l−1 , 0, · · · , 0]T and x T [x(1) , · · · , x(l−1) , 0, · · · , 0] are defined as a measure of pre-subtracting the 13

Page 13 of 32

ip t cr

M

an

us

Figure 1: Block diagram of the transmitter

Figure 2: Block diagram of the receiver

te

d

known interference term in (16) and (21), which is shown in Figure 1. The channel symbols x(l) are successively generated from the information symbols a(l) in (23). If the information symbol a(l) is uniformly distributed in an M -ary QAM constellation, then E{aaH } = IL with a = [a(1) , · · · , a(L) ]H . The modulo device is used to ensure that the transmit power does not exceed the power constraint, and E{xxH } = IL is guaranteed [15]. At the π(l)th user, the received signal after receive combining (25) is reformulated as

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

√ H y(l) = uH (l) Hπ(l) v(l) p(l) (a(l) + q(l) ) + u(l) nπ(l)

(1 ≤ l ≤ L)

(26)

√ where q(l) ∈ {2 M · (qI + iqQ )|qI , qQ ∈ Z} is introduced by the modulo √ operator at the transmitter. y(l) is divided by g(l) = uH (l) Hπ(l) v(l) p(l) before passing through the modulo operator to satisfy the same constellation boundaries as for the transmitter. Following the modulo device the original data symbol is estimated as a ˜(l) as shown in Figure 2. The algorithm of the whole process is summarized in Algorithm 1.

14

Page 14 of 32

ip t cr

te

d

M

an

us

Algorithm 1 The proposed method Initialization: Lk = 0, ∀k 1: The proposed beamforming method The first data stream is assigned to the user who has the largest data rate (15) Lπ(1) = 1, i = 2 while i ≤ Nt Calculate A(i) (19) Calculate v(i) and u(i) (20) Power allocation by water-filling algorithm Calculate the temporary sum rate denoted as R(i) if R(i) > R(i−1) Lπ(i) = Lπ(i) + 1 i=i+1 else break end end 2: Pre-equalization THP technique for l = 1 : L Calculate b(l) according to (24) Calculate the transmit symbol x(l) according to (23) end Find Lk , Vk , Uk , xk , 0≤ k ≤ K

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

15

Page 15 of 32

4. Computational complexity analysis

te

d

M

an

us

cr

ip t

Floating point operations (flops) can be used as a crude measurement of the computational complexity of an algorithm [22], [23]. A real addition, multiplication or division operation is counted as one flop. A complex addition and multiplication have two flops and six flops, respectively [24]. In this section, we use flops to characterize the computational complexity of ZF-SA method and the proposed method, and denote them as ψs and ψp , respectively. To simplify the analysis, the case where all users have the same number of antennas is considered, i.e., Nr,k = Nr , ∀k. The first step of the proposed method is to find the dominant singular value and the corresponding singular vectors of each channel matrix Hk . In the literature, several fast approaches are proposed to find the dominant singular vectors, such as computing the dominant eigenvectors, non-linear attacks, and iterative method [24], [25]. In this paper, low-rank incremental method [25] is adopted which has {6mnr} flops for a rank-r complex matrix with size m × n. Thus, the total number of flops of the first step is 6KNt Nr2 . From the lth (2 ≤ l ≤ L) step, LQ decomposition is performed once at each step, which needs 4(l − 1)2 (3Nt − l + 1) flops [23]. Then the dominant singular value and the corresponding singular vectors of matrix Hk A(i) are found at each step, which needs 6KNt Nr2 flops approximately. After transmit beamforming and receive combining design, water-filling power allocation is performed K times, which needs 2l2 +6l flops at each time [22]. The vector b(l) is obtained for pre-subtracting the non-causally known interference, which needs (L2 + L − 2)(4Nt Nr + 3Nr ) flops. Therefore, the complexity of the proposed method is approximated as

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

ψp ≈ 6KLNt Nr2 + K(L − 1) × (8KNt2 Nr + 6KNr2 + 6KNt Nr − 2KNr +

6KNr2 )

L X + (−4l3 l=2

(27)

+ (12Nt + 12)l2 − (6KNr2 + 8KNt Nr − 2KNr + 24Nt + 8)l)

Consider ZF-SA method, the first step has the same number of flops with the proposed method. From the lth (2 ≤ l ≤ L) step, the receive combining vectors are obtained by performing generalized eigenvalue technique over all the K users, the required matrix pair of each user needs 16Nt3 + (20Nr + 3)Nt2 + (16Nt − 4)Nr2 − 4Nr + Nt + (4Nt − 1)l2 − (4Nt + 1)l 16

Page 16 of 32

6

9

x 10

Proposed method ZF−SA method

ip t

8

6

cr

5 4 3

us

Number of flops

7

2

0

2

4

6

an

1 8

10 12 14 Number of users

16

18

20

M

Figure 3: Complexity comparison of ZF-SA method and the proposed method. Nt = 8, and Nr,k = 2, ∀k.

te

d

flops to obtain, the dominant generalized eigenvalue and the corresponding generalized eigenvector are found by QZ algorithm which needs 90Nr3 flops [26]. In addition, pseudo inverse of the composite channel matrix is adopted to find the transmit beamforming vectors, which is approximated as 24lNt2 + 48l2 Nt + 54l2 flops [22]. At each step water-filling algorithm is also used K times, the number of flops is 2l2 + 6l for each time [22]. Therefore, the total number of flops of ZF-SA method is approximated as

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

ψs ≈ 6KNt Nr2 + K(L − 1) × (16Nt3 + 90Nr3 + 20Nt2 Nr +

16Nr2 Nt

+

3Nt2



4Nr2 )

+

L X

(52Nt l2 + 55l2

(28)

l=2

+ 24Nt l)

In Fig. 3, we plot the number of flops of ZF-SA method and the proposed method in terms of the number of users, where Nt = 8, Nr = 2, and the total available number of data streams L is assumed to be equal to Nt . Fig. 3 shows that the proposed method has substantial low computational complexity compared with ZF-SA method.

17

Page 17 of 32

5. Simulation results

te

d

M

an

us

cr

ip t

In this section, simulation results show that the proposed method outperforms the methods in the literature. For simplification, we assume that each user has same number of receive antennas. We first consider MU-MIMO broadcast channels with Nt = 4 transmit antennas at the base station, and K = 2 users under a natural order, each user is equipped with two receive antennas (i.e., Nr,k = 2, ∀k). Fig. 4 illustrates the sum capacity by DPC method, the sum rates of SZF-DPC method, ZF-SA method and the proposed method. It is shown that the performance of the proposed method is very close to the performance of DPC method, and much better than the performance of the linear ZF-SA method. The reason is that, beamforming vectors being designed to suppress only one part of interference in the proposed method, they have much larger degrees of freedom compared with ZF-SA method, in which the whole interference is removed by beamforming vectors. As a result of the pre-equalization THP technique at the transmitter, the interference can also be removed completely in the proposed method. Thus, larger gain is contributed by beamforming vectors. Fig. 4 also shows that SZF-DPC method also approaches the sum capacity at high SNR region. But at the low SNR region, SZF-DPC method has a performance degradation compared with ZF-SA method. Next, we consider two different MU-MIMO configurations to evaluate the performance of the proposed method. The base station is first assumed with four transmit antennas, and four users are served, each user is equipped with two receive antennas. Since the total number of receive antenna is greater than the number of transmit antennas, SZF-DPC does not work here. In Fig. 5, we show the sum capacity by DPC method, the average sum rates of ZF-SA method and the proposed method over 1 × 104 independent complex Gaussian fading channel realizations. It can be observed that the proposed method outperforms ZF-SA method, and is very close to DPC method. Fig. 6 illustrates the cumulative density function of the achievable sum rate of each method at SNR = 10dB, the performance improvement is also evident. Finally, a different configuration is simulated. The base station is equipped with eight antennas, and four users are equipped with four receive antennas each. Fig. 7 shows the sum capacity by DPC method, the average sum rates of ZF-SA method and the proposed method, and Fig. 8 illustrates the cumulative density function of the achievable sum rate of each method at SNR = 10dB. Similar conclusions can be observed as before.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

18

Page 18 of 32

25

cr

15

us

Sum rate [bps/Hz]

20

ip t

DPC method Proposed method SZF−DPC method ZF−SA method

10

0

5

10 SNR [dB]

15

20

M

0

an

5

te

d

Figure 4: Sum rate comparison of DPC method, SZF-DPC method, ZF-SA method and the proposed method, Nt = 4, Nr,k = 2, ∀k, and K = 2.

30

DPC method Proposed method ZF−SA method

25

Sum rate [bps/ Hz]

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

20

15

10

5

0

0

5

10 SNR [dB]

15

20

Figure 5: Sum rate comparison of DPC method, ZF-SA method and the proposed method, Nt = 4, Nr,k = 2, ∀k, and K = 4.

19

Page 19 of 32

1

0.8 0.7

cr

0.6 0.5 0.4

us

Cumulative density function [CDF]

ip t

DPC method Proposed method ZF−SA method

0.9

0.3 0.2

0

8

10

an

0.1 12 14 Sum rate [bps/Hz]

16

18

M

Figure 6: Cumulative density function of the achievable sum rate of DPC method, ZF-SA method and the proposed method, SNR = 10dB, Nt = 4, Nr,k = 2, ∀k, and K = 4.

d

6. Conclusion

te

In this paper, a novel transmit beamforming and receive combining method in MU-MIMO broadcast channels is proposed to optimize the global throughput. By allocating one single data stream at each step, the corresponding transmit beamforming and receive combining vectors are designed to remove one part of interference, and the residual interference which is non-causally known at the base station is pre-subtracted through the pre-equalization THP technique. Compared with linear successive beamforming method, great improvement of sum rate is achieved. In addition, the proposed method is attractive in practice thanks to the low computational complexity. Simulation results validate this technique.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

7. Acknowledgement

The authors would like to thank the anonymous reviewers, whose careful consideration of the manuscript improved the presentation greatly. This study was supported by the Natural Science Foundation of Gansu (1310RJZA050).

20

Page 20 of 32

55

40 35 30

cr

Sum rate [bps/Hz]

45

ip t

DPC method Proposed method ZF−SA method

50

25

15 10 0

5

10 SNR [dB]

15

20

an

5

us

20

M

Figure 7: Sum rate comparison of DPC method, ZF-SA method and the proposed method, Nt = 8, Nr,k = 4, ∀k, and K = 4.

References

te

d

[1] Vishwanath, S., Jindal, N., Goldsmith, A.. Duality, achievable rates, and sum-rate capacity of Gaussian MIMO broadcast channels. IEEE Trans Inf Theory 2003;49(10):2658–2668. [2] Jindal, N., Rhee, W., Vishwanath, S., Jafar, S.A., Goldsmith, A.. Sum power iterative water-filling for multi-antenna Gaussian broadcast channels. IEEE Trans Inf Theory 2005;51(4):1570–1580.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[3] Peel, C.B., Hochwald, B.M., Swindlehurst, A.L.. A vectorperturbation technique for near-capacity multiantenna multiuser communication-part I: channel inversion and regularization. IEEE Trans Commun 2005;53(1):195–202. [4] Spencer, Q.H., Swindlehurst, A.L., Haardt, M.. Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels. IEEE Trans Signal Process 2004;52(2):461–471. [5] Yang, Y.H., Lin, S.C., Su, H.J.. Multiuser MIMO downlink beamforming design based on group maximum SINR filtering. IEEE Trans Signal Process 2011;59(4):1746–1758.

21

Page 21 of 32

1

ip t

0.8

DPC method Proposed method ZF−SA method

0.7

cr

0.6 0.5 0.4 0.3

us

Cumulative density functioin [CDF]

0.9

0.2

0 20

22

24

an

0.1 26 28 Sum rate [bps/Hz]

30

32

M

Figure 8: Cumulative density function of the achievable sum rate of DPC method, ZF-SA method and the proposed method, SNR = 10dB, Nt = 8, Nr,k = 4, ∀k, and K = 4.

te

d

[6] Caire, G., Shamai, S.. On the achievable throughput of a multiantenna gaussian broadcast channel. IEEE Trans Inf Theory 2003;49(7):1691– 1706. [7] Dabbagh, A.D., Love, D.J.. Precoding for multiple antenna gaussian broadcast channels with successive zero-forcing. IEEE Trans Signal Process 2007;55(7):3837–3850.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[8] Tran, L.N., Hong, E.K.. Multiuser diversity for successive zero-forcing dirty paper coding: Greedy scheduling algorithms and asymptotic performance analysis. IEEE Trans Signal Process 2010;58(6):3411–3416. [9] Liu, J., Krzymien, W.A.. A novel nonlinear joint transmitter-receiver processing algorithm for the downlink of multiuser MIMO systems. Vehicular Technology, IEEE Transactions on 2008;57(4):2189–2204.

[10] Li, X., Bai, B.. ZF-THP combined with receive beamforming for multi-user MIMO downlinks. In: 2009 1st International Conference on Information Science and Engineering (ICISE). 2009, p. 2779–2782. [11] Gaur, S., Acharya, J., Gao, L.. Enhancing ZF-DPC performance with

22

Page 22 of 32

ip t

receiver processing. IEEE Trans Wireless Commun 2011;10(12):4052– 4056.

cr

[12] Guthy, C., Utschick, W., Hunger, R., Joham, M.. Efficient weighted sum rate maximization with linear precoding. IEEE Trans Signal Process 2010;58(4):2284–2297.

us

[13] Tomlinson, M.. New automatic equaliser employing modulo arithmetic. Electronics Letters 1971;7(5):138–139.

an

[14] Harashima, H., Miyakawa, H.. Matched-transmission technique for channels with intersymbol interference. IEEE Trans Commun 1972;20(4):774–780.

M

[15] Windpassinger, C., Fischer, R.F.H., Vencel, T., Huber, J.B.. Precoding in multiantenna and multiuser communications. IEEE Trans Wireless Commun 2004;3(4):1305–1316.

d

[16] Schubert, M., Shi, S.. MMSE transmit optimization with interference pre-compensation. In: proc. IEEE 61st Vehicular Technology Conference; vol. 2. 2005, p. 845–849.

te

[17] Zu, K., de Lamare, R.C., Haardt, M.. Multi-branch tomlinsonharashima precoding design for MU-MIMO systems: Theory and algorithms. Communications, IEEE Transactions on 2014;62(3):939–951.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[18] Shenouda, M.B., Davidson, T.N.. Tomlinson-harashima precoding for broadcast channels with uncertainty. IEEE J Sel Areas Commun 2007;25(7):1380–1389. [19] Huang, M., Zhou, S., Wang, J.. Analysis of tomlinson-harashima precoding in multiuser MIMO systems with imperfect channel state information. Vehicular Technology, IEEE Transactions on 2008;57(5):2856– 2867. [20] Costa, M.. Writing on dirty paper. 1983;29(3):439–441.

IEEE Trans Inf Theory

[21] Yu, W.. Sum-capacity computation for the gaussian vector broadcast channel via dual decomposition. IEEE Trans Inf Theory 2006;52(2):754– 759. 23

Page 23 of 32

ip t

[22] Shen, Z., Chen, R., Andrews, J.G., Heath, R.W., Evans, B.L.. Low complexity user selection algorithms for multiuser MIMO systems with block diagonalization. IEEE Trans Signal Process 2006;54(9):3658–3663.

cr

[23] Tran, L.N., Bengtsson, M., Ottersten, B.. Iterative precoder design and user scheduling for block-diagonalized systems. IEEE Trans Signal Process 2012;60(7):3726–3739.

us

[24] Golub, G.H., Van, L.C.F.. Matrix computations. 3rd ed.; Baltimore, MD: The John Hopkins Univ. Press; 1996.

an

[25] Baker, C.G., Gallivan, K.A., Van, D.P.. Low-rank incremental methods for computing dominant singular subspaces. Linear Algebra and its Applications 2012;436:2866–2888.

te

d

M

[26] Haley, S.B.. The generalized eigenproblem: pole-zero computation. Proceedings of the IEEE 1988;76(2):103–120.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

24

Page 24 of 32

25

te

d

DPC method Proposed method SZF−DPC method ZF−SA method

M

an

us

cr

ip t

Figure

Ac ce p

Sum rate [bps/Hz]

20

15

10

5

0

0

5

10 SNR [dB]

15

2025 of 32 Page

1

d

DPC method Proposed method ZF−SA method

te

0.9 0.8

Ac ce p

Cumulative density function [CDF]

M

an

us

cr

ip t

Figure

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

8

10

12 14 Sum rate [bps/Hz]

16

1826 of 32 Page

M

an

us

cr

ip t

Figure

0.7 0.6 0.5

te

0.8

DPC method Proposed method ZF−SA method

Ac ce p

Cumulative density functioin [CDF]

0.9

d

1

0.4 0.3 0.2 0.1 0 20

22

24

26 28 Sum rate [bps/Hz]

30

Page 3227 of 32

M

an

us

cr

ip t

Figure

6

d

9

x 10

Proposed method ZF−SA method

te

8

Ac ce p

Number of flops

7 6 5 4 3 2 1 0

2

4

6

8

10 12 14 Number of users

16

18

Page 2028 of 32

M

an

us

cr

ip t

Figure

d

30

te

DPC method Proposed method ZF−SA method

Ac ce p

Sum rate [bps/ Hz]

25

20

15

10

5

0

0

5

10 SNR [dB]

15

Page 2029 of 32

d

M

an

us

cr

ip t

Figure

te

55

DPC method Proposed method ZF−SA method

Ac ce p

50

Sum rate [bps/Hz]

45 40 35 30 25 20 15 10 5

0

5

10 SNR [dB]

15

Page 2030 of 32

Ac

ce

pt

ed

M

an

us

cr

i

Figure

Page 31 of 32

Ac

ce

pt

ed

M

an

us

cr

i

Figure

Page 32 of 32