Optimal spectrum access algorithm based on POMDP in cognitive networks

Optimal spectrum access algorithm based on POMDP in cognitive networks

Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949 Contents lists available at ScienceDirect International Journal of Electronics and Communications ...

671KB Sizes 0 Downloads 52 Views

Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

Contents lists available at ScienceDirect

International Journal of Electronics and Communications (AEÜ) journal homepage: www.elsevier.com/locate/aeue

Optimal spectrum access algorithm based on POMDP in cognitive networks Shibing Zhang ∗ , Huijian Wang, Xiaoge Zhang, Zhanghua Cao School of Electronics and Information, Nantong University at Nantong, Nantong 226019, China

a r t i c l e

i n f o

Article history: Received 10 November 2014 Accepted 24 February 2015 Keywords: Cognitive networks Spectrum access Access strategy Throughput POMDP

a b s t r a c t Based on the partially observable Markov decision process, this paper investigates spectrum access in cognitive networks and proposes an optimal spectrum access algorithm to maximize the throughput. The suboptimal solution to the optimal spectrum access strategy is given by use of the greedy algorithm. In order to achieve the optimal channel access, the channel access state of authorized users is estimated according to the historical information of spectrum sensing. Simulation results show that the optimal spectrum access algorithm would make the secondary users achieve a higher throughput and the networks have the higher spectrum utilization. Compared with the random access and the optimal cognitive MAC protocol, the optimal spectrum access algorithm would improve the throughput by 25–40% and 8–12%, respectively. © 2015 Elsevier GmbH. All rights reserved.

1. Introduction With the rapid development of wireless communication networks, the demands for spectrum resource have been being on the increase. It has resulted in that the conflicts between the increasing demands of wireless communication traffics and the scarcity of available spectrum resource is becoming more and more scarce [1]. Cognitive radio (CR) is an intelligent spectrum sharing technology which may be able to resolve the contradiction [2]. CR allows the cognitive user to access the idle licensed bands dynamically so as to reuse the spectrum resources and improve the spectrum utilization efficiency. In cognitive radio networks, cognitive users should first sense the surrounding network. When the cognitive users find there is any idle channel (spectrum hole), they would adjust their own transceiver frequencies and other parameters to access the idle channel to transmit data [3]. At the same time, the cognitive users must constantly monitor channel states. If it is found that any authorized user begins to use the channel, cognitive users should immediately withdraw from the band to ensure that the authorized user (primary user) is not interfered by cognitive users. That is to say, cognitive users should access the cognitive radio networks with the opportunistic model [4].

∗ Corresponding author. Tel.: +86 51385012630. E-mail address: [email protected] (S. Zhang). http://dx.doi.org/10.1016/j.aeue.2015.02.015 1434-8411/© 2015 Elsevier GmbH. All rights reserved.

In the opportunistic spectrum access scheme, the spectrum sensing and spectrum access are dependent with each other instead of isolated. Therefore, it is a challenge to design an effective protocol [5]. In fact, the sensing time affects the throughput of cognitive networks a lot. The spectrum sensing and access can be combined to design the access protocol. In [6], the relationship between the spectrum sensing time and detection probability as well as the throughput of the cognitive networks is presented, and the way how to set the optimal sensing time and to maximize the throughput is described. The graph theory can also be used to balance the sensing and access to maximize the throughput. It is demonstrated that selecting the most suitable channels to be sensed would improve the throughput of cognitive networks effectively in a given network density [7]. In order to maximize the average spectrum utilization rate of cognitive users, an optimization access strategy which joints the spectrum sensing time and detection period is given in [8]. Optimizing the sensing time and power allocation of the cognitive user can also improve the capability of spectrum sensing and throughput [9–11]. Moreover, the adaptive sensing and power allocation methods can mitigate the effect of the long sensing cycle on the utilization of the spectrum [12,13]. When we design the media access control (MAC) protocol in cognitive networks, we should consider the spectrum sensing error in physical layer, the interference constraint to authorized user caused by cognitive users, and the channel statistics of the authorized channels as well. All these can be modeled as the partially observable Markov decision process (POMDP) [14,15]. For energy limited cognitive users, it may be better to transmit the data in a

944

S. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

transition probability (1−P10 ). Therefore, the available probability of nth channel can be given by P1,n = (1 − P1,n )P01 + P1,n (1 − P10 ).

(1)

Then, the available probability of the nth channel in the slot can be given by P1,n =

P01 . P01 + P10

(2)

Similarly, we can also obtain the unavailable probability of nth channel in the slot Fig. 1. Markov channel state model.

P0,n = burst mode. In this case, the states of the buffers of the cognitive users can be used as the parameters of POMDP when designing the static optimal sensing and access [16]. An opportunistic spectrum access protocol combining the hidden Markov model (HMM) with POMDP is designed in [17], In this protocol, the cognitive users estimate the state of authorized channel in next slot by means of HMM and access the network by use of POMDP without the prior information of sensing spectrum. It makes the channel access strategy more practical. Q learning would make full use of the past observed information and decision experience to optimize the current action. Therefore, it is unnecessary to transform POMDP into a Markov decision process (MDP) when Q learning is applied to design the cognitive MAC protocol based on POMDP [18]. When the spectrum sensing strategy of cognitive user is modeled as POMDP, it is supposed that the channel state transition probabilities are known [19,20]. However, the primary user uses the authorized channel whenever necessary. It will result that the states of channels occupied by primary user change momentarily and are unknown in practical situations. Moreover, the spectrum sensing may be not perfect due to the dynamic change of the multipath and shadow fading channel. In this paper, we address the opportunistic spectrum access in cognitive networks and propose an optimal spectrum access algorithm to maximize the throughput. The rest of this paper is organized as follows. Section 2 describes the system model. Section 3 presents the optimal spectrum access algorithm. Section 4 shows how to estimate the state transition probability. Some simulation results are discussed in Section 5. Conclusions are stated in Section 6. 2. System model Consider a cognitive network with N independent channels, each has the bandwidth of Bn (n = 0, 1, . . . , N). The primary user carries out data transmission synchronously [21]. In each slot, the state of every channel is denoted by “0” (busy) or “1” (idle). The busy state indicates that the channel is occupied by the primary user in the current slot, and the idle state indicates the channel may be used by cognitive user. The transition of each channel state can be modeled as a discrete-time Markov process, as shown in Fig. 1, where P01 and P10 are the channel state transition probabilities in which the channel from “0” to “1” and from “1” to “0” in one step, respectively. N channels have 2N states. For convenience, we assume that the statistical characteristics of the primary user spectrum remain unchanged in T slots. Let P1,n , n = 1, 2, . . ., N, denote the probability in which the nth channel is available for the cognitive user during a slot. From Fig. 1 we know that the available probability of nth channel P1,n consists of two parts, one is transferred from “busy” state (whose probability is (1−P1,n )) with state transition probability P01 , the other is transferred from “idle” state (whose probability is P1,n ) with state

P10 . P01 + P10

(3)

Assume that we observe the channel sates for T slots. Due to the energy consumption limitations and hardware limitations, the cognitive user cannot sense all of the N sub-channel in each slot. It chooses only N1 channels to sense. After sensing, the cognitive user chooses N2 channels to access according to the spectrum sensing results. It is obvious that N2 ≤ N1 ≤ N. Since the transition of each channel state is modeled as a discrete-time Markov process and only limited channels in the system can be observed by cognitive user in each slot, the spectrum sensing and access can be modeled as a partially observable Markov decision process (POMDP). The parameters of the POMDP are described as follows. (1) Slots T, each last t0 . (2) Set of all available states of channels, i.e. confidence vector (t) ∈ {1, . . ., M}, where M = 2N is the number of the total available channel states. (3) Channel state transition probability Pl,m from “l” to “m” during one slot, l, m = 0, 1, . . ., M − 1. (4) Sensing action A1 and access action A2 . (5) Sensing result m,n , which is the sensing result of channel n when the state is m, m ∈ {0, . . ., M − 1}. The value of m,n is denoted by ,  ∈ {1, 0}. (6) Optimal channel selection strategy m , m ∈ {0, . . ., M − 1}. (7) Access throughput Cm,n , m ∈ {0, . . ., M − 1}, n ∈ {0, . . ., M − 1}. Therefore, the optimal access strategy based on POMDP is to choose N1 channels from N channels to sense and N2 channels from N1 channels to access. In other words, the optimal access strategy is to find the most suitable combination {A1 , A2 } to maximize the total expected throughput in the T slots observed. 3. Optimal spectrum access algorithm 3.1. Sufficient statistic of POMDP In the cognitive network, the cognitive user can sense only a part of the channels instead of all. It is impossible to know all of the Markov process in all slots. The cognitive user estimates only all of the channel states according to the historical observation information and present observation results of partial channels. Let (t) = [1 (t), 2 (t), . . ., M (t), ] denote the channel states in slot t, 1 ≤ t ≤ T, where each element, i.e., channels state m (t) in the confidence vector (t), may be estimated according to the historical and present information observed. At the beginning of the slot t, the cognitive user k chooses N1 channels to sense (sensing action A1 ) and chooses N2 channels to access (access action A2 ). The confidence vector (t) is the sufficient statistic of the optimal decision {A1 , A2 } [22]. Therefore, it can be transformed into the optimization of the confidence vector (t) to find the optimal decision {A1 , A2 }. Thereby, the duty of the cognitive user k is to find the strategy which can complete the mapping from (t) to {A1 , A2 }, i.e.

S. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

␲(t) = [1 (t), 1 (t), . . ., T (t)],

(4)

where ␲(t) : (t) ∈ [0, 1] → {A1 (t), A2 (t)}. 3.2. Optimal access strategy At the beginning of each slot, the cognitive user k chooses the optimal channels to sense according to its own spectrum access strategy. If the channel sensed is available, the cognitive user k accesses this channel; if the channel is unavailable, the cognitive user does not access this channel in this slot. The channel capacity (profit function) achieved when cognitive user k accesses channel n is defined as



bk,n = Bn log2 1 + Kek

Pk,n |Hk,n |2 Bn



,

2

(5)

where Pk,n is the transmitted power of cognitive user k over the channel n, Hk,n is the channel gain of channel n to cognitive user k,  2 is the additive white Gaussian noise power per unit bandwidth, and Kek is the bit error rate required by cognitive user k (generally, it is a constant). The maximum throughput is achieved when cognitive user k accesses channel n in the channel state m can be expressed as



Cm,n = Sn (t)bk,n = Sn (t)Bn log2 1 + Kek

Pk,n |Hk,n |2 Bn  2



,

(6)

where Sn (t) denotes the state of channel n in the slot t, Sn (t) ∈ {0,

study [14] has shown that when all channels are independent of each other, the M dimensions confidence vector (t) may be simplified as an N dimensions belief vector (t) = [p1 (t), . . ., pN (t)], where pn (t) is the available probability of channel n at the beginning of slot t. When we use the greedy algorithm to achieve the optimization, the optimal spectrum access algorithm can be simplified as follows: At the beginning of slot t, the throughput achieved when cognitive user k accesses channel n in the channel state m can be expressed as Cm,n = (pn (t)(1 − p10 ) + (1 − pn (t))p01 )bk,n ,





 = argmaxE 

T 



Cm,n (t)|(1) ,

(7)

t=1

where (1) is the initial confidence vector in T slots. The spectrum access based on POMDP is to update the access strategy and make the optimal access decision according to (t) in each slot. Note that (t) is the confidence vector at present. Hence, the largest cumulative throughput, i.e. maximum expectation from present slot t to all of the rest of the slots is obtained from Vt ((t)) =

a∗ (t) = arg max ((pn (t)(1 − p10 ) + (1 − pn (t))p01 )bk,n ). n=1,...,N

=

max E

n=1,...,N

 M  m=1

When the channels are selected to be detected, their confidence vectors are updated according to the corresponding sensing results. Otherwise, the confidence vectors are updated according to the Markov chain. Therefore, the update of each confidence vector is formulated as follows (t + 1) = [p1 (t + 1), . . ., pN (t + 1)], and



m (t)

l=1

 l

pm,l

a∗ (t) = n, m,a∗ (t) = 1 a∗ (t) = n, m,a∗ (t) = 0

⎪ pn (t)(1 − p10 ) + (1 − pn (t))p10 ⎪ ⎩

a∗ (t) = / n

(13)

When cognitive user k access channel n from the beginning of the slot t, the maximum throughput achieved can be expressed as follows Vt ((t)) = (pa∗ (t) (1 − p10 ) + (1 − pa∗ (t) )p01 )bk,a∗ +

1 

Pr[m,a∗ = |, a∗ ]Vt+1 ((t + 1))

=0

= (pa∗ (t) (1 − p10 ) + (1 − pa∗ (t) )p01 )bk,a∗

(14)



where a* is the short from a* (t). The practical iterative can be obtained by

Pr[l,n = ]

=0

Vt+1 ((t + 1)) = Vt ((t)) + (pa∗ (t) (1 − p10 ) + (1 − pa∗ (t))p01 )bk,a∗ . ,

(8)

where l,n Cl,n is the maximum throughput achieved at slot t, and Vt+1 ((t + 1)) is the maximum throughput achieved from slot t + 1. And the update of the confidence vector is based on the Bayesian formula and total probability formula as follows

M

 (t)pm,l Pr[l,n = ] m=1 m M M  (t)pm,l Pr[l,n = l=1 m=1 m



pn (t + 1) =

⎧1 ⎪ ⎪ ⎨0

(12)

+ [pa∗ (t) p10 + (1 − pa∗ (t) )(1 − p01 )]Vt+1 ((t + 1)), M

× (Cl,n + Vt+1 ((t + 1)))

l (t + 1) =

(11)

+ [pa∗ (t) (1 − p10 ) + (1 − pa∗ (t) )p01 ]Vt+1 ((t + 1))

max E[Vt ((t))]

n=1,...,N

(10)

where pn (t)(1 − p10 + (1 − pn (t))p01 ) denotes the available probability of channel n cognitive user k in slot t. The channel selected in the optimal strategy, a* (t), can be given by

1}. To optimize the access decision {A1 , A2 } is to find the most suitable channel to access and to maximize the expectation of total throughput in T slots. Thus, the optimal access strategy can be expressed as follows [15]

945



]

,

(9)

where l = 0, 1, . . ., M − 1 3.3. Optimal spectrum access algorithm Although the confidence vector (t) is a sufficient statistic of POMDP, it is impossible to get all channel states in the slot t. Related

(15)

4. Estimation of state transition probability From (14) and (15), we know that when we achieve the optimal channel access, we shall know the channel state transition probabilities P01 and P10 first. However, the channel state transition probabilities will change as the primary user’s behavior and are unknown beforehand in practice. In this section, we will describe how to estimate the channel state transition probabilities according to the historical information. Assume that ym (t), m = 0, 1, . . ., M − 1, denote the probability of the mth state m (t) in slot t. Because the transition of channel state is a stationary random process, the transition probabilities are

946

S. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

independent of the number of slots. The channel state prediction model can be formulated as YT (t) = YT (t − 1)P,

M−1 

pˆ l,m yl (t − 1),

(17)

ˆ = (ˆpl,m ) is the estimation of the channel state transiwhere P M×M tion probabilities P = (pl,m )M×M . Then, the sum of the fitting error squares of the mth state probability in the entire slots is given by Qm =

2 em (t).

(18)

t=0

 M−1 T

M−1

Qm =

m=0

2 em (t).

(19)

m=0 t=0

At this point, the problem of estimation of matrix P can be formulated by the Least Square problem as follows min(Q ) = min pl,m

s.t.

M−1 T 

pl,m

M−1 



em (t)

yl (t − 1) = 0.

(25)

m=0

M−1 

l = Ml .

(26)

m=0

From (25) and (26), we deduce that j is equal to 0. Therefore, the Lagrangian function introduced in (21) is meaningless. We can take directly the derivative of Q function instead of introducing the Lagrange multipliers j as follows

 ∂Q¯ = −2 em (t)yl (t − 1). ∂pˆ l,m

(27)

t=0

¯ Letting ∂ˆQ = 0, we have

∂pjl

T 

em (t)yl (t − 1) = 0.

(28)

Substituting (17) into (28), we can get



(20)

l = 0, 1, . . ., M − 1.

Introducing the Lagrange multipliers l (l = 0, 1, . . ., M − 1), we can obtain the Lagrangian function

 M−1 T

M−1  



M−1

2 em (t) +

m=0 t=0

l

l=0

ym (t) −

t=0

M−1 



pˆ l,m yl (t − 1) yl (t − 1) = 0,

pˆ l,m − 1

.

(21)

m=0

Taking the derivative of the Lagrangian function, we get

 ∂Q¯ = −2 em (t)yl (t − 1) + l . ∂pˆ l,m

(29)

l=0

for l, m = 0, 1, . . ., M − 1. They can be expressed in the matrix as ˆ YT1 Y1 P

= YT1 Y2 ,



m=0 t=0

pˆ l,m ,



(30)

where

2 em (t)

m=0

Q¯ =

t=0

On other hand,

T



l = 2

m=0



t=0

And the sum of the fitting error squares of all channel state probabilities is given by Q =

M−1 T  

T

l=0

T 

M−1 

(16)

where P = (pl,m )M×M is the channel state transition probability matrix, Y(t) = [y0 (t)y2 (t) . . . yM−1 (t)]T is the channel state probability vector, YT denotes the transposed matrix of Y. It is obvious that M−1 M−1 y (t) = 1 and p = 1. The number of independent m=0 m m=0 l,m elements in matrix P is M(M − 1) = M2 − M. When we estimate the channel state transition probability, we define the fitting error of the mth state probability in slot t as follows em (t) = ym (t) −

Then, we have

y0 (0)

y1 (0)

···

yM−1 (0)



⎢ ⎥ y1 (1) · · · yM−1 (1) ⎢ y0 (1) ⎥ ⎢ ⎥ ⎥, Y1 = ⎢ ··· ··· ··· ⎢··· ⎥ ⎢ y (T − 1) y (T − 1) · · · y (T − 1) ⎥ 1 M−1 ⎣ 0 ⎦ ⎡

y0 (1)

y1 (1)

···

yM−1 (1)

(31)



⎢ ⎥ ⎢ y0 (2) y1 (2) · · · yM−1 (2) ⎥ ⎢ ⎥ ⎥. Y2 = ⎢ ··· ··· ··· ⎢··· ⎥ ⎢ y (T ) y (T ) · · · y (T ) ⎥ 1 M−1 ⎣ 0 ⎦

(32)

T

(22)

t=0

¯ Letting ∂ˆ Q = 0, we have

Then, the least square estimation of the state transition probability matrix can be obtained as follows ˆ = (YT Y1 ) P 1

∂pl,m

−1

(YT1 Y2 ).

(33)

Taking the constrained condition of the problem in (20) into account, we obtain

The transition process of the channel state is a special Markov chain, in which the channel state has only one state within each slot. Therefore, there is one and only one component which is “1” in the channel state vector Y(t), the others are all “0”. When the lth component is “1” in Y(t), i.e., the channel is located in lth state, we have

M−1 

Y(t) = (0, . . ., 0, yl (t), 0, . . ., 0)T = (0, . . ., 0, 1, 0, . . ., 0)T .

T 

l = 2

em (t)yl (t − 1).

(23)

t=0

em (t) =

m=0

=

M−1 

yl (t) −

m=0 M−1  m=0

yl (t) −

M−1 M−1   m=0

M−1  m=0



pˆ l,m yl (t − 1)

According to (31) and (32), we can obtain the block matrices as follows

l=1

yl (t − 1) = 1 − 1 = 0

(34)

(24)

Y1 (t) = (Y(0)Y(1). . .Y(T − 1))T , T

Y2 (t) = (Y(1)Y(2). . .Y(T )) ,

(35) (36)

S. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

and

947

0.8

 T −1

YT1 Y1 =

Y(t)YT (t),

(37)

Y(t)YT (t + 1).

(38)

0.75

YT1 Y2 =

T −1  t=0

If the channel is located in lth state in slot t, we can obtain Y(t)YT (t) = Ell ,

(39)

where Ell is an M-order square matrix in which the elements in the lth row and lth column are 1, and other elements are 0. Assume nlm represents the statistical number of times with which the channel transfers from the lth state to the mth in one step during observed T slots, we can simplify YT1 Y1 as a diagonal matrix as follows



(40)

M−1

Y(t)YT (t + 1) = Elm ,

(41)

where Elm is an M-order square matrix in which the elements in the lth row and mth column are 1, and the other elements are 0. Then, we obtain n01

···

n0(M−1)



⎢ ⎥ n11 · · · n1(M−1) ⎢ n10 ⎥ ⎢ ⎥ ⎥. YT1 Y2 = ⎢ ··· ··· ··· ⎢··· ⎥ ⎢n ⎥ ⎣ (M−1)0 n(M−1)1 · · · n(M−1)(M−1) ⎦

(42)

n m=0 lm

N=2: p01=0.15, p10=0.15 N=2: p01=0.20, p10=0.20

0.55

N=4: p01=0.10, p10=0.10 N=4: p =0.15, p =0.15 01

10

0

5

10

15 Slot

01

10

20

25

30

Fig. 2. Normalized throughputs when the channels are idler in the case of fewer channels.

data is transmitted with binary over 30 slots (T=30). In the simulation, we first estimate the channel state transition probabilities, and then simulate the processes of sensing and accessing. We simulate each one for 10000 times. The channel state transition probabilities shown in the simulation figures are the means of the random variables with uniform distribution and variance 0.001. Fig. 2 shows the comparison of the normalized throughput in the small channel state transfer probabilities when the channel number N are 2 and 4 respectively. As seen from Fig. 2, when the channel state transition probabilities are same, the throughput obtained by secondary user increases with the increasing of the number of channels. When the channel number is same, the smaller the transfer probabilities of different channel states (p01 and p10 ) are, the more stable the channels are, namely the fewer the channels accessed and released by the primary users are, the higher the successful access rate is, the larger the throughput is.

0.9 0.85 0.8

l, m = 0, 1, . . ., M − 1.

(43)

If the channel has only two states, i.e., M = 2, the statistical estimations of the two state transition probabilities are given by

⎧ n01 ⎪ ⎨ Pˆ 01 = n00 + n01 n10 ⎪ ⎩ Pˆ 10 = n + n 00 01

(44)

Throughput of SUs£¨bit/s£©

nlm

M−1

N=2: p01=0.10, p10=0.10

0.6

N=4: p =0.20, p =0.20

According to (33), we can derive out the statistical estimation of the channel state transition probability as follows Pˆ lm =

0.65

0.45

where n is the frequencies in which the channel is m=0 lm located in lth state during slots 0 to T − 1. If the channel is located in lth state in slot t and located in mth state in slot t + 1, we will have

n00

0.7

0.5



M−1 

n0m ⎢ ⎥ ⎢ ⎥ ⎢ m=0 ⎥ ⎢ ⎥ .. ⎢ ⎥ T . ⎢ ⎥, Y1 Y1 = ⎢ ⎥ M−1 ⎢ ⎥  ⎢ ⎥ n(M−1)m ⎥ ⎢ ⎣ ⎦ m=0



Throughput of SUs£¨bit/s£©

t=0

0.75 0.7 N=10: p =0.10, p =0.10

0.65

01

0.6

N=10: p01=0.20, p10=0.20 N=11: p01=0.10, p10=0.10

0.55

5. Simulation results and analysis

N=11: p =0.15, p =0.15 01

0.5

In this section, we discuss the simulation results of the optimal spectrum access (OSA) algorithm proposed in this paper and compare it with the optimal cognitive MAC protocol (OCMP) [14] and the random access (RA) algorithm [18]. In order to simplify the simulation, we suppose that there are only one primary user and one cognitive user, and the channel bandwidth is 1 unit. The

10

N=10: p01=0.15, p10=0.15

0.45

10

N=11: p01=0.20, p10=0.20 0

5

10

15 Slot

20

25

30

Fig. 3. Normalized throughputs when the channels are idler in the case of more channels.

948

S. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

0.8

0.8

0.75

0.75

p =0.10, p =0.10 01

10

p =0.30, p =0.30 Throughput of SUs£¨bit/s£©

Throughput of SUs£¨bit/s£©

01

0.7

0.65 N=2:p01=0.90, p10=0.90

0.6

N=2: p01=0.85, p10=0.85 N=2: p =0.80, p =0.80 01

0.55

10

N=4: p01=0.90, p10=0.90 N=4: p =0.85, p =0.85 01

0.5

10

10

p01=0.50, p10=0.50

0.7

p =0.95, p =0.95 01

10

p01=0.75, p10=0.75

0.65

0.6

0.55

0.5

N=4: p01=0.80, p10=0.80 0.45

0

5

10

15 Slot

20

25

0.45

30

Fig. 4. Normalized throughputs when the channels are busier in the case of fewer channels.

Fig. 3 shows the comparison of the normalized throughput in the small channel state transfer probabilities when the channel numbers N are 10 and 11 respectively. It is shown that the more the channels are, the more the throughput is affected by the channel state transition probability. When the number of channels is close to 10, the channel state transition probabilities will become the main factors of effect on the throughput. Figs. 4 and 5 describe the comparison of the throughput obtained when channels are busier (compared with Figs. 2 and 3). At this moment, the channel is busier, i.e., the primary user accesses and releases the channels more frequently, but secondary users can also obtain an ideal throughput. As the same as idler channels, the channel state transition probabilities will become the main factor to affect the throughput along with the increase of the number of channels. Fig. 6 presents the effect of the differences between the transition probabilities of different channel states (p01 , p10 ) and

0

5

10

15 Slot

20

25

30

Fig. 6. Effect of differences of transition probabilities on throughput.

the transition probabilities of same channel states (p11 = 1 − p10 , p00 = 1 − p01 ) on the throughputs when N = 3. The larger the difference is, the larger the throughput is. When the difference is 0 (p01 = 0.5, p10 = 0.5), the optimal spectrum access strategy will be degenerated into the random access strategy. Fig. 7 illustrates the throughputs obtained using the OSA, OCMP and RA algorithms respectively when N = 3. We compare the throughputs in three different cases. The first one is the case where the channel states are more stable, p01 = 0.1, p10 = 0.2; the second is the case where the channel states are switched more frequently, p01 = 0.9, p10 = 0.7; the final one is the case where transition probability difference is a little, p01 = 0.5, p10 = 0.51. It is shown that the OSA algorithm has obvious advantage over the OCMP algorithm and RA algorithm in the throughput. Compared with the OCMP algorithm, the OSA algorithm proposed in this paper increases the throughput by 8–12%; compared with the RA algorithm, it increases the throughput by 25–40%. The larger the difference is, the more obvious the margin of the OSA algorithm to the others in the throughput is.

0.8 0.8

0.75

0.7

Throughput of SUs£¨bit/s£©

Throughput of SUs£¨bit/s£©

0.75

0.65 N=10: p01=0.90, p10=0.90

0.6

N=10: p01=0.85, p10=0.85 N=10: p01=0.80, p10=0.80

0.55

N=11: p01=0.90, p10=0.90

0.65

0.6

0.55

N=11: p =0.85, p =0.85

0.5

01

10

0.5

N=11: p01=0.80, p10=0.80 0.45

Case 1: OSA Case 1: OCMP Case 1: RA Case 2: OSA Case 2: OCMP Case 2: RA Case 3: OSA Case 3: OCMP Case 3: RA

0.7

0

5

10

15 Slot

20

25

30

Fig. 5. Normalized throughputs when the channels are busier in the case of more channels.

0.45

0

5

10

15 Slot

20

25

Fig. 7. Comparison of throughputs between OSA and RA.

30

S. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 69 (2015) 943–949

From Figs. 2–7, we conclude that the optimal spectrum access strategy proposed in this paper offers obvious advantages over optimal cognitive MAC protocol and random access strategy with regard to the throughput obtained. The difference between the transition probabilities is the main factor to affect the throughput besides the number of channels. As the number of channels increase, the effect of the difference on the throughput becomes more obvious. On the other hand, along with the decrease of the transition probability difference, both the optimal spectrum access and optimal cognitive MAC protocol algorithms will be degenerated into the random access algorithm. 6. Conclusion In this paper, we focus on the spectrum access in cognitive networks and propose an optimal access algorithm based on POMDP. In the algorithm, we first estimate the channel state of the primary user by means of the historical spectrum sensing and decision information, and then optimize the spectrum access strategy of the cognitive user to maximize the throughput of the cognitive network. In order to reduce the complexity, we use the greedy algorithm to get the suboptimal solution of the optimal spectrum access algorithm. Simulation results show that the transition probability difference between different channel states is the key factor to affect the throughput achieved of the network. When the difference is little, the optimal access algorithm will be degenerated into the random access algorithm. Generally, the optimal spectrum access algorithm would improve the throughput by 25–40% compared with the random access algorithm and 8–12% compared with the optimal cognitive MAC protocol algorithm. Acknowledgements This study was supported by the National Science Foundation of China under grant 6137111 and 6137112, and the applied basic research project of the Ministry of Transport of China under grant 2014319813220. References [1] Zhao N, Li S, Wu Z. Cognitive radio engine design based on ant colony optimization. Wirel Pers Commun 2012;65:15–24. [2] Akyildiz IF, Lee WY, Vuran MC, Mohanty S. Next generation dynamic spectrum access cognitive radio wireless networks: a survey. Comput Netw 2006;50(3):2127–59.

949

[3] Karmokar AK, Anpalagan A. Energy-efficient cross-layer design of dynamic rate and power allocation techniques for cognitive green radio networks. Trans Emerg Telecommun Technol 2013;24(7-8):762–76. [4] Bourdena A, Pallis E, Kormentzas G, Mastorakis G. Efficient radio resource management algorithms in opportunistic cognitive radio networks. Trans Emerg Telecommun Technol 2014;25(8):785–97. [5] Joshi GP, Nam SY, Kim SW. An analysis of channel access delay in synchronized MAC protocol for cognitive radio networks. Trans Emerg Telecommun Technol 2014;25(5):485–9. [6] Liang YC, Zeng Y, Edward C, Hoang AT. Sensing-throughput trade off for cognitive radio networks. IEEE Trans Wirel Commun 2008;7(4):1326–37. [7] Min AW, Shin KG. On sensing-access tradeoff in cognitive radio networks. In: Proceedings of IEEE symposium on new frontiers in dynamic spectrum, April 4–6. 2010. p. 1–12. [8] Zhang C, Bian L, Zhu Q. Joint optimization of sensing period and sensing interval in cognitive radio networks. J Nanjing Univ Posts Telecommun 2011;31(1):16–22. [9] Stotas S, Nallanathan A. On the throughput and spectrum sensing enhancement of opportunistic spectrum access cognitive radio networks. IEEE Trans Commun 2012;11(1):97–107. [10] El-Sherif AA, Liu KJ. Joint design of spectrum sensing and channel access in cognitive radio networks. IEEE Trans Wirel Commun 2011;10(6):1743–53. [11] Yu H. Optimal channel sensing maximising sum rate in cognitive radio with multiple secondary links. Trans Emerg Telecommun Technol 2013; 24(7-8):777–84. [12] Ding G, Wu Q, Zou Y, Wang J, Gao Z. Joint spectrum sensing and transmit power adaptation in interference-aware cognitive radio networks. Trans Emerg Telecommun Technol 2014;25(2):231–8. [13] Kae WC. Adaptive sensing technique to maximize spectrum utilization in cognitive radio. IEEE Trans Veh Technol 2010;59(2):992–8. [14] Zhao Q, Tong L, Swami A, Chen YX. Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: a POMDP framework. IEEE J Select Areas Commun 2007;25(3):589–600. [15] Sungsoo Park, Daesik H. Optimal Spectrum Access for Energy Harvesting Cognitive Radio Networks. IEEE Trans Wirel Commun 2013;12(12):6166–79. [16] Chen Y, Zhao Q, Swami A. Bursty traffic in energy-constrained opportunistic spectrum access. In: Proceedings of the Global communications Conference, November 26–30. 2007. p. 4641–6. [17] Kae WC, Hossain E. Opportunistic access to spectrum holes between packet bursts: a learning-based approach. IEEE Trans Wirel Commun 2011;10(8):2497–509. [18] Lan ZL, Jiang H, Wu XL. Decentralized cognitive MAC protocol design based on POMDP and Q-Learning. In: Proceedings of 7th international ICST conference on communication and networking, August 8–10. 2012. p. 548–51. [19] Long X, Gan X, Xu Y, Liu J, Tao M. An estimation algorithm of channel state transition probabilities for cognitive radio systems. In: Proceedings of 3rd international conference on cognitive radio oriented wireless networks and communications, May 15–17. 2008. p. 296–9. [20] Che Y, Zhang R, Gong Y. On design of opportunistic spectrum access in the presence of reactive primary users. IEEE Trans Commun 2013;61(7): 2678–91. [21] Djonin DV, Zhao Q, Krishnamuurthy V. Optimality and complexity of opportunistic spectrum access: a truncated Markov decision process formulation. In: Proceedings of IEEE International Conference on Communications, June 24–28. 2007. p. 5787–92. [22] Smallwood R, Sondik E. The optimal control of partially observable Markov process over a finite horizon. Oper Res 1971:1071–88.