Optimized cross-layer transmission for scalable video over DVB-H networks

Optimized cross-layer transmission for scalable video over DVB-H networks

Signal Processing: Image Communication 63 (2018) 81–91 Contents lists available at ScienceDirect Signal Processing: Image Communication journal home...

1MB Sizes 0 Downloads 15 Views

Signal Processing: Image Communication 63 (2018) 81–91

Contents lists available at ScienceDirect

Signal Processing: Image Communication journal homepage: www.elsevier.com/locate/image

Optimized cross-layer transmission for scalable video over DVB-H networks Keyan Deng a,b , Lei Yuan a , Yi Wan a, *, Jie Pan a a b

School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China College of Electrical Engineering, Northwest Minzu University, Lanzhou 730000, China

a r t i c l e

i n f o

Keywords: Cross-layer transmission Scalable video Unequal error protection Expanding window fountain codes Hierarchical QPSK DVB-H

a b s t r a c t We investigate five efficient combinations of expanding window fountain (EWF) coding and hierarchical QPSK (H-QPSK) modulation in a cross-layer fashion, i.e., EWF coding at the application layer and H-QPSK modulation at the physical layer. We evaluate the performance of five cross-layer transmission schemes for scalable video over Digital Video Broadcasting-Handheld (DVB-H) networks by performing cross-layer optimization based on a genetic algorithm to find the optimal parameters and the optimization results are provided. Based on the cross-layer optimization results, we propose an adaptive transmission scheme for scalable video by using optimal combinations of these schemes over DVB-H networks, i.e., we adaptively exploit these five cross-layer transmission schemes in specific SNR regions to accommodate varying DVB-H channel conditions and achieve the best overall transmission performance. The results show that the proposed adaptive transmission scheme provides much better overall performance over DVB-H networks, especially in low SNR regions, than any single scheme used by itself.

1. Introduction As the demand for multimedia applications, which are bandwidth intensive, is growing rapidly over wireless networks, the efficient transmission of compressed multimedia bitstreams over lossy packet networks is required. Scalable coding technologies, e.g., JPEG 2000 [1] and H.264 Scalable Video Coding (SVC) [2], organize multimedia bitstreams into a number of layers of different importance in which the first and most important layer is referred to as base layer (BL), followed by progressively less important layers called enhancement layers (ELs). Scalable coding enables the receiver to progressively improve the reconstructed multimedia quality with the amount of the data being correctly transmitted and decoded. Obviously, we can exploit this characteristic to achieve robust transmission of pre-encoded images and videos over lossy packet networks. However, owing to temporal and spatial dependencies in the compressed bitstream, packet losses in scalable coded sequence transmission will lead to different levels of quality degradation of the reconstructed source message. On the other hand, for a scalable coded sequence where the data importance decreases along the sequence, an early packet loss is propagated by the decoder to all future reconstructed packets, resulting in error propagation in both space and time [3]. Therefore, the efficient transmission of scalable coded sequence over lossy packet *

networks is still a challenge. Under such a scenario, scalable coded sequence is usually protected against channel errors by using forward error correction (FEC) schemes that can improve the successful data transmission probability and eliminate the costly retransmissions. In recent years, fountain codes [4], such as LT codes [5] and Raptor codes [6], have been proposed as a more flexible and efficient FEC solution for information transmission over lossy packet networks. However, these codes are equal error protection (EEP) codes and have poor performance in scalable coded sequence owing to unequal importance of data in the scalable bitstream, where more important data require more protection and need to be reconstructed prior to less important data. In other words, scalable coded sequence calls for error correcting codes with unequal error protection (UEP) and unequal recovery time (URT) capabilities. Furthermore, it has been shown that an FEC scheme that provides UEP can achieve considerable quality improvement [7] and a better overall system performance [8] compared to the EEP. As such, UEP has been successfully utilized for the protection of scalable image and video [8]. Recently, fountain codes designed with the UEP property have emerged, e.g., expanding window fountain (EWF) codes. EWF codes [9], which are a novel approach to provide UEP and URT properties, can protect the different layers in an scalable coded sequence according to their importance at the application layer (AL). In general, error resiliency schemes are performed by independently optimizing the resources accessible at individual network layers.

Corresponding author. E-mail address: [email protected] (Y. Wan).

https://doi.org/10.1016/j.image.2018.02.004 Received 10 August 2017; Received in revised form 5 February 2018; Accepted 5 February 2018 Available online 8 February 2018 0923-5965/© 2018 Elsevier B.V. All rights reserved.

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

2. Background

Specifically, many multimedia communication systems exploit AL-FEC to protect against channel errors. However, in order to efficiently utilize scarce radio resources and achieve overall quality of service (QoS) satisfaction, all available resources in a wireless communication system should be optimally utilized together. Naturally, the cross-layer approach is introduced to optimize the resources and achieve further improvement of QoS. Clearly, it is necessary to pursue a cross-layer design and optimization for scalable video transmission over wireless channels. Cross-layer schemes for robust scalable video wireless communication have been extensively investigated [10–15]. An adaptive cross-layer protection strategy is achieved by means of Reed–Solomon (RS) codes used as the AL-FEC for robust and efficient scalable video transmission over 802.11 WLANs in [10]. Owing to the advantages in terms of complexity, performance, and flexibility over RS codes, fountain codes have been exploited as an AL-FEC solution. Moreover, as fountain codes can adapt to any erasure channel with unknown or varying characteristics, fountain codes are especially suitable for packet-level coding at the AL. Fountain codes-based cross-layer schemes for H.264 video transmission over wireless channels are investigated in [11] and [12]. Authors in [11] propose to use four combinations of cross-layer FEC coding schemes: EEP/UEP LT codes and rate-compatible punctured convolutional (RCPC) codes are utilized at the AL and the physical layer (PL), respectively, to improve the peak signal-to-noise ratio (PSNR) for a given channel bandwidth and signal-to-noise ratio (SNR). In [12], four cross-layer FEC schemes, where the systematic Raptor codes and the RCPC codes are utilized at the AL and the PL, respectively, are proposed to minimize the video distortion and maximize the video PSNR. In this paper, inspired by [11] and [12], we propose five efficient combinations of EWF coding and hierarchical QPSK (H-QPSK) modulation in a cross-layer fashion, i.e., EWF coding at the AL and H-QPSK modulation at the PL. We first evaluate our proposed five cross-layer transmission schemes for H.264 SVC bitstreams over Digital Video Broadcasting-Handheld (DVB-H) networks by carrying out cross-layer optimization to find the optimal parameters that should be adjusted adaptively according to the DVB-H channel condition. The results indicate that these five cross-layer transmission schemes can exert their respective advantages to the utmost to achieve their best performances in specific SNR regions. Subsequently, we propose an adaptive transmission scheme for scalable video by using optimal combinations of these schemes over DVB-H networks, i.e., we adaptively exploit these five cross-layer transmission schemes in specific SNR regions to accommodate the varying DVB-H channel conditions and achieve the best overall transmission performance. The main contributions of this paper include the following: (1) We propose five cross-layer transmission schemes using EWF codes at the AL and H-QPSK modulation at the PL to provide UEP for scalable video over DVB-H networks. (2) We carry out cross-layer optimization based on a genetic algorithm (GA) for the proposed transmission schemes to find the optimal parameters that should be adjusted adaptively according to the DVB-H channel condition. (3) We propose an adaptive transmission scheme for scalable video over DVB-H networks, based on the five cross-layer transmission schemes and cross-layer optimization. The rest of paper is organized as follows. Section 2 provides background material about random linear coding, EWF codes, H-QPSK modulation, and DVB-H. Section 3 describes the design of EWF codes at the AL and H-QPSK modulation at the PL, proposes five crosslayer transmission schemes over DVB-H networks, and presents the packetization scheme in the DVB-H system. Section 4 presents the crosslayer optimization schemes of five cross-layer transmission schemes. The optimization results of the proposed five cross-layer transmission schemes and the adaptive transmission scheme for test scalable video are presented in Section 5. Finally, Section 6 concludes the paper.

2.1. Random linear coding (RLC) Random linear coding (RLC) [16–19] is a class of rateless codes and performs as a near-optimal FEC solution over erasure channels. RLC can produce encoded packets over a source message 𝑥 = {𝑥1 , 𝑥2 , … , 𝑥𝑘 } by random linear combinations of message packets with coefficients randomly selected from a finite field 𝐺𝐹 (2𝑚 ). For example, the ith ∑ encoded packet 𝑐𝑖 can be represented as 𝑐𝑖 = 𝑘𝑗=1 𝛼𝑗 ⋅ 𝑥𝑗 , where 𝛼𝑗 is a 𝑚 randomly selected element of 𝐺𝐹 (2 ). Fountain coding, e.g., LT coding and EWF coding, is a special case of RLC, where 𝛼𝑗 ∈ {0, 1} for the ith encoded packet 𝑐𝑖 . New RLC encoded packets can be generated by performing RLC encoding in a rateless fashion at the transmitter until the receiver collects enough RLC encoded packets to decode the source message using the Gaussian elimination (GE) [20] decoding. Due to the rateless characteristic of RLC, expanding window random linear coding (EW RLC) [17,19,21] is used as an AL-FEC solution for UEP of the sliced-partitioned H.264/AVC video in the DVB-H standard [18]. In present study, EWF coding is used as a low-complexity EW RLC scheme at the AL to provide UEP for scalable video over DVB-H networks. 2.2. Expanding window fountain (EWF) codes EWF codes are a novel class of UEP fountain codes based on the idea of ‘‘windowing’’ the information symbols to be transmitted. We consider the transmission of data, e.g., scalable video streaming, partitioned into consecutive source blocks of 𝑘 symbols. Each source block is divided into 𝑟 importance classes of size 𝑠1 , 𝑠2 , … , 𝑠𝑟 symbols, respectively, such that 𝑠1 + 𝑠2 + ⋯ + 𝑠𝑟 = 𝑘, as shown in Fig. 1. The importance of classes decreases with chronological ordering of the symbols, i.e., the 𝑖th class is more important than the 𝑗th class, if 𝑖 < 𝑗. We compactly describe the division into importance classes using the generating polynomial ∑ 𝑠 𝛱(𝑥) = 𝑟𝑖=1 𝛱𝑖 𝑥𝑖 , where 𝛱𝑖 = 𝑘𝑖 . Based on such a division, 𝑟 expanding windows, where each window is contained in the next window, can be defined over each source block. Note that input symbols from the 𝑖th class of importance belong to the 𝑖th and all the subsequent windows. Namely, the 𝑖th window consists ∑ of the first 𝑘𝑖 = 𝑖𝑗=1 𝑠𝑗 input symbols, where 𝑘1 < 𝑘2 < ⋯ < 𝑘𝑟 = 𝑘 and 𝑠𝑖 = 𝑘𝑖 − 𝑘𝑖−1 . Therefore, the most important symbol class of size 𝑘1 = 𝑠1 is contained in all windows and the 𝑟th window consists of all the 𝑘 symbols of the source block. A new EWF encoded symbol is generated by performing standard LT encoding only on the input symbols from the selected window, where a window can be randomly chosen with respect to the window selection ∑ probability distribution 𝛤 (𝑥) = 𝑟𝑖=1 𝛤𝑖 𝑥𝑖 , where 𝛤𝑖 is the probability of selecting the 𝑖th window. For the 𝑗th expanding window, the degree ∑𝑘𝑗 (𝑗) 𝑖 distribution is 𝛺(𝑗) (𝑥) = 𝛺 𝑥 . This procedure is repeated at 𝑖=1 𝑖 the EWF encoder for each EWF encoded symbol. Therefore, the most important symbol class of size 𝑠1 is protected by all other windows, whereas the least important symbol class of size 𝑠𝑟 is only protected by the 𝑟th window. This feature is quite appropriate for UEP. In addition, when 𝑟 = 1, i.e., there exists only a single window and all input symbols are of equal importance, an EWF code becomes a standard LT code for EEP. For the sake of simplicity, an EWF code can be represented as F𝐸𝑊 (𝛱, 𝛤 , 𝛺(1) , … , 𝛺(𝑟) ). Therefore, when 𝑟 = 2, i.e., the simple case of EWF code with two importance classes, an EWF code is F𝐸𝑊 (𝛱1 𝑥 + 𝛱2 𝑥2 , 𝛤1 𝑥 + 𝛤2 𝑥2 , 𝛺(1) , 𝛺(2) ), where 𝛱1 + 𝛱2 = 1 and 𝛤1 + 𝛤2 = 1. For a given EWF code F𝐸𝑊 (𝛱, 𝛤 , 𝛺(1) , … , 𝛺(𝑟) ), the asymptotic erasure probability 𝑦𝑙,𝑗 (as 𝑘 → ∞) that the source symbol of class 𝑗 is not 82

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Fig. 1. Expanding window fountain (EWF) codes.

recovered after 𝑙 decoding iterations of the belief propagation iterative decoder with the reception overhead 𝜀𝑟 is given as ([22], Lemma 3.2): 𝑦0,𝑗 = 1 ( 𝑦𝑙,𝑗 = 𝑒

−(1+𝜀𝑟 )

∑𝑟

𝛤𝑖 𝑖=𝑗 ∑𝑖 𝛱 𝑡=1 𝑡

(

∑𝑖

𝛱𝑚 𝑦𝑙−1,𝑚 ′ 𝛺 (𝑖) 1− 𝑚=1 ∑𝑖 𝛱 𝑡=1

))

𝑡

(1) .

We use these probabilities to approximate the erasure probability in the finite source block length scenario. 2.3. Hierarchical QPSK (H-QPSK) modulation Hierarchical modulation, wherein constellations with non-uniformly spaced signal points [23,24] and the information bits are mapped onto non-uniform constellation points according to their importance [25], is a simple and feasible way to provide UEP at the PL. It has been adopted in various standards [26] and utilized to provide unequal transmission reliability to different important bits according to their importance [27,28]. In this paper, in order to protect information with two levels of importance, we consider H-QPSK modulation with a signal constellation that is a combination of two QPSK constellations of different energy [29,30], as shown in Fig. 2. As such, the information bits can be mapped onto the corresponding QPSK constellation points of different energy according to their importance. In Fig. 2, signal points 1, 2, 3, and 4 are for the least important bit (LIB) class, corresponding to the distance 𝑑𝐿 between two signal points, while signal points 5, 6, 7, and 8 are for the most important bit (MIB) class, corresponding to the distance 𝑑𝑀 between two signal points. For the given signal constellation, the energy of each transmitted QPSK symbol is just the squared Euclidean distance of the signal point from the origin. In order to characterize the modulation constellation, we introduce 𝛽 as the ratio of 𝑑𝐿 and 𝑑𝑀 , i.e., 𝛽 = 𝑑𝐿 ∕𝑑𝑀 , where 0 ≤ 𝛽 ≤ 1. As a consequence, 𝑑𝐿 and 𝑑𝑀 control the performance of the LIB and MIB, respectively, and these two parameters are related to each other. Therefore, in the H-QPSK modulation system, 𝛽 is an important parameter that determines the energy allocation between the MIB and LIB, and the UEP can be achieved by H-QPSK modulation for two importance classes, i.e., we can achieve the UEP property by tuning the parameter 𝛽. One can see that, the smaller the value of 𝛽, the more the energy provided for the MIB, resulting in the MIB being better protected than the LIB, and the bit error rate of the MIB being smaller than that of the LIB. Specifically, when 𝛽 = 1, H-QPSK essentially reduces to the conventional QPSK, i.e., the same energy is provided for the MIB and LIB and the UEP scheme becomes the EEP scheme. In the extreme case, e.g., 𝛽 = 0, all the energy is assigned to the MIB, i.e., only the MIB can be transmitted. Because the H-QPSK scheme can provide UEP for information with no redundant bits added, there is, obviously, no decrease in transmission speed and efficiency for information.

Fig. 2. Signal constellation of H-QPSK.

2.4. DVB-H Digital Video Broadcasting-Handheld (DVB-H) [31] standard is the European standard for digital video broadcast to handheld terminals. DVB-H standard only defines the link layer and physical layer. As DVBH is proposed based on the existing DVB-Terrestrial (DVB-T) standard which uses MPEG-2 packet stream, DVB-H uses the DVB-T physical layer. The additional elements on the link layer are time slicing and FEC coding mechanism, referred as multiprotocol encapsulated data (MPEFEC). Time slicing implemented by time division multiplexing (TDM) is utilized to reduce the average power consumption for handheld terminals and provide smooth and seamless frequency handover during the movement from the current service cell to another new cell. Due to the limitation of energy of handheld terminals, the time slicing is mandatory in DVB-H standard to effectively prolong handheld terminal standby time and using time. 83

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Fig. 3. DVB-H protocol layers.

transmission schemes over DVB-H networks. Finally, we present the packetization scheme in the DVB-H system. 3.1. Design of EWF codes at the AL We utilize a particularly simple but important scenario that the symbols of a source block are divided into two important classes, namely most important symbols (MIS) and least important symbols (LIS). Consequently, we adopt an EWF code with two expanding windows, i.e., 𝑟 = 2. Furthermore, for the adopted EWF code F𝐸𝑊 (𝛱1 𝑥 + 𝛱2 𝑥2 , 𝛤1 𝑥 + 𝛤2 𝑥2 , 𝛺(1) , 𝛺(2) ), we set the degree distribution on the first window as a robust soliton distribution 𝛺(1) (𝑥) = 𝛺𝑟𝑠 (𝑘1 , 0.5, 0.03), where 𝑘1 = 𝛱1 𝑘, and on the second window as a Raptor degree distribution proposed in [6] 𝛺𝑅 (𝑥) = 0.007969𝑥 + 0.493570𝑥2 + 0.166220𝑥3 + 0.072646𝑥4 + 0.082558𝑥5 Fig. 4. Asymptotic SER performance of the MIS and LIS versus 𝜀 for F𝐸𝑊 (0.105𝑥 + 0.895𝑥2 , 0.084𝑥 + 0.916𝑥2 , 𝛺(1) , 𝛺(2) ) EWF code.

+ 0.056058𝑥8 + 0.037229𝑥9 + 0.055590𝑥19

(2)

+ 0.025023𝑥65 + 0.003135𝑥66 . In the following, we present two examples for the simple case of EWF code with two importance classes. Asymptotic erasure probabilities, i.e., symbol erasure rate (SER), of the MIS and LIS for F𝐸𝑊 (0.105𝑥 + 0.895𝑥2 , 0.084𝑥 + 0.916𝑥2 , 𝛺(1) , 𝛺(2) ) EWF code with 𝛤1 = 0.084 versus 𝜀 and F𝐸𝑊 (0.105𝑥 + 0.895𝑥2 , 𝛤1 𝑥 + (1 − 𝛤1 )𝑥2 , 𝛺(1) , 𝛺(2) ) EWF code with 𝜀 = 0.05 versus 𝛤1 are presented in Figs. 4 and 5, respectively. As Figs. 4 and 5 clearly show, EWF codes enable earlier and more reliable recovery of the MIS than the LIS. Since a video source block of size 𝑘 symbols will be either directly passed to the PL or encoded by using an EWF code before passing to the PL, an AL frame consists of 𝑘 uncoded symbols or (1 + 𝜀𝑡 )𝑘 EWF encoded symbols, where 𝜀𝑡 denotes the EWF coding overhead at the transmitter. Owing to channel-induced losses, some transmitted symbols are not correctly received, resulting in the reception overhead 𝜀𝑟 at the receiver not being equal to 𝜀𝑡 , i.e., 𝜀𝑟 ≠ 𝜀𝑡 . Hence, the parameters at the AL include 𝛱𝑖 , 𝛤𝑖 and 𝜀𝑡 , where 𝑖 = 1, 2. These parameters need to be optimized and the relationship between 𝜀𝑟 and 𝜀𝑡 needs to be derived in the cross-layer setup.

RS codes based MPE-FEC is proposed to mitigate errors in wireless environment so as to improve the carrier-to-noise (C/N) performance and Doppler performance in mobile channels and also enhance tolerance to impulse interference. MPE-FEC is optional in DVB-H standard. Appended with the corresponding headers, the scalable video data is encapsulated at each layer as illustrated in Fig. 3. Firstly, scalable video symbols in a Group of Pictures (GOP) either with or without ALFEC coding are passed to the network layer (NL). Secondly, uncoded or encoded symbols are encapsulated into RTP/UDP packets and then into IP packets at the NL. Thirdly, each IP packet is inserted into an MPE section and then sorted into MPE frame at the link layer (LL). Finally, each MPE section is divided into transport stream (TS) packets at the PL for transmission over the DVB-H physical layer. 3. Proposed cross-layer transmission schemes in DVB-H system In this section, we first discuss the design of EWF codes at the AL and H-QPSK modulation at the PL, and then propose five cross-layer 84

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Table 1 Cross-layer transmission schemes. Scheme

1

2

3

4

5

AL PL

No FEC EEP

No FEC UEP

EEP EEP

UEP EEP

UEP UEP

When AL frames either with or without AL-FEC coding are passed to the LL, cyclic redundancy check (CRC) bits are added to each IP packet to detect any transmission error. We use the CRC-32 defined by the polynomial 1+𝑥+𝑥2 +𝑥4 +𝑥5 +𝑥7 +𝑥8 +𝑥10 +𝑥11 +𝑥12 +𝑥16 +𝑥22 +𝑥23 + 𝑥26 + 𝑥32 . An IP packet with 12 bytes MPE header and 4 bytes CRC-32 is referred to as an MPE section. Then, each MPE section is divided into integer number of TS packets. Finally, each TS packet is modulated by using H-QPSK at the PL. Thus, the parameter at the PL is 𝛽, which needs to be optimized in the cross-layer setup. 3.3. Cross-layer transmission schemes Fig. 5. Asymptotic SER performance of the MIS and LIS versus 𝛤1 for F𝐸𝑊 (0.105𝑥 + 0.895𝑥2 , 𝛤1 𝑥 + (1 − 𝛤1 )𝑥2 , 𝛺(1) , 𝛺(2) ) EWF code with reception overhead 𝜀 = 0.05.

Based on the UEP schemes designed at the AL and the PL, we can obtain five combinations of cross-layer transmission schemes as summarized in Table 1. The layouts of the cross-layer transmission schemes over DVB-H networks are illustrated in Fig. 6 for schemes 1 and 2, and in Fig. 7 for schemes 3, 4, and 5. In scheme 1, conventional QPSK modulation, i.e., 𝛽 = 1, is applied to all the TS packets regardless of their importance at the PL. In scheme 2, all the TS packets are modulated by using H-QPSK at the PL, based on their importance. In scheme 3, all the symbols in a source block are encoded by EEP LT coding, which is a special case of EWF coding, as 𝛤1 = 0 at the AL. Then, all the TS packets are modulated by using the conventional QPSK at the PL. In scheme 4 and scheme 5, UEP coding is applied at the AL. In scheme 4, we add the UEP scheme at the AL to the base scheme 1 setup by using the EWF coding. Then all the TS packets are modulated by using the conventional QPSK at the PL. Since EWF coding requires extra coding overhead, scheme 4 cannot outperform scheme 1 in the high SNR case. In scheme 5, the UEP scheme is provided for symbols both at the AL and the PL in a cross-layer fashion. We apply EWF coding to protect all the symbols in a source block based on their importance at the AL. Therefore, the encoded symbols in an AL frame have different importance. Then, all the TS packets are modulated by using H-QPSK according to their importance at the PL. As a consequence, this scheme benefits from the UEP property both at the AL and the PL and therefore the MIS can gain more protection from both EWF coding and H-QPSK. We expect this scheme can achieve the best performance in the low SNR case.

3.2. Design of H-QPSK at the PL As mentioned earlier, the energy of each transmitted symbol in the signal constellation is just the squared Euclidean distance of the signal point from the origin, hence the transmitted symbol energy of MIB and LIB are 𝑑 𝐸𝑠𝑀 = ( √𝑀 )2 2 (3) 𝑑𝐿 2 𝐸𝑠𝐿 = ( √ ) . 2 Let 𝐸 be the average energy of the signal constellation of HQPSK modulation. Note that throughout this paper, we assume that 𝐸 remains unaffected and is equal to 1. Therefore, there are two situations corresponding to the AL frame with respect to the average energy. (1) When an AL frame contains an uncoded source block of size 𝑘 symbols, the proportions of the MIS and LIS in an AL frame are 𝛱1 and 𝛱2 , respectively. Therefore, the average energy of the signal constellation is (see Appendix A) 𝐸 = 𝛱1 𝐸𝑠𝑀 + 𝛱2 𝐸𝑠𝐿 ) 𝑑2 ( = 𝑀 𝛱1 (1 − 𝛽 2 ) + 𝛽 2 2 we have 2 𝑑𝑀 =

2 𝛱1 (1 − 𝛽 2 ) + 𝛽 2

2𝛽 2 𝑑𝐿2 = . 𝛱1 (1 − 𝛽 2 ) + 𝛽 2

(4)

3.4. Packetization scheme in DVB-H system The packetization scheme in the DVB-H system is shown in Fig. 8. One uncoded or encoded symbol is placed in an IP packet. The IP packets are then sorted and placed into the MPE frame where each IP packet is encapsulated within a single MPE section. Note that the option of using RS codes is switched off. As the time slicing is used in the DVB-H system, MPE frame is transmitted within a single transmission burst by mapping MPE frame data onto 188-bytes long physical-layer TS packets.

(5)

(2) When an AL frame contains an EWF-coded source block of size (1 + 𝜀𝑡 )𝑘 symbols, the proportions of the MIS and LIS in an AL frame are 𝛤1 and 𝛤2 , respectively. Hence, the average energy of the signal constellation is 𝐸 = 𝛤1 𝐸𝑠𝑀 + 𝛤2 𝐸𝑠𝐿 ) 𝑑2 ( = 𝑀 𝛤1 (1 − 𝛽 2 ) + 𝛽 2 2 we have 2 = 𝑑𝑀

2 𝛤1 (1 − 𝛽 2 ) + 𝛽 2

𝑑𝐿2 =

2𝛽 2 . 𝛤1 (1 − 𝛽 2 ) + 𝛽 2

4. Cross-layer optimization of the proposed cross-layer transmission schemes

(6)

The purpose of cross-layer approaches is to jointly consider error protection strategies at various network layers to improve the transmission efficiency. Therefore, we first derive the symbol erasure probability for our proposed five cross-layer transmission schemes by jointly taking into account the symbol erasure probability at the AL and the PL over DVB-H channel, and then, we formulate the proposed cross-layer transmission as an optimization problem.

(7)

85

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Fig. 6. Proposed schemes 1 and 2.

Fig. 7. Proposed schemes 3, 4, and 5.

trace files for different channel SNRs for the DVB-H channel with realworld measurement conducted at [32]. We further assume that each source block of size 𝑘 symbols contains 𝑘1 MIS and 𝑘2 LIS such that 𝑘 = 𝑘1 + 𝑘2 . A wireless channel can be viewed as an erasure channel when errordetecting codes are used [33]. Throughout this paper, we assume that

4.1. Erasure probability analysis For the sake of simplicity, in this paper we assume that each uncoded or encoded symbol is placed into one IP packet and each MPE section is transmitted over one TS packet through DVB-H channel. As in [18], to model the channel accurately, we represent TS packet losses based on 86

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Fig. 8. Packetization scheme in DVB-H system.

channel errors can be detected perfectly by the CRC-32 checking method that is used to determine whether an MPE section is received correctly or not. Once an MPE section over DVB-H channels does not pass the CRC check, an erasure is declared and we will discard it, and then wait for the next received MPE section. In other words, only error-free received TS packets at the PL are forwarded to the AL. Therefore, the symbol erasure probability is equivalent to TS packet error probability at the PL. We assume that each source block is divided into 𝑙 layers of lengths 𝑠1 , 𝑠2 , …, 𝑠𝑙 symbols. The importance of data decreases with chronological ordering of the layers in the source block. Owing to the error propagation effect of the scalable video, we assume that a layer can be used to enhance the video quality only if that layer and all the layers before it are also received correctly at the AL, i.e., reconstruction of the scalable video at a decoder is based on the correctly received consecutive layers. Let 𝑃1 , 𝑃2 , …, 𝑃𝑙 denote the probabilities that each of 𝑙 data layers are correctly received, respectively. In scheme 1, consecutive source blocks uncoded at the AL are directly given to the PL. At the PL, 𝑘 TS packets are modulated by using conventional QPSK regardless of their importance. Let 𝑃𝑒 denote the TS packet erasure probability at the PL. We have 𝑃𝑒 = 1 − (1 − 𝑃𝑏 )8𝑆 where 𝑃𝑏 =

1 𝑒𝑟𝑓 𝑐 2

respectively: (see Appendix B)

𝐸𝑏 𝑁𝑜

𝑃𝑗 = (1 − 𝑃𝑒𝐿 )𝑠𝑗

𝜀𝑟 = (1 + 𝜀𝑡 )(1 − 𝑃𝑒 ) − 1.

We can calculate the probability 𝑃𝑖 as follows: (9)

1 𝑒𝑟𝑓 𝑐 2

(10)

√ 𝐸 ( 𝑁𝑏 )𝑀 and 𝑃𝑏𝐿 = 𝑜

1 𝑒𝑟𝑓 𝑐 2

√ 𝐸 ( 𝑁𝑏 )𝐿 are bit error 𝑜

probabilities corresponding to the MIS and LIS at the PL, respectively, 𝐸 𝐸 where ( 𝑁𝑏 )𝑀 and ( 𝑁𝑏 )𝐿 are SNRs corresponding to the MIS and LIS, 𝑜

(12)

(13)

Based on the reception overhead 𝜀𝑟 at the receiver and the parameters of the selected LT code, we can calculate (asymptotic) erasure probabilities of input symbols in the first and second windows using (1) at the AL. Owing to the EEP characteristic of LT codes, the MIS and LIS have the same erasure probability at the AL. Let 𝑃𝑒′ denote symbol erasure probability at the AL. By substituting 𝑃𝑒′ into (9), we can obtain the probabilities that each of 𝑙 data layers are correctly received. In scheme 4, 𝑘 symbols are encoded by using EWF coding at the AL and the corresponding number of output symbols with an overhead 𝜀𝑡 is (1 + 𝜀𝑡 )𝑘. Then, (1 + 𝜀𝑡 )𝑘 TS packets are modulated by using conventional QPSK at the PL. The TS packet erasure probability is the same as in (8) at the PL and the reception overhead 𝜀𝑟 at the receiver is the same as in (13). Based on the reception overhead 𝜀𝑟 at the receiver and the parameters of the selected EWF code F𝐸𝑊 (𝛱1 𝑥 + 𝛱2 𝑥2 , 𝛤1 𝑥 + 𝛤2 𝑥2 , 𝛺(1) , 𝛺(2) ), we can calculate (asymptotic) erasure probabilities of input symbols

where 𝑠𝑖 is the number of symbols in the 𝑖th layer. In scheme 2, 𝑘 TS packets are modulated by using the H-QPSK scheme according to their importance at the PL. Let 𝑃𝑒𝑀 and 𝑃𝑒𝐿 denote the TS packet erasure probabilities corresponding to the MIS and LIS at the PL, respectively. We have

where 𝑃𝑏𝑀 =

(11)

where 𝑠𝑗 is the number of symbols in the 𝑗th layer. In scheme 3, 𝑘 symbols in a source block are encoded by using LT coding at the AL and the corresponding number of output symbols with an overhead 𝜀𝑡 is (1 + 𝜀𝑡 )𝑘. Then (1 + 𝜀𝑡 )𝑘 TS packets are modulated by using conventional QPSK at the PL. The TS packet erasure probability is the same as in (8), i.e., the MIS and LIS have the same erasure probabilities at the PL. Owing to the transmission error, the reception overhead 𝜀𝑟 at the receiver is not equal to the transmitted LT coding overhead 𝜀𝑡 at the transmitter, we have

𝜋

𝑃𝑒𝐿 = 1 − (1 − 𝑃𝑏𝐿 )8𝑆

𝐸𝑏 𝛽2 ) = ( ). 𝑁𝑜 𝐿 4𝜎 2 𝛱 (1 − 𝛽 2 ) + 𝛽 2 1 𝑛

where 𝑠𝑖 is the number of symbols in the 𝑖th layer. For the case of the 𝑗th layer contained in the second window, we have

𝑒𝑟𝑓 𝑐(.) is the complementary error function defined as 𝑒𝑟𝑓 𝑐(𝑥) ≜ ∞ −𝑡2 2 √ ∫ 𝑑𝑡 [34], 𝑆 is the TS packet size, 𝑆 = 188 bytes. 𝑥 𝑒

𝑃𝑒𝑀 = 1 − (1 − 𝑃𝑏𝑀 )8𝑆

(

𝑃𝑖 = (1 − 𝑃𝑒𝑀 )𝑠𝑖

is the bit error probability of QPSK and

𝑃𝑖 = (1 − 𝑃𝑒 )𝑠𝑖

𝐸𝑏 1 ) = ( ) 𝑁𝑜 𝑀 4𝜎𝑛2 𝛱1 (1 − 𝛽 2 ) + 𝛽 2

Therefore, for the case of the 𝑖th layer contained in the first window, we have

(8) √

(

𝑜

87

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

′ ′ denote in the first and second windows using (1). Let 𝑃𝑒𝑀 and 𝑃𝑒𝐿 symbol erasure probabilities in the first and second windows at the AL, ′ ′ into (11) and (12), we can respectively. By substituting 𝑃𝑒𝑀 and 𝑃𝑒𝐿 obtain the probabilities that each of 𝑙 data layers is correctly received, respectively. In scheme 5, (1+𝜀𝑡 )𝑘 output encoded symbols are generated by using EWF coding at the AL. Then (1 + 𝜀𝑡 )𝑘 TS packets are modulated by using H-QPSK at the PL. The TS packet erasure probabilities corresponding to the MIS and LIS at the PL are the same as in (10), where

(

𝐸𝑏 1 ) = ( ) 𝑁𝑜 𝑀 4𝜎𝑛2 𝛤1 (1 − 𝛽 2 ) + 𝛽 2

(

𝐸𝑏 𝛽2 ) = ( ). 𝑁𝑜 𝐿 4𝜎 2 𝛤 (1 − 𝛽 2 ) + 𝛽 2 1 𝑛

Table 2 Number of symbols in correctly decoded consecutive layers of H.264 SVC Stefan.

𝛤𝑟1

𝛤 (1 + 𝜀𝑡 )(1 − 𝑃𝑒𝑀 ) = 1 1 + 𝜀𝑟

𝛤𝑟2

𝛤 (1 + 𝜀𝑡 )(1 − 𝑃𝑒𝐿 ) = 2 1 + 𝜀𝑟

𝛤1∗ = arg max 𝜂 { ′ 𝑃𝑒𝑀 ≤ 0.01 𝑠.𝑡. 0 ≤ 𝛤1 ≤ 1.

{𝛤1∗ , 𝛽 ∗ } = arg max 𝜂 ⎧𝑃 ′ ≤ 0.01 ⎪ 𝑒𝑀 𝑠.𝑡. ⎨0 ≤ 𝛤1 ≤ 1 ⎪ ⎩0 ≤ 𝛽 ≤ 1.

5. Cross-layer optimization results In this section, we first apply our proposed five cross-layer transmission schemes to H.264 SVC bitstreams transmitted over DVB-H networks and perform cross-layer optimization to evaluate their performances. Subsequently, we propose an adaptive transmission scheme according to the performance of the five cross-layer optimized transmission schemes over DVB-H networks. The H.265/High Efficiency Video Coding (HEVC) standard [36,37], which was developed by Joint Collaborative Team on Video Coding (JCT-VC) of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group and technically finalized in January 2013, is proposed as the newest video coding standard with the goal of achieving higher coding efficiency than existing coding standards like H.262/263/264, especially when operating on high resolution video content. The scalable extension of H.265/HEVC (SHVC) [38,39] is currently an actively developed project within JCT-VC. Due to the scalability of the SHVC similar to the H.264 SVC, the proposed cross-layer transmission schemes can be applied to SHVC bitstreams transmitted over DVB-H networks. As in [35] and [40], we assume that the transmitted CIF Stefan video sequence (the temporal resolution is 30 fps and the spatial resolution is 352 × 288) has a base layer (BL) and fourteen enhancement layers (EL) that progressively improve the overall video quality. The sequence is segmented into GOPs of size 16 frames. As a source block, each GOP data consists of 𝑘 = 3800 symbols of size 50 bytes, i.e., the source block

(16)

where 𝑁𝑈 𝑀(0) = 0, and for 𝑖 > 0, 𝑁𝑈 𝑀(𝑖) is the number of symbols in correctly received consecutive layers. Note that the average number of correctly received consecutive symbols is decided by both the type of the used scalable video coder and the content of the transmitted data. Let 𝑃 (𝑖) is the probability that the first 𝑖 consecutive layers are correctly received as in [35]:

(17)

The effective goodput of the transmission is defined as 𝜂=

𝑁𝑈 𝑀𝑎𝑣𝑔 Total number of transmitted symbols

.

(21)

Since the cross-layer UEP presents a non-linear optimization problem, we can utilize a GA to perform optimization. For simplicity, we use the GA toolbox in Matlab to achieve the optimal parameters.

𝑖=0

⎧1 − 𝑃 , for 𝑖 = 0 ⎪ 𝑖 1 ⎪∏ 𝑃𝑗 ⋅ (1 − 𝑃𝑖+1 ), for 𝑖 = 1, 2, … , 𝑙 − 1 ⎪ 𝑃 (𝑖) = ⎨ 𝑗=1 ⎪∏ 𝑙 ⎪ 𝑃𝑗 , for 𝑖 = 𝑙. ⎪ ⎩ 𝑗=1

(20)

In scheme 5, UEP schemes are both utilized at the AL and the PL, the optimization parameters are 𝛤1 and 𝛽. The optimization function is

Due to the fact that the source block of scalable video contains data of different importance, i.e., the first and most important data (BL) and followed by progressively less important data towards the end of the block (ELs), the larger the portion of the source block recovered from its beginning onwards, the better the quality of received scalable video. In this paper, we employ the effective goodput to measure the transmission efficiency of the received scalable video and closely reflect the quality of the recovered scalable video. The expected number of received symbols in correctly received consecutive layers can be written as: 𝑃 (𝑖) ⋅ 𝑁𝑈 𝑀(𝑖)

(19)

The optimization parameters for scheme 4 is 𝛤1 . The optimization function is

(15)

4.2. Formulation of optimization problem

𝑙 ∑

400 700 875 1155 1550 3800

𝛽 ∗ = arg max 𝜂 { 𝑃𝑒𝑀 ≤ 0.01 𝑠.𝑡. 0 ≤ 𝛽 ≤ 1.

(14)

where 𝛤𝑟1 and 𝛤𝑟2 are the proportions of the MIS and LIS in the correctly received EWF coded symbols, respectively. Based on the reception overhead 𝜀𝑟 at the receiver and the parameters of received EWF code F𝐸𝑊 (𝛱1 𝑥 + 𝛱2 𝑥2 , 𝛤𝑟1 𝑥 + 𝛤𝑟2 𝑥2 , 𝛺(1) , 𝛺(2) ), we can calculate (asymptotic) erasure probabilities of input symbols in ′ ′ denote symbol erasure and 𝑃𝑒𝐿 the first and second windows. Let 𝑃𝑒𝑀 probabilities in the first and second windows at the AL, respectively. ′ ′ into (11) and (12), we can obtain the By substituting 𝑃𝑒𝑀 and 𝑃𝑒𝐿 probabilities that each of 𝑙 data layers are correctly received.

𝑁𝑈 𝑀𝑎𝑣𝑔 =

Number of symbols

BL only BL + 1 EL BL + 2 ELs BL + 3 ELs BL + 4 ELs BL + All ELs

The goal of cross-layer optimization in our proposed schemes is to deliver a scalable video with the highest possible effective goodput. To maximize the effective goodput, we tune the parameters of the UEP schemes at the AL and the PL under the constraint that symbol erasure probability in the first window is less than or equal to 0.01. In scheme 2, the optimization parameter is 𝛽. For this scheme, the optimization function can be written as

Since the MIS and LIS have unequal symbol erasure probabilities at the PL, we have 𝜀𝑟 = 𝜀𝑡 − (1 + 𝜀𝑡 )(𝛤1 𝑃𝑒𝑀 + 𝛤2 𝑃𝑒𝐿 )

Correctly decoded consecutive layers

(18) 88

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Table 3 Effective goodput of scheme 1. 𝐸𝑏 ∕𝑁𝑜 (dB)

−2

−1

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

𝜂

0

0

0

0

0

0

0

0

0

0

0

0

0.012

0.490

0.973

0.999

1

1

Table 4 Optimum cross-layer parameters for scheme 2. 𝐸𝑏 ∕𝑁𝑜 (dB)

−2

−1

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

𝛽 𝜂

– 0

– 0

0.005 0.002

0.007 0.078

0.027 0.104

0.088 0.105

0.198 0.105

0.283 0.105

0.371 0.105

0.406 0.105

0.515 0.105

0.622 0.105

0.797 0.107

0.965 0.511

0.986 0.974

1 0.999

1 1

1 1

Table 5 Effective goodput of scheme 3. 𝐸𝑏 ∕𝑁𝑜 (dB)

−2

−1

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

𝜀𝑟 𝜂

– 0

– 0

– 0

– 0

– 0

– 0

– 0

– 0

– 0

– 0

– 0

0.236 0.208

0.292 0.291

0.299 0.302

0.3 0.303

0.3 0.303

0.3 0.303

0.3 0.303

Table 6 Optimum cross-layer parameters for scheme 4. 𝐸𝑏 ∕𝑁𝑜 (dB)

−2

−1

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

𝛤1 𝜀𝑟 𝜂

– – 0

– – 0

– – 0

– – 0

– – 0

– – 0

– – 0

– – 0

– – 0

0.364 −0.594 0.081

0.146 −0.024 0.081

0.018 0.236 0.249

0.015 0.292 0.324

0.015 0.299 0.334

0.015 0.3 0.335

0.015 0.3 0.335

0.015 0.3 0.335

0.015 0.3 0.335 𝐸

dB than scheme 1. In other words, in the SNR region of 11 < 𝑁𝑜𝑏 < 13.5 dB, scheme 1 can correctly transmit the BL and some ELs and even 𝐸 all the ELs in the SNR region of 𝑁𝑜𝑏 ⩾ 13.5 dB. In the SNR region of 𝐸

6 < 𝑁𝑜𝑏 < 11 dB, scheme 4 can correctly transmit much more correct symbols in consecutive layers than scheme 1, resulting in better quality of the reconstructed video. The important reason behind this fact is that EWF coding is used at the AL in scheme 4. On the one hand, more protect can be provided for the BL by using EWF coding at the AL in scheme 𝐸 4 and the BL can be recovered in the SNR region of 6 < 𝑁𝑜𝑏 < 11 dB. On the other hand, due to the error floor performance of EWF decoding, the effective goodput cannot be further improved in the SNR region of 𝐸𝑏 ⩾ 11 dB in scheme 4, whereas the erasure probability at the PL is 𝑁𝑜 𝐸

sufficient low in the SNR region of 𝑁𝑜𝑏 ⩾ 11 dB and most or even all the ELs can be correctly received on the receiver side, implying that it is unnecessary to encode the source block at the AL when the SNR is high enough and scheme 1 can therefore achieve much better performance than scheme 4. From Tables 4 and 6, we can observe that scheme 4 outperforms 𝐸 the scheme 2 in the SNR region of 8 < 𝑁𝑜𝑏 ⩽ 11 dB. However, the performance of scheme 2 is better than that of scheme 4 in the SNR 𝐸 region of 0.5 ⩽ 𝑁𝑜𝑏 ⩽ 8 dB, where the effective goodput is 0.105. That is, scheme 2 can correctly transmit the BL in the SNR region of 0.5 𝐸 ⩽ 𝑁𝑜𝑏 ⩽ 8 dB. The main reason behind this fact is that more protect can be provided for the BL by H-QPSK modulation at the PL in scheme 2 and the erasure probability is low enough to guarantee the recovery of the 𝐸 BL while the ELs cannot be recovered in the SNR region of 0.5 ⩽ 𝑁𝑜𝑏 ⩽ 8 dB. Owing to the conventional QPSK modulation at the PL in scheme 4, the erasure probabilities of the EWF encoded symbols corresponding to the BL and ELs are the same, resulting in all the EWF encoded symbols cannot be correctly received on the receiver side in the SNR region of 𝐸 −2 ⩽ 𝑁𝑜𝑏 ⩽ 6 dB and the effective goodput is 0. Moreover, as the EWF coding is used at the AL in scheme 4, on the one hand, the BL and even some ELs can be recovered by using UEP EWF coding when the erasure 𝐸 probability is improved in the SNR region of 8 < 𝑁𝑜𝑏 ⩽ 11 dB; on the other hand, due to the error floor performance of EWF decoding, the 𝐸 effective goodput cannot be further improved in the SNR region of 𝑁𝑜𝑏 ⩾ 11 dB, whereas the erasure probability at the PL is sufficient low in the 𝐸 SNR region of 𝑁𝑜𝑏 ⩾ 11 dB and scheme 2 can therefore achieve much better performance than scheme 4. We can observe in Tables 4 and 7 that the use of scheme 2 achieves 𝐸 better performance in the SNR region of 1.5 ⩽ 𝑁𝑜𝑏 ⩽ 8 dB, whereas

Fig. 9. Optimization results with 𝜀𝑡 = 0.3.

size is approximately 190 000 bytes. We assume that the BL data of size 𝑘1 = 400 symbols is placed in the first window of two EWF windows. All the ELs, together with the first window, form the second window. Thus, we have 𝛱1 = 400∕3800 = 0.105. The numbers of symbols in correctly decoded consecutive layers are presented in Table 2. 5.1. Discussion of cross-layer optimization results Note that we take 𝜀𝑡 = 0.3, for example, to perform cross-layer optimization by jointly tuning the parameters of the UEP schemes at the AL and the PL to obtain the highest possible effective goodput. We present the optimization results for the proposed schemes in Tables 3, 4, 5, 6, and 7. Fig. 9 shows the optimization results of schemes 1, 2, 3, 4, and 5. Apparently, it is unnecessary to perform optimization for schemes 1 and 3 and the effective goodputs are reported in Tables 3 and 5, respectively. From Tables 3 and 6, we can observe that scheme 1 𝐸 achieves much better performance than scheme 4 for 𝑁𝑜𝑏 ⩾ 11 dB, whereas scheme 4 can achieve better performance for 6 <

𝐸𝑏 𝑁𝑜

< 11 89

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Table 7 Optimum cross-layer parameters for scheme 5. 𝐸𝑏 ∕𝑁𝑜 (dB)

−2

−1

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

𝛤1 𝛽 𝜀𝑟 𝛤𝑟1 𝜂

– – – – 0

0.113 0.019 −0.872 1 0.081

0.151 0.073 −0.848 1 0.081

0.18 0.197 −0.847 1 0.081

0.23 0.272 −0.847 1 0.081

0.282 0.366 −0.847 1 0.081

0.293 0.514 −0.847 1 0.081

0.525 0.537 −0.847 1 0.081

0.611 0.782 −0.847 1 0.081

0.364 1 −0.594 0.364 0.081

0.146 1 0.024 0.146 0.081

0.018 1 0.236 0.018 0.249

0.015 1 0.292 0.015 0.324

0.015 1 0.299 0.015 0.334

0.015 1 0.3 0.015 0.335

0.015 1 0.3 0.015 0.335

0.015 1 0.3 0.015 0.335

0.015 1 0.3 0.015 0.335

𝐸

scheme 5 is better than scheme 2 in the SNR region of −1.5 ⩽ 𝑁𝑜𝑏 ⩽ 1.5 dB, where the effective goodput is 0.081. In other words, scheme 5 can correctly transmit the BL whereas scheme 2 cannot in the SNR region of 𝐸 −1.5 ⩽ 𝑁𝑜𝑏 ⩽ 1.5 dB. As the H-QPSK modulation at the PL is both used in scheme 2 and scheme 5, the BL can be recovered in the SNR regions 𝐸 𝐸 of 3 ⩽ 𝑁𝑜𝑏 ⩽ 10.5 dB and 0 ⩽ 𝑁𝑜𝑏 ⩽ 8 dB in scheme 2 and scheme 5, respectively. Meanwhile, due to the UEP EWF coding is used at the AL in scheme 5, on the one hand, the BL can be recovered in the low SNR 𝐸 region, i.e., 0 ⩽ 𝑁𝑜𝑏 ⩽ 2.5 dB, and some ELs can be recovered in the SNR region of 8 ⩽

𝐸𝑏 𝑁𝑜

⩽ 11 dB; on the other hand, the effective goodput 𝐸

cannot be further improved in the SNR region of 𝑁𝑜𝑏 ⩾ 11 dB due to the error floor performance of EWF decoding. Obviously, as the EEP LT coding and conventional QPSK modulation are respectively used at the AL and PL, the performance of scheme 3 is always worse than that of the other schemes in the SNR region of −2 𝐸 ⩽ 𝑁𝑜𝑏 ⩽ 15 dB. Therefore, we may not utilize this scheme in the following adaptive transmission scheme. Fig. 10. Effective goodput performance of adaptive transmission scheme for Stefan video with 𝜀𝑡 = 0.3.

5.2. Adaptive transmission scheme for test scalable video As illustrated in Fig. 9, we can observe that our proposed five crosslayer transmission schemes can exert their respective advantages to the utmost to achieve their best performances in specific SNR regions. Therefore, we can adaptively adopt the proper transmission scheme according to the current channel condition to achieve the best overall transmission performance. Therefore, for the CIF Stefan video, we can obtain the following 𝐸 adaptive transmission scheme. In the SNR region of 𝑁𝑜𝑏 > 11 dB, we use the scheme 1. For 8 <

𝐸𝑏 𝑁𝑜

⩽ 11 dB, we utilize scheme 4. We use

scheme 2 in the SNR region of 1.5 < 1.5 dB.

𝐸𝑏 𝑁𝑜

⩽ 8 dB and scheme 5 in

𝐸𝑏 𝑁𝑜



5.3. Performance of the proposed adaptive transmission scheme for test scalable video We evaluate the performance of our proposed adaptive transmission scheme for the CIF Stefan video sequence. The effective goodput of our proposed adaptive transmission scheme is shown in Fig. 10. With the same method, we perform cross-layer optimization and obtain the adaptive schemes for 𝜀𝑡 = 0.1 and 𝜀𝑡 = 0.2 as illustrated in Fig. 11. The experimental results confirm that our proposed adaptive transmission scheme provides significant overall performance improvement in effective goodput, particularly at low SNR values.

Fig. 11. Effective goodput performance of adaptive transmission scheme for Stefan video with 𝜀𝑡 = {0.1, 0.2, 0.3}.

Acknowledgments This work has been supported by the Fundamental Research Funds for the Central Universities lzujbky-2017-188, the National Natural Science Foundation of China under Grant 61601170 and CERNET Innovation Project NGII20170503.

6. Conclusion In this paper, we propose five cross-layer transmission schemes and evaluate their performances by performing cross-layer optimization to concurrently tune the parameters for maximizing the effective goodput. Based on the cross-layer optimization results, we propose an adaptive transmission scheme for scalable video over DVB-H networks, i.e., we utilize proper transmission scheme according to the current channel condition. The results demonstrate that the proposed adaptive transmission scheme provides much better overall performance, even in the low SNR region, compared with any single scheme used by itself.

Appendix A Calculation of the average energy 𝐸 in the signal constellation. For the H-QPSK signal constellation, 𝐸 = 𝛱1 𝐸𝑠𝑀 + 𝛱2 𝐸𝑠𝐿 90

K. Deng et al.

Signal Processing: Image Communication 63 (2018) 81–91

Using (3), this equation becomes

[14] A.A. Khalek, C. Caramanis, R.W. Heath, A cross-layer design for perceptual optimization of H. 264/SVC with unequal error protection, IEEE J. Sel. Areas Commun. 30 (7) (2012) 1157–1171. [15] B. Barmada, M.M. Ghandi, E.V. Jones, M. Ghanbari, Combined turbo coding and hierarchical QAM for unequal error protection of H. 264 coded video, Signal Process., Image Commun. 21 (5) (2006) 390–395. [16] D.S. Lun, M. Médard, R. Koetter, M. Effros, On coding for reliable communication over packet networks, Phys. Commun. 1 (1) (2008) 3–20. [17] D. Vukobratovic, V. Stankovic, Unequal error protection random linear coding strategies for erasure channels, IEEE Trans. Commun. 60 (5) (2012) 1243–1252. [18] S. Nazir, V. Stankovic, D. Vukobratovic, Scalable broadcasting of sliced H.264/AVC over DVB-H network, in: 2011 17th IEEE International Conference on Networks, 2011, pp. 36–40. [19] D. Vukobratovic, V. Stankovic, Multi-user video streaming using unequal error protection network coding in wireless networks, Eurasip J. Wirel. Commun. Netw. 2012 (1) (2012) 1–13. [20] M.A. Shokrollahi, S. Lassen, R. Karp, Systems and processes for decoding chain reaction codes through inactivation, US Patent 6,856,263 (Feb. 15 2005). [21] D. Vukobratovic, V. Stankovic, Unequal error protection random linear coding for multimedia communications, in: IEEE International Workshop on Multimedia Signal Processing, 2010, pp. 280–285. [22] D. Sejdinovic, D. Vukobratovic, A. Doufexi, V. Senk, Expanding window fountain codes for unequal error protection, in: Asilomar Conference on, 2007, pp. 1020– 1024. [23] P.K. Vitthaladevuni, M.S. Alouini, Exact BER computation of generalized hierarchical PSK constellations, IEEE Trans. Commun. 51 (12) (2003) 2030–2037. [24] P.K. Vitthaladevuni, M.S. Alouini, A recursive algorithm for the exact BER computation of generalized hierarchical QAM constellations, IEEE Trans. Inform. Theory 49 (1) (2003) 297–307. [25] H.X. Nguyen, H.H. Nguyen, T. Le-Ngoc, Signal transmission with unequal error protection in wireless relay networks, in: Global Communications Conference, 2009. GLOBECOM 2009, Honolulu, Hawaii, Usa, 30 November - 4 December, 2009, pp. 1–6. [26] D. Standards, Digital video broadcasting (DVB) - framing structure, channel coding and modulation for digital terrestrial television. [27] M. Morimoto, M. Okada, S. Komaki, A hierarchical image transmission system for multimedia mobile communication, in: International Workshop on Wireless Image/video Communications, 1996, pp. 80–84. [28] S. O’Leary, Hierarchical transmission and COFDM systems, IEEE Trans. Broadcast. 43 (2) (1997) 166–174. [29] D.K. Asano, R. Kohno, Serial unequal error-protection codes based on trellis-coded modulation, IEEE Trans. Commun. 45 (6) (1997) 633–636. [30] S. Yamazaki, D.K. Asano, A serial unequal error protection code system using multilevel trellis coded modulation with two ring signal constellations for AWGN channels, in: International Symposium on Intelligent Signal Processing and Communication Systems, 2009, pp. 315–318. [31] E. ETSI, 302 304 v1. 1.1 (2004-11). Digital video broadcasting (DVB); transmission system for handheld terminals (DVB-H), European Telecommunication Standard. [32] K. Nybom, S. Grönroos, J. Björkqvist, Expanding window fountain coded scalable video in broadcasting, in: 2010 IEEE International Conference on Multimedia and Expo, 2010, pp. 516–521. [33] X. Wang, W. Chen, Z. Cao, Throughput-efficient rateless coding with packet length optimization for practical wireless communication systems, in: Global Telecommunications Conference, 2010, pp. 1–5. [34] I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products, Academic Press, 1994. [35] D. Vukobratovic, V. Stankovic, D. Sejdinovic, L. Stankovic, Z. Xiong, Scalable video multicast using expanding window fountain codes, IEEE Trans. Multimedia 11 (6) (2009) 1094–1104. [36] G.J. Sullivan, J.R. Ohm, W.J. Han, T. Wiegand, Overview of the high efficiency video coding (HEVC) standard, IEEE Trans. Circuits Syst. Video Technol. 22 (12) (2012) 1649–1668. [37] J.M. Batalla, Advanced multimedia service provisioning based on efficient interoperability of adaptive streaming protocol and high efficient video coding, J. Real-Time Image Process. 12 (2) (2016) 443–454. [38] Y. Ye, P. Andrivon, The scalable extensions of HEVC for ultra-high-definition video delivery, IEEE Multimedia 21 (3) (2014) 58–64. [39] J.M. Boyce, Y. Ye, J. Chen, A.K. Ramasubramonian, Overview of SHVC: Scalable extensions of the high efficiency video coding standard, IEEE Trans. Circuits Syst. Video Technol. 26 (1) (2016) 20–34. [40] S. Ahmad, R. Hamzaoui, M.M. Al-Akaidi, Unequal error protection using fountain codes with applications to video communication, IEEE Trans. Multimedia 13 (1) (2011) 92–101.

𝑑 𝑑 𝐸 = 𝛱1 ( √𝑀 )2 + 𝛱2 ( √𝐿 )2 2 2 Since 𝛽 = 𝑑𝐿 ∕𝑑𝑀 and 𝛱2 = 1 − 𝛱1 , we have 𝑑 𝛽𝑑 𝐸 = 𝛱1 ( √𝑀 )2 + (1 − 𝛱1 )( √𝑀 )2 2 2 2 ( ) 𝑑𝑀 = 𝛱1 (1 − 𝛽 2 ) + 𝛽 2 . 2 Appendix B Derivation of 𝐸𝑏 ∕𝑁𝑜 corresponding to the MIS and LIS, respectively. For the H-QPSK signal constellation, 𝐸𝑏 ∕𝑁𝑜 corresponding to the MIS and LIS can be calculated as follows 𝐸 𝐸 ( 𝑏 )𝑀 = 𝑠𝑀 𝑁𝑜 2𝑁𝑜 𝐸𝑏 𝐸𝑠𝐿 ( ) = 𝑁𝑜 𝐿 2𝑁𝑜 where 𝜎𝑛2 = 𝐸 ( 𝑁𝑏 )𝑀 𝑜

𝑁𝑜 . 2

𝐸

and ( 𝑁𝑏 )𝐿 can be rewritten in terms of 𝛽 by using (3) and (5) 𝑜

as 𝐸 1 ( 𝑏 )𝑀 = ( ) 𝑁𝑜 4𝜎𝑛2 𝛱1 (1 − 𝛽 2 ) + 𝛽 2

(

𝐸𝑏 𝛽2 ) = ( ). 𝑁𝑜 𝐿 4𝜎 2 𝛱 (1 − 𝛽 2 ) + 𝛽 2 1 𝑛

References [1] D. Taubman, M. Marcellin, JPEG 2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, in: Standards and Practice, vol. 642, Springer Science & Business Media, 2001. [2] H. Schwarz, D. Marpe, T. Wiegand, Overview of the scalable video coding extension of the H. 264/AVC standard, IEEE Trans. Circuits Syst. Video Technol. 17 (9) (2007) 1103–1120. [3] T. Stockhammer, M.M. Hannuksela, T. Wiegand, H. 264/AVC in wireless environments, IEEE Trans. Circuits Syst. Video Technol. 13 (7) (2003) 657–673. [4] D.J. MacKay, Fountain codes, IEE Proc.-Commun. 152 (6) (2005) 1062–1068. [5] M. Luby, Lt codes, in: Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science, IEEE, 2002, p. 271. [6] A. Shokrollahi, Raptor codes, IEEE Trans. Inf. Theory 52 (6) (2006) 2551–2567. [7] W. Xiang, C. Zhu, C.K. Siew, Y. Xu, M. Liu, Forward error correction-based 2-D layered multiple description coding for error-resilient H. 264 SVC video transmission, IEEE Trans. Circuits Syst. Video Technol. 19 (12) (2009) 1730–1738. [8] R. Hamzaoui, V. Stankovic, Z. Xiong, Optimized error protection of scalable image bit streams [advances in joint source-channel coding for images], IEEE Signal Process. Mag. 22 (6) (2005) 91–107. [9] D. Sejdinovic, D. Vukobratovic, A. Doufexi, V. Senk, R.J. Piechocki, Expanding window fountain codes for unequal error protection, IEEE Trans. Commun. 57 (9) (2009) 2510–2516. [10] M. Van Der Schaar, S. Krishnamachari, S. Choi, X. Xu, Adaptive cross-layer protection strategies for robust scalable video transmission over 802. 11 WLANs, IEEE J. Sel. Areas Commun. 21 (10) (2003) 1752–1763. [11] A. Talari, S. Kumar, N. Rahnavard, S. Paluri, J.D. Matyjas, Optimized cross-layer forward error correction coding for H. 264 AVC video transmission over wireless channels, EURASIP J. Wirel. Comm. Netw. 2013 (1) (2013) 206. [12] Y. Wu, S. Kumar, F. Hu, Y. Zhu, J.D. Matyjas, Cross-layer forward error correction scheme using Raptor and RCPC codes for prioritized video transmission over wireless channels, IEEE Trans. Circuits Syst. Video Technol. 24 (6) (2014) 1047–1060. [13] Y. Pei, J.W. Modestino, Cross-layer design for video transmission over wireless rician slow-fading channels using an adaptive multiresolution modulation and coding scheme, EURASIP J. Adv. Signal Process. (2007).

91