PCCP: Proactive Video Chunks Caching and Processing in edge networks

PCCP: Proactive Video Chunks Caching and Processing in edge networks

Future Generation Computer Systems 105 (2020) 44–60 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: w...

2MB Sizes 0 Downloads 71 Views

Future Generation Computer Systems 105 (2020) 44–60

Contents lists available at ScienceDirect

Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs

PCCP: Proactive Video Chunks Caching and Processing in edge networks ∗

Emna Baccour a , , Aiman Erbad a , Kashif Bilal b , Amr Mohamed a , Mohsen Guizani a a b

CSE department, College of Engineering, Qatar University, Qatar COMSATS University Islamabad, Abbottabad, Pakistan

article

info

Article history: Received 20 December 2018 Received in revised form 27 June 2019 Accepted 2 November 2019 Available online 9 November 2019 Keywords: Video chunks Collaborative chunks caching ABR Edge network Joint processing Viewing pattern Proactive caching

a b s t r a c t Mobile Edge Computing (MEC) networks have been proposed to extend the cloud services and bring the cloud computing capabilities near the end-users at the Mobile Base Stations (MBS). To improve the efficiency of pushing the cloud features to the edge, different MEC servers assist each others to effectively select videos to cache and transcode. In this work, we adopt a joint caching and processing model for Video On Demand (VOD) in MEC networks. Our goal is to proactively cache only the chunks of videos to be watched and instead of caching the whole video content in one edge server (as performed in most of the previous works), neighboring MBSs will collaborate to store different video chunks to optimize the storage resources usage. Then, by coping with the Adaptive BitRate streaming technology (ABR), different representations of each chunk can be generated on the fly and cached in multiple MEC servers. To maximize the caching efficiency, we study the videos viewing pattern and design a Proactive caching Policy (PcP) and a Caching replacement Policy (CrP) to cache only highest probability video chunks. Servers performing caching and transcoding tasks should be thoroughly selected to optimize the storage and computing resources usage. Hence, we formulate this collaborative problem as a NP-hard Integer Linear Program (ILP). In addition to the CrP and PcP policies, we also propose a sub-optimal relaxation and an online heuristic, which are adequate for real-time chunks fetching. The simulation results prove that our model and policies perform more than 20% better than other edge caching approaches in terms of cost, average delay and cache hit ratio for different network configurations. © 2019 Published by Elsevier B.V.

1. Introduction With the advancement of personal smart devices and the growth of the number of Content Providers (CP) such as Youtube and Netflix, the world witnessed an explosion of on demand video streaming. In 2019, users have been uploading to Youtube around 300 h of videos every minute [1]. Cisco has also conducted statistics on traffic loads and predicted that mobile video streamings will present 72% of the overall data traffic by 2021 [2]. Additionally, it is expected that people will share 1 million minutes of videos every second by 2020 [3]. Such real time service that requires short transmission delays is no more feasible with the far-located cloud data centers. To deal with the growth of cloud data traffic, small caching and computing servers are deployed in the MBSs or access points to bring the cloud computing capability [4] to the edge and offer storage and computing to cache and transcode videos near the users. In this way, the users’ requests are fetched only once from ∗ Corresponding author. E-mail address: [email protected] (E. Baccour). https://doi.org/10.1016/j.future.2019.11.006 0167-739X/© 2019 Published by Elsevier B.V.

the remote servers. Subsequent requests can be served from edge cache without duplicating the transmission from the far-located networks. These distributed edge servers not only enhance the Quality of Experience (QoE), but also help CPs and Mobile Network Operators (MNOs) to minimize cost and network traffic. However, implementing MEC servers in the MBSs is not enough to minimize the data transfer between the viewers and the Content Delivery Network (CDN) due to multiple limitations: (a) the caches deployed in edge servers have scarce storage capacities that cannot serve all the requests, when they operate independently without collaborating with each others; (b) because of the instability of the network condition and the heterogeneity of end-users capabilities, requests for the same video may have different bitrate requirements, which may be different from the original version. To cope with this heterogeneity, the Dynamic Adaptive Streaming over Http (DASH), which is a popular ABR technique, is introduced to transcode a single video into multiple versions having different resolutions in order to serve viewers with heterogeneous bitrate preferences. Serving all the clients while respecting the current bandwidth conditions requires multiple requests to the CDN and storing different versions of the

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

same content in the edge cache. These requirements present limitations in terms of storage and intensification of load in the MEC servers; (c) It is hard to have a global view to efficiently cache videos at the MEC servers based on the current users’ requests to reduce the data transfer from the CDN. To mitigate these limitations, few research attempts propose to implement both caching and processing in edge servers to serve all viewers requests for different video representations [5]. Therefore, by owning computation capabilities, servers can transcode a high video bitrate version to different lower bitrates. In this way, the MEC server may cache a higher bitrate version, which can be transcoded on the fly to multiple lower bitrate versions as demanded, without requesting the content from remote servers. These contributions are extended to adopt a collaborative caching approach [6,7]. In collaborative caching networks, MEC servers assist each others to share video contents and to provide the requested videos (either from their cache or by transcoding). Hence, the collaborative caching and processing has been introduced as an approach that reduces the transit and backhaul load to fetch different representations of the same video and enlarge the possibility to find the requested content in the edge with a reduced latency. Challenges and motivation: The related works in MEC caching and transcoding face multiple challenges and limitations. (1) Most of the related works proposed to fetch, cache, and transcode the whole requested video. However, for example, 20% of Youtube viewers abandon videos in the first 10 s, if it does not attract them [8]. Furthermore, according to Ooyalas Q4 2013 reports [9], the watch time of VODs on connected TVs is 5.1 min per play. Meanwhile, the average watch time on mobiles, per play, is only 3.5 min for live streaming and 2.8 min for VOD. For long contents, videos are only watched in 3 min chunks, when using mobile devices [9]. Authors in [10] stated that the engagement of viewers depends on the length of videos and that more than 40% of viewers end the video in the beginning and more than 60% end the streaming in the middle, if the video is longer than 20 min. Other studies [11,12] proved that in addition to the length of videos, the popularity and the category of a video can impact the watching time. Hence, for several reasons, viewers are not engaged to watch the whole video and they prefer to skip parts and to watch only scoops of the streaming. To summarize, ending a video unexpectedly without fully watching it incurs (a) a higher cost related to fetching unneeded parts of the videos, (b) a higher bandwidth consumption of the transit and backhaul links, (c) a lower hit ratio due to caching a number of unwatched video parts leading to a rapid cache storage saturation and a lower caching efficiency because of storing a long video in the same MBS, (d) a lower processing efficiency and a higher transcoding cost caused by generating videos that will not be fully watched, in addition to a rapid resource consumption because of transcoding a long video in one server. (2) Most of the works follow a reactive approach, where an uncached video is fetched from the remote server when requested [6,7]. In peak hours, such reactive paradigm puts considerable load on already congested network and is unable to deliver the required QoE, which results in high startup delays. (3) Some of the MEC caching related works proposed proactive caching, where popular videos are previously cached. However, such approaches fetch and cache the whole video. By caching the whole videos, the cache will be rapidly utilized and the number of cached contents will be low. Also, since viewers do not watch videos until the end, even popular videos can contain unpopular chunks that are abandoned by users. These unpopular chunks will be proactively cached, since they belong to a popular video. In this way, the proactive approaches do not take into consideration the popularity of different chunks within a video and consequently do not implement the proactive caching effectively.

45

Our approach: In the ABR based streaming, the video content is divided into small chunks for delivery. Chunked transfer encoding is a streaming data transfer mechanism available in HTTP where a video is downloaded in chunks, saved in a buffer and then chunks are played to the viewer one by one [13]. In this paper, we consider a model, where the network will not fetch all chunks when a video is requested. Instead, various chunks of the video will be fetched proactively based on the chunk view probabilistic model. Such a probabilistic based proactive caching will minimize the delay and increase the QoE. In addition, the chunk is fetched with the bitrate version chosen by the viewer. Then, since the video is delivered in chunks, parts of the same video may be dispersed in various caches based on the viewer’s access patterns. In this paper, chunks are stored in the edge in a collaborative caching and processing network. In this way, a single video can be cached and then potentially transcoded in different MEC servers. The advantages of this strategy are: (a) the CDN servers do not need to deliver a large amount of unwatched content, which minimizes the fetching cost and the backhaul bandwidth utilization, (b) viewers can receive a video that matches their preferences and abides by the network conditions, while minimizing the data waste and the processing inefficiency, (c) caching smaller amount of data can maximize the number of distinct videos existing in the edge network characterized by limited storage and processing capabilities, (d) transferring and processing smaller chunks of videos enhance the delays of serving clients, (e) if a server does not have enough space to cache one long video or to transcode it, different MBSs can collaborate to store this content and convert it to the needed bitrate version. In this way, the cache hit ratio will be improved. In [14], we derived a study of chunks popularity based on users preferences and viewing pattern. In fact, We analyzed a dataset [15] of nearly 5 million videos to understand users video viewing behaviors and show that this behavioral pattern can be exploited to improve the video delivery. Using this derived probabilistic model, a proactive caching to pre-load the cache with popular chunks and a reactive cache replacement are introduced. In this paper, our proactive policy is not only used at the installation of the system, but also used when the video library changes and new videos are added. The contributions in this paper are summarized as follows:

• We present our collaborative chunks caching and process-









ing framework. By proposing the possibility of transcoding and watching one video from different cooperative MEC servers, resource utilization can be significantly improved, specifically when the cache is full. We formulate the collaborative multi-bitrate chunks caching and processing as an optimization problem that aims to minimize the network cost using binary decision variables. We use our study on viewing pattern and analyze of the popularity of different chunks to design a proactive PcP caching that pre-fetches chunks of videos based on users preference. Additionally, we design a CrP policy to update the cache with popular arrivals. Since the optimization problem is an ILP with a high solving complexity, we propose a relaxation to the constraints applied on the decision variables, in order to obtain an efficient solution. We, also, propose an online heuristic, namely Proactive Chunks Caching and Processing (PCCP) algorithm, adequate to be used in real time. This heuristic treats the problem in a greedy and distributed way, which eliminates the inter-dependence between different MEC servers. The relaxation and the heuristic use the PcP and CrP policies to maximize caching of popular videos. We evaluate the performance of our system thoroughly and we compare it to the performance of different edge caching

46

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

systems. Additionally, we justify the use of greedy and distributed algorithm against the optimal and sub-optimal solutions. Our simulation is based on the dataset published on [15]. Our paper is organized as follows: Section 2 gives an overview of the existing efforts in the literature and different challenges motivating our work. Section 3 presents our chunks collaborative system, where we formulate the system as an ILP optimization problem. Then, we present our probabilistic framework to model the viewing pattern of videos and we express the popularity of chunks based on users preference. Next, based on chunks probabilities, we develop two caching policies CrP and PcP to maximize caching popular chunks. Using these policies, a sub-optimal solution is proposed and a real time heuristic PCCP is presented. A detailed experimental evaluation is provided in Section 4. Finally, in Section 5, we present the conclusions. 2. Related work 2.1. Mobile edge computing networks Edge caching is extensively studied in the context of the MEC networks. In fact, at first, significant efforts have been conducted to suggest CDNs that complement cloud networks and store Internet contents [16,17] and, then, to introduce caching and multi-locations for Internet Service Providers (ISPs) [18]. However, as we explained previously, caching in the CDN presents high delays to deliver video contents and incapacity to handle high loads. Considering the fact that the video streaming demand is continually increasing, many efforts are conducted to prove the benefits of the edge in minimizing the traffic incoming from the CDN, which are summarized in the recent surveys [19,20] and [21]. Indeed, edge networks were introduced to address CDN challenges and meet delay-sensitive application requirements. More specifically, the MNOs integrated networking, caching and computation resources with the MBS to build a MEC platform. This MEC platform does not replace the cloud but complements it by executing delay-sensitive tasks next to the users and using the computation capacity of the large number of connected devices to execute real-time compute-intensive applications [22]. 2.2. Video caching, transcoding and traffic management at MEC servers Several researchers have studied the concept of computation and caching in wireless networks in order to accomplish better data rates and minimize latency as a key step toward creating an infrastructure for 5G systems; among them we can cite [23–25]. In the context of multimedia, as videos are large sized contents and different bitrate versions of the same video can be requested, the utilization of the limited MEC resources becomes more challenging. One of the solutions to cater for this challenge is the online video processing to transcode the video to the required representation on the fly. In this context, authors in [26,27] studied the implementation of the Scalable Video Coding (SVC) in the edge networks. However, this technique faced multiple problems including the lack of hardware support and the increase of power consumption in mobile devices. The work in [28], proposing a CachePro approach, investigated the potential benefits of using the storage and computing capabilities of the MEC servers. However, differently from previous works, authors considered the ABR technique for video processing and showed its efficiency for real time transcoding. For the same goals of improving video quality and reducing buffering time in the multimedia applications, authors in [29] adopted DASH technique and proposed a radio network-aware caching to select appropriate chunk bitrate to

offer to viewers. Authors in [30] studied the traffic management to improve the QoE for video services. In this work, the MEC server monitors the bandwidth required to serve a video request and instructs the mobile terminal to combine the cellular and WiFi networks to compensate for insufficient bandwidth, in order to offer the best video quality. As the MEC servers have only a limited storage and computation capacity which is not sufficient to store the large-sized video library, the previous works fail to serve the scalable number of clients from the edge. Hence, collaboration between MEC servers to maximize the edge resources, intelligent caching, and evicting techniques are introduced to enhance the cache hit ratio and minimize the backhaul loads. 2.3. Collaborative video caching and transcoding Many recent efforts studied the contribution of the collaborative caching to enhance the performance of the network. The work in [31] proposed a joint caching in multi-cell systems. The authors formulated the problem as an ILP optimization to minimize the overall cost of sharing the data either between neighboring MBS or between each MBS and the CDN and they proved that this problem is NP-hard. Hence, they designed an online algorithm that relaxes the problem complexity while minimizing the network cost. Aiming to reduce the cost burden as the previous work, authors in [6] proposed a framework where MEC servers collaborate to satisfy the incoming video requests based on the payment and the demand pattern. Similarly, authors in [32] proposed a CoCache system that uses the collaborative caching and auction-based serving to minimize the cost of the network. Authors in [33] targeted to maximize the viewers’ perceived QoE. For this goal, they formulated the multiple bitrate video caching problem under caching capacity constraint, and proved that it is NP-hard. Therefore, they proposed an efficient caching algorithm by proving that any feasible caching solution can be transformed into another feasible solution where only videos with maximum bitrate are deployed on MEC servers and the corresponding QoE does not decrease. This approximation demonstrated its efficiency via extensive simulation. Authors in [34] proposed an utility-based cache placement strategy to reasonably place contents in edge computing system by jointly considering video transmission cost, caching value and cache replacement penalty. A weighted bipartite graph model is applied to describe the relationships between serving tasks and edge servers. The work in [35] presented a Hy-CoCa framework consisting of distributed MEC servers and Network Exposure Function (NEF) to support requests locally and maximize viewers’ satisfaction. Considering realistic features such as users uneven distribution, MEC servers’ proximity and the network size, authors divided servers into disjoint logical groups using fuzzy C-means clustering (FCM) algorithm to serve requests more efficiently. The work in [36] aimed also to improve the QoE and maximize the users task offloading gain. This gain is measured by a weighted sum of reductions in terms of completion time and energy consumption. The considered problem is formulated as a mixed integer nonlinear program (MINLP). Due to the combinatorial nature of this problem, authors proposed to decompose the original problem into a resource allocation (RA) problem with fixed task offloading decision and a task offloading (TO) problem that optimizes the optimal-value function corresponding to the RA problem. The works in [7] and [37] suggested a collaborative caching and processing in the context of Radio Access Networks (RAN). The proposed system, namely JCCP, handled videos sharing between different caches and the transcoding of one video content to different bitrate versions, in order to use edge resources more

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

47

Fig. 1. Illustration of research design and different steps of the work.

efficiently. Authors in [38] proposed an approach named CJCT, where they used the X2 interface, reserved for handover tasks, to enlarge the bandwidth capacity of the MBS. These two previous efforts are the closest to our work. However, these works propose to fetch the whole requested video from the CDN and to store and transcode it in one edge server, which decreases the caching/processing efficiency. In our work, since we are dealing with chunks, multiple MBSs can collaborate to store and process one video in case of resource scarcity. 2.4. Proactive video caching at MEC servers To enhance the edge caching efficiency, some efforts proposed to study the popularity of videos and cache only the popular contents that have the highest probability to be requested later. For example, authors in [39] proposed a technique that pre-load videos based on social associations of active users. Then, these videos are cached at viewers mobile devices to offer smaller delivery delay. However, the proposed framework does not address the challenge of offering multiple bitrate versions of the same video. Authors in [5] proposed, also, a technique to prefetch videos and cache them proactively based on active users preferences. However, in realistic scenarios, viewers do not watch the videos to the end and they prefer to skip parts, which makes requesting the whole video from the CDN a waste in terms of cost and storage resources. In addition, even if the video is popular, some of its chunks could be unpopular. Hence, studying users video preference is not enough to maximize the caching efficiency, since different parts of a single video do not have the same popularity. Instead, in our work, we study popularity of chunks within a video based on viewing patterns. In summary, the novelty and contributions of this paper compared to the existing literature are: (a) presenting our study of viewing pattern of chunks within a video and the popularity of chunks inside the videos library. This helps to design a probabilistic model that estimates the probability of selecting chunks, which is very key for the development of proactive and reactive caching algorithms, (b) a collaborative caching and transcoding of chunks to maximize the edge caching and processing, (c) caching only parts of videos instead of wasting storage space, processing capacity and cost to fetch the whole video from the CDN,

(d) designing proactive and reactive caching policies based on the studied chunks popularities and demonstrating its effectiveness in producing high cache hit ratio, (e) modeling the caching and processing of chunks as an ILP and designing a distributed fetching algorithm adequate for real-time chunks serving and allocation. 3. Proactive caching and processing of video chunks In this section, we present a collaborative MEC caching, transcoding and offloading of different chunks to serve a video request. Our goal is to minimize the CDN transmissions, maximize the edge caching, use bandwidth resources more efficiently, and reduce the content delivery delays and the network cost. For this, we start by describing our architecture and illustrating the system model. The description of the architecture is followed by the formulation of the problem as an ILP that presents an optimal resource caching and transcoding for a minimum network cost and a maximum viewers’ QoE. As this problem is NP-hard, we try to design an adequate solution for real-time implementation. Hence, we relax the storage constraints by designing a proactive and reactive caching. For this purpose, we model the users viewing pattern and we design the caching policies (PcP and CrP). Using these policies, we propose a sub-optimal solution that achieves a lower complexity. Yet, since it is still not suitable for on the fly serving, we propose an online greedy algorithm (PCCP). These steps are illustrated in Fig. 1. To evaluate the efficiency of our approach, we compare it, in the next section, to recent approaches in terms of average delay, hit ratio, CDN data, cost and rejected requests. Finally, to defend the design of PCCP, we compare it to the sub-optimal relaxation, in terms of run-time complexity and performance. 3.1. System model In our system, the network consists of multiple MBSs, each MBS is associated with MEC server providing computation, storage and networking capabilities. Multiple MBSs that collaborate in joint caching and processing constitute a cluster. In this work, we intend to use the servers for caching and computation and we assume that these servers can share resources via backhaul links. Then, the shared streams can be transmitted to mobile users

48

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

Fig. 2. Illustration of the proposed collaborative chunks caching and processing framework to serve a video to viewers.

if requested. In addition, the transcoder embedded within the server can transcode the higher bitrate version to the required bitrate version if needed. Consider the cache and transcoder as a single entity, such a server has computation power and storage. It also has a streaming server. So, if the same version is requested, the data is fetched from the cache and delivered. Else, the data is transcoded and delivered. A video transcoding is the lossy compression of a higher bitrate version to a lower version. This task can be accomplished using different techniques including bitrate reduction and spatial resolution reduction as described in [40]. Transcoding is a computational-intensive task, which has a high cost due to its high CPU usage and energy consumption. Theretofore, optimal allocation of the limited computation resources is vital. The architecture of our system is described in Fig. 2. In our system, a cluster is composed of K MBSs associated with caching servers. A cluster is denoted by K = {1, 2, . . . , K }. The videos library shared in the cluster is indexed by V = {1, 2, . . . , V }. We suppose that all videos in the library have M representations (bitrate versions). Different video are partitioned into chunks with similar length. The set of chunks related to a video v is denoted as v = {v1 , v2 , . . . , vi , . . . , vc }, c is the number of chunks in the video v . All video chunks having a bitrate version l have the same size proportional to the bitrate, denoted as sl . The set of all chunks that can be requested by a viewer is denoted as V˘ = {vil | vi ∈ v, v ∈ V , l = 1, 2, . . . , M}. We consider that a video chunk vil can be obtained by transcoding the video chunk vih if l ≤ h, ∀vi ∈ v, v ∈ V and l, h ∈ {1, 2, . . . , M}. We consider that viewers can only request and receive videos from the closer MBS, which means the MBS with the stronger signal strength, denoted as home node. In addition, each server in our system is provisioned with a cache capacity equal to Sk bytes. We model

This means, each MBS k has a Bk bandwidth capacity to transmit contents to other nodes or to receive chunks from the CDN. Since transcoding videos is a computationally intensive task and because of the real-time requirements of converting videos to the requested bitrate, we consider separate instances for each transcoding task, as considered commonly for real-time live video transcoding. However, since transcoding videos from high bitrate representations (e.g., 1080p or 720p) to low bitrate representations (e.g., 480p or 360p) has different computational requirements, we assign a single instance size for transcoding tasks that is large enough to transcode to the highest considered representation of the chunks. This approach is more robust and appropriate for real time computations. Hence, let Pk represent the processing capacity (number of transcoding instances) of the kth caching server.

the video content caching by introducing the variable Ck i ∈ {0, 1},

• Htvj l = 1 indicates that the chunk of video vil requested by

∀ vi ∈ v , v ∈ V , ∀ l ∈ {1 . . . M}, ∀ k ∈ K, where Ck i = 1 if vil

a user connected to the MBS j can be obtained from the home MBS server j, after transcoding it from a higher video j representation vih ; Ht l = 0 otherwise.

vl

vl

vl

is stored at the MBS k and Ck i = 0 if not. The cache capacity of a cache server associated to a MBS k is expressed as follows:



vil

sl .Ck ≤ Sk ,

k∈K

(1)

vil ∈V˘

Also, we consider that the backhaul links, connecting different MBSs and serving chunks from the CDN, have limited bandwidth.

3.2. Problem formulation To illustrate the different events that can occur when a user connected to a MBS j requests a chunk of video vil ∈ Rj , we j

jk

j

jk

jk

vi

vi

vi

vi

vi

introduce the binary variables {H l , N l , Ht l , Nt l , Nh l } ∈ {0, 1}, which are defined as follows:

• Hvj l = 1 indicates that the chunk of video vil requested by i

a user connected to the MBS j can be served directly by his j home server j; H l = 0 otherwise. vi

• Nvjkl = 1 indicates that the chunk of video vil requested by a i

user connected to the MBS j can be served from the neighbor jk MBS server k (j ̸ = k), k ∈ {1..K }; N l = 0 otherwise. vi

i

vi

• Ntvjkl = 1 indicates that the chunk of video vil requested by a i

user connected to the MBS j can be served from the neighbor MBS server k, after being transcoded from a higher video representation vih . The transcoding is performed in the MBS jk k (j ̸ = k); Nt l = 0 otherwise. vi

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

49

Fig. 3. Possible scenarios that can occur when a user requests a video chunk: (a) The chunk of video can be served directly by the home server; (b) The chunk of video can be served from the home MBS server after being transcoded from a higher representation; (c) The chunk of video can be served from the neighbor MBS server; (d) The chunk of video can be served and transcoded in the neighbor MBS server; (e) The chunk of video can be served from the neighbor MBS server and transcoded in the home server; (f) The chunk of video can be served from the cloud either directly or after being transcoded.

• Nhjk = 1 indicates that the chunk of video vil requested by a vl i

user connected to the MBS j can be served from the neighbor MBS server k, after being transcoded from a higher video representation vih . The transcoding is performed in the home jk MBS j (j ̸ = k); Nh l = 0 otherwise. vi

• Nvj0l = 1 indicates that the chunk of video vil requested by a i

user connected to the MBS j can be served directly from the j0 cloud with the exact representation; N l = 0 otherwise. vi

• Nhj0 = 1 indicates that the chunk of video viM can be served vl i

directly from the cloud and, then, transcoded at home node; j0 Nh l = 0 otherwise. vi

When a viewer requests a video chunk vil from the home MBS j, this chunk will be served following only one of the previously presented events. The different possible events are described in Fig. 3. We define the following constraint to guarantee that only one scenario is adopted (∀ vi ∈ v , v ∈ V , ∀ l ∈ {1 . . . M}, ∀ j, k ∈ K): j

j

vi

vi

H l + Ht l +



jk

jk

jk

j0

vi

vi

vi

vil

(N l + Nt l + Nh l ) + N

k∈K,j̸ =k

j0

+ Nhvl = 1 i

(2)

The next step is to define the cost (in terms of weight) of the network after serving all viewers requests. Let Cbl denote the cost of getting the chunk of videos having a bitrate version l from neighboring MBSs,1 Ct l denote the cost of transcoding a video chunk to a representation l, Cc l denote the cost of fetching the video chunk having a bitrate representation l from the CDN. In practice, Cc l is larger than Cbl since the backhaul link connecting different MBSs is shorter than the links connecting CDN cloud 1 This cost implicitly includes the cost for utilizing networking resources e.g. bandwidth, etc.

with the MBS. This fact makes getting the video chunk from the cache servers more cost-effective than getting the same video from the CDN. The network cost when delivering a chunk vi of video v having a presentation l to a user connected to the MBS j can be expressed as follows (∀ vi ∈ v , v ∈ V , l ∈ {1 . . . M}, j, k ∈ K): CsTj (vil ) = [Cc l ∗ N

+Ct l ∗ Htvj l +

+ (CbM + Ct l ) ∗ Nhj0 vl i



k∈K ((Cb

i

j

l

j0

vil

l

+ Ct l ) ∗ (Ntvjkl + Nhjk ) vl i

(3)

i

+Cb ∗ Nvl )] ∗ sl i

We next present a formulation of the proposed collaborative caching and processing problem. Our optimization problem has an objective to minimize the cost of delivering chunks of videos to viewers. In particular, by fixing the constraints related to resources availability (processing capability and cache storage), vl

we want to determine jointly cache placement policy Cj i and the j

j

jk

jk

jk

vi

vi

vi

vi

vi

video fetching scheduling (H l , Ht l , N l , Nt l , Nh l ). The problem formulation is presented as follows:

∑∑

min

vl j j (C i ,H ,Ht , j vil vil jk jk jk N ,Nt ,Nh ) vil vil vil

s.t

H

Ht

j∈K v l ∈R j i

vl

j

vil j

vil

∑ vil ∈V˘

CsTj (vil ),

≤ Cj i

(4a)

∀vil ∈ V˘ , j ∈ K,

≤ min(1,

M ∑

vh

Cj i ) ∀vil ∈ V˘ , j ∈ K,

(4b) (4c)

h=l+1

vl

sl .Cj i ≤ Sj ,

∀vil ∈ V˘ , j ∈ K

(4d)

50

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

N

vil

jk

Nt

jk

vil

Nh

∀vil ∈ V˘ ,

≤ Ck

vil

jk

vil

M ∑

≤ min(1,

∀j , k ∈ K , vh

Ck i ) ∀vil ∈ V˘ ,

(4e)

∀j, k ∈ K,

(4f)

j, k ∈ K ,

(4g)

h=l+1

M ∑

≤ min(1,

vh

∀vil ∈ V˘ ,

Cj i )

h=l+1

j

j

vi

vi



H l + Ht l +

jk

jk

jk

vi

vi

vi

(N l + Nt l + Nh l )

k∈K,j̸ =k j0 j0

+Nvl + Nhvl = 1, i



(4h)

j, k ∈ K,

j0

j0

vi

vi

vil ∈Rj

a viewer to another.

∑ ∑ k∈K,j̸ =k v l ∈R i

vil

viM

[sl .(Nvkjl i

k

(4i)

+Ntvkjl ) + min(sl+1 .Cj , . . . , sM .Cj ).Nhjk ] ≤ Bj , ∀j ∈ K vil i ∑ j ∑ j0 jk (Ht l + Nh l + Nh l )+ v v v i

vil ∈Rj

i

∑ ∑ k∈K,j̸ =k v l ∈R i

vil

Nt

kj

≤ Pj ,

vil

∀j ∈ K ,

compared to long videos. highly correlated with the watching time. (4j)

k

{Cj , Hvl , Htvj l , Nvjkl , Ntvjkl , Nhjk } ∈ {0, 1} vl

j

i

i

i

i

∀vil ∈ V˘ ,

i

j, k ∈ K,

leaving the session, depending on the video popularity.

• Short videos have higher probability to be fully watched • The lengths, the popularities and the types of videos are

i

k∈K,j̸ =k

• Videos popularity follows a Zipf distribution. • A video popularity depends on its category and differs from • A video popularity is different from a MBS site to another. • Viewers commonly watch small parts of a video before

i

(sM .Nh l + sl .N l ) +

cite [11,12,45]. These studies observed that viewers watch only small parts of a video before leaving the session. Authors in [46] conducted, also, an empirical studies to find the relation between the watching time and the characteristics of videos and they found that the popularity, the length and the category of a video can impact its watching time. Based on this literature review, we drew the following conclusions that helped us to design our model:

(4k)

The constraints applied to the problem are described as follows: The constraints from (4b) to (4g) describe the caching allocation and the constraints (4i) and (4j) are related to the bandwidth and processing allocation. In particular, the constraint (4b) and (4e) guarantee that the chunk of video with the requested representation is available in the correspondent cache; constraints (4c), (4f) and (4g) ensure the availability of a higher representation of the requested chunk to be transcoded; (4d) guarantees the respect of the server’s storage capacity; constraint (4h) ensures that each request for a chunk of video should be satisfied by only one of the scenarios as defined in (2); constraint (4i) guarantees that the bandwidth is available for the transmitted chunks; constraint (4j) guarantees that the processing resources are available for transcoding tasks in different servers; finally, (4k) shows that the decision variables are binary and can only take a value of 0 or 1. The problem in (4) is an NP-hard problem which makes finding the optimal solution an extremely challenging task in terms of time. Another challenge of this problem is that we assume that we know, previously, the exact set of requests incoming in a large time interval. To reduce the complexity of the optimal problem, a proactive sub-optimal relaxation is proposed. This relaxation allows us to solve the original problem more efficiently by decomposing it into small sub-problems that can be solved in a reduced time while loading proactively chunks potentially requested later. This pre-load of popular chunks should be preceded by the study of viewing pattern. 3.3. Users viewing pattern Recently, there have been several studies conducted on the popularity of VoD. Using empirical analysis, authors in [41] searched the distribution that could represent the popularity pv of a video v and they proved that Zipf fits videos popularity with a skew parameter α . This parameter characterizes the similarity of popularities inside a video library. The studies in [42,43] and [44] have showed that videos have different popularities and this popularity depends strongly on the category of the video and the location of viewers. Other efforts focused on studying the viewing pattern of VoD, the average watching time of videos and the factors affecting this viewership; among them we can

These conclusions motivated us to study the probability that a viewer requests a chunk of video based on his/her preference, the length, the popularity, the location and the category of the video. Hence, in our work [14], we proposed a probabilistic model where we studied the popularity of different chunks. We proved that this popularity Pj,y,v,vi can be expressed as follows: Pj,y,v,vi = pj (cgy ) × pj (cgy , v ) × pv (vi ),

(5)

where pj (cgy ) is the probability of requesting any video from a j category cgy . pj (cgy ) depends on the preference p(cgy |uk ) of users j

j

Uj = {u1 . . . u|U | } and the location of the MBS j, from where the j users are served. pj (cgy ) =

1 ∑|Uj |

|Uj |

k=1

j

p(cgy |uk ).

(6)

pj (cgy , v ) is the probability that a video v having a popularity pv and belonging to a category cgy is requested. This probability is presented as follows: ⎧ ⎨ ∑ pv cg , if Ivcgy = 1. V y p i × Ii pj (cgy , v ) = (7) ⎩ i=1 0, otherwise. cgy

where V is the total number of videos and Iv is a binary variable, indicating if v belongs to cgy or not. pv (vi ) is the probability that a chunk vi of a video v is watched. This probability is impacted by the category and the length of the content. pv (vi ), presented in Eq. (8), is derived from an empirical analysis on a real-life video dataset. This dataset contains video logs extracted from one of the most popular video providers YouTube collected by authors in [15] and named TweetedVideos. It contains more than 5 Million videos published in July and August, 2016 from more than one million channels. Also, it presents several metadata of videos including view count; which gives an idea about the popularity of videos, the watch time, and video duration. This data will be used later in our simulation. Since our framework is general, any other data can be used to identify the viewing behavior. In our previous analysis, we proved that the watching pattern of videos, belonging to a category cgy and collected in the dataset, follows a Weibull distribution with parameters αy , βy and γy . The expression of pv (vi ) can change depending on the dataset. Deriving this probability is explained in our work in [14]. pv (vi ) =

[

1 − γy α vi − γy α ) y −( ) y] β βy y −e . e −(

(8)

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

Based on this probabilistic model, We define two proactive and reactive policies for cache population and cache replacement (PcP and CrP), which will be used in the problem relaxation and the online heuristic PCCP solutions. 3.4. Popularity-aware caching policies In this section, we present three caching policies: One of the policies is Least Recently Used (LRU) which is conventionally used in CDNs [47], and the two others are proposed based on the probabilistic model, where we studied the popularity of chunks (see Section 3.3). These policies are named the Proactive caching Policy PcP, that populates, in advance, the cache with the most popular chunks and the caching replacement policy CrP that guarantees to remove less popular chunks and to store in real-time the most popular. Before introducing our policies, we need to present our distributed and synchronized video catalog. This catalog is associated with different caching servers and it is proposed to store the metadata of each cached video (video ID, chunk, level, category, related MBS, replica, popularity within the MBS and last video access). In fact, when receiving a new event (e.g., accessing, caching or removing) at a specific MBS, the related catalog is updated reactively. This updated catalog is shared in real time with other nodes so they can update their catalogs accordingly. Such a collaborative catalog management contributes in making video searching local and in minimizing the communication overheads. Since the number of collaborative MBSs is small, the catalog size will not be very large and searching or updating videos will not present a complex task. A snapshot of a synchronized catalog at a random time t is illustrated in Fig. 4. 3.4.1. LRU The LRU [47] is a well known reactive caching algorithm that is used by multiple joint caching algorithms such as [5] and [37]. The objective of this policy is to fetch the video from the CDN and to cache it at the edge MBS, if there is an available storage. If the cache is full, the LRU video will be deleted and replaced by the new requested video. The performance of this policy is influenced by the frequency of requesting the same video. However, the algorithm does not take into consideration the viewers preferences or the bitrate of the video, while caching or removing. Based on this observation, we designed our Popularity-aware reactive and proactive caching and removing. 3.4.2. CrP We propose, in this section, a reactive caching replacement policy, with minimum replication and based on chunks popularities. Two new features are proposed in our replacement policy: The first one is the minimum replication strategy. In the literature, when a video is requested, it is cached at the home MBS whether it exists in the neighbor nodes or not. This means that a video can be cached multiple times in the same cluster. Consequently, when the cache is full, other videos that might be requested later have to be evicted to provide space for the duplicated video. In our network, a second copy of the chunk is marked as Replica and it is stored, only if the cache is available. when the caching capacity is unavailable, the duplicated chunk is not stored and only a single copy is cached in the cluster. A chunk is called a Replica, if a similar or a higher bitrate representation exists in the cluster, since it is always possible to create a lower video quality from a higher one. Based on this definition, if the cache is full, a lower representation of a video is not stored, if a higher one exists in the cluster. The second feature of our caching policy is the popularity-aware caching and removing. In fact, the unpopular contents are only cached, when the cache is available.

51

In case of storage unavailability, all chunks in the catalog are ranked by their popularity Pj,y,v,vi . Then, the Least Popular chunk (LPC) is selected. If the requested chunk has a lower probability than LPC and it is marked as Replica, it will not be cached in the cluster. If its popularity is higher than the LPC, the chunks with lower popularities are removed. If the newly requested chunk is not a replica and it has the lower popularity, more popular replicas are removed to provide space. In fact, a threshold TH is set. If the difference between the newly requested video which is an LPC and higher popular replicas is less than TH, these chunks are deleted to free space for a non replica content. When removing chunks from the cache/catalog and if two contents have the same popularity, the LRU algorithm is used and the least recently accessed is deleted. The above approach guarantees that the highest popular chunks are cached and the number of stored videos is maximized due to the minimum replication policy. The proposed CrP policy is presented in Algorithm 1. Algorithm 1 Caching replacement Policy (CrP) 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:

Input: Sj , vil , ctg , Cj Output: updated Sj , ctg , Cj isReplica = 0 Add = 0 if vil ∈ ctg then isReplica = 1 if Sj < sl then LPC = min(Pj,.,.,. )∗ if isReplica = 1 then if LPC < Pj,y,v,vi then * CrP-Removing part: while Sj < sl do - Prioritize removing unpopular replicas vi1l1 over unique unpopular chunks Sj = Sj + sl1 v l1

Cj i1 = 0

16: 17: 18: 19: 20: 21: 22: 23:

Add = 1 else if LPC > Pj,y,v,vi then while Sj < sl do l2 with - Remove Higher popular chunks vi2 Pj,y,v,vi 2 − LPC < TH Sj = Sj + sl2 v l2

Cj i2 = 0

24:

Add = 1 else * CrP-Removing part

25: 26: 27:

28: else 29: Add = 1 30: if Add = 1 then 31: S j = S j − sl vl

32:

Cj i = 1

33:

- Add vil to ctg

34: - Update vil to recent time in ctg 35: - Update other catalogues 36: else 37: - Relay to the viewer without caching 38: * Pj,.,.,. is the set of popularities of different chunks 39: stored in the MBS j

3.4.3. PcP PcP is a proactive caching policy, where the cache is populated with the most popular chunks that are most likely to be requested. The popularities of chunks, related to different MBSs, are calculated based on the preference of users as described previously in Section 3.3. More specifically, high popular chunks

52

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

Fig. 4. Snapshot of a synchronized catalog: the catalog is associated and shared with each cache to store the metadata of cached videos (video ID, chunk number, level, category, related MBS, replica, popularity within the MBS and last video access).

are stored with the higher bitrate, one by one until filling the cache. If the cache is still available, popular chunks are preloaded again with a lower representation. This task is done at the initialization of the network and has an initial cost. In our work, we consider that the library is not static and new incoming videos can be added. We assume that these videos have been viewed by a population with the same preference. In this case, when the library changes, popularities of all chunks are re-calculated and the CrP policy is applied to decide whether to proactively add the new chunks or not. These chunks are added with the highest bitrate version. Meanwhile, the catalog is updated with Replicas and popularity status. The proposed PcP is illustrated in Algorithm 2. The two policies CrP and PcP are expected to give a higher hit ratio and a lower latency to access videos compared to LRU, since popular chunks are already cached. Still, all chunks not found in the cluster need to be fetched from the CDN. In this case, these new requested chunks with lower preference will replace replicas and the hit percentage will be maximized. Algorithm 2 Proactive caching Policy (PcP) 1: 2: 3: 4: 5: 6:

Input: S1 , ..., SK Output: C1 , ..., CK , ctg, S1 , ..., SK , P1,.,.,. ,..,PK ,.,.,. Cache initialization: C1 = 0, ..., CK = 0 Catalogue initialization: ctg = ∅ for j ∈ {1, .., K } do calculate Pj,.,.,. based on equation (5).

7: At the initialization of the network: 8: level=M 9: for j ∈ {1, .., K } do 10: while Sj > 0 do 11: Select the higher popular chunk max(Pj,.,.,. ) 12: Update ctg 13: Sj = Sj − slev el

receiving new requests. The sub-optimal relaxation is described in (9). R∗ = (R∗1 , . . . , R∗K ) denotes the received sets of requests in different MBSs. Each R∗j is decomposed into smaller sets received in the small intervals of time t1 , . . . , tp , where p denotes the studied period. R∗j is defined as (R∗t1 , . . . , R∗tp ). In our system, we are not fetching the whole video content when being requested. Instead, we are fetching small chunks, when the viewer intends to continue watching the content. However, by reactively serving chunks when requested, stalls may occur, especially if the chunk is not cached in the cluster. Hence, we propose that, in every small interval ti and for each requested chunk, we proactively fetch and serve the wd next chunks. In this way, in the next interval ti+1 , when the user requests to watch the next part, the video will be streamed without stalls. Pre-serving next requests gives the system a better vision of the system and a more optimal allocation. To make the optimization problem less complex, we propose to remove the storage capacity constraint and the cache vl

decision variable Cj i . Then, between different sub-optimizations, our new proposed reactive and proactive caching and removing policies are applied to update the cache and decide what chunks to remove or to store based on their popularities Pj,y,v,vi (defined in Section 3.3). Additionally, the processing capacity Pj and the bandwidth capacity Bj in different MBSs are updated based on the previous resource utilization. wd ∑ ∑ ∑

min (H

j

j

,Ht , vil vil

v are added:

s.t

H

vl

j

vil+w

Ht

≤ Cj i+w

j

≤ min(1,

jk

M ∑

vh

Cj i+w )

h=l+1

vil+w

vil+w

∀vil+w ∈ V˘ , j ∈ K, w ∈ {0..wd },

vil+w

N

≤ Ck

∀vil+w ∈ V˘ ,

Nt

jk

vil+w

∀vil+w ∈ V˘ ,

jk

Nh

vil+w

≤ min(1,

M ∑

(9b)

(9c)

j ∈ K, w ∈ {0..wd } (9d)

∀j, k ∈ K, w ∈ {0..wd }

3.5. Sub-optimal relaxation As we stated previously, the optimization problem is characterized by its complexity. Hence, we propose a sub-optimal relaxation. The idea is to compute the optimal solution after receiving a small set of requests in short intervals. This interval can be ranged from few milliseconds to few minutes. In this way, the computation will be fast and efficient, and the solution for the one shot optimization problem will be conducted gradually, when

(9a)

j∈K v l ∈R∗t w=0 i j

jk jk jk ,Nt ,Nh l ) vil vil vi

N

vilev el

14: Cj =1 15: if All chunks are cached then 16: level=level-1 17: When library changes and chunks of 18: for j ∈ {1, .., K } do 19: CrP(Sj , viM , ctg , Cj )

CsTj (vil+w ),

vh

∀vil+w ∈ V˘ ,

(9e)

∀j, k ∈ K, w ∈ {0..wd } M ∑ vh ≤ min(1, Cj i+w ) ∀vil+w ∈ V˘ ,

(9f)

Ck i+w )

h=l+1

h=l+1

j, k ∈ K, w ∈ {0..wd }

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

H

j

vil+w

+ Htvj l )+N

[sl .(Nvkjl

j0

+ Nhj0 vl

j0

jk

vil+w

vil+w

∀w ∈ {0..wd } ∑ ∑ + sl .Nvl ) +

k∈K,j̸ =k v l ∈R∗k t i+w

+ Ntvkjl ) + min(sl+1 .Cj i+w

(9g)

j ∈ K,

i+w

i+w

vM

, . . . , sM .Cj i+w )

(9h)

.Nhjk vl

] ≤ Bj , ∀j ∈ K, ∀w ∈ {0..wd } ∑ jk + Nhj0 + Nh l )+ l v v i+w



(Ht

vil ∈R∗j t

j

vil+w



i+w



k∈K,j̸ =k v l ∈R∗k t i+w

{Hvj l

i+w

, Htvj l

i+w

Nt

kj

vil+w

≤ Pj , ∀j ∈ K,

i+w

Initialize P1,.,.,. , ..., PK ,.,.,. , S1 , ..., SK System initialization: (C1 , ..., CK , ctg, S1 , ..., SK , P1,.,.,. ,..,PK ,.,.,. )=PcP(S1 , ..., SK ) for j ∈ {1, .., K } do - PcP(S1 , ..., SK ) // If library changes. for each video request vil incoming to MBSj do for w ∈ {0, .., wd } do vl

if Cj i+w = 1 then -Stream from home MBS j -CrP(Sj , vil+w , ctg , Cj ) else if

M ∑

vh

Cj i+w ≥ 1 and Pj > 0 then

h=l+1

(9i)

}

i+w

∈ {0, 1}∀vil+w ∈ V˘ , j ∈ K, w ∈ {0..wd }

(9j)

The sub-optimal relaxation process is presented in Algorithm 3, where we included the PcP and CrP policies for pre-loading and cache update. The algorithm is initialized by populating different caches using PcP policy. The cache is updated after each iteration using the reactive CrP policy and the proactive PcP policy, in case of library change. Note that the relaxation solution can give better results when treating larger intervals of time. In fact, in a larger period of time, the system can receive a higher number of requests, which results in a better fetching and placement decisions. However, enlarging the inter-time between different iterations adds a higher complexity because of the higher number of the incoming requests. This makes solving the problem, during different time intervals, impractical. Hence, an online algorithm that is appropriate to be used in real time is proposed in the next section. This algorithm is characterized by its distributed solution. Algorithm 3 Sub-optimal relaxation 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:

8: 9: 10: 11: 12:

∀w ∈ {0..wd }

, Nvjkl , Ntvjkl , Nhjk vl i+w

1: 2: 3: 4: 5: 6: 7:

i+w

k∈K,j̸ =k

53

Algorithm 4 Proactive Chunks Caching and Processing (PCCP)

i+w

= 1,

vil+w

i+w

+ Ntvjkl

j0

(sM .Nh

vil ∈R∗j t

(N

k∈K,j̸ =k

vil+w

i+w



+

i+w

+Nhjk vl



Initialize P1,.,.,. , ..., PK ,.,.,. , S1 , ..., SK System initialization: (C1 , ..., CK , ctg, S1 , ..., SK , P1,.,.,. ,..,PK ,.,.,. )=PcP(S1 , ..., SK ) for ti = t1 : tp do - PcP(S1 , ..., SK ) // In case the library changes. - Run the sub-optimal relaxation in the problem (3) ∗t ∗t for the sets of requests R1 i , ..., RK i for j = 1 : K do - Get the transcoding allocation decisions and update Pj and Bj . - Update Sj , Cj and ctg depending on the fetching decisions and CrP policy.

3.6. Proposed online solution (PCCP) Because of the inherent combinatorial complexity of the optimization problem and the real-time requirements of the fetching/placement tasks, we propose a greedy algorithm, namely PCCP, illustrated in Algorithm 4. At the initialization of the system, the PcP policy is applied to populate different caches and catalogs with the highest popular chunks. Additionally, when the videos library changes, the PcP is called again to re-calculate the popularities and re-rank the videos. If the new incoming videos are more popular, the CrP will add them to the catalog. Next, on each incoming request to the MBS j, the chunk with the exact bitrate is searched at the home MBS. If available, the access time of the chunk is updated in the catalog. In addition to the

vl

14: 15: 16: 17: 18: 19: 20:

if Ck i+w = 1, k ∈ K and Bk > sl then if Cbl > Ct l then -Transcode and Stream from home MBS j -CrP(Sj , vil+w , ctg , Cj ) else -Fetch from neighboring MBS k -CrP(Sj , vil+w , ctg , Cj )

21: 22: 23: 24:

else -Transcode and Stream from home MBS j -CrP(Sj , vil+w , ctg , Cj )

13:

25: 26: 27: 28: 29: 30:

else

vl

if Ck i+w = 1, k ∈ K and Bk > sl then -Fetch from neighboring MBS k -CrP(Sj , vil+w , ctg , Cj ) else if

M ∑

vh

Ck i+w ≥ 1, k ∈ K

h=l+1

31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52:

and Pk > 0 and Bk > sh then if Pj > Pk then -Fetch from k and transcode home MBS j -CrP(Sj , vil+w , ctg , Cj ) -CrP(Sj , vih+w , ctg , Cj ) else -Transcode and Fetch from neighboring node k -CrP(Sj , vil+w , ctg , Cj ) else if Pj > 0 and Bj > sM then -Fetch viM +w from CDN and transcode at home MBS j -CrP(Sj , viM +w , ctg , Cj ) -CrP(Sj , vil+w , ctg , Cj ) else if Bj > sl then -Fetch vil+w from CDN -CrP(Sj , vil+w , ctg , Cj ) else - Reject request

the requested chunk, wd potentially requested chunks from the same video are fetched. If the requested bitrate is not available at the home node, it is searched again in the neighboring nodes. The possibility of finding a higher representation locally is also considered. Based on the processing and bandwidth availability, the best option is chosen. If the local transcoding and the exact bitrate of the requested chunk is not possible, a higher bitrate is searched within the neighbor MBSs. If the cluster cannot serve the chunk of video, the bandwidth and transcoding resources are checked to decide whether to bring the exact chunk representation or a higher one from the CDN servers. After each serving task,

54

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60 Table 2 Simulation parameters of different components of the network.

Fig. 5. Fraction of viewers related to ranked videos: the Zipf fit presents an α parameter equal to 0.5. The larger α is, the more heterogeneous is the popularity. Table 1 Weibull parameters related to each category to model the popularity of different chunks as described in section Section 3.3. Category (cgy )

αy

βy

γy

Category (cgy )

αy

βy

People Gaming Leisure News Music Sports Film

2.39 1.98 2.41 4.70 2.45 4.34 2.32

0.56 0.45 0.56 0.95 0.51 0.92 0.62

0.0023 0.0146 −0.0064 −0.298 0.0178 −0.267 0.0205

Howto Comedy Education Science Autos Activism Pets

2.74 2.89 2.40 2.53 2.68 2.50 3.089

0.52 0.65 0.54 0.53 0.58 0.59 0.69

Variable

Distribution/parameters value

Number of MEC servers Total number of video requests per MBS Total number of videos

const, K = 3 const, R1 = R2 = R3 = 10.000

Video popularity Number of video categories New videos added to the library Category preference UP j Video sizes Video bitrate Watching time Chunk size Number of viewers Activity session size Video request arrival Max cache size popularity threshold

const, V = 1000 randomly chosen from [15] zipf, α = 0.5 const, G = 14 (see Table 1) Poisson, λ = 10, inter-add time = 5 min random ≤1500 s (50 chunks) Uniform, M = 4, from 200 kbps to 2 Mbps Exponential, mean watch time from [15] 30 s const, |U1 |= |U2 |= |U3 |= 500 Exponential, mean 300 s Poisson, λ = 5, inter-arrival time = 30 s Library size TH = 0.001

γy 0.0153

−0.0250 −0.0104 0.013 0.0016 −0.0228 −0.066

the CrP policy is applied to update the cache j and the catalog. It means, depending on the popularity of the requested chunk, the caching and removing are accomplished. 4. Performance evaluation 4.1. Simulation settings In this section, we evaluate the performance of our system under different network configurations including storage capacity, processing capacity, bandwidth capacity and popularity of videos. First, we describe the simulation parameters. Then, we study the impact of different cache, processing, bandwidth capacities on cache hit ratio, delay, cost and CDN traffic load. We consider that our network consists of 3 neighboring MBSs attached to 3 MEC servers (K = 3). The video library V , shared within the cluster, consists of 1000 different videos chosen randomly from the dataset in [15]. Different videos of the library are divided into chunks of 30 s each. Videos are selected with a duration lower than 1500 s (50 chunks). The length of chunks and the maximum duration of videos are chosen to limit the size of the set Pj,.,.,. . The popularity (number of views) of the chosen videos follows a Zipf distribution having a skew parameter α = 0.5 as shown in Fig. 5. The total number of views of each video is divided randomly between different MBSs. In this way, a video can have different popularities in each MEC server. Each video has 4 different representations (M = 4). As configured in [37], we set the bitrate variants related to each representation to be 0.45, 0.55, 0.67, 0.82 of the original video bitrate version. In this paper, we consider that all videos have the same original bitrate which is 2 Mbps. The videos belong to G = 14 categories described in Table 1. New videos are added to the library following a Poisson process with a data rate equal to 10 and an inter-arrival time equal to 5 min. It means the library may change every 5 min. In our simulation, we assume that we have a pool of 500 users connected to each MBS. These users follow a Poisson arrival model with a data rate λ = 5. The

average user active time is equal to 300 s and the inter-arrival time between requests is equal to 30 s. The watching duration per video follows an exponential distribution with a mean equal to average watch time provided in the dataset. The bitrate version of each video is selected following a uniform distribution (different versions have equal probabilities). Table 2 summarizes the simulation parameters. We generate 10,000 video requests in each MBS (around 45,000 chunks requests per MBs). The access delay to get a chunk of video for various scenarios follows a uniform distribution in a range of (a) [5–10] ms from the CDN servers (b) [1–2.5] ms when being served from neighboring MBSs (c) [0.25– 0.5] ms when served from the home server. Different processing capacities (0 to 20 Mbps) and bandwidth capacities (10 to 100 Mbps) are tested. In addition, different storage capacities (from 10% to 90% from the video library) are used. Several parameters are evaluated to prove the performance of our system: (a) cache hit ratio: is the number of requests that can be fetched or transcoded in the edge network (home or neighbor servers). (b) Average access delay: is the average latency to receive videos from different caches or from the CDN. (c) Data removal: is the volume of content removed from the caches to provide space for new requested videos. (d) Total CDN data: is the volume of traffic received from the cloud remote servers. (e) CDN cost: is the cost of fetching the chunks of videos from the CDN. The CDN cost is calculated as $ 0.03 per GB. (f) Number of rejected requests: is the number of requests that cannot be served because of the bandwidth unavailability. Our proposed system is compared to different recently proposed systems CachePro [5], CoCache [32], JCCP [7,37] and CJCT [38] (with X2 bandwidth equal to 15 Mbps). These systems are described in the related work section. 4.2. Simulation results 4.2.1. Impact of cache size The cache size is an important parameter to evaluate the efficiency of a caching system. Fig. 6 shows the performance of different cache systems achieved for the described cluster. The cache size varies from 10% to 90% of the video library size and the processing capacity is equal to 15 Mbps, while the bandwidth capacity is equal to 30 Mbps. Fig. 6(a) shows the performance of our system in terms of cache hit ratio. It can be seen that our PCCP heuristic with its CrP and PcP policies performs significantly better than the other caching systems. When the cache size is very low (equal to 10% of the library size), the PCCP heuristic presents a hit ratio equal to 0.79 when wd = 0 and 0.96 when wd = 1.

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

55

Fig. 6. Performance comparison based on varying cache capacity with processing capacity = 15 Mbps and bandwidth capacity = 30 Mbps: (a) Cache Hit ratio incurred for 30,000 request, (b) Average access delay per request, (c) Removed data, (d) Total CDN Data loaded from CDN, (e) CDN cost, (f) Number of rejected requests.

The achieved ratio is significantly higher compared to CJCT, JCCP, CoCache and CachePro achieving a cache hit ratio equal to 0.6, 0.56, 0.45 and 0.53 respectively. This can be explained by: (a) The PcP policy that stores the highest popular chunks, which improves the probability to find the videos inside the cluster, (b) The collaboration between MBSs to store different chunks of the same videos, which makes caching a video possible, even when the cache is full, (c) Storing only chunks to be watched, since it is proved that viewers rarely watch the content to the end which provides more space to cache a higher number of chunks, (d) The CrP policy that caches only high popular chunks in the cache, removes only less popular chunks and avoids the replication of videos to maximize the number of cached videos. (e) In addition to proactively fetching the highest video bitrate and transcoding it to the required representation locally, we avoid adding lower bitrate replica. In this way, the number of fetches from the CDN becomes lower, which is shown in Fig. 6(d). In fact, in the case of other systems, when the exact bitrate is fetched from the remote servers, any other higher request results in an additional CDN transmission. Also, when removing the last accessed content, the bitrate version is not considered. However, in our system, a higher or exact representation should be found in the cache before removal. When wd = 1, we can see that the hit ratio becomes very high, which is explained by the fact that the next potentially requested chunks are fetched and served beforehand. This number of predicted requests contributes to increasing the hit ratio. In our simulation, we limited wd to 1 because we are working with a relatively large chunk size (30 s), which is enough to fetch the next chunk without stalls. In the case of very small chunks, wd should be larger to serve the next requests without an initial delay. Fig. 6(b) shows the average access delay per request among different systems. We can see clearly that our system presents a much better access delay to deliver videos to viewers which can be explained by: (a) delivering chunk by chunk with a low delay compared to serving a large video contributes to

minimizing the delivery time, (b) collaborating to cache a single video increases the chance to find the requested content locally, (c) The proactive and reactive caching maximizes the cache hit which leads to less delay. Fig. 6(c) shows the removed data from different caches to provide space for the new cached videos. It can be observed that CJTC, JCCP, CoCache and CachePro show a high dependence on cache size in terms of data removal since a larger cache offers more space to cache newly requested videos. Also, a large difference is presented between our proposed system and other caching systems. This difference shows the performance of the CrP policy that adds only popular chunks and removes the unpopular ones compared to the LRU that removes the last accessed videos. Hence, videos with a high probability to be requested will be always stored in the cluster. In addition, when caching a large video in the same MBS, a big amount of data should be deleted to free space for a potentially large videos that will not be fully watched. Whereas, in our case, watched pieces can be cached in different servers, if the home MBS does not have enough space to cache them all. Figs. 6(d), 6(e) and 6(f) show that PCCP outperforms the other systems in terms of fetched data, cost and rejected requests, which matches the previous results. When the prediction window wd is equal to 1, our system incurs more cost and data fetching compared the network without pre-fetching. This can be explained by the additional cost of missed predictions. The missed predictions are equal to one pre-loaded chunk for each requested video, which means a chunk that is served but not watched by the user abandoning the video. This cost waste is accepted since the additional loaded content presents only small chunks with a minimal cost. Our system has, also, an initial cost of pre-loading the caches using PcP. However, this cost is negligible while installing the system. 4.2.2. Impact of processing capacity Fig. 7 shows the impact of varying the processing capacity on different performance parameters while fixing the cache size to 20% of the videos library and bandwidth capacity to 30 Mbps. We

56

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

Fig. 7. Performance comparison based on varying processing capacity with a cache size = 20% of the library size and bandwidth capacity = 30 Mbps: (a) Cache Hit ratio incurred for 30,000 request, (b) Average access delay per request, (c) Removed data, (d) Total CDN data loaded from CDN, (e) CDN cost (f) CDN cost.

Fig. 8. Performance comparison based on varying bandwidth capacity with a cache size = 20% of the library size and a processing capacity = 15 Mbps: (a) Cache Hit ratio incurred for 30,000 request, (b) Average access delay per request, (c) Removed data, (d) Total CDN data loaded from CDN, (e) CDN cost, and (f) rejected requests.

can see that the CoCache system is not impacted by enlarging the processing capacity since it does not have any transcoding capability. However, our system, along with other caching systems present better performance for larger processing capacities. In fact, the PcP stores the popular chunks with the highest level. Also, the proactive fetching from the CDN (prioritizing fetching

the higher level) requires large processing capability to transcode the chunks to the requested bitrates. This can explain the increasing of the performance in all graphs when enlarging the processing capacity. The performance of our system becomes constant for large transcoding capabilities. In fact, small chunks need less resources to be transcoded compared to long videos

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

57

Fig. 9. Performance comparison based on Zipf parameter with a cache size = 10% of the library size, processing capacity = 15 Mbps and bandwidth capacity = 30 Mbps: (a) Cache Hit ratio incurred for 30,000 request, (b) Average access delay per request, (c) CDN cost.

that are resource-hungry contents. Hence, 15 Mbps was the required capacity to treat the incoming chunks requests for PCCP. Additionally, collaboration between different MBSs to transcode one video content contributes to alleviate the processing load. 4.2.3. Impact of bandwidth capacity The bandwidth capacity is also an important parameter to show the efficiency of serving viewers. Fig. 8 shows the impact of varying the bandwidth capacity on the system performance, while fixing the cache size to 20% and the processing capacity to 15 Mbps. We can see that the PCCP presents a near-constant behavior compared to the other edge caching systems, which is explained by the fact that only small chunks are transmitted through backhaul links and chunks of same video can be served using different links. Additionally, using the PcP policy, most of the popular chunks are cached locally (at home node). Also, the video is not entirely fetched, when being requested. We can see also that the total CDN data in Fig. 8(d) and CDN cost in Fig. 8(e) are increasing. This can be justified by the fact that small bandwidth capacities are not enough to serve all the requests as shown in Fig. 8(f). Hence, enlarging the bandwidth capacities means more videos to transmit and consequently more incurred cost and data. Finally, Fig. 8(f) shows that different caching systems are highly impacted by the bandwidth capacities in terms of requests rejection. When improving the system bandwidth, the PCCP presents lower rejections compared to other systems. 4.2.4. Impact of video popularities Varying the popularities of videos has an important impact on our system since it is based on the preference of viewers. Fig. 9 presents the impact of changing the Zipf parameter α on cache hit, delay and CDN cost. Specifically, a large α (high skewed popularity) represents a library with a small number of videos with similar popularities. It means that the library contains videos with very high popularities and videos with very low popularities. On the other hand, if α is lower, the library presents videos with similar popularities (a lower diversity between content objects). It can be noticed that caching and requesting a smaller number of popular chunks can produce higher cache hit ratio and reduce the CDN hit rate. In our simulation, we created another video library from the same dataset [15] containing videos with a larger α = 0.7, which means the library contains videos with higher difference in terms of popularity. In this way, videos with very high popularity will be highly requested. Whereas, unpopular videos will be rarely or never requested. We suppose, in this simulation, that there will not be new video arrivals to maintain the same popularity distribution and the same skew. Fig. 9(a) shows that a higher skewed library gives better results in terms of cache ratio. This is explained by the fact that only a part of videos

Fig. 10. Run time complexity: the optimal solution is highly complex compared to the relaxed and online solutions, which justify the design of the PCCP algorithm.

are frequently requested which are stored in the cluster thanks to the PcP policy. Also, even if the cache is low (10% of the library size), the size is enough to cache highly popular chunks which enhances the cache hit. Similarly, the access delay and the cost are minimized when α is bigger as shown is Figs. 9(b) and 9(c). 4.2.5. Performance of the PCCP heuristic To prove the efficiency of our online greedy heuristic, we compared it to the sub-optimal results. However, we need to justify, first, the design of an online heuristic. Fig. 10 shows the time complexity of the optimization problem, the relaxation and the online PCCP. We can see that the complexity of the optimal solution is very high, reaching 21 h for 15,000 requests per cluster. With the same number of requests, the relaxation solves the problem in 21 min, while the PCCP serves all viewers in 8 min, which validates the usage of an online heuristic. We stopped the simulation at 15,000 because of the computation complexity required by the optimal solution. Note, also, that we conducted our experiments on a computer with the following characteristics: Windows 7, core i7, 8 GB RAM. Next, we evaluate the performance of the PCCP compared to the relaxation. In this simulation, we generated only 2000 video requests (around 27,000 chunk requests) to minimize the computation time. Fig. 11 shows the performance of the sub-optimal system compared to the online PCCP heuristic in terms of hit ratio and delay at different cache sizes, and different processing and bandwidth capacities. We can see that the sub-optimal solution is very close to the heuristic results which proves the high performance of our greedy-algorithm. The minor difference between the sub-optimal solution (decisions based on a set of requests) and the online solution (decision taken on each request arrival) is explained by

58

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

Fig. 11. Performance comparison between the online heuristic and the sub-optimal solution for 2000 requests: (a) Cache Hit ratio while varying caching capacity, (b) Cache Hit ratio while varying processing capacity, (c) Cache Hit ratio while varying bandwidth capacity (d) Average access delay while varying caching capacity, (e) Average access delay while varying processing capacity, (f) Average access delay while varying bandwidth capacity.

the fact that a small difference in the fetching decisions does not add a big enhancement to the system since we are dealing with chunks with a small cost and delivery delay. However, in the other systems, a non-optimal fetching decision taken for a large video can charge the network the cost of multiple chunks. The difference is bigger for small processing and bandwidth capacities because it is better to have a visibility on the whole set of requests to better assign the scarce number of resources. However, in the case of the online heuristic, the resources are assigned for the first arrived requests. When the processing and bandwidth capacities are big, the resources are enough for all transcoding and transmission tasks. We aimed, in this paper, to minimize the CDN transmissions and network cost, relieve the backhaul congestions, maximize the edge caching, and reduce the perceived content delivery delays. In order to achieve this, we were motivated by the fact that most of the viewers do not fully watch videos; which leads to large waste in terms of cost of serving unwatched chunks. Hence, since the viewing pattern can be derived, we proposed to study the popularity of different chunks and pre-load MEC caches with the most popular ones. Furthermore, we proposed that a video is not cached, processed or offloaded by one MEC server, as done in previous works. Instead, neighboring MBSs collaborate to cache and transmit different video chunks. Finally, as shown by our extensive simulation, we proved that our approach contributes to achieve the aforementioned goals. Specifically, by cooperating to cache only chunks to be watched, more contents can be found in the edge (10% to 20% improvement); which minimizes the CDN fetching, and reduces the cost and access delay by 50%. In addition, collaborating to serve parts of each request contributes to use more efficiently the limited bandwidth resources; which relieves backhaul congestions.

5. Conclusion In this paper, we proposed that different MEC servers collaborate to cache and transcode different chunks of one video content, in order to use more efficiently the backhaul and storage resources, reduce the content delivery delay and minimize the network cost. Our joint caching and transcoding approach is formulated as an optimization problem. Due to the complexity of the problem, we proposed a sub-optimal relaxation and a PCCP framework to share and transcode multi-bitrate chunks in real time. We also studied the videos viewing pattern and we proposed a CrP and PcP policies to manage the addition and removal of chunks based on their popularity. The extensive simulation of our heuristic approach proves the performance of the PCCP compared to other caching approaches in terms of cost, cache hit ratio, average delay, cache removals and CDN hits. Still, in addition to modeling the popularity of different chunks and proactively serving the next request, the popularity of different bitrate versions needs to be studied and the prediction of the representation to be requested is required. As a future work, we intend to address these challenges and extend the work to include device to device offloading as collaborative chunks caching and transmitting can contribute to efficiently use limited resources of users. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgment This publication was made possible by NPRP grant 8-5191-108 from the Qatar National Research Fund (a member of

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60

Qatar Foundation). The findings achieved herein are solely the responsibility of the author(s). References [1] Youtube statistics – 2019, 2019, URL https://merchdope.com/youtubestats/ (Accessed: 2019-05-08). [2] Cisco visual networking index: Global mobile data traffic forecast update, 2016-2021, 2017. [3] Cisco Visual Networking Index: Forecast and Trends, White Paper 2017—2022, 2018. [4] F. Haouari, E. Baccour, A. Erbad, A. Mohamed, M. Guizani, Qoe-aware resource allocation for crowdsourced live streaming: a machine learning approach, CoRR abs/1906.09086 (2019) arXiv:1906.09086. [5] H.A. Pedersen, S. Dey, Enhancing mobile video Capacity and quality using rate adaptation, ran Caching and processing, IEEE/ACM Trans. Netw. 24 (2) (2016) 996–1010. [6] A. Ndikumana, S. Ullah, T. LeAnh, N.H. Tran, C.S. Hong, Collaborative cache allocation and computation offloading in mobile edge computing, in: 2017 19th Asia-Pacific Network Operations and Management Symposium (APNOMS), 2017, pp. 366–369. [7] T. Tran, D. Pompili, Adaptive bitrate video Caching and processing in mobile-edge computing networks, IEEE Trans. Mob. Comput. (2018) 1. [8] Youtube by the numbers: Stats, demographics & fun facts, 2018, URL https://www.omnicoreagency.com/youtube-statistics/ (Accessed: 2019-0508, updated: 2019-01-06). [9] C. Rick, Ooyala’s q4 2013 report, 2014, URL http://tubularinsights.com/livevideo-vod-per-play/#ixzz4JTvPtK7v (Accessed: 2019-05-08). [10] E. Fishman, How long should your next video be?, 2016, URL https://wistia. com/learn/marketing/optimal-video-length (Accessed: 2019-05-08). [11] Y. Chen, Y. Liu, B. Zhang, W. Zhu, On distribution of user movie watching time in a large-scale video streaming system, in: 2014 IEEE International Conference on Communications (ICC), 2014, pp. 1825–1830. [12] F. Jiang, Z. Liu, K. Thilakarathna, Z. Li, Y. Ji, A. Seneviratne, Transfetch: A viewing behavior driven video distribution framework in public transport, in: 2016 IEEE 41st Conference on Local Computer Networks (LCN), 2016, pp. 147–155. [13] Chunked transfer encoding, 2019, URL https://en.wikipedia.org/wiki/ Chunked_transfer_encoding (Accessed: 2019-05-08). [14] E. Baccour, A. Erbad, A. Mohamed, K. Bilal, M. Guizani, Proactive video chunks Caching and processing for latency and cost minimization in edge networks, in: 2019 IEEE Wireless Communications and Networking Conference (WCNC), 2019, pp. 1–7. [15] S. Wu, M.-A. Rizoiu, L. Xie, Beyond Views: Measuring and Predicting Engagement on YouTube Videos, CoRR ArXiv preprint abs/1709.02541 (2017). [16] N. Anjum, D. Karamshuk, M. Shikh-Bahaei, N. Sastry, Survey on peer-assisted content delivery networks, Comput. Netw. 116 (2017) 79–95. [17] M.A. Salahuddin, J. Sahoo, R. Glitho, H. Elbiaze, W. Ajib, A survey on content placement algorithms for cloud-based content delivery networks, IEEE Access 6 (2018) 91–114. [18] Q. Jia, R. Xie, T. Huang, J. Liu, Y. Liu, The collaboration for content delivery and network infrastructures: A survey, IEEE Access 5 (2017) 18088–18106. [19] A. Ahmed, E. Ahmed, A survey on mobile edge computing, in: 2016 10th International Conference on Intelligent Systems and Control (ISCO), 2016, pp. 1–8. [20] S. Wang, J. Xu, N. Zhang, Y. Liu, A survey on service migration in mobile edge computing, IEEE Access 6 (2018) 23511–23528. [21] W.Z. Khan, E. Ahmed, S. Hakak, I. Yaqoob, A. Ahmed, Edge computing: A survey, Future Gener. Comput. Syst. 97 (2019) 219–235. [22] E. Ahmed, M.H. Rehmani, Mobile edge computing: Opportunities, solutions, and challenges, Future Gener. Comput. Syst. 70 (2017) 59–63. [23] E. Ahmed, A. Ahmed, I. Yaqoob, J. Shuja, A. Gani, M. Imran, M. Shoaib, Bringing computation closer toward the user network: Is edge computing the solution?, IEEE Commun. Mag. 55 (11) (2017) 138–144. [24] J.A. Cabrera, R. Schmoll, G.T. Nguyen, S. Pandi, F.H.P. Fitzek, Softwarization and network coding in the mobile edge cloud for the tactile internet, Proc. IEEE 107 (2) (2019) 350–363. [25] S. Misra, R. Tourani, F. Natividad, T. Mick, N.E. Majd, H. Huang, Accconf: An access control framework for leveraging in-network Cached data in the ICN-enabled wireless edge, IEEE Trans. Dependable Secure Comput. 16 (1) (2019) 5–17. [26] K. Poularakis, G. Iosifidis, A. Argyriou, I. Koutsopoulos, L. Tassiulas, CaChing and operator cooperation policies for layered video content delivery, in: IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, 2016, pp. 1–9. [27] R. Yu, S. Qin, M. Bennis, X. Chen, G. Feng, Z. Han, G. Xue, Enhancing software-defined RAN with collaborative caching and scalable video coding, in: 2016 IEEE International Conference on Communications (ICC), 2016, pp. 1–6.

59

[28] H. Ahlehagh, S. Dey, Video-aware scheduling and Caching in the radio access network, IEEE/ACM Trans. Netw. 22 (5) (2014) 1444–1462. [29] Y. Tan, C. Han, M. Luo, X. Zhou, X. Zhang, Radio network-aware edge caching for video delivery in MEC-enabled cellular networks, in: 2018 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), 2018, pp. 179–184. [30] S. Kim, D.-Y. Kim, J.H. Park, Traffic management in the mobile edge cloud to improve the quality of experience of mobile video, Comput. Commun. 118 (2018) 40–49. [31] A. Gharaibeh, A. Khreishah, B. Ji, M. Ayyash, A provably efficient online collaborative Caching algorithm for multicell-coordinated systems, IEEE Trans. Mob. Comput. 15 (8) (2016) 1863–1876. [32] J. Dai, F. Liu, B. Li, B. Li, J. Liu, Collaborative Caching in wireless video streaming through resource auctions, IEEE J. Sel. Areas Commun. 30 (2) (2012) 458–466. [33] Z. Qu, B. Ye, B. Tang, S. Guo, S. Lu, W. Zhuang, Cooperative Caching for multiple bitrate videos in small cell edges, IEEE Trans. Mob. Comput. (2019) 1. [34] C. Li, J. Tang, H. Tang, Y. Luo, Collaborative cache allocation and task scheduling for data-intensive applications in edge computing environment, Future Gener. Comput. Syst. 95 (2019) 249–264. [35] D. Ren, X. Gui, K. Zhang, J. Wu, Hybrid collaborative caching in mobile edge networks: An analytical approach, Comput. Netw. 158 (2019) 1–16. [36] T.X. Tran, D. Pompili, Joint task offloading and resource allocation for multi-server mobile-edge computing networks, IEEE Trans. Veh. Technol. 68 (1) (2019) 856–868. [37] T.X. Tran, P. Pandey, A. Hajisami, D. Pompili, Collaborative Multi-bitrate Video Caching and Processing in Mobile-Edge Computing Networks, CoRR arxiv abs/1612.01436 (2016). [38] K. Bilal, E. Baccour, A. Erbad, A. Mohamed, M. Guizani, Collaborative joint caching and transcoding in mobile edge networks, J. Netw. Comput. Appl. 136 (2019) 86–99. [39] C. Yi, S. Huang, J. Cai, An incentive mechanism integrating joint power, channel and link management for social-aware d2d content sharing and proactive Caching, IEEE Trans. Mob. Comput. 17 (4) (2018) 789–802. [40] A. Vetro, C. Christopoulos, H. Sun, Video transcoding architectures and techniques: an overview, IEEE Signal Process. Mag. 20 (2) (2003) 18–29. [41] K. Pires, G. Simon, Youtube live and twitch: A tour of user-generated live streaming systems, in: Proceedings of the 6th ACM Multimedia Systems Conference, ACM, 2015, pp. 225–230. [42] G.S. Njoo, K.-W. Hsu, W.-C. Peng, Distinguishing friends from strangers in location-based social networks using co-location, Pervasive Mob. Comput. 50 (2018) 114–123. [43] Most popular online video categories in the United States as of february 2017, 2018, URL https://www.statista.com/statistics/277777/most-popularus-online-video-catogories/ (Accessed: 2019-05-08). [44] L. Tao, C. Zhumin, L. Yujie, M. Jun, Temporal patterns of the online video viewing behavior of smart tv viewers, J. Assoc. Inf. Sci. Technol. 69 (5) (2018) 647–659. [45] A.L. Jia, S. Shen, D. Li, S. Chen, Predicting the implicit and the explicit video popularity in a user generated content site with enhanced social features, Comput. Netw. 140 (2018) 112–125. [46] Y. Chen, B. Zhang, Y. Liu, W. Zhu, Measurement and modeling of video watching time in a large-scale internet video-on-demand system, IEEE Trans. Multimed. 15 (8) (2013) 2087–2098. [47] CaChe replacement policies, 2019, URL https://en.wikipedia.org/wiki/ Cache_replacement_policies (Accessed: 2019-05-09).

Emna Baccour received the Ph.D. degree in computer Science from the University of Burgundy, France, in 2017. She was a Research Assistant at Qatar University on a project covering the interconnection networks for massive data centers. She currently holds a postdoctoral position at Qatar University. Her research interests include data center networks, cloud computing, green computing and software defined networks as well as distributed systems. She is also interested in edge networks and mobile edge caching and computing. Aiman Erbad is an associate professor at Computer Science and Engineering department at Qatar University, Qatar. He received his PhD from British Columbia University, Canada. He regularly serves as a technical program committee member in international conferences related to multimedia systems and networking (ACM Multimedia, ACM Multimedia Systems, NOSSDAV, MoVid).

60

E. Baccour, A. Erbad, K. Bilal et al. / Future Generation Computer Systems 105 (2020) 44–60 Kashif Bilal received his PhD from North Dakota State University USA. He is currently a post-doctoral researcher at Qatar University, Qatar. He is an Assistant Professor at COMSATS Institute of Information Technology, Pakistan. His research interests include cloud computing, energy efficient high speed networks, and crowdsourced multimedia. Kashif is awarded CoE Student Researcher of the year 2014 based on his research contributions during his doctoral studies at North Dakota State University.

Amr Mohamed (S’ 00, M’ 06, SM’ 14) received his M.S. and Ph.D. in electrical and computer engineering from the University of British Columbia, Vancouver, Canada, in 2001, and 2006 respectively. He has worked as an advisory IT specialist in IBM Innovation Centre in Vancouver from 1998 to 2007, taking a leadership role in systems development for vertical industries. He is currently a professor in the college of engineering at Qatar University and the director of the Cisco Regional Academy. He has over 25 years of experience in wireless networking research and industrial systems development. He holds 3 awards from IBM Canada for his achievements and leadership, and 4 best paper awards from IEEE conferences. His research interests include wireless networking, and edge computing for IoT applications. Dr. Amr Mohamed has authored or co-authored over 160 refereed journal and conference papers, textbook, and book chapters in reputable international journals, and conferences. He is serving as a technical editor for the journal of internet technology and the international journal of sensor networks. He has served as a technical program committee (TPC) co-chair for workshops in IEEE WCNC’16. He has served as a co-chair for technical symposia of international conferences, including Globecom’16, Crowncom’15, AICCSA’14, IEEE WLN’11,

and IEEE ICT’10. He has served on the organization committee of many other international conferences as a TPC member, including the IEEE ICC, GLOBECOM, WCNC, LCN and PIMRC, and a technical reviewer for many international IEEE, ACM, Elsevier, Springer, and Wiley journals. Mohsen Guizani (S’85-M’89-SM’99-F’09) received the B.S. (with distinction) and M.S. degrees in electrical engineering, the M.S. and Ph.D. degrees in computer engineering from Syracuse University, Syracuse, NY, USA, in 1984, 1986, 1987, and 1990, respectively. He is currently a Professor at the CSE Department in Qatar University, Qatar. Previously, he served as the Associate Vice President of Graduate Studies, Qatar University, University of Idaho, Western Michigan University, and University of West Florida. He also served in academic positions at the University of Missouri-Kansas City, University of Colorado-Boulder, and Syracuse University. His research interests include wireless communications and mobile computing, computer networks, mobile cloud computing, security, and smart grid. He is currently the Editor-inChief of the IEEE Network Magazine, serves on the editorial boards of several international technical journals and the Founder and the Editor-in-Chief of Wireless Communications and Mobile Computing journal (Wiley). He is the author of nine books and more than 500 publications in refereed journals and conferences. He guest edited a number of special issues in IEEE journals and magazines. He also served as a member, Chair, and General Chair of a number of international conferences. He received three teaching awards and four research awards throughput his career. He received the 2017 IEEE Communications Society Recognition Award for his contribution to outstanding research in Wireless Communications. He was the Chair of the IEEE Communications Society Wireless Technical Committee and the Chair of the TAOS Technical Committee. He served as the IEEE Computer Society Distinguished Speaker from 2003 to 2005. He is a Fellow of IEEE and a Senior Member of ACM.