A novel GPU-aware Histogram-based algorithm for supporting moving object segmentation in big-data-based IoT application scenarios

Information Sciences 496 (2019) 592–612 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins...

Download PDF

4MB Sizes 0 Downloads 6 Views

Report

Full Text

Information Sciences 496 (2019) 592–612

Contents lists available at ScienceDirect

Information Sciences journal homepage: www.elsevier.com/locate/ins

A novel GPU-aware Histogram-based algorithm for supporting moving object segmentation in big-data-based IoT application scenarios Alfredo Cuzzocrea∗, Enzo Mumolo DIA Department, University of Trieste, Italy

a r t i c l e

i n f o

Article history: Received 18 February 2018 Revised 8 March 2019 Accepted 13 March 2019 Available online 14 March 2019 Keywords: Big data management IoT application scenarios Moving object segmentation GPU-aware algorithms Intelligent applications

a b s t r a c t Multimedia data are a popular case of Big Data that expose the classical 3V characteristics (i.e., volume, velocity and variety). Such kind of data are likely to be processed within the core layer of Internet of Things (IoT) platforms, where a multiple, typically high, number of “things” (e.g., sensors, devices, actuators, and so forth) collaborate to massively process big data for supporting intelligent algorithms running over them. In such platforms, the computational bottleneck is very often represented by the component running the main algorithm, while communication and cooperation costs still remain relevant. Inspired by this emerging trend of big-data-based IoT applications, in this paper we focus on the speciﬁc application context represented by the problem of supporting moving object segmentation over images originated in the context of big multimedia data, and we propose an innovative background maintenance approach to this end. In particular, we provide a novel GPUaware Histogram-based Moving Object Segmentation algorithm that adopts a pixel-oriented approach and it is based on Graphic Processing Units (GPU), called pixHMOS_gpu. pixHMOS_gpu allows us to achieve higher performance, hence making the computational gap of big-data-based IoT applications decisively smaller. Experimental results clearly conﬁrm our arguments. © 2019 Published by Elsevier Inc.

1. Introduction Multimedia data are a popular case of Big Data (e.g., [11,30]) that expose the classical 3V characteristics (i.e., volume, velocity and variety). Such kind of data are likely to be processed within the core layer of Internet of Things (IoT) platforms, where a multiple, typically high, number of “things” (e.g., sensors, devices, actuators, and so forth) collaborate to massively process big data for supporting intelligent algorithms running over them. The synergy between big data and IoT platforms has already been highlighted in numerous studies to date. Fig. 1 provides an example of such platforms in the context of situational awareness and intelligence over multimedia big data. This refers to government organizations that, in connection with safe city solutions, border control institutions, critical infrastructures, transportation security agencies, and so forth, focus on improving security via enhanced situational awareness obtained by means of fusing multiple big data from a large

∗

Corresponding author. E-mail addresses: [email protected] (A. Cuzzocrea), [email protected] (E. Mumolo).

https://doi.org/10.1016/j.ins.2019.03.029 0020-0255/© 2019 Published by Elsevier Inc.

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

593

Fig. 1. IoT-Big multimedia data surveillance application scenario.

collection of security systems, also including image-processing-oriented infrastructures. Such (big) data are later aggregated from multiple sensors, and correlation analysis over them is ﬁnally capable of magnifying the effectiveness of situational awareness methodologies, with the goal of enhancing timely response and investigation, in the context of security systems for what regards the speciﬁc case. On the other hand, the more general problem of big data fusion has been investigated recently, with several interesting research outcomes. As shown in Fig. 1a Target Environment is continuously monitored by an IoT-Big Multimedia Data Surveillance Platform. Big multimedia data originated by such platform are (i) ﬁrst fused within the Big Multimedia Data Fusion module, according to suitable big data fusion algorithms; (ii) then analyzed by means of the Big Multimedia Data Correlation Analysis module; (iii) ﬁnally delivered to Cybersecurity Analysts who, by interacting with ad-hoc Vulnerability DBs, try to detect possible cybersecurity breaches and security risks. A few remarks should be made on the application shown in Fig. 1. Several computer-vision-based algorithms have been so far tackled with single-view approaches. Examples of such algorithms include vision monitoring or surveillance for controlling particular sites or automatic moving object detection for estimating vehicles or people ﬂows. However, it is clear that multiple views would brings many beneﬁts compared to single views, essentially because some views can help revealing information that were hidden to other views. Indeed, it has been shown that multiple views can greatly help computer vision algorithms, such as proposals that focus on moving object detection and tracking (e.g., [32,41]). For this reason, there is a continuous trend towards multi-camera approaches. Multi-vision allows to recover 3D information of the visual environment, but the related algorithms are quite demanding as regards to computational complexity requirements. Recently, several approaches simpler than 3D recovery have been proposed. Most of such approaches are based on homography constraints. Following this major trend, as shown in Fig. 1, we propose to gather video information from some high deﬁnition cameras and to develop an high-performance moving object detection algorithm. On the other hand, managing multiple, high-deﬁnition cameras involve an huge quantity of data to be managed, hence the big data great challenge arises. In such platforms, the computational bottleneck is very often represented by the component running the main algorithm, while communication and cooperation costs still remain relevant. For instance, in the running example, the computational bottleneck is represented by the need for effectively and eﬃciently detecting moving objects over massive big multimedia data. Inspired by this emerging trend of so-called big-data-based IoT applications (e.g., [39,50]), in this paper we focus on the speciﬁc application context represented by the problem of supporting moving object segmentation over images originated in the context of big multimedia data, and we propose an innovative background maintenance algorithm to this end, called GPUaware Histogram-based Moving Object Segmentation (pixHMOS_gpu). pixHMOS_gpu adopts a pixel-oriented approach and it is based on Graphic Processing Units (GPU), which have already proved to be extremely suitable in supporting data-intensive computing tasks. pixHMOS_gpu allows us to achieve high performance at a low computational cost, hence making the computational gap of big-data-based IoT applications smaller, by limiting the complexity of the main algorithm. Experimental results clearly conﬁrm our arguments.

594

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

1.1. Paper contributions The paper makes the following contributions: • •

•

• • • • •

•

we focus on the context of big-data-based IoT environments; we provide an algorithm for effectively and eﬃciently supporting moving object segmentation over images originated in the context of big multimedia data; we provide formal properties of the proposed framework, according to the commonly-known pixel-wise model for computer vision; we face the problem of maintaining the background over multiple images, with information-fusion-aware problems; we provide an effective GPU-based implementation of the proposed algorithm; we analyze related work and highlight pros and cons of our proposed approach; we introduce several cases studies; we provide a comprehensive experimental evaluation of the proposed algorithm, against several data sets and applicative settings; our experimental work really conﬁrms that the proposed algorithm solves the annoying moving-object-segmentation problem in the innovative and emerging big-data-based IoT context.

1.2. Paper organization The remaining part of the paper is organized as follows. Section 2 provides foundations and motivations of our research, by focusing on the beneﬁts deriving from GPU-aware algorithms for supporting big-data-based IoT applications. In Section 3, we provide an overview on pixHMOS_gpu algorithm, which is the main result of our research. Section 4 contains preliminary considerations as well as fundamental concepts of our research. In Section 5, we describe background knowledge for our research, namely pixel-wise algorithms and the GPU-based computational model. Section 6 deals with previously-published work on GPU-based implementations of Computer Vision algorithms that are related to our research. In Section 7, the pixHMOS_gpu algorithm is described in details, along with its GPU-based implementation. Section 8 contains the experimental assessment and analysis of pixHMOS_gpu algorithm in terms of computational eﬃciency, speedup and quality, along with some interesting case studies. Finally, in Section 9, ﬁnal remarks and future work are proposed. 2. Moving object segmentation: towards GPU-aware algorithms for supporting effective and eﬃcient big-Data-Based IoT applications – foundations and motivations In this Section, we provide foundations and motivations of our research, by illustrating how GPU-aware algorithms can be successfully exploited to support effective and eﬃcient big-data-based IoT applications. By focusing on next-generation big-data-based IoT applications, with particular emphasis on big multimedia data, this paper provides an effective and eﬃcient algorithm that solves the moving object segmentation problem, for a wide range of critical scenarios (e.g., surveillance, elderly care management, monitoring and situational awareness systems, and so forth). The proposed pixHMOS_gpu algorithm exploits the computational power offered by novel GPU computational paradigms, and ensures the feasibility of big-data-based IoT applications by limiting the complexity of the most-computationallyexpensive task inside such systems (i.e., just the moving object segmentation phase). pixHMOS_gpu algorithm deals with the background maintenance problem and proposes an innovative GPU-aware, histogram-based, pixel-wise solution. On the pure conceptual plan, pixHMOS_gpu introduces the following main features: fast background initialization, high accuracy in describing the effective background and fast reaction to sudden changes. The basic idea behind our proposed algorithm is that pixels are updated only if a statistic measure on the intensity variation of each pixel is greater than an adaptive threshold, thus reducing the I/O channel occupation signiﬁcantly. In computer vision systems, a background model is a representation of the background image and it is based on its associated statistics. Background models are widely used for foreground objects segmentation (e.g., [14]), which is a fundamental task in many computer vision problems including moving object detection (e.g., [2]), shadow detection and removal (e.g., [40]), image classiﬁcation (e.g., [33]), and other image-processing-target tasks. As the visual scene changes over time, these models are continuously updated as to include the required background modiﬁcations. Designing effective and eﬃcient model-updating algorithms identiﬁes, as mentioned, the background maintenance problem (e.g., [38]). As described in [7], there are many problems that background maintenance algorithms should solve, mainly related to the reaction of the background to both sudden or gradual changes in the visual scene. Noticeable instances are the sudden or gradual environmental light changes. Moreover, the moving object detection process can generate ghost images if the background image reconstruction is not fast enough. Consider, for instance, the case of an object in the background that starts moving. Since it is moving, the object should not be considered part of the background anymore. However, since the background image is reconstructed from the background model, the object cannot be immediately brought out from the background. This can generate ghost images out from the moving object detection process. A similar problem happens when a foreground object suddenly becomes motionless. Other typical background maintenance problems are due to the sudden or gradual environmental light changes, as mentioned above. Further, other problems may be caused by shadows, because foreground objects often generate cast shadows

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

595

that appear different from the modeled background. Hence, high-quality background management algorithms are generally quite complex to design. In fact, there is a trade-off to be considered between the accuracy of the background image and the computing time the algorithm requires. Of course, the complexity of these methods is the big obstacle for real-life applications. State-of-the-art techniques can be classiﬁed depending on the features they use. Speciﬁcally, we recognize the following two main classes: (1) temporal level approaches [12]: techniques adhering to such class make use of the temporal distribution of intensity only; (2) pixel-wise level (spatial level) approaches [22]: techniques adhering to such class fragment the frame sequence into independent pixel-oriented processes. In temporal-level techniques (e.g., [4]), temporal averaging and temporal median are two common methods exploited to compute an adaptive background model. The background model is estimated by processing each pixel in frames without prior knowledge. In pixel-wise-level techniques, each pixel is modeled independently, e.g. [16] models each pixel via a single Gaussian process. Further, in [42], a Kalman ﬁlter is used to model each pixel to overcome the gradual light changes. In [45] other statistical approaches for estimating a background model are proposed. These methods show their effectiveness in modeling dynamic backgrounds via updating them over time. They can deal with the gradual light changes as well. However, complexity of these approaches is still an open problem to deal with. A typical limitation of current background maintenance algorithms is related to the following aspects: (i) tracking of dynamic changes in real-time, and (ii) processing high-deﬁnition frames at high frame rates still maintaining high-quality background images. Besides this, there is a couple of other considerations that must be taken into account. First, in some cases, initialization of background maintenance algorithms must be performed with foreground objects that are present in the visual scene. Second, since resolutions of the video cameras are continuously increasing, background maintenance algorithms must process increasingly-in-size amounts of data ever. Third, since the frame rate of video cameras is also increasing as well, it is required that background maintenance computations must be increasingly-faster. For the latter reason, several initiatives propose background management algorithms developed on top of GPU, as to exploit the computational power offered by these computational paradigms. This is also the goal of the proposed pixHMOS_gpu algorithm. 3. The PIX HMOS_ GPU algorithm: overview In this Section, we focus the attention on a detailed overview on the main contribution of how research, i.e. the pixHMOS_gpu algorithm. pixHMOS_gpu algorithm is a GPU-aware, histogram-based, pixel-wise background maintenance algorithm, which uses video streams acquired from a ﬁxed position camera. pixHMOS_gpu introduces the following features: fast background initialization, high accuracy in describing the effective background and fast reaction to sudden changes. pixHMOS_gpu makes use of an adaptive threshold that determines when pixels must be updated via a suitable statistic measure on the intensity variation of each pixel. It is suited to real-time processing and high-deﬁnition images. In particular, the algorithm is well-suited to describe dynamic visual scenes with light intensity changes. It is worth noting that the proposed solution optimizes the thresholds independently for each pixel, leading to a better description of local dynamic changes in the image with respect to alternative ﬁxed threshold approaches. Moreover, pixHMOS_gpu does not require a background model training phase. Summarizing, the main contribution of pixHMOS_gpu consists in improving the quality of the resulting background images with respect to well-known algorithms available in active literature, as we truly demonstrate via experimental analysis throughout the paper. pixHMOS_gpu achieves high performance, thanks to the fact it adopts a GPU-based computational paradigm. GPU are characterized by a high number of parallel processors and implement the stream processor paradigm, a form of Single Instruction Multiple Data (SIMD) parallel processing. Under this paradigm, the same series of operations (a.k.a. kernel functions) are independently applied onto each element in a set of data stream in an unspeciﬁed order and in batches of an undetermined number of elements. In the context of image processing, this processing is particularly suited when each input pixel is independently processed (like in our proposed pixHMOS_gpu algorithm). Given the computational requirements posed by high-quality and high-performance background maintenance methods, it appears evident that the best solution for our investigated problem consists in developing a parallel background maintenance algorithm, as it occurs in a GPU-based solution. Generally speaking, all the pixel-wise algorithms are suitable to a parallel implementation. However, there are some constraints which should be better considered, mostly regarding the time needed for I/O transfers and the data transfer between parallel elements, which must be both minimized. pixHMOS_gpu algorithm is indeed pixel-wise, as almost all the processing are performed independently on each pixel and, as such, it is well-suited to GPU-based implementation and high speed-ups can be expected over the classical sequential implementation. It is worth remarking that an important quote of real-time computation of background maintenance algorithms is devoted to the detection of moving objects from high-deﬁnition images. On the other hand, trends of digital cameras expose increasing image resolutions and increasing frame rates. In fact, with high-resolution images, it is possible to perform zooms of a particular

596

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

region of the target image, such as a door or an entrance in an ambient scene. Moreover, high frame rates allow us to capture elements that are moving at high speed, such as fast-running persons or high-speed cars. All these features are very important in video surveillance, for instance. Another important application scenario where the proposed pixHMOS_gpu algorithm turns to be extremely useful is when the scene is crowded. In this case, the background is quite completely hidden for a long period of time and needs to be reconstructed in real-time. The proposed pixHMOS_gpu algorithm is capable to match all the described requirements, and to eﬃciently support next-generation big-data-based IoT applications. 4. Preliminary considerations In this Section, we provide some preliminary considerations whose fundamental concepts will be used throughout the paper. Let us denote as I the current acquired frame. In I, the pixel px,y at time t is characterized by the intensities of the basic colors Red, Green and Blue (RGB), respectively, via an RGB vector, denoted by Ix,y (t), which is deﬁned as follows:

R G B Ix,y (t ) = Ix,y (t ), Ix,y (t ), Ix,y (t )

(1)

R (t ) denotes the intensity of red component of the pixel p G such that: (i) Ix,y x,y at time t; (ii) Ix,y (t ) denotes the intensity of B (t ) denotes the intensity of blue component of the pixel p green component of the pixel px,y at time t; (iii) Ix,y x,y at time t. Similarly, B denotes the background image produced by a background model, as follows:

Bx,y (t ) = BRx,y (t ), BGx,y (t ), BBx,y (t )

(2)

The goal of the background maintenance task is to estimate, at each time t, the background model which produces the background image B. Furthermore, we denote as MO the set of pixels of the current frame I which report the moving objects (MO), and as G the set of pixels which appear in motion but do not correspond to any moving object. G identiﬁes a ghost image. As mentioned in Section 2, ghost images are due to the delay introduced by the background model in reconstructing the background image. Similarly to Ix,y (t), in the images Bx,y (t), MOx,y (t) and Gx,y (t) the pixel px,y at time t is modeled in terms of a RGB vector. Once the background image is computed, moving objects can be detected by subtracting, at time t, the background B from the current frame I, resulting in the difference image D, as follows:

Dx,y (t ) = Ix,y (t ) − Bx,y (t ) =

MOx,y (t ) − Bx,y (t ) Bx,y (t ) − MOx,y (t ) 0

if px,y ∈ MO if px,y ∈ G otherwise

(3)

B must be reconstructed from a background model because MO makes not visible the effective background. B is reconstructed by computing, at time t, its pixels px,y according to the following equation:

Bx,y (t ) =

Bx,y (t − tx,y ) MOx,y (t ) unchanged

if px,y ∈ MO if px,y ∈ G otherwise

(4)

where the time interval tx,y is estimated such that no moving objects correspond to the pixel px,y during tx,y . In order to measure the quality of a background model, several indexes have been proposed in literature (e.g., [4]). Generally speaking, by using background subtraction approaches (like ours), the quality of the background is measured through the quality of the foreground objects extracted from the current image. In our research, we consider the following quality measures. 4.1. Similarity Let A denote an extracted foreground region and B the corresponding ground truth region, the similarity S between A and B, denoted by S(A, B) is deﬁned as follows [29].

S(A, B ) =

A∩B A∪B

(5)

From Eq. (5), it should be noted that S(A, B) is a nonlinear measure. It tends to 1 (i.e., the maximum value) if A and B are equal and to 0 (i.e., the minimum value) when A and B are different. In other words, S(A, B) integrates the false positive and false negative errors in one measure. 4.2. Recall-Precision Recall, denoted by R, and Precision, denoted by P, are two widely-used metrics for evaluating the correctness of pattern recognition algorithms. P and R can be seen as extended versions of accuracy, a simple metric that computes the fraction of

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

597

instances for which the correct result is returned. In our investigated context, recall R is deﬁned as follows [26]:

R=

NF PC NF PG

(6)

such that: (i) NFPC denotes the number of foreground pixels that are correctly identiﬁed (by the target moving object segmentation algorithm A); (ii) NFPG denotes the number of foreground pixels in ground-truth. Precision P is instead deﬁned as follows [26]:

P=

NF PC NF PA

(7)

such that: (i) NFPC denotes the number of foreground pixels that are correctly identiﬁed (by the target moving object segmentation algorithm A); (ii) NFPA denotes the number of foreground pixels that are identiﬁed by the target moving object segmentation algorithm A (including pixels that are erroneously identiﬁed as pixels of moving objects). We can qualify how well a background model works by matching its results to the ground-truth. When using precision and recall, the set of possible labels for a given instance is divided into two subsets, one of which is considered “relevant” for the purposes of the metric. Recall is then computed as the fraction of correct instances among all instances that actually belong to the relevant subset, while precision is the fraction of correct instances among those that the algorithm believes to belong to the relevant subset. Precision can be seen as a measure of ﬁdelity, whereas recall is a measure of completeness. Clearly, a background model exposes a high quality when its values of precision and recall (on the target image set) are both high. 5. Background knowledge In this Section, we provide the background knowledge of our research. In particular, in Section 5.1 we report on pixelwise algorithms, by detailing their deﬁnitions and properties; in Section 5.2 we focus the attention on the GPU-based computational model. 5.1. Pixel-wise algorithms: deﬁnitions and properties In pixel-wise algorithms (e.g., [22]), each pixel of a frame is considered to be independent from the others. The temporal evolution of each pixel’s value is used to update each single pixel. As a consequence, the background image is obtained by putting side by side the updated pixels. Different types of pixel-wise algorithms are characterized by different parallel implementation solutions. Region-wise algorithms, such as Local Binary Patterns (LBP) [21], take into account the neighbor of each pixel in order to update the background image by considering local spatiality. Such types of algorithms require to update the neighbor data together with the current pixel value, and this can cause high inter-thread communication costs. In our research, we consider pixelwise algorithms due to the fact that they are characterized by a natural parallel implementation. Indeed, in such algorithms, the updating process of each pixel can be performed by a different thread of execution. The data of each pixel is in fact independent from the data of the other pixels. Several statistical techniques have been used to manage the temporal evolution of the color value of each pixel. Popular techniques are those based on Mixtures of Gaussian Models and those based on histograms (e.g., [25,26]). Consider, for instance, the method described in [44]. Here, a given sequence of recent-history frames {I1 , . . . , It , . . . , IN } is modeled by means of mixture of N Gaussian distributions. On the basis of this theoretical framework, the probability of observing the current pixel value, denoted by P(px,y (t)) is computed as follows:

P (Ix,y (t )) =

N

ωi,t · η (It , μi,t , i,t )

(8)

i=1

wherein: (i) N denotes the number of Gaussian distributions; (ii) ωi,t denotes an estimate of the weight of the ith Gaussian of the mixture at time t ((i.e., the portion of data that is accounted for by the i-th Gaussian); (iii) μi,t is the mean value of the i-th Gaussian of the mixture at time t; (iv) i,t is the covariance matrix of the ith Gaussian in the mixture at time t; (v) η denotes a Gaussian probability density function. In [44], Gaussian distributions are ordered as to give more evidence to distribution peaks and low variances. Then, the ﬁrst MD Gaussian distributions are chosen to ﬁnally represent the background model. MD is computed as follows:

MD = argminm

m

ωk > T

(9)

k=1

such that T is a measure of the minimum portion of data that should be accounted for by the background (still modeled in terms of the selected Gaussian distributions). As regards histogram-based approaches, we ﬁrst provide some foundations on how histograms are exploited in order to represent background images. First, consider that, without loss of generality, pixels expose different intensity values across a

598

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

Fig. 2. Histogram-based Color Intensity Variation Across Time from (a) to (b).

certain number of frames. Therefore, the color distribution of a background pixel px,y having intensity a, such that 0 ≤ a ≤ 255 c (t, a ) (i.e., we assume a color depth of 8 bits for each color channel), at time t can be described by means of an histogram Hx,y that represents the distribution of the intensity value of each color c, such that c ∈ {Red, Green, Blue}. Given a pixel px,y with c (t + 1, a ) at time t + 1 is updated from the associated histogram H c (t, a ) at time t intensity a, its associated histogram Hx,y x,y as follows: c c c Hx,y (t + 1, a ) = Hx,y (t, a ) + δ (Ix,y − a)

(10)

where δ ( · ) denotes the Dirac delta function, deﬁned as follows:

δ ( p − q) =

1 0

if p = q if p = q

(11)

In histogram-based algorithms, the background color is assumed to be corresponding to the ﬁrst peak value of the histogram, for all the histograms associated to the three color channels. In the case of two or more equal peaks, only the ﬁrst one is selected. The idea behind the histogram-based approaches is as follows. If the pixel px,y of the current frame represents always the same, ﬁxed point of an object, its histogram continuously increases at each frame (see Eq. (10)). Therefore, we can argue that the pixel is a background pixel. For instance, Fig. 2 (a) represents the histogram associated to a pixel pointing to an object with color intensity a = 100 after a certain number of frames, said n, being that initially a = 0. If, after more frame arrivals, said m such that m > n, the histogram increase its color intensity to a = 180, as shown in Fig. 2 (b), we derive that this peak represents the color intensity of the object that is seen by the same pixel for the longest time, hence the latter is a background c (t, a ) is a background model. From this model, the background image object. In this sense, the histogram-based model Hx,y can be reconstructed by choosing the color with the highest value in the histogram. The histogram-based model still suffer of the ghost image problem mentioned in Section 2. In fact, if the target pixel does not belong to the moving object anymore, still its color continues to remain the highest peak for some time, until other peaks become higher due to the histogram updating. This because the background reconstruction is not fast enough. In fact, from reconstructing the background image, the target pixel still appears as belonging to the moving object although that this is no longer true. The resulting detected moving object is a ghost image. Lai et al. [26] propose an histogram-wise background model that founds on these considerations. Hereinafter, we name this baseline approach as the Histogram-Based algorithm (HB). According to HB, the background image is obtained via extracting the peak value of the histograms: a pixel is considered foreground if it is signiﬁcantly different from the current background estimation [26]. Kuo et al. [25] introduce an improved version of HB, called Eﬃcient Histogram-Based algorithm (EHB). According to the novel strategy, the background is not updated periodically, but only when changes occur after a certain period of time. In order to adapt to changes in the scene, the number of detected changes of each pixel at time t is introduced. If a color intensity variation is frequently detected in a period of time and the associated number of detected changes is above a given threshold , the background image is then updated. 5.2. The GPU-based computational model The GPU-based computational model founds on the massive ﬂoating-point computational power of modern graphic accelerator’s shader pipelines. GPU have been extensively used in supporting data-intensive computing tasks for a wide family of goals, ranging from data management techniques to OLAP data processing, from Data Mining algorithms to artiﬁcial intelligence methods, and so forth. On the other hand, in order to achieve a further performance gain, GPU computing has also been integrated with data compression paradigms, similarly to what happens in different but related applicative settings. In our research, we exploit the GPU-based computational model for effectively and eﬃciently supporting big-data-based IoT applications, according to the guidelines discussed in Sections 1 and 2, respectively. When utilizing GPU, there are several things that must be taken into consideration, as the internal structure of GPU is completely different from the internal structure of traditional CPU. Fig. 3 provides, to this end, the typical interaction

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

599

Fig. 3. CPU-GPU high-performance computational architecture.

Fig. 4. CUDA-based multi-threading: shared memory within each block and global memory shared by all threads.

between the GPU and the CPU, where the hierarchical model CPU-GPU can be appreciated. Indeed, as shown in the Figure, CPU computing is mapped on multiple GPU computational blocks, thus improving performance. Let us move in greater details on the GPU-based computational model and design considerations that concern algorithms demanding for eﬃciently exploiting the GPU computational power. First, the execution model of GPU is really different from that of CPU. GPU employ massive parallelism and wide vector instructions for supporting computation tasks, via executing the same instruction for a large number of elements at a time. Indeed, if the design phase of target algorithms do not take into consideration this special feature, the performance of GPU-based computational frameworks will only reach a small fraction of what that hardware is really capable of. The computation in blocks is parallelized through parallel threads. Threads are organized by CUDA (Compute Uniﬁed Device Architecture in grids of blocks and scheduled in hardware. Each thread is assigned a unique ID, which is accessible within the kernel code. The blocks, similar to threads, are arranged in one or two-dimensional grids. Also, threads within a block can cooperate by sharing data through a shared memory and synchronizing their execution to coordinate memory access. Threads can access data from different memory locations during their execution. Each thread has its own private local memory. Each block has its own shared memory visible to all the threads of the block and with the same time of life of the block. All threads of all blocks have access to a global memory (global memory). Fig. 4 shows the shared memory withing each block and the global memory shared by all the threads. There are also two additional spaces of read-only memory accessible by all threads: the constant memory and texture memory. The global, constant, and texture memory survive after different executions of the kernels of the same application. As an example of GPU-based architectures, consider the case of multi-thread programs, which are very often used in dataintensive computing platforms (e.g., [19]). Here, the computational ﬂow can be easily spread-out over multiple multi-core GPU processors, like in Fig. 4, where CUDA-based multi-thread program is executed in parallel on top of two multi-core GPU processors. It is worth noticing that the architecture shown in Fig. 4 can be easily scaled-up on a larger number of parallel GPU processors, as the input computational ﬂow grows in size, hence achieving higher performance.

600

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

A critical aspect to be considered when designing algorithms running on GPU-based architectures is that, usually, data are transferred from CPU to GPU, being GPU connected to the target workstation via a conventional PCI connector. From the so-determined hybrid computational architecture, it is clear enough that the most time-consuming task is just the data transfer, which is usually relatively slow. Indeed, as widely known, data transfer usually introduces a higher complexity of any arbitrary algorithm, and, as a consequence, when designing GPU-aware algorithms, the most critical aspect to be considered concerns with how to optimize the data transfer phase, in order to reduce that complexity. 6. Related work In this Section, we provide an analysis of research efforts known in literature that are related to our research. Indeed, image processing and computer graphics are the ﬁelds where the GPU can give the greatest beneﬁts, as long as the algorithms are mostly pixel-wise (see Section 2). One direct advantage is that computations executed by GPU clearly free the (interconnected) CPU from those tasks (see Fig. 3). On the basis of this paradigm, several GPU-aware algorithms for background management problems have been proposed in recent years. Griesser et al. [20] presents a foreground-background segmentation method that is based on a novel color similarity test in a small pixel neighborhood, integrated within a Bayesian estimation framework based on an iterative Markov random ﬁeld. The idea here is to use images of size 640x480 pixels on top of an NVIDIA GeForce 6800GT graphic board. In [34], a background model with low latency is introduced, in order to detect moving objects in video sequences. Subsequent frames are stacked on top of each other via using ad-hoc associated registration information, which must be obtained in a suitable pre-processing step. Their GPU-based architecture is running on a test-bed system equipped with an NVIDIA GeForce 8800 GTX model, having 768 MB RAM dedicated to video storage. Yu and Medioni [49] describe a GPU-based architecture for supporting motion detection from a moving platform. A step compensating for camera motion is required prior to estimate the background model, contrary to conventional settings where the cameras are ﬁxed. Due to unavoidable registration errors, the background model is estimated on the basis of a sliding window of frames. In more details, the proposed background model is based on an RGB texture histogram and a searching routine looking for the bin with the largest number of samples. Gong and Cheng [18] propose an approach that incorporates a pixel-based online learning method that adapts to temporal background changes together with a graph-cuts method allowing to propagate per-pixel evaluation results over nearby pixels. The architecture founds on an Intel Centrino T2500 CPU and an ATI Mobility Radeon X1400 GPU. Pham et al. [35] describe a GPU-aware implementation of an improved version of the extended Gaussian mixture background model, based on a Intel Core2 Duo 2.6 GHz CPU and an NVIDIA GeForce 9600GT GPU. They achieve a speed-up equal to at least a factor of 10 over the corresponding CPU-aware implementation. In [36], Poremba et al. describe a background maintenance algorithm based on an Adaptive Mixture of Gaussians (AGMM) running on an hybrid architecture comprising an NVIDIA GeForce 9800 GPU and an IBM Cell processor. Compared with an CPU-aware implementation running an Intel Core2 Duo 2.6 GHz using multi-threading, they achieve higher orders of acceleration for both the GPU and the IBM cell processor, respectively. Momcilovic and Sousa [31] describe a new motion estimation parallel approach for multi-core architectures that exploits the capacity of multi-core processors to eﬃciently provide the real-time motion estimation required by the recent Advanced Video Coding standards. In particular, motion estimation can be eﬃciently performed while the main processor executes in parallel the other parts of the video coding system. Experimental results show that motion estimation can be performed in real-time for the most demanding conﬁguration and search algorithms by programming the proposed parallel algorithm in a current multi-core processor. To evaluate the eﬃciency of the proposed model, authors use an H.264/MPEG-4 video coding motion estimation conﬁguration. The same authors later report an enhanced version of their parallel motion estimators developed on top of two GPU with both the Tesla and CUDA architecture, respectively. Authors show that real-time motion estimation is achieved even for 720x576 resolution and 25 frames per second on top of a GeForce 285GTX GPU. Fukui et al. [15] propose a GPU based algorithm for extracting moving objects in real-time. Differently from Momcilovic and Sousa [31], they propose to extract the moving regions without shadows in an high deﬁnition video sequence under intensity changes, and it makes it possible to use the color space for handling shadow areas produced by objects. Therefore, the proposed method allows us to extract moving objects without shadow areas. Moreover, the proposed method is robust to intensity changes, as authors prove in the experimental evaluation and analysis. In particular, real-time performance are obtained on top of a GeForce 8800GTX GPU. Berjón et al. [3] describe a real-time implementation of a Gaussian-based optimized spatio-temporal nonparametric moving object detection strategy. The proposed approach dynamically estimates the bandwidths of kernels required to model the background and the model itself is also selectively updated. The solution is implemented on a consumer-grade GPU with 16 stream multiprocessors and 1.5 GB RAM coupled with a 4-core CPU clocked at 3.4 GHz with 16 GB RAM. They report smart cooperation among: (i) computer/device’s CPU/GPU, (ii) extensive usage of the texture mapping and ﬁltering units, (iii) high-quality detection rates. In [17], a GPU-aware implementation of an optical ﬂow based moving objects detection algorithm is proposed. Novel computational approaches on GPU oriented to widely-used techniques such as RANdom SAmple Consensus (RANSAC – e.g., [13]) and Region Growing (e.g., [37]) are described. The solution also solves image processing parallelization problems, due to divergent execution paths, via using compaction and sorting primitives, with a signiﬁcant impact on performance. The authors

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

601

ﬁnally show that the GPU-based implementation of the target algorithm outperforms the FPGA-based one of the same algorithm. Recently, Kumar et al. [24] describe another moving object detection algorithm on high-resolution videos running in real-time on GPU. In the proposed research, several algorithms, namely video object detection, morphological operators and connected component labeling are implemented on GPU, by achieving a speed of 22.3 frames per second for high-deﬁnition videos. In particular, background modeling is addressed by means of Gaussian Mixture Model (GMM). 7. PIX HMOS_ GPU : a novel GPU-aware histogram-based algorithm for moving object segmentation In this Section, the pixHMOS_gpu algorithm is described in details, along with its GPU-based implementation. The histogram-based algorithms described in Section 5.1, namely HB [26] and EHB [25], expose suitable characteristics for parallelization and good quality of the resulting background, as experimentally proved by authors. However, in the proposed algorithms, they use the same threshold for all the pixels of the frame. Contrary to this approach, our proposed pixHMOS_gpu algorithm introduces the innovative idea of making use of a different adaptive threshold for each pixel. This leads to an improvement of the background image quality at the cost of a slight increase in the computational load, as we demonstrate in our experimental analysis provided in Section 8. We now describe how the pixHMOS_gpu algorithm works. First, focus the attention on how the background color is determined. In classical approaches, the background color is estimated via the average of the pixel intensity values over time. In more detail, it should be noted that basic background estimation approaches use simple statistical combination, like average or median, of the previous n frames. Such basic background estimation approaches raise many problems, partially solved by the subsequent algorithms. Other basic background estimation algorithms are based on pixels histograms. This is a simple yet effective approach for modeling the background image. However, the main problem of such approach is to deﬁne the global threshold by which the algorithm decides when to update the background image. Our pixHMOS_gpu algorithm dramatically improves the background updating by computing a thresholds for each pixel of the input image according to an adaptive criterion. Thus, the background is updated pixel by pixel only when it is necessary. Like in EHB [25], in the pixHMOS_gpu algorithm pixel changes are counted but, differently from EHB, local variance is taken into consideration in order to establish when the background image has to be updated. In particular, in this phase, we apply an adaptive approach: we assign lower weights to pixels that frequently change their intensity value (with respect to the background) and higher weights to pixels that do not change at a relevant rate, hence they are detected as candidate background pixels. Indeed, the ﬁrst class of pixels are likely to belong to moving objects rather than to the background, because their intensity value is changing, whereas the contrary for the second class of pixels. In addition to this, the pixel update threshold, as mentioned before, is modiﬁed dynamically. In pixHMOS_gpu algorithm, for each pixel px,y of the current frame I, the following parameters are introduced: • •

•

•

c , such that c ∈ {Red, Green, Blue}, for the three RGB components – see Section 4; three histograms Hx,y the number of Found Changes (FC), denoted by FCx,y , which is the same concept exploited by EHB [25] – see Section 5.1, i.e. the number of detected changes in the intensity of px,y ; the number of Not Found Changes (NFC), denoted by NFCx,y , which is the number of times that intensity of px,y does not change; a pair of thresholds, namely φ x,y and ξ x,y , applied to FCx,y and NFCx,y , respectively.

In particular, the meaning of thresholds φ x,y and ξ x,y is as follows. The background image is updated on the basis of the following criterion, applied to each pixel px,y . If FCx,y is greater than φ x,y , then the background pixel is reconstructed from its histograms. More speciﬁcally, FCx,y is increased when a background pixel is considered changed, and thus it should be updated. On the other hand, when a pixel is part of a relevant object and it can be considered changed, the value of FCx,y is decreased in order to increase the time before the following background update. The background pixel is reconstructed even if NFCx,y is greater than ξ x,y . Indeed, NFCx,y is used to correct small pixel color changes over a long period of observation: sometimes, the background image is reconstructed even for unchanged pixels, for instance for tracking slow and small intensity light modiﬁcations. To detect pixel intensity changes, for each pixel px,y in the current frame the difference vector x,y between current image and background, denoted by x,y , is computed as follows:

G B T R x,y = Ix,y − BRx,y , Ix,y − BGx,y , Ix,y − BBx,y

(12)

c denotes the intensity of pixel p c such that: Ix,y x,y for the channel c, being c = (Red, Green, Blue ), and Bx,y is the intensity of the background pixel for the channel c, being c = (Red, Green, Blue ) – see Section 4. We detect if the intensity of pixel px,y of the current frame is changed with respect to that of the pixel pBx,y of the background image by comparing each component of x,y with a proper threshold that is different for each channel (i.e., for each vectorial component), denoted by τ , which is deﬁned as follows:

τ = τ R, τ G, τ B T

(13)

602

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

such that: (i) τ R is the threshold on the Red channel; (ii) τ G is the threshold on the Green channel; (iii) τ B is the threshold on the Blue channel. Therefore, we determine that the intensity of pixel px,y is changed through Eq. (14).

x,y > τ

(14)

pixHMOS_gpu algorithm is reported in Algorithm 1. If makes use of several algorithms as sub-routines. Among these, two of them, namely updateFC and updateNFC, play a more relevant role (as they express the inherent nature of the pixHMOS_gpu’s approach) hence they are reported in Algorithms 2 and 3, respectively. Algorithm 1 Algorithm pixHMOS_gpu. Input: I, B, τ , ζ , α , ξ Output: D, B for (each px,y ∈ I) do ← 0; F Cx,y ← 0; NF Cx,y R G , H B ← updateHistograms (I );

Hx,y , Hx,y x,y x,y Dx,y ← Ix,y − Bx,y ; x,y ← computeDelta(Dx,y ); if (changeDet ect ed (x,y , τ )) then ); F Cx,y ← updateF C (τ , Dx,y , F Cx,y βx,y ← computeBeta( px,y ); φx,y ← computePhi(ζ , α , βx,y ); if (F Cx,y > φx,y ) then R , H G , H B ); Bx,y ← updateBackground (Hx,y x,y x,y end if else ); N F Cx,y ← updateN F C (τ , Dx,y , NF Cx,y if (NF Cx,y > ξx,y ) then R , H G , H B ); Bx,y ← updateBackground (Hx,y x,y x,y end if end if end for return D, B;

Algorithm 2 Algorithm updateFC. Input: τ , D, F Cx,y Output: F Cx,y

if (isF oreground ( px,y )) then if (Dx,y > τ ) then − 1; F Cx,y ← F Cx,y end if else if (Dx,y > τ ) then + 1; F Cx,y ← F Cx,y end if end if return F Cx,y ;

In the following, we focus in greater details on the pixHMOS_gpu algorithm’s parameters. The parameter FCx,y models the number of recent changes of the pixel px,y . This counter is then used to evaluate whether a pixel value in the background need to be updated or not. The rules used to update this counter are the core part of the algorithm. In more details, FCx,y is increased when the difference Dx,y is over the threshold τ and the pixel is labeled as not belonging to a relevant object of the foreground. On the other hand, when a pixel is part of a relevant object and the difference is greater than the threshold, the value of FC is decreased in order to increase the time before the following background update (see Algorithm 2). The threshold τ is typically around 100, according to several empirical evaluations.

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

Algorithm 3 Algorithm updateNFC. Input: τ , D, NF Cx,y Output: NF Cx,y

if (Dx,y > τ ) then NF Cx,y ← 0; else + 1; N F Cx,y ← N F Cx,y end if return NF Cx,y ;

Fig. 5. Data structures introduced by the pixHMOS_gpu algorithm.

Fig. 6. Data management tasks in pixHMOS_gpu algorithm.

603

604

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

Fig. 7. Big multimedia data fusion method.

Fig. 8. Similarity analysis results.

Fig. 9. Computational time analysis results.

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

605

Fig. 10. Computational time analysis for pixHMOS_gpu algorithm.

Fig. 11. Quality analysis for pixHMOS_gpu algorithm.

As speciﬁcally regards the basic background image updating, the pixHMOS_gpu algorithm introduces the following rule:

reconstruct ( px,y ) NULL

F Cx,y > φx,y F Cx,y ≤ φx,y

(15)

wherein: (i) reconstruct(◦) is a function that performs the reconstruction of pixels; (ii) φ x,y is a threshold that is computed for each pixel via using an adaptive criterion, as follows:

φx,y = (bx,y − a ),

(16)

where (i) a is a parameter that considers global image properties, and it is deﬁned as follows:

a=

NPC NPT

(17)

such that: (i) NPC denotes the number of pixels in the image that have changed, (ii) NPT denotes the total number of pixels in the image; and (ii) bx,y is a parameter that considers local image properties, i.e. local to the proximity of the actual pixel,

606

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

Fig. 12. Quality analysis for HB algorithm.

Fig. 13. Quality analysis for EHB algorithm.

and it is deﬁned as follows:

bx,y =

NPx,y,C NPx,y,T

(18)

such that: (i) NPx,y,C denotes the number of pixels in the neighborhood of px,y that have changed, (ii) NPx,y,T denotes the total number of pixels in the neighborhood of px,y . From Eq. (16), it should be noted that the contribution of local changes, which is modeled by bx,y , increments the value of the threshold φ x,y , while the contribution of global changes, which is modeled by a, tends to decrease the value of φ x,y , thus forcing a faster update of the background image. Therefore, φ x,y is an adaptive threshold used to decide when to reconstruct the background image, thus allowing us to more-accurately model regions inside the current scene that have different change rates. As regards the comparison with conceptual-background research efforts, pixHMOS_gpu algorithm offers some improved features with respect to EHB [25]. First of all, the algorithm is capable to adapt the background to the gradual changes of lights that happens at different hours and weather conditions during the day, as the histograms are continuously updated. Also, it is capable to adapt single parts of the background image taking into account the different dynamics of the changes

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

607

Fig. 14. Quality analysis for MoG algorithm.

Fig. 15. Quality analysis for LBP algorithm.

in different regions of the grabbed image. In addition to this, it is capable to adapt the background to sudden light changes, as when a light is turned on or when sun appears among the numbs, choosing accordingly suitable model parameters. Moreover, one can expect a reduced number of I/O operations due to the reduced updates of the background image. Some other features are in common to EHB, such as the absence of a training phase and the fact that it can work properly when the start grabbed image has foreground elements already present. Finally, as mentioned, one critical aspect of pixHMOS_gpu algorithm relies in the fact that it founds on the GPU computational framework. Here, we provide further details on the GPU-based implementation of our proposed algorithm. In our implementation, each acquired image is divided into 8x8 pixel blocks and for each block a pool of independent threads is instantiated. For each concurrent thread, several data structures are instantiated in the GPU’s memory for each c ; (ii) the parameter FC ; (iii) the parameter NFC . The details of these data pixel px,y , namely: (i) the three histograms Hx,y x,y x,y structures are reported in Fig. 5. Each thread updates the model of a single pixel of the background. As the pixels are updated by independent threads, this approach does not require inter-thread communication to synchronize the thread operations. A schematic representation of pixHMOS_gpu algorithm, with details on the data management tasks, is reported in Fig. 6.

608

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

Fig. 16. F-Measures for all the comparison techniques.

Fig. 17. Real-life case studies on top of classical CPU-based algorithm [49].

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

609

Fig. 18. Real-life case studies on top of pixHMOS_gpu algorithm.

8. Experimental assessment and analysis In this Section, we provide the experimental assessment and analysis of pixHMOS_gpu algorithm in terms of computational eﬃciency, speedup and quality, along with some interesting case studies. In our experimental assessment, we acquired an in-lab video data set using ultra-high deﬁnition cameras from eLine Technology. The target data set is composed by four videos. The videos have been made in some city streets and at the entrance of a train station under different conditions. The data set was acquired at 2, 560, 440 resolution with a rate of 15 frames per second. After that, the video data set was re-sized at different resolutions via using bi-cubic interpolation for performing experiments. Videos taken from the cameras were also time-synchronized. The cameras looked at a car parking according to each camera’s point of view. 10 sets of 5 videos have been acquired, each at different time instants. On the basis of what stated previously, the scheme of the multi-view big multimedia data fusion method is depicted in Fig. 7. Here, the different camera views are projected to a common plane via using an homography matrix. By assuming the cameras are ﬁxed and with overlapping views, for each view, an homography matrix is computed once in the initial camera calibration phase. It is worth estimating the amount of data ﬂow in different points of the scheme. At the input of the fusion module we have L high-deﬁnition cameras. If we assume a resolution of 2, 560, 440 that is sampled at a 15 f/s rate, we thus have a total ﬂow of about 210 GB/s. After the fusion, we will have a video signal at a 52 GB/s rate. The video signal is then processed by the GTX1080 GPU. Implementation-wise, we adopted a server host characterized by one core equipped with Intel Core 2 Quad Q9550 CPU running at 2.83 GHz, which interfaces to the camera, on one side, and to the GPU, on the other side. In all the experiments, we adopted the same hardware to compare the different comparison techniques. This in order to obtain a fair experimental comparison. In the experimental analysis, we compared pixHMOS_gpu algorithm with the following well-known moving object detection algorithms: (i) HB [27]; (ii) EHB [25]; (iii) Mixture of Gaussians (MoG) [23]; (iv) Linear Binary Pattern (LBP) [1]. Since these algorithms do not perform big multimedia data fusion originally, in order to do the comparison with our solution,

610

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

Fig. 19. Computational Time Ratio Between the pixHMOS_gpu Algorithm and the GPU-based EHB.

moving objects detected from each camera are fused by overlapping according to the scheme reported in Fig. 7. Therefore, we are able to obtain one video from the multiple cameras, hence all the algorithms can be compared. As regards experimental metrics, we considered similarity measures between the background computed by the target algorithm and the real background, and computational time. As described in Section 4, similarity is deﬁned according to Eq. (5). Fig. 8 reports the similarity analysis results for the comparison algorithms, as averaged among all the acquired videos. It shows that the proposed pixHMOS_gpu algorithm gives better similarity results than the eﬃcient histogram-based versions, and provides the best results among the considered algorithms. Fig. 9 reports the computational time analysis results for the comparison algorithms. It shows that the faster algorithm is EHB, which requires about 80 ms for computing the background. After the comparison phase, we further investigated the performance of pixHMOS_gpu algorithm, by stressing other experimental parameters. First, we note that the computational speed-up of GTX1080 GPU with respect to Intel Xeon 6 cores running at 2.66 GHz is about 28 for all the considered image resolutions. As regards the absolute computational time, Fig. 10 reports the computational time analysis of pixHMOS_gpu algorithm for different frame sizes, on different hardware platforms, namely: (i) Intel Core 2 Duo running at 2.66 GHz; (ii) Intel Xeon 6 cores running at 2.66 GHz. As shown in Fig. 10, our GPU-based solution outperforms traditional CPU-based implementations. In order to assess the quality of the algorithm, the values of Recall, Precision and F-measure versus the number of frames have been considered. Results are reported in Fig. 11. As shown in Fig. 11, our proposed algorithm exposes a good behavior even with respect to quality measures, beyond to performance measures. The analysis above has been further integrated by computing the same experimental pattern reported in Fig. 11 for the pixHMOS_gpu algorithm as related to the other comparison algorithms, namely: (i) HB (see Fig. 12); (ii) EHB (see Fig. 13); (iii) MoG (see Fig. 14); (iv) LBP (see Fig. 15), respectively. Further, we compared the F-measure metrics for all the comparison techniques, still by ranging the number of frames (see Fig. 16). From the analysis of Figs. 11–16, it follows that our proposed algorithm overcomes comparison techniques, for both metrics. In order to complement our experimental assessment, we devised some simple yet effective case studies showing the real-life results of the proposed pixHMOS_gpu algorithm against classical CPU-based implementation. Fig. 17 shows, from the top, the grabbed image (top), the reconstructed background (middle) and the difference image (bottom), for two reallife scene recorded by cameras (one for each column: people in the outside, and people at the train station). In this case, a

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

611

CPU-based implementation has been performed, by using the classical approach [49], which computes the background via estimating it from a simple average of previous frames. Fig. 18, instead, shows the same experimental pattern while the proposed pixHMOS_gpu algorithm has been used as base algorithm. From the analysis of Figs. 17 and 18, it is evident that pixHMOS_gpu algorithm not only provides the same quality results like the classical CPU-based algorithm [49], but, in addition to this, it leads to other amenities because the reconstructed backgrounds are cleaner (middle) and the difference images (bottom) allow a much more precise determination of moving objects. Finally, we evaluated the speed-up of pixHMOS_gpu algorithm by studying the ratio between the computational time required by the GPU-based implementation of EHB algorithm [25] and the computational time required by our algorithm, at different image resolutions. Fig. 19 reports the ratio between the time of pixHMOS_gpu and the time of the GPU-based EHB. As shown in Fig. 19, pixHMOS_gpu algorithm requires only from 1.4 to about 1.8 more time than EHB for increased image resolutions. Moreover, it scales very well as the number of pixels increases. 9. Concluding remarks and future work Inspired by the emerging big-data-based IoT applications trend, in this paper we have focused on the speciﬁc application context represented by the problem of supporting moving object segmentation over images originated in the context of big multimedia data, and we proposed the innovative background maintenance algorithm pixHMOS_gpu. pixHMOS_gpu adopts a pixel-oriented approach and it is based on powerful GPU platforms. As we demonstrated via extensive experimental assessment and analysis, pixHMOS_gpu allows us to achieve high performance, hence making the computational gap of big-data-based IoT applications smaller, by limiting the complexity of the main algorithm. As the evolution of video camera technology provides more powerful devices, the resolution of the acquired images is increasing, so that better deﬁnition of image details can be provided. Moreover, higher resolutions allow to zoom a region of an image without sacriﬁcing spatial resolution. Therefore, it is worth noting that the proposed pixHMOS_gpu algorithm is well-suited for high resolution images, as it presents a linear speedup as the number of pixel increases. In addition to this, pixHMOS_gpu algorithm can manage high frame rate in real-time, and it is suited for video tracking of rapidly moving objects. At current state-of-the-art, full HD videos can be managed in real-time using the actual generation of GPU. Therefore, since pixHMOS_gpu algorithm slightly depends on the particular GPU architecture adopted, it might operate properly on future-generation GPU, thus achieving better performance on both quality and eﬃciency. Other interesting extensions of the overall framework concern with: (i) studying how fragmentation techniques can be integrated as to improve the eﬃciency of our framework; (ii) moving towards the Big-Data’s philosophy (e.g., [5,10,28,46– 48]), as moving objects naturally generate big data sets; (iii) exploring privacy-preservation issues (e.g., [8,9]), which are now becoming more and more critical for image processing research (e.g., [43]); (iv) exploring adaptiveness paradigms, even proposed in different contexts (e.g., [6]), as these may improve the big multimedia data management phase. References [1] A. Athira, M. Vijayan, R. Mohan, Moving object detection using local binary pattern and gaussian background model, Lect. Notes Netw. Syst. 11 (2017) 367–376. [2] S.R. Balaji, S. Karthikeyan, A survey on moving object tracking using image processing, in: Proceedings of the 11th International Conference on Intelligent Systems and Control (ISCO), 2017, pp. 469–474. [3] D. Berjón, C. Cuevas, F. Morán, N.N. García, Gpu-based implementation of an optimized nonparametric background modeling for real-time moving object detection, IEEE Trans. Consumer Electron. 59 (2) (2013). [4] T. Bouwmans, F. Porikli, B. Hoferlin, A. Vacavant, Background Modeling and Foreground Detection for Video Surveillance, Chapman and Hall/CRC, 2014. [5] P. Braun, J.J. Cameron, A. Cuzzocrea, F. Jiang, C.K. Leung, Effectively and eﬃciently mining frequent patterns from dense graph streams on disk, in: Proceedings of the KES, in: Procedia Computer Science, 35, Elsevier, 2014, pp. 338–347. [6] M. Cannataro, A. Cuzzocrea, C. Mastroianni, R. Ortale, A. Pugliese, Modeling adaptive hypermedia with an object-oriented approach and xml, in: Proceedings of the Second International Workshop on Web Dynamics, 2002. [7] R. Cucchiara, C. Grana, M. Piccardi, A. Prati, Detecting moving objects, ghosts, and shadows in video streams, IEEE Trans. Pattern Anal. Mach. Intell. (2003) 1337–1342. [8] A. Cuzzocrea, V. Russo, D. Saccà, A robust sampling-based framework for privacy preserving OLAP, in: Proceedings of the Data Warehousing and Knowledge Discovery 10th International Conference, DaWaK Turin, Italy, 2008, pp. 97–114. [9] A. Cuzzocrea, D. Saccà, Balancing accuracy and privacy of OLAP aggregations on data cubes, in: Proceedings of the 13th International Workshop on Data Warehousing and OLAP DOLAP, Toronto, Ontario, Canada, 2010, pp. 93–98. [10] A. Cuzzocrea, I. Song, Big graph analytic: the state of the art and future research agenda, in: Proceedings of the 17th International Workshop on Data Warehousing and OLAP, DOLAP 2014, Shanghai, China, 2014, pp. 99–101. [11] C. Eaton, D. DeRoos, T. Deutsch, G. Lapis, P. Zikopoulos, Understanding big Data : Analytics for Enterprise Class hadoop and Streaming Data, McGraw-Hill, New York, NY, USA, 2012. [12] A. Elgammal, R. Duraiswami, D. Harwood, L.S. Davis, Background and foreground modeling using nonparametric kernel density estimation for visual surveillance, Proc. IEEE (2002) 1151–1163. [13] M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model ﬁtting with applications to image analysis and automated cartography, Commun. ACM 24 (6) (1981) 381–395. [14] K. Fu, J.K. Mui, A survey on image segmentation, Pattern Recognit. 13 (1) (1981) 3–16. [15] S. Fukui, Y. Iwahori, R.J. Woodham, GPU based extraction of moving objects without shadows under intensity changes, in: Proceedings of the IEEE Congress on Evolutionary Computation, CEC Hong Kong, China, 2008, pp. 4165–4172.

612

A. Cuzzocrea and E. Mumolo / Information Sciences 496 (2019) 592–612

[16] J. Gallego, M. Pardás, G. Haro, Bayesian foreground segmentation and tracking using pixel-wise background model and region based foreground model, in: Proceedings of the 16th IEEE International Conference on Image Processing, 2009, pp. 3205–3208. [17] J. Gómez-Luna, H. Endt, W. Stechele, J.M. González-Linares, J.I. Benavides, N. Guil, Egomotion compensation and moving objects detection algorithm on GPU, in: Proceedings of the Conference Applications, Tools and Techniques on the Road to Exascale Computing, ParCo Ghent, Belgium, 2011, pp. 183–190. [18] M. Gong, L. Cheng, Real-time foreground segmentation on gpus using local online learning and global graph cut optimization, in: Proceedings of the 19th International Conference on Pattern Recognition (ICPR) Tampa, Florida, USA, 2008, pp. 1–4. [19] I. Gorton, D.K. Gracio, Data-Intensive Computing: Architectures, Algorithms, and Applications, Cambridge University Press, New York, NY, USA, 2012. [20] A. Griesser, S.D. Roeck, A. Neubeck, L.V. Gool, Gpu-based foreground-background segmentation using an extended collinearity criterion, in: Proceedings of Vision, Modeling, and Visualization (VMV), 2005, pp. 319–326. [21] M. Heikkilä, M. Pietikäinen, A texture-based method for modeling the background and detecting moving objects, IEEE Trans. Pattern Anal. Mach. Intell. (2006) 657–662. [22] S. Huwer, H. Niemann, Adaptive change detection for real-time surveillance applications, in: Proceedings of the Third IEEE International Workshop on Visual Surveillance, 20 0 0, pp. 37–46. [23] P. KadewTraKuPong, R. Bowden, An improved adaptive background mixture model for real-time tracking with shadow detection, in: Proceedings of the 2nd European Workshp on Advanced Video-Based Surveillance Systems, 2001, pp. 1–5. [24] P. Kumar, A. Singhal, S. Mehta, A. Mittal, Real-time moving object detection algorithm on high-resolution videos using gpus, J. Real Time Image Process. 11 (1) (2016) 93–109. [25] C.-M. Kuo, W.-H. Chang, S.-B. Wang, C.-S. Liu, An eﬃcient histogram-based method for background modeling, in: Proceedings of the International Conference on Innovative Computing, Information and Control, 2009, pp. 480–483. [26] A. Lai, H.S. Yoon, G. Lee, Robust background extraction scheme using histogram-wise for real-time tracking in urban traﬃc video, in: Proceedings of the 8th IEEE International Conference on Computer and Information Technology, CIT Sydney, Australia, 2008, pp. 845–850. [27] A.-N. Lai, H. Yoon, G. Lee, Robust background extraction scheme using histogram-wise for real-time tracking in urban traﬃc video, in: Proceedings of the 8th IEEE International Conference on Computer and Information Technology, CIT, 2008, pp. 845–850. [28] K. Li, H. Jiang, L.T. Yang, A. Cuzzocrea (Eds.), Big Data - Algorithms, Analytics, and Applications, Chapman and Hall/CRC, 2015. [29] L. Li, W. Huang, I.Y.H. Gu, Q. Tian, Statistical modeling of complex backgrounds for foreground object detection, IEEE Trans. Image Process. (2004) 1459–1472. [30] J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, A.H. Byers, Big data: The Next Frontier for Innovation, Competition, and Productivity, MaKinsey Global Institute, 2011. [31] S. Momcilovic, L. Sousa, Parallel advanced video coding: Motion estimation on multi-cores, Scalable Comput. Pract. Exper. 9 (3) (2008). [32] D. Moschini, A. Fusiello, Tracking human motion with multiple cameras using an articulated model, in: Proceedings of the 4th International Conference Computer Vision/Computer Graphics Collaboration Techniques MIRAGE Rocquencourt, France, 2009, pp. 1–12. [33] S.S. Nath, G. Mishra, J. Kar, S. Chakraborty, N. Dey, A survey of image classiﬁcation methods and techniques, in: Proceedings of the International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), 2014, pp. 554–557. [34] J.F. Ohmer, P.G. Perry, N.J. Redding, Gpu-accelerated background generation algorithm with low latency, in: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, DICTA Adelaide, Australia, 2007, pp. 547–554. [35] V. Pham, P. Vo, H.T. Vu, H.B. Le, GPU implementation of extended gaussian mixture model for background subtraction, in: Proceedings of the IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), Hanoi, Vietnam, 1-4 November, 2010, 2010, pp. 1–4. [36] M. Poremba, Y. Xie, M. Wolf, Accelerating adaptive background subtraction with GPU and CBEA architecture, in: Proceedings of the IEEE Workshop Signal Processing Systems, 2010, pp. 305–310. [37] W.K. Pratt, Digital Image Processing: PIKS Inside, 3rd ed., John Wiley & Sons, Inc., New York, NY, USA, 2001. [38] R.J. Radke, S. Andra, O. Al-Kofahi, B. Roysam, Image change detection algorithms: a systematic survey, IEEE Trans. Image Process. 14 (3) (2005) 294–307. [39] R. Ranjan, D. Thakker, A. Haller, R. Buyya, A note on exploration of IOT generated big data using semantics, Future Generat. Comp. Syst. 76 (2017) 495–498. [40] A. Sanin, C. Sanderson, B.C. Lovell, Shadow detection: a survey and comparative evaluation of recent methods, Pattern Recognit. 45 (4) (2012) 1684–1695. [41] T.T. Santos, C.H. Morimoto, Multiple camera people detection and tracking using support integration, Pattern Recognit. Lett. 32 (1) (2011) 47–55. [42] J. Scott, M.A. Pusateri, D. Cornish, Kalman ﬁlter based video background estimation, in: Proceedings of the IEEE Applied Imagery Pattern Recognition Workshop, AIPR Washington, DC, USA, 2009, pp. 1–7. [43] A.C. Squicciarini, D. Lin, S. Sundareswaran, J. Wede, Privacy policy inference of user-uploaded images on content sharing sites, IEEE Trans. Knowl. Data Eng. 27 (1) (2015) 193–206. [44] C. Stauffer, W.E.L. Grimson, Learning patterns of activity using real-time tracking, IEEE Trans. Pattern Anal. Mach. Intell. 22 (8) (20 0 0) 747–757. [45] L. Wang, N.H.C. Yung, Extraction of moving objects from their background based on multiple adaptive thresholds and boundary evaluation, IEEE Trans. Intell. Transp. Syst. (2010) 40–51. [46] Z. Wu, W. Yin, J. Cao, G. Xu, A. Cuzzocrea, Community detection in multi-relational social networks, in: Proceedings of the 14th International Conference Web Information Systems Engineering - WISE, Nanjing, China,Part II, 2013, pp. 43–56. [47] C. Yang, J. Liu, C. Hsu, W. Chou, On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism, J. Supercomput. 69 (3) (2014) 1103–1122. [48] B. Yu, A. Cuzzocrea, D.H. Jeong, S. Maydebura, On managing very large sensor-network data using bigtable, in: Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid Ottawa, Canada, 2012, pp. 918–922. [49] Q. Yu, G.G. Medioni, A gpu-based implementation of motion detection from a moving platform, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops Anchorage, AK, USA, 2008, pp. 1–6. [50] Q. Zhang, L.T. Yang, Z. Chen, P. Li, High-order possibilistic c-means algorithms based on tensor decompositions for big data in iot, Inf. Fusion 39 (2018) 72–80.

A novel GPU-aware Histogram-based algorithm for supporting moving object segmentation in big-data-based IoT application scenarios

A novel GPU-aware Histogram-based algorithm for supporting moving object segmentation in big-data-based IoT application scenarios

Recommend Documents