PhD Forum: WiseNET - smart camera network interacting with a semantic model Roberto Marroquin
Julien Dubois
Christophe Nicolle
Laboratory Le2i - UMR 6306, CNRS University of Burgundy Franche-Comté, France
Laboratory Le2i - UMR 6306, CNRS University of Burgundy Franche-Comté, France
Laboratory Le2i - UMR 6306, CNRS University of Burgundy Franche-Comté, France
[email protected]
[email protected]
ABSTRACT
[email protected]
This paper presents an innovative concept for a distributed system that combines a smart camera network with semantic reasoning. The proposed system is context sensitive and combines the information extracted by the smart camera with logic rules and knowledge of what the camera observes, building information and events that may occurred. The proposed system is a justification for the use of smart cameras, and it can improve the classical visual sensor networks (VSN) and enhance the standard computer vision approach. The main application of our system is smart building management, where we specifically focus on increasing the services of the building users.
This paper presents an innovative approach of overcoming these problems by exploiting the data using smart cameras, in charge of extracting significant information from the scene, and by adding knowledge on the context, i.e., semantic information of what the camera observes, building information and events that may occurred. Semantic information coming from different nodes can be easily integrated in an ontology. Our approach differs from the standard computer vision, which deals with algorithm improvements [4, 10] and signal processing problems [6], by dealing with a meaning problem in computer vision [7], where we try to improve and understand what the camera ”observes” by adding contextual semantic information.
CCS Concepts
2. WISENET SYSTEM
•Computer systems organization → Sensor networks; •Information systems → Information integration;
The WiseNET (Wise Network) system consists of a smart camera network connected to a semantic model. The communication between the smart camera network and the ontological model is bidirectional, i.e., the cameras can send information either when the model asks for it or whenever new data becomes available. The semantic model was described and defined by means of ontology, this allow us to express information in our system and to take decisions according to combinations of the different information [3, 5].
Keywords Building information modeling, event detection, knowledge engineering, semantic gap, visual sensor network
1. INTRODUCTION Nowadays visual sensor networks (VSN) have become a part of our daily life [9]. Many efforts have being devoted to deal with the huge amount of data produced by the VSN. Based on our experience we have identified three main problems of classical VSN. Firstly, the problem of selecting/filtering relevant information from the big data. Secondly, the problem of integrating the information coming from the different nodes of the network, i.e., linking the different informations to take a decision. Finally, the problem of protecting the privacy of the individuals while extracting useful information from an image/video.
2.1
For this project, the raspberry pi 3 was used due to its low cost and its high processing performance which allow us to implement, and execute in real-time, complex algorithms such as fall detection [6], person detection [10], face detection, color histograms, etc. Some specification of the raspberry pi 3 are: CPU :1.2Ghz quad-core ARM Cortex-A53, GPU :Broadcom VideoCore IV and RAM : 1GB at 900Mhz. The network topology and the interaction between camera nodes is being studied.
2.2 Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
[email protected].
ICDSC ’16, September 12-15, 2016, Paris, France c 2016 ACM. ISBN 978-1-4503-4786-0/16/09. . . $15.00 ⃝ DOI: http://dx.doi.org/10.1145/2967413.2974036
Smart Camera Network
Semantic Model
The semantical model is articulated in three sections: sensor, environment and application. All the sections are bilaterally connected between themselves by properties and relations. The sensor section consists of a set of concepts and relationships concerning the smart camera, the image processing algorithms and their results [1]. The sensor section is in charge of describing the sensor output and giving a seman-
processing algorithm to adapt to its context or to detect a specific feature of interest. Secondly, it can ask the camera to search, in a specific part of the FOV, for a certain feature of interest. This can be used to complete missing knowledge or to confirm an inferred knowledge.
3.
Figure 1: Interaction between the smart camera and the semantic model.
tic meaning to what the smart cameras observes, a problem known as semantic gap [7]. The environment section is composed of a semantic vocabulary regarding the building information model (BIM) [2], i.e., the structure of the building (number of floors, rooms, halls, etc) and the different elements that can be found in a space (doors, windows, walls, furnitures, etc). Finally, the application section comprises a set of concepts and relationships concerning the different events that may occurred in a building and a set of logic rules defining some actions and decisions to take according to the occurrence of these events [5].
2.3 WiseNET behaviors The system behaviors (Fig.1) are articulated in four steps: Feature Extraction: the smart camera detects some features of interest such as motion, face detection, person detection [10], etc. Send to Model: the camera sends a messages (no images) stating the presence of a specific feature, the time and the position of the feature in the field of view (FOV). The message has to be shaped according to vocabulary defined in the semantic model. This part populates the semantic model with the data sent by the camera and allows the ontology to classify the data. Ontological Reasoning: the semantic model combines all the messages received from the cameras and based on the contextual knowledge and logic rules it can quickly infer new knowledge and/or take some decisions. The decisions can be categorized in three types. Firstly, rules for event identification, such as profiling the behavior of a person, incorrect use of the building elements, presence of somebody on a restricted area, an abandoned object [5], etc. Secondly, rules to alert users about the occurrence of certain event, such as alerting the closest person to help somebody that fell, calling security if somebody is in a restricted area or if an object has been abandoned, alerting that somebody is stuck in the elevator, etc. Finally, rules to improve the visualization of features, i.e., change the image processing algorithm according to the features that want to be detected or just to take into consideration the environmental changes such as light, events, if a door is closed, the position of a furniture, etc. Send to Camera: according to the inference process, the semantic model can send two different types of messages. Firstly, it can ask the camera to change the image
CONCLUSIONS AND FUTURE WORKS
The WiseNET system may improve the classical VSN by combining pertinent information, extracted by the camera, with logic rules and knowledge on the context. Furthermore, the semantic model can be used to enhanced the standard computer vision by adding rules to adapt to the environment in order to improve the visualization of features. In addition, the WiseNET system overcomes the privacy/utility trade off present on classical VSN [8, 9] by not sending images, just pertinent information from the scene. The main goal of our system is to increase the number of services for the building users in order to improve their daily activities, their welfare and safety. As a future work, a database is being constructed to validate our system, where we have focused on two design cases: person tracking and surveillance of restricted areas.
4.
REFERENCES
[1] M. Compton et al. The ssn ontology of the w3c semantic sensor network incubator group. Web Semantics: Science, Services and Agents on the World Wide Web, 17:25–32, 2012. [2] M. Farias et al. Cobieowl, an owl ontology based on cobie standard. In OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”, pages 361–377. Springer, 2015. [3] S. R. Fiorini and M. Abel. A review on knowledge-based computer vision. 2010. [4] R. Mosqueron et al. Smart camera based on embedded hw/sw coprocessor. EURASIP Journal on Embedded Systems, 2008:3, 2008. [5] J. C. SanMiguel et al. An ontology for event detection and its application in surveillance video. In Advanced Video and Signal Based Surveillance, 2009. AVSS’09. Sixth IEEE International Conference on, pages 220–225. IEEE, 2009. [6] B. Senouci et al. Fast prototyping of a soc-based smart-camera: a real-time fall detection case study. Journal of Real-Time Image Processing, pages 1–14, 2014. [7] C. Town. Ontological inference for image and video analysis. Machine Vision and Applications, 17(2):94–115, 2006. [8] H. Vagts and A. Bauer. Privacy-aware object representation for surveillance systems. In Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International Conference on, pages 601–608. IEEE, 2010. [9] T. Winkler and B. Rinner. Security and privacy protection in visual sensor networks: A survey. ACM Computing Surveys (CSUR), 47(1):2, 2014. [10] S.-I. Yu et al. Harry potter’s marauder’s map: Localizing and tracking multiple persons-of-interest by nonnegative discretization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3714–3720, 2013.