Neurocomputing 126 (2014) 116–117
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
Guest editorial
Online data processing
In contrast to batch learning where data are assumed to be drawn from a stationary distribution and are available ahead of the training phase, in online learning, the data come over time and can potentially be non-stationary. The data may change drastically in the future so that the model learned so far will be hardly applicable to the data coming in the future. Online processing of data embraces two issues: (1) online and adaptive adjustment of system's parameters and (2) continuous evolution of the system's structure. These issues are important for dealing with real-world data streams because of the following scenarios: (1) the dynamics of the system change, (2) the operating conditions of the system change, eventually new ones may emerge, (3) the states of the system change and their number may increase or decrease, and dynamically over time. Such problems challenge the task of building reliable intelligent systems that operate in dynamic environments where input needs to be processed in real-time and on the fly—ideally with a minimal usage of past data. Therefore, online processing of data is relevant when space memory cannot suffice to handle all data in a one-shot experiment. It is also relevant to when the process generating data is non-stationary characterized by various types of drift (steady change/gradual, sudden, cyclic). In online processing, data is segmented into batches and processed sequentially in one-pass. In the extreme case, data is processed sample by sample as they arrive over time. The aim in both methods is to build models incrementally such that not only their parameters are tuned but also their structure gets adapted. A key issue during adaptation is to find an appropriate strategy to update models in a way to achieve a feasible tradeoff between plasticity and stability and to prevent catastrophic forgetting. This tradeoff is crucial in order to achieve convergence of the models on the one hand and sufficient flexibility (e.g. to properly resolve drifts in data streams) on the other hand. Online data processing of high-speed and non-stationary data streams has prominent relevance in various fields like finance, internet, security, smart environments, industrial processes, robotics, etc. Its application encompasses various tasks such as monitoring, classification, diagnostic, prediction, forecasting, clustering, etc. Over the recent years, there has been an ever growing interest and demand in self-adaptive autonomous systems operating online, capable of sequentially processing massively large and 0925-2312/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.neucom.2013.05.008
continuous streams of data in an evolving setting. The present special issue of “Neurocomputing” aims at shedding light on the new advances and future avenues of online processing of data with a particular focus on the design issues of online evolving systems, algorithms and methods dedicated to online processing of data. Reflecting on the recent developments in the area of online learning, the present special issue offers a large spectrum of contributions pertaining and covering various aspects. In all 7 contributions have been retained after a thorough review process. In the following a short presentation of each of the papers is introduced. Paper 1 “Dynamic supervised classification method for online monitoring in non-stationary environments” by L. Hartert and M. M. Sayed-Mouchaweh, a classification method for online monitoring of non-stationary processes is proposed. This method involves an incremental algorithm to track the gradual change of data from different classes. The change is detected using a double-check criterion consisting of two thresholds. If the first criterion is satisfied the data patterns are placed in an evolution block and if the second criterion is fulfilled then the adaptation of the model (classes) is triggered. Paper 2 “Video surveillance; Human recognition; Incremental Support Vector Machine; On-line multiclass classification” by Y. Lu et al. introduces an on-line human recognition system, which is able to classify persons with adaptive abilities using an incremental support vector machines classifier. The ISVM algorithm is trained periodically on batches of images in a way to update incrementally the parameters of the classifier. Here the support vectors which are computed in each step are used in the following training step in a recursive manner. Paper 3 “Online Fuzzy Medoid Based Clustering Algorithms” by N. Labroche presents two versions of an online fuzzy clustering algorithm based on medoids. These algorithms are equipped, on one hand, with mechanisms to deal with outliers and overlapping clusters and on the other hand with a decay mechanism to adapt more effectively to changes over time in data streams. In the paper 4 “Online Fault Detection of a Mobile Robot with a Parallelized Particle Filter” by M. Zając, a particle filtering-based approach combined with the negative log-likelihood test to detect faulty behavior of mobile robots. For efficiency purposes, parallelism is implemented. The goal of this approach is not only reducing the execution time, but also improving its performance in the applications for which the time boundedness is a key constraint for online processing. Various options related to particle sampling are investigated in the paper. Paper 5 “Online Variational Learning of Generalized Dirichlet Mixture Models with Feature Selection” by W. Fan and N. Bouguila proposes a statistical framework for simultaneous online
Guest editorial / Neurocomputing 126 (2014) 116–117
clustering and feature selection using finite generalized Dirichlet mixture model. The proposed framework allows to control overfitting by dynamically adjusting the mixture model's parameters, number of components and the features weights. Paper 6 “Border Pairs Method—constructive MLP learning classification algorithm” by B. Ploj et al. presents a constructive learning algorithm for multilayer perceptron (MLP) that implements online learning. In this approach border pairs of data are used to tune the boundaries between the classes by training the network offline. This is however is done only in case the patterns are misclassified during the training. Mechanisms of forgetting (unlearning) are used to adapt to the online learning setting. Paper 7 “Adaptive Brain Emotional Decayed Learning for Online Prediction of Geomagnetic Activity Indices” by E. Lotfi and M.-R. Akbarzadeh-T. proposes an algorithm for adaptive brain-inspired emotional decayed learning. Relying on the standard AmygdalaOFC model, the paper suggests a classification method by transforming an originally reinforcement learning based method into a classification problem. The approach does not operate online, but can be adapted to fit the context of online processing.
117
We hope that this special issue sheds light on some novel ideas of online data processing, particularly its relevance to some application areas covered by the papers. The intention was to bring new insight into this young but active research area. At the end, we would like to gratefully acknowledge and sincerely thank all the reviewers for their insightful comments and criticism of the manuscripts. Our thanks go also to the authors for their contributions and collaboration. Finally, we are grateful to the Editorin-chief of Neurocomputing, Prof. Tom Heskes for his suggestive and insightful editorial comments and support during the review process and to Vera Kamphuis, the editorial assistant, for her valuable help along the review and production process of this special issue. Abdelhamid Bouchachia n School of Design, Engineering and Computing, University of Bournemouth, United Kingdom E-mail address:
[email protected] Received 26 May 2013; accepted 26 May 2013
n
Tel.: +44 1202 962401.