A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks

A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks

Applied Mathematics and Computation xxx (2015) xxx–xxx Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepag...

1MB Sizes 0 Downloads 37 Views

Applied Mathematics and Computation xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks Marta Kolasa a, Tomasz Talaska a, Rafał Długosz a,b,⇑ a b

UTP University of Science and Technology, Faculty of Telecommunications, Computer Science and Electrical Engineering, ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland Institute of Microtechnology of Swiss Federal Institute of Technology in Lausanne, Rue A.-L. Breguet 2, CH-2000 Neuchâtel, Switzerland

a r t i c l e

i n f o

Keywords: Unsupervised learning algorithms Recursive neighborhood mechanism Topology of neural network Software and hardware implementations

a b s t r a c t In this paper we propose a novel recursive algorithm that models the neighborhood mechanism, which is commonly used in self-organizing neural networks (NNs). The neighborhood can be viewed as a map of connections between particular neurons in the NN. Its relevance relies on a strong reduction of the number of neurons that remain inactive during the learning process. Thus it substantially reduces the quantization error that occurs during the learning process. This mechanism is usually difficult to implement, especially if the NN is realized as a specialized chip or in Field Programmable Gate Arrays (FPGAs). The main challenge in this case is how to realize a proper, collision-free, multi-path data flow of activations signals, especially if the neighborhood range is large. The proposed recursive algorithm allows for a very efficient realization of such mechanism. One of major advantages is that different learning algorithms and topologies of the NN are easily realized in one simple function. An additional feature is that the proposed solution accurately models hardware implementations of the neighborhood mechanism. Ó 2015 Elsevier Inc. All rights reserved.

1. Introduction Artificial neural networks (ANNs) can be seen as universal expert systems, that are more and more frequently used in various areas including engineering and medical ones [1,2]. Usually they are considered from the point of view of a development and/or an optimization of learning algorithms specific for particular architectures of the NNs. A substantially different problem is how to translate a mathematical description of a given learning algorithm, sometimes very complex, into its efficient realization. Looking from the point of view of the second criterion, the most popular are software realizations, as they are very flexible. In practice any algorithm can be easily implemented in software. Unfortunately, pure software systems are not suitable for some kinds of applications that include portable miniaturized devices that have to consume ultra low power. An interesting example of such systems are Wireless Body Area Networks (WBANs), recently more and more frequently used in medical diagnostics [3–6]. Hardware realization that are an alternative, usually mean Field Programmable Gate Arrays (FPGAs) [7–10] but also, more rarely, Application Specific Integrated Circuits (ASICs) realized in the ‘full custom’ style [8,11–14]. A challenge in this case ⇑ Corresponding author at: UTP University of Science and Technology, Faculty of Telecommunications, Computer Science and Electrical Engineering, ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland. E-mail addresses: [email protected] (M. Kolasa), [email protected] (T. Talaska), [email protected] (R. Długosz). http://dx.doi.org/10.1016/j.amc.2015.03.068 0096-3003/Ó 2015 Elsevier Inc. All rights reserved.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

2

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

relies on the necessity of solving some specific problems not known in software engineering. In digital ASICs the main problems we have to face with is the minimization of the number of transistors used to implement particular functions and modules, as this factor has a direct influence on the energy consumption, the computation power and area occupied by the overall system. Another important problem is a proper synchronization of particular building blocks to avoid glitches in the signals, which raise up the power dissipation. In analog realizations we have to additionally face with various phenomena of physical nature, such as charge injection effect, leakage in analog memory cells, parasitic capacitors and the transistor mismatch. These phenomena are the source of many problems which significantly disturb the outcomes of the learning process. The disadvantages of hardware realization are diminished by several paramount features. One of them is a strong ability of parallel data processing, which allows for reaching computation power related to consumed energy not available in software systems [11,15]. However, to make realization of such systems possible, a careful design is required, which in practice means a proper modeling of the system. In the literature one can find many examples of modeling of ANNs. In fact, each software implementation to some extent can be seen as a model. Our new investigations in this area results from the fact that the existent models are insufficient when we cope with ANNs realized at the transistor level, which is our objective. The problem is not trivial. We have to deal not only with development and optimization of specific algorithms, which means determination of the architecture of the NN, but also on such an optimization of this algorithm to reduce the power dissipation and the chip area of the resultant chip. The presented work is an important step of a bigger project that aims at development of an ultra low power parallel neuroprocessor that will be used in new generation sensing nodes in WBANs used in the analysis of the electrocardiography signals (ECG). A typical WBAN consists of a set of small devices, each equipped with a biosensor or a set of biosensors, allocated or implanted in the human body close to the place of signal acquisition that are capable to communicate wirelessly with a base station (Master Processing Unit – MPU). In a typical WBAN sensors (nodes) usually perform only basic tasks, such as: data collection, analog-to-digital conversion, simply data preprocessing and conditioning (filtering). Finally, the processed data throughout the radio frequency (RF) communication block are being transmitted to the MPU for a further detailed processing and analysis. One of the main problems encountered in such systems today is very large amount of energy lost (even 95% of total energy) during the RF transmission of data that significantly reduces the battery lifespan [16]. This becomes one of the barriers in a development a truly wearable systems. Development of a miniatured and low power ANNs, which is our objective, will enable the realization of new intelligent sensors that itself will be able to preform data analysis and classification and thus will reduce the energy consumed during data transmission. The resultant WBAN composed of such sensors will be much more convenient in use and cheaper. One of main challenges both in hardware and in software approaches is an efficient realization of the neighborhood mechanism [11,14]. This mechanism is closely related to the NN topology and determines the map of connections between particular neurons in the NN. The most popular topologies are the rectangular one with four or eight neighboring neurons (Rect4 and Rect8) and the hexagonal one with six neighbors (Hex) [17]. They are schematically shown in Fig. 1. Our former investigations showed that particular topologies are optimal in different scenarios [11,15], so it is well if a NN realized either in software or in hardware leaves the user a choice of which topology to choose. The structure of the neighborhood mechanism depends also on the type of the learning algorithm. All these issues are discussed in details in next Section. In our investigations we focus mostly on self-organizing learning algorithms, as they offer relatively simple structures with usually only simple arithmetic operations such as additions, subtractions, multiplications, shifting the bits and the abs() function. Nevertheless, such NNs has proven to be suitable for the analysis of various biomedical signals, including ECG signals [2,18]. In next Section we briefly discuss example algorithms of this type. In contrary to software realizations in which any part of the system can be relatively easily reprogrammed, in case of the transistor level implementation once the system is fabricated any further changes are not possible. For this reason, we developed a software system that enables massive computations for many parameters usually not seen in other models. The system has been written in ANSI C++ language under the Linux OS. It does not need a graphical shell, so it can be compiled and run under most Unix-type systems. The realized model is strongly oriented on hardware realizations, so the outcomes of the simulations can easily be ported to any low level hardware, including full-custom ASIC designs.

Fig. 1. Typical SOM topologies: (a) and (b) rectangular with 4 (Rect4) and 8 (Rect8) neighbors, (c) hexagonal (Hex).

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

3

The proposed software system models various self-organizing NNs such as a classical Kohonen Self-Organizing Map (SOM), the Winner Takes All (WTA) NN with various scenarios of the conscience mechanism [11,19], the neural gas NN, and several versions of the, so called, fisherman algorithm [20]. We have introduced various modifications to these algorithms, important looking from the point of view of their hardware implementation. Any other algorithm can be easily added to the system in case of need. Various topologies of the SOM, and various neighborhood functions can be easily investigated as well. In general, the system enables adjustment of more than sixty parameters with many of them being related to hardware issues mentioned above. The system enables simulations of a list of tasks, while each task can be a quite different scenario. Additionally in each task the majority of the parameters can be changed in the loops with a provided step. All these features enable generating a huge amount of results in a short time. The purpose of this is to investigate as many scenarios as possible before the final chip will be designed and fabricated. The system is briefly presented in Section 4.2. In our system we use a dynamically created table of neurons, which can be easily fitted to any required sizes of the NN. Particular neurons are complex objects of class NEURON linked with their neighbors throughout pointers. This is a flexible approach, as any topology of the NN i.e. the map of connections between the neurons, can easily be established. To obtain a proper data flow in the realized neighborhood mechanism, which is a problem itself, as collisions can occur, a fast and efficient recurrent approach has been proposed. This approach enables the function that realizes the neighborhood to operate with different learning algorithm. The proposed approach is described in Section 4. 2. Self-organizing algorithms in the perspective of the hardware implementation Various Self-Organizing NNs has been proposed in the literature, but not all of them are suitable for hardware realization. For this reason, in our work we follow a different path than typical ANN designers. First we have to look for relatively simple algorithms, parametrize them and optimize to obtain a required performance. This stage requires very large number of simulations (even more than 10,000 per design [11]) for different combinations of particular parameters. The aim of these investigations is to find such values of these parameters that will cover many scenarios (various input data sets). A desired purpose is to obtain only a small set of parameters that need to be reconfigurable, as each non constant parameter increases the hardware complexity of the overall system. At the next stage the algorithms are optimized in such a way to make them realizable in fixed-point hardware. Such models can be easily autocoded even in simple microcontrollers [21,22] or relatively quickly ported to a fully-custom designed chip. Parameters that are carefully verified at this stage are resolutions (in bits) of particular internal signals. In this Section we briefly present two self-organizing learning algorithms that can be relatively easily implemented at the transistor level as parallel systems. 2.1. Classical learning algorithm of the SOM A very simple algorithm is the classical one proposed by Kohonen [17]. In the competitive learning of the SOM training patterns, XðlÞ, being vectors in an n-dimensional space Rn coming from a given learning data set are presented to the ANN in the random fashion. At each learning cycle, l, the network computes a distance between a given pattern and the weight vectors W j ðlÞ of all neurons in the SOM. The neuron, whose weights resemble a given input pattern to the highest extent, becomes a winner. This neuron, as well as its neighbors are allowed to adjust their weights, according to the following formula:

W j ðl þ 1Þ ¼ W j ðlÞ þ gðkÞGðR; dði; jÞÞ½XðlÞ  W j ðlÞ

ð1Þ

In this formula gðkÞ is the learning rate constant in a given kth training epoch, W j is the weights vector of a given neuron, while X is an input training pattern presented to the network in an lth learning cycle. One thing has to be explained at this stage. An epoch means presentation to the NN all learning patterns XðlÞ from a given data set, while particular patterns are presented in the random sequence. The neurons that belong to the winner’s neighborhood, are trained at different intensities, determined by their neighborhood functions GðÞ. One of the input parameters of the GðÞ functions is a distance, dði; jÞ, between the winning neuron, i, and a given, j, neuron belonging to the neighborhood of the winner. The second parameter, R, is the so called neighborhood range that determines if a given neuron belongs to the neighborhood of the winner in the l cycle. The value of R is the largest at the beginning of the learning process, and then diminishes to zero, thus shrinking the neighborhood range as the learning process goes from the rough learning phase towards the tuning phase. Details concerning neighborhood functions and the NN topology are discussed further in this Section. 2.2. ‘‘Fisherman’’ learning algorithms Recently we started investigating the possibility of hardware realization of another algorithm, proposed by Lee and Verleysen in [20], called the ‘‘fisherman’’ learning rule. It has been demonstrated in [20] that this algorithm allows to obtain better results according to several assessment criteria, described in Section 3.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

4

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

Let us recall that in the classical Kohonen’s algorithm all neurons that belong to the winner’s neighborhood are adapted so that their weights vectors move towards the input pattern X, as shown in Eq. 1. In the algorithm proposed in [20], in the first iteration the winning, i, neuron is adapted in the same way as encountered in the classic update rule:

W i ðl þ 1Þ ¼ W i ðlÞ þ a0  ðXðlÞ  W i ðlÞÞ

ð2Þ

For a distance d ¼ 0 the a0 parameter is equal to g0 ðkÞ  GðÞ term present in (1). On the other hand, in the fisherman rule the neighboring neurons (d ¼ 1; . . . ; R) are trained in an iterative fashion according to formula:

W d ðl þ 1Þ ¼ W d ðlÞ þ ad  ðW d1 ðl þ 1Þ  W d ðlÞÞ

ð3Þ

In the second iteration, for d ¼ 1, all neurons from the first ring surrounding the winner are adapted in such a way that their weights move toward the weights of the winning neuron calculated in the first iteration. The neurons from the second ring, i.e., for d ¼ 2, are in the next iteration adapted towards the updated weights of the neurons of the first ring, and so forth. In case of the software realization, in which weights of particular neurons are calculated sequentially, both algorithms come with a comparable computational complexity, so there is a sound rationale behind using the fisherman rule in many cases. On the other hand, the fisherman concept is significantly more complex in hardware realization, as the described iterative adaptation sequence has to be controlled by an additional multiphase clock. Furthermore, the control clock signals have to be distributed on the chip using additional paths. The iterative nature of the second algorithm is a source of another disadvantage. The adaptation of neurons in each following ring can be undertaken only after the adaptation in the preceding ring has been completed. This is the source of a delay that significantly slows down the adaptation phase. For the comparison, in the classic algorithm realized at the transistor level the adaptation in all neurons can be performed fully in parallel [11,15]. To overcome the described problems and to make the realization of the fisherman algorithm more efficient we modified the update formula as follows:

W d ðl þ 1Þ ¼ W d ðlÞ þ ad ½W d1 ðlÞ  W d ðlÞ

ð4Þ

In comparison with the original algorithm, in which each successive ring of neighbors uses the weights W d1 ðl þ 1Þ i.e. adapted in a given learning cycle, in the modified algorithm, we use the weights W d1 ðlÞ from the previous cycle. As a result, the neurons at each following ring do not need to wait until the adaptation at the preceding ring has been completed. This makes the overall adaptation process approximately R times faster, where R is the neighborhood range. Our former investigations show that for small values of the g0 ðkÞ  GðÞ term, which is typical for larger values of l, this modification does not have a negative impact on the learning results.

2.3. Neighborhood function The neighborhood function GðR; dði; jÞÞ is an important feature associated with SOMs that determines the intensity with which a given neighbor on a given distance d to the winner is adapted. This function, used in Eqs. 1 and 3, returns the value ‘1’ (for the winner) or less than 1 for particular neighbors. In the literature one can find several typical functions of this type. In the classical approach proposed by Kohonen, a simple rectangular neighborhood function (RNF) has been used [17,23]:

GðR; dði; jÞÞ ¼



K

if dði; jÞ 6 R

0

if dði; jÞ > R

ð5Þ

where K is a constant gain usually equal to 0.5. As stressed in the literature, better results are achieved by using Gaussian neighborhood function (GNF) [24], defined as follows: 2

GðR; dði; jÞÞ ¼ exp 

d ði; jÞ

! ð6Þ

2R2

The problem with this function is its high complexity that makes its hardware implementation very difficult, especially in case of parallel systems, in which each neuron has to be represented by a separate circuit. For this reason, we proposed [15] an efficient transistor level implementation of a triangular neighborhood function (TNF), which requires only a single multiplication operation followed by shifting the bits. Our former investigations show that this function can be used as a substitute for GNF [15] in all cases. The TNF is defined as follows:

GðR; dði; jÞÞ ¼



aðg0 Þ  ðR  dði; jÞÞ þ c

if dði; jÞ 6 R

0

if dði; jÞ > R

ð7Þ

where aðÞ is the assumed steepness of this function, g0 is the winning neuron’s learning rate, while c is the bias value. All these parameters decrease toward zero after each training epoch. Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

5

2.4. Topology of the SOM In the literature one can find three topologies that are typically used in SOMs (as shown in Fig. 1). In this paper we refer to them to as Rect4, Rect8 and Hex. In Rect4 and Rect8 each neuron has four or eight neighbors, respectively, unless it is located on the border of the map. In case of the Hex topology particular neurons can have up to six neighbors. Topology is one of the most important features of the SOM. It is closely related to the structure of the neighborhood mechanism that can be seen as a map of connections between particular neurons, as well as the rules determining flow scheme of activation signals that are propagated throughout this map. The neighborhood mechanism is always the subject of a careful optimization, as its structure has a strong influence on the computation power of the NN, as well as the complexity of the overall system. In case of transistor level designs it has also a strong influence on the energy consumption as well as the area occupied by the chip [11,13]. Two parameters characterize connections between each pair of neurons in the SOM. One of them is a 1-bit information if a connection between a given pair ‘‘physically’’ exists. In general, the SOMs can be divided into two groups. In the first of them the ‘‘physical’’ layer is fixed, as shown in Fig. 1, i.e. it does not change during the learning process. In the second group the physical layer can be dynamically modified, which means that the map of connections is determined in each learning cycle on the basis of mutual distances (in the input data space) between particular neurons. Such scenario can be found, for instance, in the Neural Gas algorithm [25]. In transistor level implementations the information if a given connection does exist can be represented by existence of a physical signal path connecting two neurons throughout proper hardware interfaces [11]. This approach is especially convenient in case of parallel NNs, in which each neuron is represented by a separate circuit block. In this approach the connections exist only between these neurons that in the layout are located in the closest proximity. If a given neuron has to communicate with a neuron located on a larger distance, the neurons in-between this pair play a role of ‘relay’ of the activation signals. In our former work we proposed an asynchronous solution of this type [11]. This approach is very fast, as a winning neuron can communicate in parallel with all neighboring neurons, independently on the size of the neighborhood. In software implementations various approaches can be applied, but in general such connections are always ‘‘virtual’’, realized by indexing. The most popular solution is the one offered by Neural Networks toolbox in the Matlab system. It is based on a lookup table, in which each neuron is represented by a raw with indexes that link particular neighboring neurons together. The main challenge in this approach relies on realizing a collision-free data flow necessary to avoid unambiguous situations, in which a given neuron is activated more than one time or, in other words, is attempted to be activated by more than one neuron. This solution is somehow inconvenient in case of large SOMs, in which the sizes of the map have to be frequently modified. Another problem is that the structure of the look-up table has to be modified every time the SOM topology is being changed. Additionally in case of more sophisticated algorithms like the fisherman ones a simple indexing is not sufficient. In our investigations we initially realized the model of the classical SOM in the Matlab system, but as it was too slow to perform a required number of simulations in a short time, we ported this solution to C/C++ language. Due to disadvantages of the look-up table described above, in the next step we realized the neighborhood in another way, shown in Fig. 2. The overall NN is in this case realized as a 2-dimensional table of objects of class NEURON. Thus, to some extent it resembles the structure shown in Fig. 1. Distance between the winning neuron and any other neuron in the map can be quickly determined only on the basis of indexes i; j of particular neurons in the table. For particular topologies distances are calculated as follows:

dRect8 ¼ MAXðabsðDðiÞÞ; absðDðjÞÞÞ

ð8Þ

dRect4 ¼ absðDðiÞÞ þ absðDðjÞÞ

ð9Þ

dHex ¼



dRect8 ; if sgnðDðiÞÞ ¼ sgnðDðjÞÞ dRect4 ; if sgnðDðiÞÞ – sgnðDðjÞÞ

ð10Þ

The adaptation of the weights of particular neighbors is performed raw by raw in this table from top to bottom. For each raw the algorithm determines which neurons belong to the neighborhood of a winning neuron for a given neighborhood radius R, taking into account the SOM topology. The ranges for each raw are shown as ðstart; stopÞ values on the right side of each raw. Fig. 2(d) has been added to illustrate how the Hex topology is realized in the 2-D table. In our investigations we found this solution much faster than the one based on the look-up table. Unfortunately, it becomes inconvenient in case if various algorithms and various topologies have to be easily accessible in a single system. Note that distances for each topology are calculated in a different way (Eqs. (8)–(10)). In the classical SOM, in which the updates of the weights are calculated in relation to a given pattern X, this solution is very convenient. Each neuron ‘‘sees’’ the same vector X, so as long as we are able to determine the distances between the neurons (it is done independently for each pair of neurons), the adaptation sequence in not important. A larger problem appears in the fisherman algorithms, in which the weights of particular neurons are modified toward the weights of other neurons, calculated earlier. This approach requires to determine the direction from which a given neuron receives the weights, as described in Section 2.2. Due to these problems we recently proposed a solution based on a recursive approach which does not depend neither on the algorithm nor the topology of the SOM. Additionally, this solution very accurately models our hardware approach briefly presented in Section 4.1 [11].

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

6

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

Fig. 2. Realization of the neighborhood in 2-D table for SOM topologies shown in Fig. 1. Illustration of a sequence how particular neurons are activated in case: (a) Rect4 topology (b) Rect8 topology, (c) Hex topology, (d) resultant Hex topology shown for a better illustration.

The second parameter associated with each connection determines if a given neuron (say A) is allowed to receive signals from its neighbor (say B). The value of this parameter depends on mutual locations of the winning neuron (say C) and neurons A and B. If for example, the neuron A is located in-between B and C, the neuron A will not receive the activation signal from B, even though A and B are physically connected. The winning neuron will activate the neuron A, then A will activate the neuron B, but the neuron B will not return an activation signal to A. This is a very important feature, that enables avoiding collisions in the neighborhood mechanism. This issue will be discussed in more details in Section 4.1. 3. Comparison of unsupervised learning algorithms Optimization of the learning process is one of the main objectives of the investigations carried out by researchers around the world [20,26–28]. To say that a learning process is optimal it is necessary to define appropriate criteria that can serve as a point of reference for such estimation. In this paper we define five criteria described below, but first it is necessary to explained the nature of the problems encountered in the ANN area. At this point it must be clarified that in case of ANNs the optimization problems are typically substantially different than in conventional optimization meaning. Even in relatively small NNs it is not possible to define a mathematical dependency (a cost function) between particular parameters of the NN and the optimization criteria. This is due to several reasons. One of the main problems are input signals acquired by NNs. Such signals are often heuristic and as such cannot be used as inputs to a potential cost function. In many cases learning patterns are composed of different signals, which are not correlated with each other. Such data can contain, for example, the percent of patients that died during n years after a cancer has been diagnosed with them, while other components can be other medical factors or even conditions in which a given patient lived [29]. Another problem is typically high complexity of NNs. In our investigations we considered NNs with even more than 1500 neurons and 10 inputs (10,000+ calculation channels). Many internal functions in particular neurons are nonlinear. For example, the winning neuron is the one that is located in the closest proximity to a given learning pattern, independently on the value of this distance, while this distance can be measured in accordance with a nonlinear (non-Euclidean) measure. Additionally, the outcome of the learning process usually depends on several dozen parameters, as in case of self-organizing NNs developed by us. For this reason, optimization of the values of particular parameters relies on repeating simulations with different combinations of these parameters, looking for such combinations that lead to optimum learning process.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

7

To compare, in a quantitative way, the algorithms included in the proposed model we have used several criteria described in [20]. They allow to assess the quality of the vector quantization, as well as the topographic mapping [20,30–34]. The quantization quality is assessed using two measures. One of them is the quantization error, defined as follows:

Q err

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n m uX 1X t ðxj;l  wi;l Þ2 ¼ m j¼1 l¼1

ð11Þ

In this formula, m is the number of patterns X in the input data set. The Q err is an error that the SOM makes while approximating the input vector (input data set). After successfully completed learning process, particular neurons should be representatives of particular data classes, while every class should have at least one representing neuron. Each neuron is activated only by these patterns that belong to its class. If the NN is properly trained, distances between particular pattern – neuron pairs are small, so the resultant error which is an average of distances for all patterns in a given data set is small as well. Note that the value of Q err strongly depends on data distribution in particular classes. It can vary in-between zero, if each class contains only one distinct pattern, to large values if patterns in the classes are distributed over large area (low density classes). This issue is important while assessing efficiency of the learning process for different numbers of neurons in the SOM. To make it possible we have used data sets with patterns regularly distributed, with equal distances between centers of the classes [11]. A second measure used to assess the quantization quality is a percentage of dead neurons (PDN), which tells us about the ratio of inactive (dead) neurons versus the total number of neurons in the SOM. Let us explain that dead neurons are those neurons that never won the competition and did not become representatives of any data class. Both errors described above are not useful in the assessment of the topological order of the map. The quality of the topographic mapping can be evaluated using three other measures [20]. The first of them is the Topographic Error ET1 , defined as follows:

ET1 ¼ 1 

m 1X kðX h Þ m h¼1

ð12Þ

This is one of the measures proposed by Kohonen [17,34]. The value of kðX h Þ equals 1 when for a given pattern X two neurons whose weight vectors that resemble this pattern to the highest extent are also direct neighbors in the map. Otherwise the value of kðX h Þ equals 0. The lower the value of ET1 is, the better the SOM preserves a given map topology [30,34]. The remaining two measures do not require the knowledge of the input data. In the second criterion first we calculate the Euclidean distances between the weights of an qth neuron and the weights of all other neurons in the NN. Next we check if all p direct neighbors of neuron q are also the nearest ones to this neuron in the sense of the Euclidean distance measured in the feature space. To express this requirement in a formal manner, let us assume that neuron q has p ¼ jNðqÞj direct neighbors, where p depends on type of the map topology. Let us also assume that function gðqÞ returns the value equal to the number of direct neighbors that are also the closest to neuron q in the feature space. As a result, the ET2 criteria for P neurons in NN is defined as follows:

ET2 ¼

P 1X gðqÞ P q¼1 jNðqÞj

ð13Þ

The optimal value of ET2 is 1. In the third criterion, we build around each neuron, q, the Euclidean neighborhood in the feature space defined as a sphere with the radius:

RðqÞ ¼ max jjW q  W s jj s2NðqÞ

ð14Þ

where W q are the weights of a given neurons q, while W s are the weights of its particular direct neighbors. Then we count the neurons, which are not the closest neighbors of neuron q, but are located inside RðqÞ. The ET3 criterion, with the optimal value equal to 0, is defined as follows:

ET3 ¼

P   1X j sjs – q; s R NðqÞ; jjW q  W s jj < RðqÞ j P q¼1

ð15Þ

Selected comparative simulation results for the algorithms described in previous Section are shown in Fig. 3. The proposed model enables simulations for particular parameters varying in the loops. We investigate the maximal neighborhood range vs. the map topology scenario for different initial values of the neighborhood range Rmax . As can be seen the classical SOM provides better results for the Rect8 topology, while the fisherman algorithm for Rect4 one. Both these algorithms provide proper results for small values of RðmaxÞ. In this situation, the question is which algorithm is more efficient in case of the hardware implementation. The Rect4 topology is less dense, so for a given value of the neighborhood range R less number of neurons has to be activated. As can be seen in Fig. 2, for R ¼ 3 in case of the Rect8 topology 49 neurons are activated, while

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

8

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

Fig. 3. Selected simlation results illustrating the outcomes of the learning process for particular algorithms described in Sections 2.1 and 2.2 and two example topologies of the SOM.

in Rect4 only 25 neurons, resulting in the power dissipation in the second case reduced almost by half. The Rect4 topology requires also less number of connections between particular neurons (only vertical and horizontal) and a simpler hardware structure, as shown in Fig. 5. This is only one of examples illustrating why so detailed simulations are required. 4. The proposed recursive algorithm of the programmable neighborhood mechanism The proposed recursive algorithm very accurately models a programmable hardware neighborhood mechanism proposed in [11]. For a better illustration, in next Section we briefly present some important aspects concerning this solution to prepare a background for the explanation how the new algorithm operates. As the proposed algorithm is the part of a bigger software model, therefore in Section 4.2 we also provide some details concerning this system. 4.1. Parallel neighborhood mechanism implemented in the CMOS technology In one of our former works we proposed a parallel neighborhood mechanism that operates in an asynchronous fashion [11]. A placement of neurons in this solution is schematically shown in Fig. 4(a), with an internal structure of a single neuron presented in Fig. 4(b) [11]. Such an arrangement of neurons resembles the arrangement of neurons in the proposed software approach (a 2-dimensional table). Each neuron is connected with only the closest p neighbors, where the value of p depends on the map topology (Rect8, Rect4, Hex), as described in Section 2.4. It allows for a very efficient routing of particular neurons. Fig. 4(b) presents only these circuit components which are used by the neighborhood mechanism itself. As will be shown, both these components are directly modeled in the proposed software neighborhood algorithm. The role of the EN_PROP circuit is to propagate a 1-bit neighborhood activation signal (EN) in directions allowed in a given topology. For this reason, the structure of this circuit depends on the SOM topology, as shown in Fig. 5(a) and (b) for two example cases of Rect8 and Rect4 topologies, respectively. The role of the R_PROP circuit is to propagate a multi-bit signal, r, that controls the neighborhood range. This circuit takes an r signal from its input data bus, decrements it by 1 and finally puts the new value on its output data bus. The structure of the R_PROP circuit does not depend on the map topology, as each neuron resends the r signal in all directions, independently on the number of allowed output directions. In other words, the output data bus of the R_PROP circuit of a given neurons is connected with all its direct neighbors. Theoretically it could cause a confusing situation, as a single

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

9

r ENin_1 ENout_1

EN EN EN r NEURON r NEURON r NEURON EN EN EN r r r EN r EN r r EN

r in_1

EN r

(a)

rout

R_PROP STOP

r

r

ENin_2 ENout_2 rout

NEURON

r

EN EN NEURON r NEURON r NEURON EN EN

ENout_3 ENin_3

q − bits r in_6 ENin_6 ENout_6

EN_PROP

rout NEURON

rout

r EN

q − bits

NEURON r

r in_7 ENin_7 ENout_7

rout

rout

r in_2

NEURON r

r in_8

EN r EN

r in_3

(b)

rout

ENout_4 ENin_4 r in_4

r

ENin_8 ENout_8

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

rout ENout_5 ENin_5 r in_5 RPROG WSC

Fig. 4. Hardware realization: (a) placement of neurons in the SOM, (b) structure of a single neuron.

EN1 STOP 1

ENout_1

ENin_2 ENout_2

ENin_3 ENout_3

ENin_1

ENin_2

NEIGHBOR EN8 STOP 8 NEIGHBOR

ENout_4

WTA

EN8 STOP8 NEIGHBOR

ENin_5 EN7 STOP 7

(a)

NEIGHBOR

ENout_7 ENin_7

ENout_2

ENin_4

ENout_8 ENin_8

ENout_6

ENin_6

ENout_8 ENin_8

ENin_4 ENout_4

WSC

ENout_5

(b)

ENout_6 ENin_6

Fig. 5. The EN_PROP block for the (a) Rect8 and (b) Rect4 topologies of the SOM.

neuron could obtain r signals, usually with different values, from all directions simultaneously. To avoid this situation, the input of the R_PROP circuit in a given neuron and outputs of the R_PROP circuits in its neighbors are separated by a multi-input switch (a kind of the multiplexer). Particular branches of this switch are controlled by EN signals coming from particular neighbors. The EN_PROP circuits in these neighboring neurons ensure that particular paths by which the EN signals are propagated do not cross one with each other. As a result, a given neuron can obtain the activation signal from only one direction and this signal simultaneously opens only one branch in the switch. Other branches are in this time inactive. Note that each EN_PROP circuits have a WSC input. This privileged signal allows the winning neuron to activate all ENout i signals in a given EN_PROP block. Particular ENout i signals of this neuron become the ENin signals in its closest neighbors. In contrary to the WSC signal, the ENin i signals activate only selected ENout i signals at the opposite side of the EN_PROP block. As a result, the neighbors always receive the ENin signals from only one direction that prevents the confusing situation described above. In case of the Rect8 topology the diagonal ENin signals activate three ENout signals, while the horizontal and vertical signals activate only one ENout signal. Such distinction is necessary, as the number of neurons in each following ring of neighbors increases (by 8 for the Rect8 configuration). The propagation of the EN signal resembles a wave that spreads asynchronously in all directions concentrically from the winner. The only delay in this process results from a delay caused by a few logic gates located in particular EN_PROP blocks. The EN_PROP block itself does not contain any mechanism that could terminate the propagation of the EN signal at a desired neighborhood range R. This problem has been solved by the use of the R_PROP circuit. As mentioned above, this circuit decreases the value of r by 1 at each following ring of neighbors. The propagation of the EN signal terminates at this ring for which r ¼ 0. Note that the winning neuron receives an r PROG signal that equals R in a given epoch throughout the switch operated by the WSC signal, as shown in Fig. 4(b). Propagation of the r signal is also very fast. For 15 rings of neurons it requires only 11 ns, when implemented in the CMOS 0.18 lm technology, while the propagation is performed fully in parallel in all directions in the map [11]. The mechanism described above itself enables the SOM to operate with the rectangular neighborhood function. In this particular case, if for a given neuron ENin ¼ 1 it simply triggers the adaptation process in this neuron. On the other hand, the r signal can be used to calculate the value of the TNF in a given neuron [15].

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

10

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

4.2. An overview of the proposed software system The proposed neighborhood algorithm is an important part of a bigger system, already briefly described in Section 1. It enables comprehensive investigations of various learning scenarios for several algorithms of the NNs trained in an unsupervised manner. The system realized in the C++ language enables simulations of the learning process for any combination of the parameters described earlier. The user can select any sizes of the SOM, as well as any length of the input learning patterns, X. The NN can operate with three map topologies shown in Fig. 1 and different neighborhood functions (5)–(7). The model handles also three distance measures, namely the Euclidean, the Cityblock and the Manhattan one [35]. The ability to compare all these measures is an important feature, as the results have a strong influence on the hardware complexity of the system. The Manhattan measure requires only summing, subtracting and the abs() operations, which makes it relatively simple algorithm. The Euclidean measure, on the other hand, requires also squaring and rooting operation and thus it is much more complex. An important step is a proper definition of the learning data, as it has a large influence on the overall learning process. For this reason, we have included in the system an advanced generator of the learning data sets, which allows for placing input data in selected areas of the input data space. The boundaries of these areas are determined by polynomial functions, ellipses, trigonometric functions, etc. The input data can be regularly or randomly distributed over the input data space. Of particular meaning are data sets with regular distribution of the centers of particular data classes, as well as patterns X around these centers. Only for such data the learning results can be directly compared for different SOM parameters as shown in Section 3. The quality of the learning process depends also on the number of the learning epochs, as well as the learning rate gðkÞ. The investigations show that there exists a correlation between these parameters and therefore the user can select an initial and a final value of gðkÞ, thus determining haw fast gðkÞ will decrease during the learning process. An important parameter that exhibits a strong influence on both the quality of the learning process and the hardware complexity is the initial value of the neighborhood size, Rmax [11,15]. The system enables comprehensive investigations of the influence of this parameter on the learning process. The main objective is to find such values of Rmax , which are as small as possible and simultaneously assure an optimal learning process for different data sets. The value of Rmax has a direct influence on the number of bits in the r signal i.e. on the number of connecting paths. The model is able to determine the influence of various physical phenomena and constraints common in hardware design, such as leakage in analog memory cells, charge injection effect, mismatch effect, offsets in comparators used in some blocks in case of hardware implementation. A very important issue is the influence of the resolution of particular signals on the quality of the learning process. Since in parallel hardware architectures all neurons in the map are composed of equal blocks, therefore any reduction of the complexity of any block in a single neuron has a strong effect on the complexity of the entire NN [15]. 4.3. The proposed recursive algorithm of the neighborhood mechanism The proposed neighborhood algorithm is based on a recursive approach. Before we explain how does the proposed model of the neighborhood mechanism work we have to shortly present how data in the overall NN model are organized. The NN is represented by a dynamically created 2-dimensional table of neurons, named SOM. Neurons are objects of class NEURON that specifies all parameters and rules necessary to perform the learning process for different learning algorithms. Presentation of the structure of the overall class in detail is not possible due to capacity exceeding thousand lines of code. The class contains variables and functions that process these variables. Two general groups of variables can be distinguished in this class. The first of them includes static variables that contain the parameters of the overall neural network. Using static variables is convenient, as each neuron has in this way a direct access to these data, independently on the number of neurons in the NN. This also facilitates modification of the parameters. Features of particular neurons, on the other hand, are stored in private variables of each neuron. The most important parameters that define the overall NN include: //- - - - - - - - - - - -Parameters associated with the structure of the overall NN static int Lx; // number of inputs of the map (number of weights) static int Xn,Yn; // sizes of the map of neurons static float eta; // learning rate static int MapTopol; // selected map topology (Rect4/Rect8/Hex) static int GRID [8]; // configuration of the map topology static int NeighFunct; // selected neighborhood function (RNF/TNF/GNF) static int DistMes; // selected distance measure (L1/L2) static int WinX, WinY; // position of the winning neuron //- - - - - - - - - - - -Parameters specific for the learning process static int Rmax; // initial neighborhood radius static int Algorithm; // selected algorithm (SOM/fisherman/WTA)

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

11

static int NoOfCycles; // input dataset size (No. of cycles in learning epochs) //- - - - - - - - - - - -Parameters specific for prospective hardware implementation static int NoBXW; // No. of bits in X and W signals The most important parameters specific for particular neurons include: class NEURON *NEIGHBORS[8]; // addresses to direct neighbors of a given neuron float *W; // neuron weights (pointer to dynamic table) float Gfunc; // output value of the neighborhood function int WinCount; // counter of the wins (used in conscience mechanism) Vector of the weights is dynamically created every time we change the sizes of the map or the number of the inputs of the NN. This operation is performed by constructor, by the use of the following instruction: if (W == NULL) W = allocMatrix1D(Lx); // function allocating memory for 1-D tables The constructor does not need any explicit parameters, as the size of vector W is available throughout the static variable Lx. Destructor is used to release memory when the map of neurons has to be destroyed or resized. It is performed as follows: if (W != NULL)deleteMatrix1D(W);W = NULL;// function deleting dynamic tables The class NEURON contains various functions. Here we present declarations of only those functions that participate in the adaptation process: // initialization of the learning process - zeroing selected variables void initSimulation(); // calculation of a distance between learning pattern X and neuron weights vector W float DCC (float X[], int NoBXW);// number of bits as an additional parameter; // check if the distance of neuron at px,py position in SOM table is the smallest float findDmin (int px, int py); // adaptation of the winning neuron - the same function for all algorithms void AdaptWinner (float X[], int NoBXW); // recursive adaptation function int Adapt (int directions, int R, float Win[], float X[], int d);

Fig. 6. A function that realizes the proposed recursive neigborhood mechanism for three different learning algorithms and three different topologies of the SOM.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

12

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

One of the important tables of class NEURON is vector NEIGBORHOOD of pointers of type NEURON that contain addresses of particular neighbors of a given neuron. In case of the classical SOM and the fisherman algorithms the neurons that in the SOM table are located in the closest proximity of particular neurons become their neighbors (at least one index must be different, but the maximum difference is less or equal to 1). In case of these algorithms the NEIGHBORHOOD vectors remain unchanged during the learning process. The values stored in GRID vector directly define the map topology. This vector is modified if we change the neighborhood topology (MapTopol). It is a quick process that is one of advantages over the look-up table approach. Length of this vector results from eight possible directions in the SOM table. Let us denote them as ‘NW’, ‘N’, ‘NE’, ‘E’, ‘SE’, ‘S’, ‘SW’ and ‘W’, where ‘N’, ‘S’, ‘W’, ‘E’ mean north, south, west and east, respectively. The directions are counted in a clock-wise manner, starting from the ‘NW’ direction. For particular topologies the appropriate GRID vectors are defined as follows: GRID8 = 0xc1, 0x40, 0x70, 0x10, 0x1c, 0x04, 0x07, 0x01// for Rect8 topology GRID4 = 0x00, 0x51, 0x00, 0x10, 0x00, 0x15, 0x00, 0x01// for Rect4 topology GRID6 = 0xc1, 0x40, 0x00, 0x58, 0x08, 0x0d, 0x00, 0x01// for Hex topology Two additional parameters are associated with the activation signal. One of them is the index that indicates which element from the GRID vector should be considered for a given input direction. The first element is being selected if a given neuron receives the activation signal from the ‘SE’ direction. The following elements are selected for the activation signal coming from, in turn, ‘S’, ‘SW’, ‘W’, ‘NW’, ‘N’, ‘NE’, ‘E’ directions. Particular elements in the GRID vectors contain the information about the allowed output directions for a further propagation of the activation signal. One can notice that this mechanism is thus a direct counterpart of the EN_PROP circuit described in Section 4.1. The role of both the EN_PROP circuit and the GRID vector is to assure that the activation signals flow will be collision free. To explain the meaning of particular values stored in the GRID vector let us consider an example case, in which a neuron in the map working in the Rect8 mode receives the activation signal from the ‘SW’ direction. In this case the value of the corresponding index equals 2, which means that the information which output directions are allowed is taken from the 3rd element of vector GRID (in C/C++ tables are indexed starting from 0). In the Rect8 topology if a neuron receives the activation signal from any diagonal direction, it has to activate three neurons at its opposite side, as it is in the EN_PROP circuit described above. As a result, in our example case the given neuron has to activate neurons located on its ‘N’, ‘NE’ and ‘E’ directions. The value stored under the 3rd position in the GRID vector equals 0x70 (hexadecimal format in C/C++) i.e. 01110000 (binary format). Particular bits indicate if a given output direction is allowed or not for a given input direction. Taking into account the counting sequence, described above, one can notice that the ‘NW’, ‘SE’, ‘S’, ‘SW’, ‘W’ directions are not allowed in this case, as their corresponding bits are 0. In the Rect4 and the Hex topologies some directions are

33 4

4

3

3

8 7

15 47

16

43

40

20

21 23

24

95

96

93

92

91

34

(b) 12

9 2

71

79

76

68

65

52 54

62

55

75

70

64

69

63

60

57

59 58

5

3

10

13

23

20

27

11

6

12

21

22

71

24

19

29

14

68

30

25

1 33

70

63

27

23

71

19

72

24

28

26

29

31

30

25

32 33

W

67

42 66

20

18

31 32

26

72

13

17

15

18

10

2 16

14

9

28

17

64

43

49

65

48

37

34

38

69

68

65

54

53

44 47

61

35

41

36

44

50

39

54

40

47

53

60

59

43 48

62

39 40

34

37 38

42 49

64 63

50

61

67 66

35 36

41

62

60

59 55 58

(c)

51

61

W

69

50

66

74

7

1 70

67

8

7

15

72

56 80

8

16

46

81

22

21

45

47 49

78

4 11

48

73

77

4 6

40

44

83

33

(a) 5

39

53

85

84

82

3

38 41 42

86

32

35

37

25

43

90 89

87

31

36

30

26

24

19

94

88 28

29 30

37

18

36

W

27

26 38

23

20

22

25

41 39

34

31

29

13

1 19

42

10

14

15

W

44

28

27

17

12

18

48

45

9 2

11

13 1

46

7

10

14

22

21

35

9

17

12

11

8

6 2

16

32

6

5

5

51

45

57 56

52

46

58

(d)

56

45

51

55

57

52

46

Fig. 7. Illustration of a sequence how particular neurons are activated in the proposed recursive algorithm: (a) Rect4 topology (b) Rect8 topology, (c) Hex topology, (d) resultant Hex topology shown for a better illustration.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

13

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

never allowed. For this reason, the appropriate values in the corresponding GRID vectors are 0. In the Rect4 topology, for example, only vertical and horizontal directions are allowed, and therefore the positions (0, 2, 4, 6) in the GRID vector representing diagonal directions are 0. Fig. 6 illustrates the function which is a direct realization of the proposed recursive algorithm. The meaning of particular variables used in the code shown in Fig. 6 is as follows: Most of the parameters have been described earlier. The meaning of remaining parameters is as follows: float X[]; // vector containing learning pattern (of length Lx), int directions; //element of GRID tab. passed to Adapt func. as parameter (line 18). The ‘algorithm’ variable indicates which algorithm is selected. Note that the Adapt function handles all three algorithms described in Section 2. The differences in the use of these algorithms rely upon the sequence in which particular parts of the code are executed, as well as the variable that is used in the calculation of the updated weights vector of a given neuron. In the classic SOM (algorithm = 1) the new values of the weights are in all neurons calculated toward the X pattern. Note that the learning pattern X[ ] is passed unchanged to a next neuron in the chain (see line 18). A different situation occurs in the case of both the fisherman algorithms (for algorithm = 2 or 3). In these cases a given neuron receives the weights vector from the preceding neuron (the Win[ ] table), while it passes its own weights vector (W) to a next neuron in the chain. The difference between both cases relies only on a sequence in which the updating of the weights takes place. In case 2 (the original fisherman algorithm) the weights are calculated before they are passed to the Adapt function of the following neuron. In case 3, on the other hand, the weights are passed before their modification. The Adapt function of the winning neuron in both fisherman algorithms receives the X learning pattern as the 3rd parameter, to be consistent with Eq. 2. After the adaptation of the weights of the winning neuron, this neuron invocates its own Adapt function: SOM[WinX][WinY].Adapt(NT.full, NT.R, SOM[WinX][WinY].W, XDATA[NoX],0); with the following parameters:  NT.full: static variable in class NEURON defining all directions allowed in a given topology – NT.full = 0xFF, 0x55 or 0xdd for Rect8, rect4 and Hex topologies, respectively,  NT.R: static variable in class NEURON containing the information about the neighborhood size in a given learning epoch,  SOM[WinX][WinY].W: vector of the weights of the winning neuron,  XDATA[NoX]: table with learning patterns X. The NoX index indicates a given pattern from the input dataset in a given learning cycle  0: constant indicating that distance to the winner is assumed to be 0 in this case. In line 4 the function checks if the neighborhood range has been reached (for R = 0). Then in line 5 it calculates the value of the neighborhood function, which depends upon the distance ‘d’ to the winning neuron. The core part of the function starts in line 14. In the loop indexed by the ‘i’ variable all eight directions are tested whether they are allowed or not (line 15). Initially k = 128 (binary: 10000000), so the ‘NW’ direction is tested as first. In the following iterations the value ‘1’ in the ‘k’ variable is shifted to the right, that allows the function to test the ‘N’, ‘NE’, . . ., ‘W’ directions, consecutively. In line 16 the Adapt function checks if a neighbor at a given direction exists. If ‘true’ it recursively starts the Adapt function of this neuron, with decreased value of the ‘R’ variable, and increased value of the ‘d’ variable. This procedure is repeated for the consecutive neurons invocated in a chain. The sequence how particular neurons are activated is illustrated in Fig. 7 for all topologies described earlier, for an example case of R ¼ 3. Activations of the Adapt functions of particular neurons and returns from these functions are numbered, while both these directions are distinguished by arrows. Depending on the

Table 1 Arbitrary calculation times for particular algorithms [in minutes]. Learning algorithm (Neighborh. func.)

Rect4 1200 cycles

Rect8 1200 cycles

Hex 1200 cycles

Rect4 3200 cycles

Rect8 3200 cycles

Hex 3200 cycles

WTA SOM1/SOM2 (RNF) SOM1/SOM2 (TNF) SOM1/SOM2 (GNF) fish1/fish2 (RNF) fish1/fish2 (TNF) fish1/fish2 (GNF)

10.65 10.04/10.07 10.83/10.99 10.77/11.24 10.97/10.98 10.95/10.97 11.54/11.71

10.71 9.40/9.80 11.04/11.03 10.54/10.76 11.03/12.10 11.06/10.99 12.00/11.87

10.69 9.56/ 9.86 10.86/ 11.22 11.06/11.16 11.02/11.73 11.03/11.63 11.58/11.59

27.36 24.99/ 23.73 28.71/ 28.84 27.06/28.31 28.57/28.56 28.60/31.94 30.11/29.54

27.33 23.65/ 24.01 28.69/ 29.34 27.46/27.68 28.41/29.90 28.51/28.43 29.28/29.38

27.23 23.09/ 23.56 28.63/ 29.31 26.72/27.35 28.34/28.43 28.36/28.53 28.61/28.60

SOM1/SOM2 – SOM with iterative/recursive neighborhood mechanism. fish1/fish2 – recursive fisherman original/modified algorithm.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

14

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

Table 2 Comparison of the calculation time between two realizations of the neighborhood mechanism for SOM. Neighborh. func.

Rect4 1200 cycles

Rect8 1200 cycles

Hex 1200 cycles

Rect4 3200 cycles

Rect8 3200 cycles

Hex 3200 cycles

RNF TNF GNF

0.22* 1.48 4.37

4.31 0.04 2.07

3.12 3.33 0.90

5.00 0.44 4.61

1.53 2.26 0.81

2.02 2.37 2.35

* All values calculated according to formula: 100  ðt 2  t1 Þ=t 1 [%], where t1 – execution time in case of the iterative neighborhood mechanism shown in Fig. 2. t2 – execution time in case of the recursive neighborhood mechanism shown in Fig. 7.

algorithm, the adaptation is performed either before the invocation of the following neuron (lines 9 and 10) in a given chain or after the return (line 24). As has been demonstrated the proposed algorithm and its software implementation is very simple. A single simple function handles all implemented topologies and the learning algorihms, which is not possible in the alternative solutions decribed above. It accurately models the hardware approach described in Section 4.1. The GRID tables that link particular input directions with allowed output directions are direct counterparts of EN_PROP circuits for particular topologies. The ‘if (directions & k)’ instruction in line 15 models particular AND gates in the EN_PROP circuits. The behavior of the R_PROP circuit is simply modeled by ‘R  1’ parameter passed to the Adapt function in line 18. Recursive algorithms can be source of some problems. As the opinion is that they are slower than counterpart nonrecursive algorithms, we performed investigations of the simulation times of the learning process to enable a comparison. A full comparison is not possible, as only the SOM algorithm was implemented with both the non-recursive and the recursive approach. The fisherman algorithms were implemented only with the recursive method. This is due to the fact that nonrecursive implementation would be inefficient, as it would require to formulate a relatively complex conditions for adaptation of particular neurons. We suppose that in case of the fisherman algorithms the recursive approach will be always faster, as the indexing necessary in the iterative approach would be very complex. In case of the WTA approach the neighborhood mechanism is switched off (only the winning neuron is adapted). Table 1 presents arbitrary calculation times for particular algorithms that are implemented in the proposed software system, for two selected numbers of cycles (1200 and 3200). Tests were performed by the use of the ‘‘time’’ Unix command. The results collected in Table 1 can be seen as representative cases for investigations carried out by us in general. Simulations have been performed for three network topologies (Rect4, Rect8 and Hex) and three neighborhood functions (RNF, TNF, GNF). Particular cases were repeated several times and the obtained values were averaged. In Table 2 we present a direct comparison for both versions of the SOM algorithm. The differences are always below 5.0 %, while in some cases the recursive algorithm was even a little bit faster. Such differences are acceptable taking into account the simplicity of the proposed solution. Another potential problem is the stack overflow, but in this case the problem will not occur as the recurency depth is not very large. Note that the optimal values of the neighborhood size, which equals the recurency depth, are very small (see Fig. 3). 5. Conclusions In the paper we presented a novel recursive algorithm that models the neighborhood mechanism commonly used in selforganizing neural networks. In the comparison with other approaches the proposed solution allows for a very efficient realization of several algorithms and several topologies in a single function, while the structure of this function is very simple. The proposed solution can be viewed as a direct counterpart of the hardware neighborhood mechanism proposed by us earlier. It very accurately models particular features of the former approach, although some differences also exist. In hardware all directions can be tested in parallel, while in the software approach testing of particular directions has to be performed sequentially. The EN_PROP circuit operates in the asynchronous fashion, while in the software approach particular testings are performed in consecutive clock cycles. Nevertheless, despite these differences the flow of the activation signals in the Adapt function resembles the flow of such signals in the hardware approach to the highest extent among existing software solutions. This is important, as the intention of the realized model is to indicate how the final specialized chip with the SOM should be realized. Acknowledgments The ‘‘Development of Novel Ultra Low Power, Parallel Artificial Intelligence Circuits for the Application in Wireless Body Area Network Used in Medical Diagnostics’’ project is realized within the POMOST programme of Foundation for Polish Science, cofinanced from European Union, Regional Development Fund.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068

M. Kolasa et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx

15

References [1] K. Gugała, A. S´wietlicka, M. Burdajewicz, A. Rybarczyk, Random number generation system improving simulations of stochastic models of neural cells, Computing 95 (1) (2013) 259–275. [2] C. Wen, T.-C. Lin, K.-C. Chang, C.-H. Huang, Classification of ecg complexes using self-organizing CMAC, Meas.: J. Int. Meas. Confederation 42 (3) (2009) 399–407. [3] S.-L. Chen, H.-Y. Lee, C.-A. Chen, C.-C. Lin, C.-H. Luo, A wireless body sensor network system for healthcare monitoring application, in: Proc. IEEE Biomedical Circuits and Systems Conference Healthcare Technology, 2007, pp. 243–246. [4] B. Latre, B. Braem, I. Moerman, C. Blondia, P. Demeester, A survey on wireless body area networks, Wireless Network 17 (1) (2011) 1–18. [5] R.Y. Mehmet, Implementation of wireless body area networks for healthcare systems, Sens. Actuators A: Phys. 162 (1) (2010) 116–129. [6] X. Zhang, H. Jiang, L. Zhang, C. Zhang, Z. Wang, X. Chen, An energy-efficient ASIC for wireless body sensor networks in medical applications, IEEE Trans. Biomed. Circuits Syst. 4 (1) (2010) 11–18. [7] K. Appiah, A. Hunter, P. Dickinson, M. Hongying, Implementation and applications of tri-state self-organizing maps on FPGA, IEEE Trans. Circuits Syst. Video Technol. 22 (8) (2012) 1150–1160. [8] R. Długosz, M. Kolasa, M. Szulc, A FPGA implementation of an asynchronous, programmable WTM self-organizing map neighbourhood mechanism, in: Proc. MIXDES 2011–15th Int. Conf. Mixed Design of Integrated Circuits and Systems, Poznan´, Poland, 2011, pp. 258–263. [9] K.B. Khalifa, B. Girau, F. Alexandre, M. Bedoui, Parallel FPGA implementation of self-organizing maps, in: Proc. ICM 2004–16th Int. Conf. Microelectronics, 2004, pp. 709–712. doi: 10.1109/ICM.2004.1434765. [10] A. Tisan, S. Oniga, C. Gavrincea, A. Buchman, FPGA implementation of a self-organized map with on-chip learning, in: Proc. International Conference of Optimization of Electrical and Electronic Equipment, 2008, pp. 81–86. [11] R. Długosz, M. Kolasa, W. Pedrycz, M. Szulc, Parallel programmable asynchronous neighborhood mechanism for kohonen SOM implemented in CMOS technology, IEEE Trans. Neural Networks 22 (12) (2011) 2091–2104. [12] W. Kurdthongmee, A novel hardware-oriented Kohonen SOM image compression algorithm and its FPGA implementation, J. Syst. Architect. 54 (2010) 983–994. [13] D. Macq, M. Verleysen, P. Jespers, J.D. Legat, Analog implementation of a kohonen map with on-chip learning, IEEE Trans. Neural Networks 4 (3) (1993) 456–461. [14] V. Peiris, Mixed analog–digital VLSI implementation of a Kohonen neural network (Ph.D. thesis), Ecole Polytechnique Fédérale de Lausanne (EPFL), 2004. [15] M. Kolasa, R. Długosz, W. Pedrycz, M. Szulc, A programmable triangular neighborhood function for a Kohonen self-organizing map implemented on chip, Neural Networks 25 (2012) 146–160. [16] A. Bereketli, O. Akan, Communication coverage in wireless passive sensor networks, IEEE Commun. Lett. 13 (2) (2009) 133–135. [17] T. Kohonen, Self-Organizing Maps, 3rd Edition., Springer, Berlin, 2001. [18] C. Leite, D. Martin, G. Sizilio, K.D. Santos, B. de Araujo, R. Valentim, A. Neto, J. de Melo, A. Guerreiro, Classification of cardiac arrhythmias using competitive networks, in: Proc. Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2010, pp. 1386–1389. [19] R. Długosz, T. Talas´ka, W. Pedrycz, R. Wojtyna, Realization of the conscience mechanism in CMOS implementation of winner-takes-all self-organizing neural networks, IEEE Trans. Neural Networks 21 (6) (2010) 961–971, http://dx.doi.org/10.1109/TNN.2010.2046497. [20] J. Lee, M. Verleysen, Self-organizing maps with recursive neighborhood adaptation, Neural Networks 15 (8–9) (2002) 993–1003. [21] R. Długosz, T. Talas´ka, P. Przedwojski, P. Dmochowski, A flexible, low-power, programmable self-organizing neural network based on microcontrollers for medical applications, in: Proc. 17th Electronics New Zealand Conference, 2010. [22] P. Przedwojski, J. Dalecki, T. Talas´ka, R. Długosz, Kohonen winner takes all neural network realized on microcontrollers with AVR and ARM cores, in: Proc. MIXDES 2010–17th Int. Conf. Mixed Design of Integrated Circuits and Systems, 2010, pp. 273–276. [23] P. Boniecki, The kohonen neural network in classification problems solving in agricultural engineering, J. Res. Appl. Agric. Eng. 50 (1) (2005) 37–40. [24] I. Mokriš, R. Forgácˇ, Decreasing the feature space dimension by Kohonen self-organizing maps, in: Proc. 2nd Slovakian – Hungarian Joint Symposium on Applied Machine Intelligence, Herl´any, Slovakia, 2004. [25] T.M. Martinetz, S.G. Berkovich, K. Schulten, Neural-gas network for vector quantization and its application to time-series prediction, IEEE Trans. Neural Networks 4 (1) (1993) 558–569. [26] T. Ayadi, T. Hamdani, A. Alimi, A new data topology matching technique with multilevel interior growing self-organizing maps, in: Proc. of the IEEE International Conference on Systems Man and Cybernetics, 2010, pp. 2479–2486. [27] A. Azcarraga, Assessing self-organization using order metrics, in: Proc. of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000, pp. 159–164. [28] T. Villman, R. Der, M. Hermann, T.M. Martinetz, Topology preservation in self-organizing maps: exact definition and measurement, IEEE Trans. Neural Networks 8 (1997) 256–266. [29] M. Kolasa, R. Długosz, R. Wojtyna, Computers in Medical Activity, Springer, Berlin/Heidelberg, 2009. Ch. Application of Artificial Neural Network to Predict Survival Time for Patients with Bladder Cancer. [30] D. Beaton, I. Valova, D. MacLean, CQoCO: A measure for comparative quality of coverage and organization for self-organizing maps, Neurocomputing 73 (2010) 2147–2159, http://dx.doi.org/10.1016/j.neucom.2010.02.004. [31] J. Lee, N. Donckers, M. Verleysen, Recursive learning rules for SOMs, in: N. Allinson, H. Yin, L. Allinson, J. Slack (Eds.), Proc. WSOM 2001 – Workshop on Self-Organizing Maps, Advances in Self-Organising Maps, Springer Verlag, Lincoln, United Kingdom, 2001, pp. 67–72. [32] M. Sheikhan, V.T. Vakili, S. Garoucy, Codebook search in LD-CELP speech coding algorithm based on multi-SOM structure, World Appl. Sci. J. 7 (Special Is. of Computer & IT) (2009) 59–68. [33] M.-C. Su, H.-T. Chang, C.-H. Chou, A novel measure for quantifying the topology preservation of self-organizing feature maps, Neural Process. Lett. 15 (2) (2002) 137–145. [34] E. Uriarte, F. Martin, Topology preservation in SOM, Int. J. Appl. Math. Comput. Sci. 1 (2005) 19–22. [35] M. Kolasa, Uczenie sie ß samoorganizujcych map Kohonena metod WTM implementowan sprzeßowo (Ph.D. thesis), University of Technology and Life Sciences, Poladn, Bydgoszcz, 2012.

Please cite this article in press as: M. Kolasa et al., A novel recursive algorithm used to model hardware programmable neighborhood mechanism of self-organizing neural networks, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.03.068