Synthesis of a hybrid brain for a humanoid robot

Synthesis of a hybrid brain for a humanoid robot

Robotics and Autonomous Systems 119 (2019) 135–150 Contents lists available at ScienceDirect Robotics and Autonomous Systems journal homepage: www.e...

5MB Sizes 0 Downloads 81 Views

Robotics and Autonomous Systems 119 (2019) 135–150

Contents lists available at ScienceDirect

Robotics and Autonomous Systems journal homepage: www.elsevier.com/locate/robot

Synthesis of a hybrid brain for a humanoid robot ∗

Omar Zahra , Mohamed Fanni 1 , Abdelfatah M. Mohamed 2 Mechatronics and Robotics Engineering Department, School of Innovative Design Engineering, Egypt–Japan University of Science and Technology, E-JUST, New Borg El Arab, Alexandria, PO Box:21934, Egypt

article

info

Article history: Available online 2 July 2019 Keywords: Computer vision Invariant object-recognition Brain-Based Device Perceptual categorization Neural simulation

a b s t r a c t This article comprehends the design of a Brain-Based Robot (BBR) using hybrid techniques that incorporate both Brain-Based Device (BBD) and computational algorithms. BBDs are biologically inspired machines which have its behavior guided by a simulated nervous system. This nervous system follows detailed neuroanatomy of different brain areas. BBDs tend to have a nervous system with a large number of neurons and synapses. Thus, a huge computational power is required to simulate the nervous system of a BBD. Nevertheless, some of the tasks carried out by the simulated nervous system can be accomplished using computational algorithms which can help reducing the required computational power greatly. In this article, a BBR is built which combines some subsystems from BBD with computer vision algorithms. Computer vision algorithms are applied using OpenCV to extract some features from images, while neuronal-areas are connected together based on a detailed neuroanatomical structure to mimic the human learning process. Nengo python package is used for simulating neuronal areas in the system and monitoring activities of neuronal units. Moreover, the successful integration of the BBD’s subsystems with computer vision leads to the perceptual categorization based on invariant object-recognition of various visual cues. To make a fair comparison with BBD, the nervous system of a BBD is built on the same computer used to build the hybrid brain for the proposed BBR. The proposed hybrid brain is then applied to a Nao humanoid robot in V-REP simulation environment to test it. The results obtained through this article prove that the proposed hybrid brain possesses the same intelligence of the BBD and requires much less computational power that it can run on an on-board computer of a robot, which makes it plausible for engineering applications. © 2019 Elsevier B.V. All rights reserved.

1. Introduction Nature has always been our endless source of inspiration. Living organisms never fail to impress us with their ability to perform complex tasks and adapt to different environments. Although, much research has been targeting the development of fully autonomous robots, the behavior of these robots pales in comparison with that of insects and animals guided by nervous systems. A problem that faces autonomous robots is dealing with novel situations. However, biological organisms can survive in dynamic environments and display flexibility and adaptability that far exceed any artificial systems. Thus, brain-based robotics is an interesting research field that investigates the roots of ∗ Corresponding author. E-mail addresses: [email protected] (O. Zahra), [email protected] (M. Fanni), [email protected] (A.M. Mohamed). 1 On leave: Pord. Eng. Mechanical Design Dep., Faculty of Engineering, Mansoura University, Egypt. 2 On leave: Electrical Engineering Dep., Faculty of Engineering, Assuit University, Egypt. https://doi.org/10.1016/j.robot.2019.05.006 0921-8890/© 2019 Elsevier B.V. All rights reserved.

intelligence of biological organisms by embodying computational models of the brain on robotic platforms. Although the principal focus of developing BBDs has been to test theories of brain function, this type of modeling may also provide a basis for robotic design and practical applications. Moreover, because brain-based robotics follows a working model (i.e. the biological brain), it is evident that this field will lead to the development of autonomous machines with an intelligent behavior comparable to that of animals [1]. BBDs were introduced by the Nobel laureate Gerald M. Edelman. These machines were built to help discovering new ways to understand the fundamental workings of an animal brain. To have a complete understanding of how the brain works it is important to monitor the activity of neurons in different parts of the brain at the same time while an animal is performing a behavioral task. However, the latest available electrophysiology technology allows only the recording of at most hundreds of neurons. In these last decades, vast progress has been made in describing how the separate systems of the brain function, while only a small progress has been made in obtaining a global picture of higher brain functions such as learning and memory [2]. Considering the

136

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

current limitations, and the complexity of the nervous system, computational modeling of the nervous system helped in getting a better understanding of different brain functions. BBDs used a simulated nervous system to guide the behavior of a physical device. That method was successful in verifying some of the theories about different parts of the brain like the cerebellum and the hippocampus [3,4]. In contrast to robots that are controlled by an artificial neural network (ANN), BBDs have much more complex neural system following real biological models of different brain areas. The simulated nervous system of a BBD typically has simulated neuronal units in the order of thousands or even millions with much more biologically plausible dynamics, connected with synapses in the order of millions or even billions. Consequently, BBDs are not programmed by instructions like computers, but instead, they operate like biological systems which allow them to adapt to the environment. Thus, BBDs provide a basis for the design of intelligent robots that behave without instructions in a way similar to an animal or a child, that can learn by reward and punishment. AI-based robots mimic some cognitive functions to solve problems by implementing complex algorithms. However, these algorithms are a series of instructions based on conditionals, and these conditionals cannot cover all possible scenarios that a robot may face in real world. While ANN is a simplistic model of the human nervous system, the network itself does not resemble the structure of the nervous system. Consequently, it is not even clear to what extent the ANN reflects the human brain functions. However, one of the problems of using BBDs is that they require a large computational power to be able to operate in the realworld without significant delay in response. Most of BBDs receive commands from remote cluster of computers through wireless communication. These clusters are used for running the neural simulation based on the sensory data sent from the BBDs [5]. In this paper, a hybrid system is constructed by coupling computer vision algorithms with BBDs to achieve visual perceptual categorization and condition its behavior based on perception [6]. Vision is chosen because it is the main sense upon which human depends to perceive the surrounding environment. The latest advances in computer vision made it possible for robots to achieve robust and invariant visual recognition of objects and patterns [7, 8]. These advances have numerous applications in the field of robotics including navigation, manipulation, and human–robot interaction. While Darwin VII consumes a large computational power to achieve visual object recognition, the proposed system makes use of recent computer vision advancements to reduce the computational cost. This makes it possible for such system to run on a PC with average specifications without noticeable delay during operation, which makes it plausible for engineering applications. This paper is structured as follows: In Section 2, the description of a BBD (Darwin VII) is introduced. In Section 3, the nervous system outlines of Darwin VII is discussed. In Section 4, the proposed hybrid brain is introduced. In Section 5, a comparison is held between biologically-inspired methods and computer vision algorithms to achieve invariant object recognition. In Section 6, the setup to compare the performance of Darwin VII and the proposed system is presented. In Section 7, the results from the simulation are explained. In Section 8, the conclusions of this research are presented. Thus, the main objective of this paper is to go through the structure of the neural system of Darwin VII introduced in [9] and show how this work is reimplemented using some algorithms to reduce the number of neuronal units. The purpose of this work is to maintain the same level of intelligence as Darwin VII while reducing the computational power needed, such that applicability in engineering applications becomes more feasible.

Fig. 1. Darwin VII while exploring the surrounding environment [9].

2. BBD (Darwin VII) The proposed system is designed to achieve perceptual categorization. This ability is essential for cognition as it enables living organisms to form categories to be able to distinguish food and non-food, peer and non-peer, etc. Thus, categorization is usually an initial step when perceiving the surrounding environment, which consequently affects the way of interaction with the physical world [10]. Thus, it even affects our cognitive development [11]. To be able to perform a clear comparison with BBDs from literature, the system is built to carry out the same function of Darwin VII which is one version of the BBDs developed by Gerald Edelman [9]. The task carried out by Darwin VII is analogous to foraging task performed by live beings, to be able to search for food and differentiate appetitive and aversive food just by picking some visual and auditory cues associated with different objects. As shown in Fig. 1, Darwin VII consists of a mobile base equipped with several sensors and a gripper called NOMAD (Neurally Organized Mobile Adaptive Device), and a neural simulation running on a remote computer workstation. Video output from a CCD camera mounted on Darwin VII, as well as status and auditory information, are sent via RF transmission to the workstation carrying out the neural simulation. Also, RF transmission is used by NOMAD to receive move-commands from the simulation. The wheels of NOMAD allow for independent translational and rotational motion. Two motors are used to allow for pan and tilt movement of the camera and microphones. NOMAD is also equipped with a one degree of freedom gripper. The CCD camera, two microphones on either side of the camera

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

137

Fig. 2. Darwin VII nervous system without auditory pathway.

and sensors embedded in the gripper that measure the surface conductivity of stimuli (the blocks) provide sensory input to the neural simulation. In the environment to be explored, the blocks are covered with either blobs or stripes representing different features that can be used to differentiate between appetitive and aversive food. Also these blocks produce tones with different frequencies, representing an extra feature that can be used to define the type of food. Additionally, the electrical conductivity of the blocks represents the taste of blocks, which is one feature of an object that is known innately, while associated visual and auditory cues needs to be learned while the BBD explores the surrounding environment and interacts with the blocks. In the beginning, the robot starts to interact with all the blocks and touch them to know their electrical conductivity and hence their taste. After some time, the robot starts to associate the stripes pattern and high frequency tone with the appetitive taste which is represented by high conductivity. On the other hand, the blobs pattern and the low frequency tone are associated with aversive taste represented by low conductivity. Thus, after these relations have been learned, Darwin VII starts to move towards the blocks with stripes and grips them, while it starts to avoid blocks with blobs without even touching them. This animates what happens in the animal world, where the animal first tastes whatever it finds with its mouth while associating the appetitive/ aversive taste with some visual cues or with the smell of the food. Later, after some time, the animal goes directly to the appetitive food and neglect the aversive one relying on the visual cues or on the smell of the food. Fig. 3 shows the schematic of the nervous system of Darwin VII which is discussed thoroughly in [9]. In the current work, only visual cues are used to achieve the categorization, while disregarding the auditory cues to simplify the addressed problem and have more evident results. Thus, Fig. 2 shows a part of the nervous system of Darwin VII in which the auditory pathway is discarded, by removing the layers RCoch, LCoch and A1 which form the auditory system for Darwin VII, and only the visual pathway is considered. Accordingly, the simplified Darwin VII of Fig. 2 exhibits four subsystems; visual system, taste system, reward system, and motor system. Connections among these systems can either be excitatory or inhibitory. Excitatory connections means that if the activity of the presynaptic neuron increases, the activity of postsynaptic neuron also increases. For inhibitory connections, if the activity of the presynaptic neuron increases, the activity of postsynaptic

neuron decreases. Also connections can either be plastic or nonplastic, where only plastic connections can have its strength modulated, while non-plastic connections cannot. As shown in Fig. 2, the connections can be excitatory non-plastic connections represented by solid connections with arrow head, inhibitory non-plastic connections represented by solid lines with ball head, excitatory plastic connections represented by dotted lines, and excitatory reward-dependent plastic connections represented by dotted lines with solid black square. The strength modulation of an excitatory reward-dependent plastic connection is guided by the reward system. The visual system consists of five layers; R, VV , VH , VB and IT . The image from the CCD camera is introduced to the Retina layer R. Then, the output from layer R is input to the three layers VV , VH and VB through excitatory connections with certain patterns of connections that will be explained later in Section 6. These three layers extract the vertical lines, horizontal lines and blobs from the images. The output of these layers is then introduced to the Inferotemporal cortex layer (IT ) in order to achieve invariant object recognition by identifying blobs and stripes. Invariant object-recognition means the ability to recognize the object even if it is subjected to scaling, rotation and translation. In case of the vertical and horizontal lines, Darwin VII is able to detect lines that are inclined to either the vertical or horizontal axes by ± 30◦ . The connections between the three layers and the IT layer is excitatory plastic connections, which means that its strength can be modulated. Therefore, these connections are modulated such that the invariant object recognition is achieved. The taste system consists of the two layers, Tapp and Tav e . If Darwin VII touches a block with high conductivity, the Tapp layer becomes active with an increasing activity of neurons. If Tapp is active, it evokes a reflex response to grip the block for a relatively long period of time. On the other hand, if Darwin VII touches a block with low conductivity, the Tav e layer becomes active, and evokes a response to move away from the block. These two layers are connected together through inhibitory connection. Thus, if Tapp is active, Tav e becomes inactive, and vice versa. At the same time, the taste system layers are connected to the reward system through excitatory connections. Activity in the reward system affects only the plastic reward-dependent connections that connect the IT layer to both the motor system and the reward system itself. Therefore, these connections can be reinforced only if the reward system is active. Accordingly, when Darwin VII is subjected to some visual feature while the taste system evokes

138

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

Fig. 5. System building blocks diagram.

Fig. 3. Schematic diagram of Darwin VII’s nervous system [9].

achieve further reinforcement of connections. After strength of connections have reached a certain value, the reward-dependent connections cannot be strengthened anymore due to the action of feedforward inhibitory connections [12]. The taste system is connected to the reward system and the motor system through excitatory connections. These connections are nonplastic connections, since it represents unconditioned stimulus. While the visual system is connected to the reward system and the motor system through excitatory plastic reward-dependent connections, representing conditioned stimulus. Both motor areas (appetitive and aversive) are connected together through reciprocal inhibitory connections, which means that activity in any of the two areas inhibits the other. Any behavioral response should be triggered based on the difference of the activity of both motor areas. 4. The proposed hybrid brain

Fig. 4. The reward system.

the reward system at the same time, only then the neural system learns to correlate this visual feature with the corresponding taste. 3. The reward system The visual system will be explained later in Section 6. The reward system, shown in Fig. 4 consists of the layers So, Si and S. So represents an integrator unit [12], where neurons in So are constantly adding up the synaptic inputs in time (temporal summation) and in space (spatial summation) since these synaptic inputs come from various neurons. If the summation is at or above a certain threshold, an action potential is triggered (release of an output). If the summation is below the threshold, no action potential is initiated. This process is called synaptic integration [13]. Si represents a feedforward inhibitory unit. Feedforward inhibition is a way of shutting down or limiting the excitation in a neuron in a neural circuit. Feedforward inhibition means that a presynaptic neuron excites an inhibitory interneuron (an interneuron is a neuron interposed between two neurons), and that inhibitory interneuron then inhibits the next follower neuron. S represents a response unit, where the average activity in area S (denoted by S ) controls the reward-dependent plastic connections. When the robot encounters the blocks for the first time, only taste would trigger activity in So, which consequently triggers action in S. This allows for increase in strength of the reward-dependent plastic connections between the active neurons in IT and the active neurons in both of motor area and So. When these connections are reinforced, activity in IT would be sufficient to trigger activity in motor areas, and even

The proposed hybrid brain is built to carry out the same task as that of Darwin VII but with less computational power to enable realistic engineering applications. Therefore, the number of neurons and synaptic connections should be reduced to be able to achieve this goal. Fig. 5 shows a schematic diagram for the construction of the proposed hybrid brain. It is clear from the previous section, and will be clarified more in Section 6, that most of the computational power needed in case of Darwin VII, was used to simulate the large number of neurons in the visual system only. Thus, by using computer vision algorithms to replace the visual system in Darwin VII, as will be discussed in Section 6, the proposed system is made up of only 880 neurons and about 7500 synaptic connections compared to 17,460 neurons and about 400,000 synaptic connections in case of the simplified Darwin VII. The proposed hybrid brain adopts the same reward system of Darwin VII as shown in Fig. 4. The reward system is crucial for the modulation of the strength of synaptic connections based on the saliency of the received cues. Recent advances in computer vision have given a rise to robust-invariant visual pattern recognition technology that is based on extracting a set of characteristic features from an image as explained in Section 6. After extracting the required features, it is necessary to translate the output of the visual system to a pattern that distinguishes one object from another. The Inferotemporal cortex (IT ) is known to perform that task in primates [14]. The IT generates a certain pattern for each object. Thus each feature to be extracted is given a certain number, and each number is translated into a certain pattern. The pattern selection for the IT of the proposed system was done such that the average firing rate is equivalent to that in Darwin VII in [15]. Next, the output of the IT is fed to the motor and the reward systems as described in Section 2.

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

139

Table 1 Values of the parameters of the neuronal units in Darwin VII. Area

Size

g

σ

ω

R VB VH , VV IT ITi Mapp , Mav e Tapp , Tav e S So Si

64 x 64 64 x 64 64 x 64 28 x 28 14 x 14 3 x 6 3 x 6 2 x 2 4 x 4 2 x 2

1.50 1.33 1.50 1.10 1.35 2.00 2.00 2.00 3.00 2.00

0.20 0.65 0.50 0.04 0.02 0.10 0.10 0.05 0.15 0.10

0.00 0.50 0.50 0.03 0.15 0.30 0.30 0.15 0.22 0.22

Fig. 7. Modified BCM learning rule.

Fig. 6. Thresholding function Φ (x).

5. Neural activity and synaptic connections In the simplified Darwin VII as shown in Fig. 2, the whole nervous system, with the visual path only disregarding the auditory path, consists of 17,460 neurons and about 400,000 synaptic connections, while the proposed system is made up of only 880 neurons and about 7500 synaptic connections. A neuronal unit in the system is simulated by an average firing rate model [9]. Neuronal activity is computed by firstly calculating the total contribution of other neuronal units to its activity, this is given by: N

l Ai (t) = ΣlM =1 Σj=1 cij sj (t)

(1)

where M is the number of the different connection types (as in Table 1) and Nl is the number of the connections for each type M neuronal units projecting to unit i. cij is the normalized strength of connection between pre-synaptic neuron j and post synaptic neuron i. Negative values of cij resemble the inhibitory connections. sj (t) is the activity of pre-synaptic neuron. The activity of unit i is given by: si (t + 1) = Φ (tanh(gi (Ai (t) + ωsi (t))))

have two classifications as either excitatory or inhibitory, plastic or non-plastic as listed in Table 3. For excitatory connections, the activity of post-synaptic neuron increases as the activity of the pre-synaptic neurons increases. For inhibitory connections, activity of post-synaptic neuron decreases as the activity of pre-synaptic neurons increases. On the other hand, non-plastic synaptic connections have fixed weights that does not change. For plastic connections, the connection strength is updated every cycle depending on the neuronal activities in both pre and post synaptic neurons. The synaptic strength is modulated based on modified BCM (Bienenstock, Cooper and Munro) learning rule [16]. This learning rule models the Long-term potentiation (LTP) and Long-term depression (LTD) in neural system, where a synaptic strength is potentiated if the average activity in both the pre and post synaptic neurons are high at the same time and exceeding a certain threshold value, otherwise the synaptic strength would be depressed. It suits the system as inputs with weak correlations are depressed, while the ones with strong correlation are potentiated. The change in strength of plastic synaptic connections as introduced in [9] is given by:

∆cij (t + 1) = ε (cij (0) − cij (t)) + ηsj (t)F (si (t))

The change in strength of reward-dependent plastic synaptic connections is given by:

∆cij (t + 1) = ε (cij (0) − cij (t)) + ηsj (t)F (si (t))V(d)

where Φ is a thresholding function that is given by:

Φ (x) =

0; x < σi

F (s) =

x;

x > σi

(3)

as shown in Fig. 6, and ω governs the persistence of the activity of neuronal units, σi is the value of the threshold activity of a neuronal unit. If the activity of a unit falls below this threshold value, its activity is neglected. gi is a scaling factor for the neuronal activity [9]. The values of the parameters of the neuronal units of Darwin VII and the proposed hybrid brain are listed in Tables 1 and 2, respectively. The size parameter means the number of neurons in a neuronal layer. Regarding synaptic connections, it can

(5)

where η determines the learning rate depending on the correlations between pre and post neuronal units. ε determines the decay rate that resembles forgetting the learned relations until the connection weight decays to its initial value, cij (0) is the initial connection strength (as shown in Tables 3 and 4). F(s) is the modified version of BCM learning rule [9] as it rules the synaptic change in the form of a piecewise function as indicated in Eq. (6) and shown in Fig. 7.

(2)

{

(4)

⎧ 0 ⎪ ⎪ ⎨

s < θ1

k1 (θ1 − s)

θ1 ≤ s < (θ1 + θ2 )/2 ⎪ k (s − θ ) ( θ1 + θ2 )/2 ≤ s < θ2 2 ⎪ ⎩ 1 k2 tanh(ρ (s − θ2 ))/ρ s ≥ θ2

(6)

F(s) is defined by two thresholds (θ1 and θ2 ), two inclinations (k1 and k2 ) and a saturation parameter ρ (ρ has a fixed value that is equal to 6 for all synaptic connections in this work). A value term V is computed as: V(d) = 1 + f (d)

S + υ (d − 1)(d − 1)

(7) d where the delay d of the value term is incremented by one every cycle such that it ranges between 1 and 9, S is equal to the average

140

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

Table 2 Values of the parameters of the neuronal units in the proposed system. Area

Size

g

σ

ω

IT Mapp , Mav e Tapp , Tav e S So Si

28 x 28 3 x 6 3 x 6 2 x 2 4 x 4 2 x 2

1.10 2.00 2.00 2.00 3.00 2.00

0.04 0.10 0.10 0.05 0.15 0.10

0.03 0.30 0.30 0.15 0.22 0.22

activity of neuronal units in area S, and υ (d − 1)(d − 1) is equal to V at time (d − 1) multiplied by (d − 1). The convolution function f is applied to spread the activity over time. Each delay from 1 to 9 have corresponding value of f that can be concluded from the set [0.1, 0.1, 0.3, 0.7, 1.0, 1.0, 0.7, 0.3, 0.1]. The values of the parameters of the synaptic connections of Darwin VII and the proposed hybrid brain are listed in Tables 3 and 4, respectively. An example of the connection between R and VH will be explained in the next section to help understand pattern of connections between various layers as shown in Table 3. Both R and VH consists of 64 × 64 neurons in a grid formation, as indicated in Table 1. An additional example to clarify this matter more would be the connection between VH and IT . IT consists of 28 × 28 neurons, which means a total of 784 neurons. The pattern of connection (arborization) between VH and IT is non-topographic, which means it is random connections. The probability of each neuron in VH to be connected to another neuron in IT is 1.75%, which means that each neuron in VH is connected to about 14 neurons (784*1.75/100) in IT . Since η, ε , k1 , k2 , θ1 and θ2 have nonzero values, then this connection is a plastic connections. Initially, the weight of connections between the visual system and the motor system is not strong enough. Therefore, the activity of the visual system cannot possibly invoke a behavioral response until the reward system modulates the strength of the connections. Consequently, the visual stimuli are conditioned stimuli that trigger conditioned behavior, while the cues perceived by the taste system are unconditioned stimuli that trigger unconditioned innate behavior. Before any conditioning occurs, only the taste system can trigger the motor system. Thus, the taste system triggers the reward system, which allows for the learning rule to increase the weight of connections between the active visual area with both the corresponding motor area and the reward system. By adjusting parameters of the neuronal unit activity and learning rule of synapses, the time needed for the robot to learn and forget the relation between the inputs can be adjusted. Thus, after some conditioning, both of the taste system and the visual system can trigger neural activities in the motor system. 6. Invariant object recognition Robots rely on different sensors to explore the surrounding environment. Among these sensors are the cameras which provide a large amount of information in each image, and are usually cheaper than other navigational sensors. However extracting the correct data from the information collected is still a challenge. Thus, various methods were developed to process the images appropriately. Some of these methods were inspired by the way the brain processes and analyzes inputs to our visual system. In this section two features that are important to distinguish between various objects are discussed; shape and color. 6.1. Biologically inspired method In living organisms, most neurons of the visual system respond only to stimulation from a narrow region within the visual field. This region is called the receptive field of that neuron. The

firing rate of a neuron changes depending on the position of a bright spot within its receptive field. The receptive field can be subdivided accordingly into ‘‘ON’’ and ‘‘OFF’’ regions, where neuronal activity increases if a bright spot is in an ‘‘ON’’ region and decreases when a bright spot is in an OFF region as shown in Fig. 8(a). Different neurons have different receptive fields. However, neighboring neurons tend to have receptive fields that project from the same region of the visual field. This is what is typically called the retinotopic organization of the neuronal projections, where neighboring points in the visual field are mapped to neighboring neurons of the visual system. Moreover, complex receptive field shapes are formed as a combination of simpler fields as shown in Fig. 8(b). A neuron with ON-center simple receptive field can be expressed as a circle with a positive sign in its center representing an increase in the firing rate when subjected to a bright spot, and negative sign on the outer circumference representing a decrease in the firing rate when subjected to a bright spot. Thus, by aligning many neurons with ON-center simple receptive field at a certain inclination, a complex receptive field is formed expressed by positive sign along the line on which the neurons with simple receptive fields were aligned and negative signs on the periphery. Therefore, the firing rate is maximum when this complex receptive field is subject to a bar of light along its central positive line , and the firing rate decreases as this inclination changes as shown in Fig. 8(b). For example, projections between the retina and different neuronal layer were proposed in [9] to have patterns as shown in Fig. 9. Neuronal layers are formed by having a grid-like distribution of neurons. Consequently, each neuron in the retina layer, to which the camera input is fed, in this grid corresponds to a pixel in the original image. The pattern of connection between the retina layer (R) and the three following layers, VH , VV and VB , decides the features extracted from the retina layer. These connections are exploited in Table 1. For example, for the connections between the layer R and the layer VH , the connections are formed such that every 4 pixels in one row of R are connected to one corresponding neuron in VH . In this case, the activity of that neuron in VH depends on the total activity of the 4 neurons in R, as explained by the equations earlier in Section 5. Thus, the activity of each neuron in VH can only be triggered when the total activity of the corresponding neurons in R reaches a certain threshold value. This can be done by adjusting the various parameters that control the activity of these neurons. By applying these connections, activity in the layer VH would be noticed only if many neurons in the same row in R have a high firing rate at the same time. Since connections are formed to pick only horizontal lines, only lines with orientation within ± 30◦ from the horizontal can possibly trigger activity of neurons in VH [9]. The same method is applied to extract vertical lines and blobs in VV and VB respectively. Consequently, different patterns of projections between the layers in the visual area lead to detection and recognition of various features from the surrounding environment. However, the limited degree of inclination that can be detected by each layer, 60◦ in each of VH and VV , leaves a dead band in which stripes can no longer be detected. 6.2. Computer vision method Recent advances in computer vision have given a rise to robust-invariant visual pattern recognition technology that is based on extracting a set of characteristic features from an image. In the proposed system, invariant recognition of blobs and stripes is carried out by applying several algorithms, as shown in the flowcharts in Fig. 10. For blob detection, the function SimpleBlobDetector from OpenCV library is used [17]. As indicated in Fig. 10(a), the image is firstly converted to grayscale, where each

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

141

Table 3 Properties and types of the synaptic connections in Darwin VII. Projection

P

Arbor

cij (0)

η

ε

θ1

θ2

k1

k2

R → VB R → VH R → VV VB → IT VH , VV → IT IT → ITi ITi → IT IT → IT IT → Mapp , Mav e IT → Mav e IT → So Tapp , Tav e → So , Mapp , Mav e Mapp ↔ Mav e Si → S So → S So → Si

1.00 1.00 1.00 0.01 0.0175 1.00 1.00 1.00 0.15 0.15 0.05 1.00 1.00 1.00 1.00 0.80

O3 × 3 []0 × 4 []4 × 0 non-topo non-topo Θ 2,3 []1 × 1 O1 non-topo non-topo non-topo O 1 non-topo []2 × 2 []2 × 2 []2 × 2

0.03,0.03 0.04 0.04 0.04,0.08 0.04,0.08 0.06,−0.06 −0.36,−0.50 0.0006,0.0010 0.0006,0.0010 0.0006,0.0010 0.0006,0.0010 0.12,0.12 −0.05,−0.12 −0.27,−0.30 0.09,0.11 0.05,0.06

0.0 0.0 0.0 0.125 0.125 0.0 0.0 0.0 0.006 0.006 0.02 0.0 0.0 0.0 0.0 0.0

0.00 0.0 0.0 0.00125 0.00125 0.0 0.0 0.0 0.00006 0.00006 0.002 0.0 0.0 0.0 0.0 0.0

0.0 0.00 0.0 0.10 0.10 0.0 0.0 0.0 0.01 0.005 0.01 0.0 0.0 0.0 0.0 0.0

0.0 0.0 0.0 0.25 0.25 0.0 0.0 0.0 0.16 0.08 0.18 0.0 0.0 0.0 0.0 0.0

0.0 0.0 0.0 0.45 0.45 0.0 0.0 0.0 0.1 0.1 0.05 0.0 0.0 0.0 0.0 0.0

0.0 0.0 0.0 0.45 0.45 0.0 0.0 0.0 0.16 0.48 0.05 0.0 0.0 0.0 0.0 0.0

Connection between a presynaptic neuronal unit and a postsynaptic one can be described by probability (p) and projection shape (Arbor). Arborization can be ‘‘[]’’, ‘‘O’’ , ‘‘Θ ’’ ,or ‘‘non-topo’’. ‘‘[]’’ denotes a rectangular shape of height and width (hxw), ‘‘O’’ denotes a circular shape of radius (r), ‘‘Θ ’’ denotes a donut shape with inner and outer radii, r1 and r2 respectively, and ‘‘non-topo’’ denotes non-topological arborization where any pairs of pre and post synaptic neuronal units have a certain probability of being connected. Strength of connections initially have random values ranging between minimum and maximum values. Inhibitory connections are indicated by negative values of cij .η, ε , k1 , k2 , θ1 , and θ2 have nonzero values for plastic connections. Plastic connections going from the layer IT to other layers are reward-dependent, while other plastic connections (going to the layer IT ) are not reward-dependent. Table 4 Properties and types of the synaptic connections in the proposed system. Projection

P

Arbor

cij (0)

η

ε

θ1

θ2

k1

k2

IT → Mapp , Mav e IT → Mav e IT → So Tapp , Tav e → So , Mapp , Mav e Mapp ↔ Mav e Si → S So → S So → Si

0.15 0.15 0.05 1.00 1.00 1.00 1.00 0.80

non-topo non-topo non-topo O 1 non-topo []2 × 2 []2 × 2 []2 × 2

0.0006,0.0010 0.0006,0.0010 0.0006,0.0010 0.12,0.12 −0.05,−0.12 −0.27,−0.30 0.09,0.11 0.05,0.06

0.006 0.006 0.02 0.0 0.0 0.0 0.0 0.0

0.00006 0.00006 0.002 0.0 0.0 0.0 0.0 0.0

0.01 0.005 0.01 0.0 0.0 0.0 0.0 0.0

0.16 0.08 0.18 0.0 0.0 0.0 0.0 0.0

0.1 0.1 0.05 0.0 0.0 0.0 0.0 0.0

0.16 0.48 0.05 0.0 0.0 0.0 0.0 0.0

Connection between a presynaptic neuronal unit and a postsynaptic one can be described by probability (p) and projection shape (Arbor). Arborization can be ‘‘[]’’, ‘‘O’’ ,or ‘‘non-topo’’. ‘‘[]’’ denotes a rectangular shape of height and width (hxw), ‘‘O’’ denotes a circular shape of radius (r), and ‘‘non-topo’’ denotes non-topological arborization where any pairs of pre and post synaptic neuronal units have a certain probability of being connected. Strength of connections initially have random values ranging between minimum and maximum values. Inhibitory connections are indicated by negative values of cij . η, ε , k1 , k2 , θ1 , and θ2 have nonzero values for plastic connections . All plastic connections in this system are reward-dependent.

pixel ranges between 0 and 255. A pixel with value 0 is black and a pixel with value 255 is white, and gray in the middle. Then, this grayscale image is converted to several binary images, with only black and white colors, by thresholding, where pixels with a value greater than a threshold value are assigned the value 1 (white color), while the rest are assigned the value 0 (black color). These images are formed by incrementing the thresholds from minimum threshold value to maximum threshold value by a certain step value. After that, the connected white pixels are grouped together for each binary image and the center of each blob is computed. Next, the blobs in the binary images that are close to each other are merged. Finally, the centers and radii of the merged blobs are computed [18,19]. Blobs of different sizes and shapes can be extracted by setting some parameters of the SimpleBlobDetector function. Fine tuning of these parameters may be needed depending on the required task. In the proposed hybrid brain, information about centers and radii of the extracted blobs is only useful in case of the rejection of some of the extracted blobs based on its position or size. Regarding stripes detection as shown in Fig. 10(b), thresholding is applied to the grayscale image. Then, the function findContours is applied to identify contours of objects in the image [20]. After that, the function minAreaRet returns the rectangles of the minimal area, even the inclined ones. Lastly, the function BoxPoints is used to get the points at the corners of the rectangles. Unlike the receptive fields in case of Darwin VII, this method enables the detection of stripes at any inclination, which adds another advantage for the proposed system.

Another feature that was used to test the proposed system is the color of the blocks. Extracting such feature needs to perform color filtering. Originally, images from the webcam are in RGB format. RGB stands for Red, Green and Blue, where any color in this color space can be formed by adding these three colors with different values. However, RGB color model is not the ideal choice to carry out such task because the values of the three color channels change concurrently with change in brightness and illumination conditions. This makes it less efficient in real situations in which the illumination varies continuously. A better choice would be HSV color model as shown in Fig. 11. H stands for Hue which is the main component for color discrimination, as it represents the dominant wavelength of the color. This means that Hue is the most obvious characteristic of a color. There is really an infinite number of possible hues. A full range of hues exists, for example, between red and yellow. In the middle of that range are all the orange hues. Similarly, there is a range of hues between any other two hues. S stands for Saturation, and it represents the purity of a color. High saturation colors look rich and full. Low saturation colors look dull and grayish. V stands for Value, and it represents the lightness or darkness of a color. Value is the brightness of the color and varies with color saturation. It ranges from 0 to 100%. When the value is 0 the color space will be totally black. With the increase in the value, the color space brightness increases and shows various colors. All high saturation colors have medium values (because light and dark colors are achieved by mixing with white or black). With such representation, change in illumination affects only the value and saturation color channels while the hue remains

142

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

Fig. 8. Simple and complex receptive fields. (a) A neuron with ON-center simple receptive field. The activity of this neuron increases when its center region is subjected to a bright spot, and decreases when the outer circumference is subjected to a bright spot. (b) A complex receptive field formed by combining several neurons with simple receptive fields. The formed receptive field have the highest firing rate when it subjects a bar of light at certain orientation, and decreases as this orientation changes.

the same, which makes it possible for better segmentation. To filter out a certain color from an image, the function inRange is used, such that hues between higher and lower boundary values only pass the filter while those out of this range do not pass. For example, if it is needed to pick up the green color from an image, it has a hue around 130 (out of 360), then the H should be adjusted as 65 ◦ ± 15◦ for lower and higher boundaries (i.e. H is 50 for lower boundary and 80 for higher boundary), while S and V can be adjusted to range between 100 to 255 in order to accommodate for changes in illumination. Thus, a binary image is obtained with only colors in the desired band left. After extracting the required features, it is necessary to translate the output of the visual system to a pattern that distinguishes one object from another. The Inferotemporal cortex (IT ) is known to perform that task in primates [14]. The IT generates a certain pattern for each object. Thus each feature to be extracted is given a certain number, and each number is translated into a certain pattern. The pattern selection for the IT of the proposed system was done such that the average firing rate is equivalent to that in Darwin VII in [9]. Next, the output of the IT is fed to the motor and the reward systems as described in the previous sections, and hence, the interface between the computer vision algorithms and the simulating nervous system is achieved. 7. Experimental verification 7.1. Setting the neuronal system The nervous system is built using Nengo 2.1 (Neural Engineering Object) [21]. Nengo is a python package for a neural

Fig. 9. Receptive fields formation to extract different features. (a) Extracting Horizontal lines. (b) Extracting Vertical lines. (c) Extracting Blobs.

simulation environment developed at the center for theoretical neuroscience at the university of Waterloo. Nengo also allows the implementation of different algorithms from the Neural Engineering Framework (NEF). NEF is a computational framework that is used for modeling the function of complex neural networks and allows for flexible adjustment of different parameters of neurons and synapses [22]. The core components of Nengo are networks, nodes, ensembles, and connections. A network is a container for all of the components, it can contain any number of nodes, ensembles, connections, and even other networks. Ensemble represents a group of neurons with same parameters, while nodes can be used for any type of computations and performing custom functions. Connections are used to create a one-way link between ensembles and nodes, between specific neurons in different ensembles, and even set different learning rules to modulate the

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

143

Fig. 10. Flowchart of computer vision algorithms used to extract blobs and stripes. (a) Extracting blobs. (b) Extracting stripes.

strength of these connections. Nengo is mainly used for spiking neural network simulation. But it is possible to add new objects to Nengo simulator, such as neurons with custom functions and synapses with custom learning rules. This allows the creation of average firing rate neuron models while applying custom learning rules. With all this taken into account, Nengo would be the ideal package to build a robot with a hybrid brain. Then, using these custom neurons and synapses the whole network for the hybrid brain was constructed. 7.2. Setting the simulation environment Nao robot was chosen for experimental verification because it provides a wide set of sensors, actuators and other devices that would help to verify the behavior of the hybrid system, and even allow for a space for future improvements. Additionally, Nao has a big community, which allows it to be available in various robotics simulation software. The robot Nao is a programmable humanoid robot from Aldebaran Robotics company. Nao is a humanoid robot with 25 degrees of freedom (DOF), with position sensors on each joint. Nao is equipped with various sensors to help interact with its surrounding environment efficiently as shown in Fig. 12. It has an inertial unit with two gyrometers and three accelerometers. Also, it is equipped with four Force Sensitive Resistors placed in the bottom of each foot to give Nao the ability to current state estimation. Moreover, sonars placed on the chest give a measurement of the distance between the robot and nearest obstacles in its environment, and bumpers on the feet detect collisions with obstacles on the ground. Additionally, Nao has two VGA CMOS cameras and four microphones. Concerning the output devices, Nao offers two loudspeakers and programmable LEDS around the eyes. Nao can be controlled by NaoQi framework, which makes it possible to make use of all the features of the robot. This includes sending commands to the actuators, take reading from the sensors, handling Wi-Fi connection, etc. Moreover, NaoQi can handle executing functions in parallel, in sequence or even drive

Fig. 11. HSV color space 3D model. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

by events. Also, functions in NaoQi can be called in C++ or in Python. Another tool to deal with Nao is Choregraphe suit. While running on a PC, it gives access to all the functions provided by NaoQi through a graphical user interface. However, in this research it was used mainly to create a virtual robot to be able to make use of NaoQi framework to control Nao in a simulated environment. (See Fig. 13.) 7.3. Manual reward system To test the introduced system, a manual reward system was used first to ensure that dynamics of learning process is working as desired. The system was first tested using a webcam and some patterns printed on A4 paper. The webcam sends 640 × 360 pixel RGB video images to the computer used for the simulation of the hybrid brain. The image is clipped before it is fed to the visual system. Then, the visual system applies the algorithms described

144

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

Fig. 14. Nao robot in V-REP simulation environment.

Fig. 12. Nao robot sensors.

Fig. 15. Autonomous reward system setup.

Fig. 13. Manual reward system setup.

in Section 6.2. The input to the taste system is simulated by push buttons which trigger activity in the corresponding neuronal area; such that one for appetitive taste, and the other for aversive taste. Similarly, the motor system consists of two areas; Motor Appetitive (Mapp), and Motor Aversive (Mave). (See Fig. 14.) 7.4. Autonomous reward system After the verification of the correct operation of the nervous system, an experiment was conducted to test its operation in a simulated environment. Virtual Robotics Experimentation Platform (V-REP) is a software used for simulation of the physical environment. Additionally, it contains a Nao humanoid robot model. Using the Choregraphe Suite, which is a multi-platform desktop application provided by Aldebaran company, a virtual Nao robot can be launched on the computer. Then the robot can be controlled using PyNaoqi-SDK (Software Development Kit), which is a set of tools and classes in addition to a set of examples to make it easier to program almost any action that can be executed by Nao. NAOqi, as a part of PyNaoqi-SDK, is the name of the main software that runs on the robot and controls it. The NAOqi Framework is the programming framework used to program NAO which allows for high-level control of the robot.

PyNaoqi provides the option to program Nao using python, which is adequate for the interface with the neural system built using Nengo, which is also a python package. (See Fig. 15.) Basically, a handle for each link and joint of the robot is created in V-REP through the remote API. The remote API is a part of the V-REP API framework that allows to control a simulation (or the simulator itself) from an external application or a remote hardware. Handles are then linked to the corresponding links and joints in the Choregraphe suite, such that commands from the PyNaoqi would be used to control the robot. This allows for high-level control of the robot without the need to give the commands manually to each joint. There are various ways available to control the motion of Nao using NaoQi. These ways allow to control the speed of the robot, control the position based on the estimated pose, control the speed with a joystick (or any external controller), and control the velocity indirectly by setting NAO’s instantaneous step length and frequency [23]. Additionally, it is even possible to alter the parameters of the default walk settings The robot would approach blocks based on its color, where blocks with different colors have different tastes. For simplicity, the activity in the taste areas would be triggered when the robot is close enough from these blocks. The activity of the simulated neuronal areas is displayed to ensure that the learning process is proceeding in the right manner. 8. Results and discussion 8.1. Visual system simulation The visual system described in [9] was built and simulated on an Intel i7 core device with 8 GB of RAM. The visual system which contained the biggest number of neurons and synapses in the Darwin VII, as indicated in Tables 1 and 3, and consequently used most of the computational power needed to simulate the system. The time needed for each simulation cycle is ∼ 6500 ms.

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

145

Fig. 16. Filtering vertical and horizontal lines from images: (a) Input Image to layer R, (b) Output from layer VV , (c) Output from layer VH .

Fig. 17. Filtering blobs from images: (a) Input Image to layer R, (b) Output from layer VB .

Fig. 18. Response of VB to vertical and horizontal lines: (a) Input Image to layer R, (b) Output from layer VB .

The output of the three layers, VH , VV and VB , in Darwin VII was monitored to ensure that the system is behaving as expected. Fig. 16 shows that if an image of vertical and horizontal lines is input to layer R, the output in layer VV will have only vertical lines, while VH will have only horizontal lines. Also, in Fig. 17 the layer VB manages to extract blobs from an image, while it rejects both vertical and horizontal lines as shown in Fig. 18. The output from these layers is then introduced to the layer IT, to achieve invariant object recognition. On the other hand, using computer vision algorithms for our proposed system, invariant object recognition was achieved with a simulation cycle taking only ∼ 150 ms. Fig. 19 shows results of applying algorithms to extract blobs and stripes from images at different scale and orientation using a webcam while moving a paper with blobs and

lines printed on it to different poses inside the vision field of the camera. 8.2. Manual reward system simulation Upon simulating the manual reward system, the user have to input signals to trigger the taste system while simultaneously introducing some input to the webcam. The visual system successfully picks up the predefined features and generates a unique pattern corresponding to each feature. Then, the hybrid brain, with neurons and synapses connected as indicated in Tables 2 and 4, learns to link the corresponding motor action to this visual input. As shown in Fig. 20, the panels show the activity of the simulated neural areas. Neuronal activity is represented by a heat map, where dark blue represents no activity, dark

146

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

red represents the maximum activity, and varies in between as shown in Fig. 20(c). The left panel shows the output of the visual system, the middle panel shows the activity of the taste system, where its left side shows the activity in Tapp and its right side shows the activity in Tav e , and the right panel shows the activity of the motor system, where its left side shows the activity in Mapp and its right side shows the activity in Mav e . In Fig. 20(a), the upper panel shows the activity when green color is introduced to the system before learning occurs, the middle panel shows the activity as the appetitive taste push button is pressed while the green color is introduced to the webcam, and the lower panel shows the activity after the learning occurs for a short period of time with no push button pressed. Similarly, in Fig. 20(b), the panels have the same activity sequence but this time purple color is introduced instead of the green one, and the aversive taste push button is pressed instead of the appetitive one. It is clear from these figures that at the beginning, activity in both motor areas was not correlated with the activity in the visual system. But, after learning occurs for enough amount of time, in this system ∼ 40 s, the system exhibits a motor action in response to only visual cues without any input to the taste system. Although the activity in the motor system in both cases did not reach the maximum value (dark red color), but the difference of activity in both motor areas is enough to evoke a motor response. It is to be noted that when the difference in activity of both motor areas exceeds a threshold value it triggers the motor action, and this threshold was chosen to be 0.3, as that used in case of Darwin VII [9]. The time needed to achieve this task is dependent upon the value of the parameters listed in Tables 2 and 4. These parameters are the same as that used by Darwin VII. The introduced system achieves perceptual categorization after it encounters only three exemplars. On the other hand, Darwin VII needs to encounter at least ten exemplars for the connections between the primary visual area and Inferotemporal layer to be reinforced enough to achieve perceptual categorization as indicated in [9]. 8.3. Autonomous reward system simulation After setting the V-REP environment to be suitable to test the proposed system, Nao starts to explore the environment. In the beginning, it does not exhibit any response to either visual cues, where both patterns generated in the visual system after recognizing either the green or purple color does not trigger any response in the motor area, as shown in Fig. 21, where the upper panel on the left shows activity in the IT, the middle panel shows the activity in the taste system and the lower panel shows the activity in the motor system. A programmed action leads the robot to approach the blocks to taste it. When close enough, the taste system is active and it would trigger either Tapp or Tav e depending on the type of the block, where the green blocks are programmed to trigger Tapp , while the purple blocks are programmed to trigger Tav e , as shown in Figs. 22 and 24. The taste action is predefined as it represents an innate response. Then, the taste system would trigger the corresponding motor action by triggering either Mapp or Mav e . After the robot encounters blocks for a certain amount of time, connections would have been modulated such that only visual cues would be enough to trigger motor action while the robot is far from the blocks, where the green color triggers an appetitive response and the purple color triggers the aversive response, as shown in Figs. 23 and 25. These responses can only be triggered when the average activity in one motor area exceeds that in the other motor area by a threshold value. This threshold value is equal to 0.3, equivalent to that used in case of Darwin VII [9]. The motor activity then is input to a proportional controller that allows Nao to turn towards an appetitive target to be in the central field of vision of Nao and move towards it, or turn away from it in case of an aversive target.

Fig. 19. Invariant Object-Recognition of blobs and stripes from different perspectives. (a) Images on the left show the images from the camera after conversion into grayscale. While the images on the right show the blobs detected after applying the blob detection algorithm. (b) Images on the left are the images from the camera after thresholding. Images in the middle show the stripes detected with the highlighted contours. Images on the right are stripes contours drawn on a black background.

9. Conclusions

• A new concept is investigated to maximize benefit from both of BBD and different robot algorithms used in order to achieve a hybrid brain for real-time operation of robots with limited computational power, to suit engineering applications. • Each simulation cycle of Darwin VII takes about 6500 ms compared to about 150 ms for the proposed hybrid brain using the same device to achieve the same goal. This realizes the main aim of this study by introducing a system that possesses the same intelligence as BBDs but needs must less computational time which makes the proposed system plausible for engineering applications.

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

147

Fig. 20. The system response (a) before and (b) after the conditioning process. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 21. Nao robot encounters a block before any learning occurs. Initially, there will be no motor action in response to visual cues alone. At the same, Nao recognizes the color of the blocks and triggers a corresponding pattern for this color. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 22. Nao robot encounters a block close enough to trigger a response in taste area Tapp . Consequently, taste system evokes a response in motor area Mapp . (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

148

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

Fig. 23. Nao after learning to associate green blocks with appetitive taste. Without action in Tapp , but still a motor response is triggered by visual cues only. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 24. Nao robot encounters a block close enough to trigger a response in taste area Tav e . Consequently, taste system evokes a response in motor area Mav e . (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

• Computer vision algorithms are used to achieve invariant

• Although, the introduced system adopted the hybridization

object recognition comparable to that achieved by Inferotemporal cortex in primates. Each extracted feature have a corresponding unique pattern. These patterns are then input to neuronal units of the simulated nervous system. • The introduced system achieves perceptual categorization after it encounters only three exemplars. On the other hand, Darwin VII needs to encounter at least ten exemplars for the connections between the primary visual area and Inferotemporal layer to be reinforced enough to achieve perceptual categorization. Thus, the proposed hybrid brain takes much less time to achieve its task. • Neuronal areas were built, using Nengo python package, based on neuroanatomical structure of the brain. These areas simulate brain areas that are crucial for human reinforcement learning process. The recognized objects are then categorized based on perceived cues.

of computer vision algorithms with BBD, the concept itself can be applied to different computational algorithms to build a hybrid brain. • It is evident that the simulation of both Darwin VII – built in this study – and the proposed hybrid brain relies on sequential processing — which is a similar approach to the original Darwin VII in [9]. To conduct the simulation in a parallel processing approach, a Graphical Processing Unit (GPU) [15] or a neuromorphic device [24] can be used to suit the parallel nature of the neural systems and achieve a better performance. The latter even realizes a better biological plausibility and less power consumption [25]. Robustness, quality, and speed of the results obtained from both approaches are highly dependent on the architecture of the system and the task intended for the robot to carry out. A comparison of both approaches is out of the scope of this study.

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150

149

Fig. 25. Nao after learning to associate purple blocks with aversive taste. Even without action in Tav e , but still a motor response is corresponding to the observed color. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

10. Future work In this paper, only choice among two categories is exploited. However, in real life situations much more choices are available and the robot has to make a decision which one is the right to pick. Also, a practical experiment using a real Nao robot can be carried out to ensure the robustness of the designed system and how it behaves in a cluttered environment. Acknowledgment The first author is supported by a scholarship from the Mitsubishi Co. at Egypt–Japan University of Science and Technology, which is gratefully acknowledged. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. References [1] J.L. Krichmar, H. Wagatsuma, Neuromorphic and Brain-Based Robots, Cambridge University Press, 2011. [2] J.G. Fleischer, G.M. Edelman, Brain-based devices, IEEE Robot. Autom. Mag. 16 (3) (2009) 33–41, http://dx.doi.org/10.1109/MRA.2009.933621. [3] J.L. McKinstry, G.M. Edelman, J.L. Krichmar, A cerebellar model for predictive motor control tested in a brain-based device, Proc. Natl. Acad. Sci. USA 103 (9) (2006) 3387–3392. [4] J.G. Fleischer, J.A. Gally, G.M. Edelman, J.L. Krichmar, Retrospective and prospective responses arising in a modeled hippocampus during maze navigation by a brain-based device, Proc. Natl. Acad. Sci. 104 (9) (2007) 3556–3561. [5] J.L. Krichmar, G.M. Edelman, Brain-based devices for the study of nervous systems and the development of intelligent machines, Artif. Life 11 (1–2) (2005) 63–77. [6] G.M. Edelman, Learning in and from brain-based devices, Science 318 (5853) (2007) 1103–1105. [7] M. Munich, P. Pirjanian, E. Di Bernardo, L. Goncalves, N. Karlsson, D. Lowe, Application of visual pattern recognition to robotics and automation, IEEE Robot. Autom. Mag. (2006) 72–77. [8] A.K. Seth, J.L. McKinstry, G.M. Edelman, J.L. Krichmar, Visual binding through reentrant connectivity and dynamic synchronization in a brain-based device, Cerebral Cortex 14 (11) (2004) 1185–1199.

[9] J.L. Krichmar, G.M. Edelman, Machine psychology: Autonomous behavior, perceptual categorization and conditioning in a brain-based device, Cerebral Cortex 12 (8) (2002) 818–830. [10] G. Lakoff, Cognitive models and prototype theory, Concepts: Core Read. (1999) 391–421. [11] E. Thelen, L. Smith, A Dynamic Systems Approach to the Development of Cognition and Action, in: Bradford Book Series in Cognitive Psychology, Mit press, 1994. [12] O. Sporns, W.H. Alexander, Neuromodulation and plasticity in an autonomous robot, Neural Netw. 15 (4) (2002) 761–774. [13] J. Byrne, N. Dafny, Neuroscience online: An electronic textbook for the neurosciences, Department of Neurobiology and Anatomy, The University of Texas Medical School at Houston. [14] K. Tanaka, H.-a. Saito, Y. Fukada, M. Moriya, Coding visual images of objects in the inferotemporal cortex of the macaque monkey, J. Neurophysiol. 66 (1) (1991) 170–189. [15] K.-S. Oh, K. Jung, Gpu implementation of neural networks, Pattern Recognit. 37 (6) (2004) 1311–1314. [16] B. El, L. Cooper, P. Munro, Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex, J. Neurosci. 2 (1982) 32–48. [17] G. Bradski, Dr, Dobb’s Journal of Software Tools. [18] S. Mallick, 2015, https://www.learnopencv.com/blob-detection-usingopencv-python-c/. [19] A. Kaehler, G. Bradski, Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library, O’Reilly Media, 2016, https://books.google.com.eg/ books?id=SKy3DQAAQBAJ. [20] S. Suzuki, et al., Topological structural analysis of digitized binary images by border following, Comput. Vis. Graph. Image Process 30 (1) (1985) 32–46. [21] T. Bekolay, J. Bergstra, E. Hunsberger, T. DeWolf, T.C. Stewart, D. Rasmussen, X. Choo, A. Voelker, C. Eliasmith, Nengo: A python tool for building large-scale functional brain models, Front. Neuroinformatics 7 (2014) 48. [22] C. Eliasmith, How to Build a Brain: A Neural Architecture for Biological Cognition, Oxford University Press, 2013. [23] D. Gouaillier, C. Collette, C. Kilner, Omni-directional closed-loop walk for nao, in: Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on, IEEE, 2010, pp. 448–454. [24] G. Indiveri, B. Linares-Barranco, T.J. Hamilton, A. Van Schaik, R. EtienneCummings, T. Delbruck, S.-C. Liu, P. Dudek, P. Häfliger, S. Renaud, et al., Neuromorphic silicon neuron circuits, Front. Neurosci. 5 (2011) 73. [25] I. Sugiarto, G. Liu, S. Davidson, L.A. Plana, S.B. Furber, High performance computing on spinnaker neuromorphic platform: A case study for energy efficient image processing, in: 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC), IEEE, 2016, pp. 1–8.

150

O. Zahra, M. Fanni and A.M. Mohamed / Robotics and Autonomous Systems 119 (2019) 135–150 Omar Zahra received B.Sc. degree in Electromechanical engineering from Alexandria University in 2014. He received the M.Sc. degree in Mechatronics and Robotics engineering from Egypt–Japan University of Science and Technology as a grant from Mitsubishi Co. in 2017. He is currently enrolled in Ph.D. program at The Hong Kong Polytechnic University. His current research interest include robotic manipulation and embodied cognition.

Mohamed Fanni received the B.E. and M.Sc. degrees in mechanical engineering from Faculty of Engineering of both Cairo University and Mansoura University, Egypt, in 1981 and 1986, respectively and the Ph.D. degree in engineering from Karlsruhe University, Germany, 1993. He is a Professor with the Dept. of Mechatronics and Robotics Engineering Egypt-Japan University of Science and Technology E-JUST, Alexandria, on leave from Production Engineering and Mechanical Design Department, Faculty of Engineering, Mansoura University, Egypt. His major research interests include robotics

engineering, automatic control, and Mechanical Design. His current research focuses on Design and Control of Mechatronic Systems, Surgical Manipulators, Industrial Robots and Flying/Walking Robots. Abdelfatah Mohamed has received the Ph.D degree from University of Maryland, College park, USA in 1990. Since 1990 he has been an Assistant Professor with the Dept. of Electrical Engineering, Assiut University, Egypt. He became an Associate Professor in 1995, and Professor in 2000. From September 1990 to August 1993, He has been a Postdoctoral Fellow at the Dept. of Mechanical Engineering, University of Texas, Austin USA. From April 1996 to April 1997. He has been a visiting Professor at the Dept. of Electrical Engineering, Kanazawa University, Japan. From September 2010 to March 2012 He has been the Head, Dept. of Electrical Engineering, Assiut University, and became the Dean of Faculty of Engineering, Assiut University on March 2012. Currently He is the head of the Dept. of Mechatronics and Robotics Engineering, Egypt–Japan University of Science & Technology. His research interest lies in Robust and Intelligent control, Magnetic Bearing systems, Robotics, Industrial drives. Dr. Mohamed is a senior IEEE member.