Biologically Inspired Cognitive Architectures (2013) 6, 109– 125
Available at www.sciencedirect.com
ScienceDirect journal homepage: www.elsevier.com/locate/bica
RESEARCH ARTICLE
Emotional biologically inspired cognitive architecture Alexei V. Samsonovich Krasnow Institute for Advanced Study, George Mason University, 4400 University Drive, MS 2A1, Fairfax, VA 22030-4444, USA Received 4 June ; received in revised form 29 July 2013; accepted 29 July 2013
KEYWORDS Emotional cognition; Human-level artificial intelligence; Cognitive architectures; Social emotions; Affective computing; Human–computer interface
Abstract Human-like artificial emotional intelligence is vital for integration of future robots into the human society. This work introduces a general framework for representation and processing of emotional contents in a cognitive architecture, called ‘‘emotional biologically inspired cognitive architecture’’ (eBICA). Unlike in previous attempts, in this framework emotional elements are added virtually to all cognitive representations and processes by modifying the main building blocks of the prototype architectures. The key elements are appraisals associated as attributes with schemas and mental states, moral schemas that control patterns of appraisals and represent social emotions, and semantic spaces that give values to appraisals. Proposed principles are tested in an experiment involving human subjects and virtual agents, based on a simple paradigm in imaginary virtual world. It is shown that with moral schemas, but probably not without them, eBICA can account for human behavior in the selected paradigm. The model sheds light on clustering of social emotions and allows for their elegant mathematical description. The new framework will be suitable for implementation of believable emotional intelligence in artifacts, necessary for emotionally informed behavior, collaboration of virtual partners with humans, and self-regulated learning of virtual agents.
ª 2013 Elsevier B.V. All rights reserved.
Introduction It was pointed recently (Samsonovich, 2012c) that the bottleneck of the long-awaited big breakthrough in artificial intelligence could be in human acceptance of virtual partners as potentially equal minds. This step would require future intelligent agents and co-robots to be able to develop E-mail address:
[email protected]
mutual understanding and personal relationships with human partners at the human level (Buchsbaum, Blumberg, & Breazeal, 2004; Parisi & Petrosino, 2010). From this perspective, believable, human-level emotionally-intelligent virtual agents–partners can be expected to become a goal of research in artificial intelligence for the nearest future. It is argued here that this practical goal, although very ambitious, is achievable in the near future, at least in a limited set of domains and paradigms, and it can be achieved based on a cognitive architecture inspired by the human
2212-683X/$ - see front matter ª 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.bica.2013.07.009
110 mind: a simple computational model capturing the essence of human emotional intelligence. The scientific challenge motivating this work is to reconcile and unify disparate, mostly controversial theoretical views of the cognitive aspect of emotions within one universal parsimonious framework. Many limited attacks on this challenge were undertaken in the past; most recently, including works published in this journal (e.g., Sellers, 2013; Treur, 2013). Likewise, being guided by the ambitious overarching goal, the present work is limited to one small step: development and preliminary assessment of selected aspects of a parsimonious general framework of the above kind, formulated in the form of a biologically inspired cognitive architecture, or BICA. The notion of a cognitive architecture (e.g., Gray, 2007; Langley, Laird, & Rogers, 2009) is understood here more as a framework specifying a set of principles, templates and constraints for designing intelligent agents rather than an implementation of a big system, or a programming language, like in the case of Soar (Laird, 2012; Laird, Newell, & Rosenbloom, 1987) and ACT-R (Anderson & Lebiere, 1998). This framework is supposed to be general and not limited to any specific kind of environment, embedding, cognitive task or paradigm, while its implementations may be relatively small, case-specific, and limited in many ways. A motivation for this approach taken here is the desire to learn step by step the main principles that will eventually allow us to create a ‘‘cognitive embryo’’: a ‘‘critical mass’’ of a human-level learner and an equivalent of the human mind, that will be initially demonstrated in limited settings. In the case that this will happen, emotional intelligent capabilities arguably can be expected to become vital for success. The focus of this work is therefore on principles enabling human-like emotional cognition in an agent, and on their assessment in a simplistic paradigm. In the literature, ‘‘emotional cognition’’ may refer to cognitive processes affected by emotions, or involving understanding and awareness of emotions in self and others. Unfortunately, modern attempts to add emotional intelligent capabilities to cognitive architectures are limited both functionally (e.g., in Soar the role of emotion is limited to reward generation in reinforcement learning: Laird, 2012) and architecturally. In many cases, e.g., in the aforementioned example, in recent extensions of ACT-R (Dancy, Ritter, & Berry, 2012), and in related architectures (Marsella & Gratch, 2009), an emotion module is added as an auxiliary appendix to the existing architecture, and simulated phenomena are limited to global affective or physiological biases of the system as a whole. Previous modeling attempts to describe development of social emotional relationships among agents are limited to non-BICA approaches. It appears that a new approach in the field of BICA is required to overcome existing limitations. The framework presented here is free of the above limitations and is an elegant way to introduce emotional elements at the core of virtually all basic cognitive process in BICA. Specifically, this framework (i) makes description and processing of emotions in a cognitive system local, (ii) makes certain known clustering of emotions natural, and (iii) adds emotional elements as intrinsic components virtu-
A.V. Samsonovich ally to every cognitive representation. This latter feature is arguably inspired by the human cognition. Indeed, human cognition and emotions are intimately mixed from birth and develop inseparably (Phelps, 2006).
Background on the state of the art and the problem Recent years presented us with an explosion of research on computational models of emotions and affective computing (Hudlicka, 2011; Picard, 1997). Yet, the only existing consensus is limited to the basic affective space model that has been around for a while (Heise, 2007; Osgood, Suci, & Tannenbaum, 1957; Plutchik, 1980; Russell, 1980) and is known by different names, as briefly overviewed below. At the same time, a complete theoretical and computational understanding of the cognitive aspect of emotions is missing. In general, the notion of subjective emotional feeling is problematic in modern science: problems trace back to the general problems associated with the notion of consciousness (Chalmers, 1996). At the same time, there is no need to address the ‘‘hard problem’’ before describing cognitive-psychological and functional aspects of emotions mathematically and using this description as a basis to replicate the same observable features in artifacts. While most modern studies of emotions do not extend beyond phenomenology, a number of views and theories were proposed that attempt to relate emotions to first principles and/or to experimental data in neurophysiology, psychology, psychiatry, sociology, theory of evolution, theory of information, control theory, and beyond. Neurophysiologically, emotional reactions and values are supported by a distributed network of brain structures, including amygdala, nucleus accumbens, anterior cingulate, paracingulate and orbitofrontal cortices, the striatum, hypothalamus, ventral tegmental area, the insula, and are mediated by major neurotransmitters, including dopamine, serotonin, acetylcholine and norepinephrine. E.g., dopamine release in nucleus accumbens results in a feeling of pleasure and is responsible for the development of emotional memories, e.g., leading to drug addiction (Kringelbach, 2009; Olds, 1956; Wise, 1978). Neurophysiological constraints like these cannot be ignored in construction of models of emotional intelligence. They help us to understand the basis of emotion; yet, they alone are not sufficient for understanding emotional intelligence. Here we focus on its cognitive-psychological rather than physiological aspect. Psychological and computational models of emotions attempt to reduce a large variety of emotion-related concepts such as affects, appraisals, feelings, moods, traits, attitudes, intentions, motivations, and preferences to a few universal constructs (the term ‘‘emotion’’ is used here as a generalizing reference to the above factors and phenomena). Main kinds of approaches in phenomenological description of emotions (Hudlicka, 2011) include (i) taxonomies, or discrete models; (ii) dimensional models, examples of which are the semantic differential model (Osgood et al., 1957) with its variations known by different names, e.g., EPA: evaluation-potency-arousal, PAD: pleasure-arousaldomainance, Circumplex (Russell, 1980), etc.; and (iii) cognitive component models, or appraisal models, the most
Emotional biologically inspired cognitive architecture popular of which remains to be the OCC model (Ortony, Clore, & Collins, 1988), partially due to its suitability for computer implementations (e.g., Steunebrink, Dastani, & Meyer, 2007). The idea of OCC is to describe phenomenologically all specific cases and circumstances under which emotions occur, and also how emotion intensities change over time in those cases. However, this phenomenological description given in the form compatible with Markov decision trees does not provide an understanding of the underlying cognitive mechanisms. Many alternative computational models of emotions are conceptually similar to OCC (e.g., Castelfranchi & Miceli, 2009) and were used for modeling of relation development in social networks (e.g., trust relations: Sabater, Paolucci, & Conte, 2006). The topic of recent emotional extensions of mainstream cognitive architectures is continued in Discussion. Semantics of basic emotions can be efficiently characterized by a small number of universal semantic dimensions, such as valence, dominance, arousal, and surprise. The number and specific set of these features depend on the choice of a dimensional theory, such as EPA or PAD. Across most dimensional models, there is a consensus on the main three dimensions that are frequently labeled as valence, arousal and dominance. Together they form the core of the ‘‘semantic map’’ of emotions. However, precise definitions and mutual independence of these factors remain questionable (Samsonovich & Ascoli, 2013). Adding more dimensions allows one to build more accurate representations, sliding to the level of OCC-like models, while merging dimensions with each other allows one to build simpler and more general models. General problems with this state of the art are easy to point; here are only few examples. (i) Typically in affective modeling, emotional and cognitive components are added to a cognitive system separately. Moreover, emotion is contrasted to cognition. Frequently, their integration appears to be challenging. (ii) At the level of dimensional models, it may be hard to define a boundary between basic emotions like fear or anger and complex social emotions like jealousy or sense of humor. Adding new dimensions to the map may not help with discriminating them, or understanding the nature of the difference. It seems that complex, or social emotions must be in principle distinct from basic emotions: not in their EPA values, but because they involve an element that is not present in basic emotions, an element which makes them ‘‘complex’’ or ‘‘social’’. For instance, this could be a social cognition (Theory-of-Mind) element. (iii) Unfortunately, in conclusion, the science of complex social emotions in general, including their computational models in particular, is still at its infancy and does not extend beyond phenomenology. To build a unifying cognitive theory of emotions, we need a new framework.
Theory: the architecture A candidate for a basic framework for understanding emotional cognition is introduced here and is called ‘‘emotional biologically inspired cognitive architecture’’ (eBICA). The architecture eBICA builds essentially on two prototypes: GMU BICA (Samsonovich & De Jong, 2005) and its variation called Constructor (Samsonovich, 2009). The original GMU
111 BICA is now eight years old. It had a number of implementations in Python, Lisp, Matlab and Java, and was used in numerous computational experiments with paradigms ranging from spatial navigation to problem solving and metacognition. Its applications were proposed, e.g., for tutoring systems (Samsonovich, De Jong, & Kitsantas, 2009). The architecture eBICA is decomposed below in a topdown order. At a bird’s-eye-view level, the architecture consists of seven components (Fig. 1): the interface buffer, working, procedural, semantic and episodic memory systems, the value system and the cognitive map system. The three main building blocks for these components are mental states, schemas and semantic maps. Semantic memory is a collection of schema definitions. The interface buffer is populated by schemas. Working memory includes active mental states populated by schemas. Episodic memory stores inactive mental states clustered into episodes – previous contents of working memory. Therefore, episodic memory consists of similar structures to those found in working memory, that are ‘‘frozen’’ in long-term memory. Procedural memory includes primitives. The value system includes drives and scales representing primary values. The cognitive map system includes, in particular, semantic maps of emotional values. A semantic map uses an abstract metric space (semantic space) to represent semantic relations among mental states, schemas and their instances, and to assign values to their appraisals. The core framework of eBICA can be summarized by the following set of tuples (see also Fig. 4; some elements in the tuples can be empty): semantic map = (semantic space, mental states and schemas, their appraisals) mental state = (attributes, schemas) schema = (attributes, nodes, links) node = (attributes, reference to schema) attributes = (category, perspective, attitude, bindings, projections, . . .) This top-level description is unfolded below at a more detailed level. Components of eBICA can be roughly mapped onto the human brain structures as follows. Working memory: activity in ventrolateral prefrontal cortex (VLPFC, BA 45), dorsolateral prefrontal cortex (DLPFC, BA 46), dorsal parietal (BA 9, 46), dorsal frontal (BA 10). Working memory, figural reasoning: dorsal parietal cortex (BA 7), ventral frontal parietal cortex (BA 40). Cognitive map for episodic and spatial reference memory: hippocampus, parahippocampal gyrus (entorhinal, perirhinal and retrosplenial cortices), lateral parietal cortex, nuclei of diencephalon. Cognitive map the for utility matrix in decision making: lateral infraparietal cortex (LIP). Value system, absolute value: orbitofrontal cortex; relative reward value: striatum. Value system, emotional network: anterior cingulate cortex, amygdala, orbitofrontal cortex, hypothalamus, insula, and more. Episodic memory: synaptic weights in/between the hippocampus and extrastriate neocortices.
112
A.V. Samsonovich
Fig. 1 A bird’s eye view of eBICA. The architecture includes seven components. Semantic memory is populated by schemas that are universal units of symbolic representations in eBICA. Bound instances of schemas enter from the interface buffer into mental states (I-Now, I-Next, etc.) in working and episodic memories. The cognitive map system may include spatial maps and semantic maps (semantic spaces) that give values to appraisals. The value system includes drives (red cones) that are linked to value measures and to moral schemas in semantic memory. Procedural memory includes specialized primitives. Interface buffer mediates interactions with the external world.
Semantic memory: medial temporal lobe, parahippocampal, prefrontal, parietal cortices. Interface buffer, output and imagery: premotor and motor cortices, cerebellum. Interface buffer, input and imagery: primary sensory cortices and related structures of their thalamocortical loops. Procedural memory: specialized neocortical areas, including visual, auditory, language (Broca, Wernike), motor and premotor cortices and related structures of their thalamocortical loops, plus the cerebellum.
Mental states The mental state formalism (Samsonovich & De Jong, 2005; Samsonovich et al., 2009) largely separates eBICA and its prototypes from other cognitive architectures. It extends and formalizes a view of simulationist Theory of Mind (Goldman, 2006; Nichols & Stich, 2003; Samsonovich & Nadel, 2005). Related yet different frameworks were independently described in the literature (e.g., McCarthy, 1993; Scally, Cassimatis, & Uchida, 2012). The idea comes from the observation that cognitive representations in human working memory are usually attrib-
uted to some instance of a self of some agent. Therefore, the minimal self (Strawson, 1999, 2011) in this framework is associated with a viewpoint, specifying the identity of the subject, the time, the place, etc. According to this attribution, all higher-level symbolic representations in working memory can be partitioned into ‘‘boxes’’ labeled in an intuitive self-explanatory manner: ‘‘I-Now’’, ‘‘I-Previous’’, ‘‘He-Now’’, and so on, accordingly to the represented mental perspectives. These ‘‘boxes’’ represent mental states. A mental state in eBICA has two parts: (i) content and (ii) a standard set of attributes, the most relevant of which here is called ‘‘perspective’’: it defines the precise mental perspective of the subject to whom the content of awareness is attributed, and is partially captured by the mental state label, such as ‘‘I-Now’’, ‘‘I-Next’’, and ‘‘She-Imagined’’. Mental state labels are used to refer to specific mental states. The content of a mental state is the set of bound instances of schemas representing immediate awareness of a subject. Only one mental state (I-Now) represents immediate awareness of the actual agent; any other mental state represents immediate awareness of a virtual subject. Only INow has control of actions of the agent, can make decisions that determine agent’s behavior, and generate emotional
Emotional biologically inspired cognitive architecture physiological states of the agent. Other mental states, while conceptually not different from I-Now, are merely simulations, of which the agent is aware if they are referenced in I-Now. Mental states switch their perspectives with time: I-Next becomes I-Now, I-Now becomes I-Previous, and I-Previous becomes one of I-Past, is deactivated and stored in episodic memory. Mental state dynamics is constrained by self-axioms (Samsonovich et al., 2009). Episodic memory is a collection of inactive mental states clustered into episodes – previous states of working memory. These mental states, when become relevant, can be activated and retrieved back into working memory with an ‘‘I-Past’’ label. Mental states inherit content from each other. Therefore, contents of co-active mental states in working memory frequently represent identical or related information, and therefore need to be synchronized with each other. This is done using projections: i.e., links that connect schemas inside the mental state to other mental states, schemas in other mental states, or schemas elsewhere, and specify rules of synchronization.
Schemas The term ‘‘schema’’ (plural ‘‘schemas’’ or ‘‘schemata’’) is highly overloaded in the literature, and so is the term ‘‘mental state’’. In the context of eBICA, both terms have specific meaning. A schema (Samsonovich, 2006; Samsonovich, Ascoli, De Jong, & Coletti, 2006; Samsonovich & De Jong, 2005) is understood here as a universal form of all symbolic representations in eBICA. E.g., it is used to represent any concept, percept or category, including entities, properties, events, relations, etc., plus possibly non-conceptual knowledge or experiences, e.g., feelings. The notion of a schema generalizes and replaces the notions of a production, a chunk, an operator, a frame, etc. used in other cognitive architectures, and is analogous to the notion of an object in programming. Another analogy is a Lisp function. Formally speaking, mental states also can be viewed as a special kind of a schema, but are treated separately. Instead, there is an agent schema including a token representing an instance of an agent (e.g., ‘‘me-now’’) linked to the associated mental state. This token is used to reference one mental state from within another. These references and interactions of mental states naturally enable metacognition and social reasoning. A schema is defined by a template stored in semantic memory that can be multiple-instantiated in working and episodic memory and in the interface buffer. A schema is representable as a (hyper)graph consisting of nodes and links connecting them. It can be also represented as a table, in which the first row (the header) is a string of terminal nodes including the head and terminals. The rest (body) consists of internal nodes organized into rows of the table, each of which corresponds to a header of some other schema or a primitive. Primitives are stored in procedural memory. A primitive is a specialized object like an external function, that has a standardized schema-compatible interface and an unrestricted implementation (a ‘‘black box’’). It can bind to nodes in a schema and perform a certain function: e.g., compute an arithmetic operation, cause a certain
113 behavior of the agent, convert an analog signal into symbolic form, etc. Schemas can activate, instantiate, bind, execute, terminate, create or modify other schemas, etc. Regarding the question of mapping of schemas to the brain: arguably, active schemas can be identified with specific, distributed spatio-temporal patterns of neuronal activity. Stored schemas correspond to the same patterns memorized in the network by modification of synaptic connections. In general, the process of identification of semantics of neuronal activity is not straightforward. E.g., one may not expect to find a direct correspondence between nodes of a schema and groups of neurons in the brain. Schema nodes are stored in semantic memory as parts of schemas. A minimal schema is just one head node. Not only in this special case, but in general, each node represents a schema in which it is the head, so there is 1–1 relation between nodes and schemas. Each node (and therefore each schema) represents a category (if this notion is understood broadly); this category gives a name to the schema. Nodes and schemas in semantic memory are organized into a global semantic net that is useful, e.g., for search and activation of relevant schemas. Links connecting nodes within a schema are defined by bindings (bindings are one of the attributes of a node).
Attributes In many cognitive architectures, the notion of an attribute has usage similar to that in object-oriented programming in general: objects have attributes that have values, and the set of attributes and their possible values is object-specific. E.g., a water jag has a certain capacity and the amount of water inside as its attributes; an apple has color, weight, taste, etc. In the framework of eBICA and its prototypes, the situation is different (Samsonovich, 2009): traditional attributes, like color, taste, capacity, etc. that are object-specific, are represented by nodes of the schema and other schemas. At the same time, each node (and each schema and each mental state) has a standard set of approximately 20 attributes, which is one and the same set for all categories, concepts and percepts. From a traditional object-oriented programming point of view, they would be ‘‘meta-attributes’’. Their list includes the following: category, name, tag, perspective, attitude, appraisal, quantifier, mode of binding, method of binding, bindings, projections, references, status, stage of processing, value, activation, attention, fitness, etc. Not all attributes are used in each case; typically most of their values remain empty. All attribute names and roles cannot be discussed here in detail.
Dynamics Dynamics of eBICA in general develop as briefly outlined below. At the core is the standard cognitive cycle: perceive, understand, generate ideas of possible actions, select intention consistent with working scenario, commit the intended action, check the outcome against expectation, and resolve surprises, if any. These steps are realized via propagation of input schemas from the interface buffer into active mental states, associative activation of other schemas, their
114 mapping onto the content of I-Now and other mental states, binding and execution of schemas, and so on. Thus, new representations in working memory are generated by matching, binding and processing of schemas. Schemas are invoked and processed in parallel; in general this process is not deterministic. The working scenario is understood as the main sequence of mental states of the agent extended into the future: e.g., a sequence connecting I-Now and IGoal in the case of a goal-directed behavior. For the present purposes we shall stay at a high level of abstraction, ignoring details of the contents of mental states as much as possible, and will focus on relations of mental states to each other. Of interest here are appraisals that represent subject’s feelings about self and others, as well as about their actions, related objects, relations, and about others’ feelings. The term ‘‘appraisal’’ here stands for an attribute of a schema, similar to ‘‘attitude’’ (Samsonovich, 2006; Samsonovich et al., 2006). Mathematically, the value of an appraisal is given by a vector on a semantic map (also called semantic space, or semantic cognitive map).
Weak semantic cognitive mapping At the core of eBICA is the cognitive map component that is used to organize memories in an abstract space: this could be a map of the physical space, or an abstract semantic space model. The model relevant here is emotional space (also called affective space), representing values and flavors of feelings. Many models of this sort developed through centuries nicely converge to one generalizing framework that we call here weak semantic map. The term was coined by Samsonovich, Goldin, and Ascoli (2010), while the process of ‘‘weak semantic cognitive mapping’’ was described earlier (Samsonovich, 2006; Samsonovich & Ascoli, 2007, 2010). In general, the idea of semantic cognitive mapping is to allocate representations (e.g., words) in an abstract space based on their semantics. This paradigm is common for a large number of techniques, from the latent semantic analysis (LSA: Landauer, McNamara, Dennis, & Kintsch, 2007) to
A.V. Samsonovich Circumplex models (Russell, 1980). Traditionally, the metrics that determines allocation of symbols in space is a function of their semantic dissimilarity (called the dissimilarity metrics). In contrast, the idea of weak semantic cognitive mapping is not to separate all different meanings from each other, but to arrange them based on the very few principal semantic dimensions. These dimensions can emerge automatically, if the strategy is to pull synonyms together and antonyms apart (Samsonovich, 2006; Samsonovich & Ascoli, 2010). The map a part of which is shown in Fig. 2 is a result of this process. It includes 15,783 words and was constructed based on the dictionary of English synonyms and antonyms available as a part of Microsoft Word (Samsonovich & Ascoli, 2010); a similar map was also constructed using WordNet in the same work. This map does not separate well different meanings from each other: e.g., basic and complex feelings. However, it classifies meanings consistently with their semantics. Fig. 2 represents the maximal projection defined by the first two principal components (PC) of the distribution. The axes of the map are defined by the PCs. In a very approximate sense, PC1 (the horizontal dimension in Fig. 2, coded by the green-magenta gradient) can be associated with valence, positivity, pleasure, attractiveness, rationality, and similar notions, while PC2 combines the notions of dominance, arousal, potency, strength, speed, etc. (the vertical dimension in Fig. 2, coded by red–cyan). The standard affective space, such as the Osgood’s semantic differential, PAD, and EPA is three-dimensional, where semantics of the main dimensions are typically interpreted as pleasure or valence, arousal, and dominance. In our case, however, arousal and dominance are mixed. This is not a unique example: e.g., the ANEW database upon analysis shows a strong correlation between two out of its three dimensions (Samsonovich & Ascoli, 2010, 2013).
Emotional elements in eBICA In addition to the weak semantic map of affective space, emotional cognitive elements are included in the eBICA framework in the form of appraisals: i.e., cognitive evalua-
Fig. 2 A sample from the weak semantic cognitive map described by Samsonovich and Ascoli (2010). The ‘‘synesthetic’’ color enhancement follows the scheme of Plutchick (1980). Colors represent, green: PC1 (pleasure, valence), red: PC2 (dominance, arousal). The sum of RGB values is fixed.
Emotional biologically inspired cognitive architecture
A
115
B
Fig. 3 Examples of simplest emotional elements in the framework of eBICA. (A) A schema (could be a schema of an action, an agent, or virtually any schema) has an appraisal as its attribute. It is also an attribute of the head node. The value of this attribute is ‘‘dominant’’, meaning that the action is perceived as a manifestation of dominance, or the agent is perceived as ‘‘dominant with respect to me’’, etc. (B) A mental state has the appraisal attribute, which is an emotional state and a self-appraisal of the agent at the given moment, in the given mental perspective. The shown value of this appraisal is ‘‘excited’’, meaning that the agent is in an excited emotional state. The moral schema shown in B binds to a part of the content of the mental state (including a certain pattern of appraisals) and represents an appraisal of the selected pattern, e.g., a pattern of interactions and mutual appraisals of two agents referenced in the mental state.
with a certain pattern of appraisal values specified as ‘‘normal’’ for this schema.
Fig. 4 Essential core UML class diagram of eBICA. Emotional elements (shown in red) include a new standard attribute (appraisal) and moral schemas representing higher-order appraisals. Appraisal values are shown in Fig. 2.
tions of the emotional value of a mental state, an agent, an action, a relationship, etc. Thus, there are three categories of emotional elements in eBICA, which are new with respect to the prototype architectures: (1) Semantic map of affective space. (2) An appraisal, one of the standard set of attributes of mental states and schemas (when an attribute of a mental state – emotional state, or self-appraisal; when an attribute of a schema – appraisal, when an attribute of a moral schema – intended appraisal). (3) A moral schema, that represents a higher-order appraisal (i.e., an appraisal of appraisals) associated
These classes of elements are shown in Figs. 2–4. Thus, an emotional state, or self-appraisal of a mental state, is an emotional characteristic is assigned to a mental state as a whole. In the case of appraisal of a schema, the emotional characteristic that is assigned to an instance of a schema. This schema may represent another agent with its mental state: then it is an appraisal of the current state of one agent by another agent. Finally, a moral schema is a special kind of a schema that binds to patterns of appraisals, and produces effects (e.g. biases on decisons of the agent) that are intended to change the values of those appraisals to which it is bound. Consider an example: I can be aware of another agent A represented by a schema of A in my I-Now. This schema also serves as a link to A-Now: my mental simulation of the mind of A. My appraisal of A is attached as an attribute to the schema of A in I-Now. It tells me (the subject) how I emotionally evaluate that agent A. For example, when I honor A, or when I submit to A, my appraisal of A has a positive value of dominance. At the same time, if I take an action to yield to A, then my appraisal of the yield action in I-Now has a negative dominance (Table 1), while this is consistent with my appraisal of A as ‘‘dominant’’. Moral schemas that represent appraisals of appraisals are the next level of complexity in this framework. These elements correspond to higher-order feelings, known as complex, or social emotions (Parrott, 2001; cf. Castelfranchi & Miceli, 2009). Examples are analyzed below.
Table 1 Weak semantic cognitive map coordinates for the appraisals of actions (values taken from materials of Samsonovich and Ascoli (2010)). Action
PC1
PC2
Hit Yield Greet Ignore
0.26 0.43 0.93 1.83
1.07 1.03 0.15 0.37
116 In summary, the new, emotional part of the eBICA framework (excluding semantic space) is captured in a UML class diagram (Fig. 4). The diagram shows ontological relations among basic elements of eBICA. Emotional elements are shown in red. All three kinds of appraisals – emotional states or selfappraisals of mental states, first-order appraisals (appraisals of objects, facts, events, actions, relations, etc.; as well as appraisals of agent minds and personalities), and moral schemas representing appraisals of appraisals – take values on the weak semantic cognitive map (e.g., the EPA or PAD space).
Methods: paradigm of random social interactions To illustrate the general principles outlined above and to map them to human behavior, a simple study was conducted based on the settings described below. The paradigm for this study was selected as one of the simplest paradigms that allows for assessment of the eBICA framework implemented in virtual agents as a model of human emotional cognition and emotionally-driven behavior. Alternative paradigms are entertained in Discussion. The paradigm consists of random social interactions in a small group. Groups were either homogeneous – including virtual agents only, or heterogeneous – including virtual agents and a human participant. Actors – members of the group – have a limited repertoire of actions that they could perform with respect to each other. At each step of the discrete time, an actor is prompted to perform one action of her/his/its choice with respect to a randomly selected target actor. Performed actions affect mutual appraisals of agents, and probabilities of action selection in turn depend on the balance of appraisals. With time, a stable pattern of relationships emerges in the group. This general paradigm involves N actors interacting by means of M possible actions performed in discrete time t. Each action has a unique, fixed appraisal value. Parameter values used in most cases in the present study were N = 3, M = 4. Any environmental (including spatial) factors were excluded.
Virtual agents Virtual agents were implemented in Matlab R2011a following the principles of eBICA outlined above. Only minimal elements of the eBICA architecture necessary for the presented study were implemented. While implementations were succinctly simplistic and tailored specifically for the selected paradigm, elements of eBICA components can be identified in them. E.g., schemas of moves (elements of semantic memory) and their activation in working memory were implemented algorithmically. A set of primitives (procedural memory) were added in order to implement constraints on the move selection. Goals assumed to be represented by drives in the value system determined the implemented rules of action selection. Implicitly implemented mental states in working memory represented the agent and the partner(s) at the present
A.V. Samsonovich moment and the agent at the next moment (these mental states should be labeled ‘‘I-Now’’, ‘‘Partner-Now’’, and ‘‘I-Next’’, respectively). The essential attributes used in this study were the appraisals of all actors, including self, and all actions. Thus, each action schema was assigned a certain fixed appraisal value a. In some sessions, a moral schema was used that specified a ‘‘normal’’ value of selfappraisal of the agent and affected probabilities of action selection. Appraisals in general are understood here as vectors that take values on the 2-D weak semantic map (a part of the cognitive map system: Fig. 2). The map data was taken from the previous work (Samsonovich & Ascoli, 2010): specifically, the map constructed based on Microsoft Word Thesaurus was used here. The main two dimensions of the map represent valence (‘‘like-dislike’’) and dominance-arousal (dominant vs. subordinate). In contrast with a previously proposed interpretation (Samsonovich & Ascoli, 2010), it is assumed here based on word-by-word analysis of the map that dominance was merged with arousal in the second dimension rather than captured by the third dimension. At first, there seems to be a discrepancy between this interpretation and the finding that the map of Samsonovich and Ascoli (2010) agrees with other 3-dimensional affective space models, e.g., ANEW (Bradley & Lang, 1999), where valence, dominance and arousal are treated as the three independent semantic dimensions. However, principal component analysis performed in Matlab on the ANEW data shows that some of the three ANEW dimensions are strongly correlated: in fact, the first two principal components account for more than 96% of variance of the distribution of words in ANEW. In other words, the affective space of ANEW is essentially two-dimensional. The fixed appraisals of actions, which took their values on the two-dimensional semantic map (Fig. 2) with the two principal components labeled ‘‘valence’’ (PC1) and ‘‘dominance-arousal’’ (PC2), were determined as the map coordinates of words – the action names (Table 1). These numbers are available as part of the materials of Samsonovich and Ascoli (2010) and were taken from those materials. There were N mental states in working memory of every agent, each corresponding to the state of awareness of one agent taken at the present moment. Appraisal of a given mental state was assumed the same for all appraisers in this model. The underlying assumption was that actors and their actions were appraised by all participating agents identically, because they all received identical information and used identical internal cognitive models (identical semantic maps). Therefore, one and the same generic working memory model was presumed to correctly describe working memory of any of the participating agents, including human participants (up to the obvious switching of indexicals, such as ‘‘I’’ and ‘‘He’’). There were no false beliefs or subjective biases assumed in this model, consistently with the assumption that all agents always ‘‘correctly interpreted’’ each other’s motivations. This simple choice is justified by the principle of parsimony as a first step in computational treatment of the problem. The indexical differences among agent perspectives were taken into account in derivation of the dynamic equations.
Emotional biologically inspired cognitive architecture
117
In virtual agents, appraisals of actors Ai determined probabilities of action selection based on the match between Ai and the action appraisals {a}. And vice versa, appraisals of performed actions determined dynamics of the appraisals of actors. There were N dynamic emotional characteristics in this model: appraisals of the N agents Ai (the rest of appraisal values are static). Appraisals Ai were initiated to very small random values at the beginning of the simulation epoch, in order to break the symmetry. Each of the four possible actions has a fixed appraisal given in Table 1. All appraisal values (2-D vectors in this model) were treated for convenience of implementation as complex numbers: A ¼ ðvalence; dominanceÞ: In this case, valence = Re(A), and dominance = Im(A). The simulation epoch consists of a sequence of iterations performed at each moment of discrete time t. One iteration includes the following essential steps: (i) compute action probabilities, (ii) select and perform action, (iii) update appraisals of the actor and the target of action. Dynamical equations used to update the dynamic values of appraisals were: t Atþ1 target ¼ ð1 rÞAtarget þ raaction t Atþ1 actor ¼ ð1 rÞAactor þ raaction
ð1Þ
Here t is the moment of discrete time, and r is a small positive number (a model parameter that was set to 0.01). The likelihood L of a specific action was computed at each step as follows: Laction ½Reðaaction ðAactor þ Atarget ÞÞþ :
ð2Þ
Here [x]+ is equal to the positive values of x and is zero otherwise; A\ is the complex conjugate of A. Intuitively, this formula tells us that the action is more likely to be selected when its appraisal matches the appraisal of the actor and also matches the appraisal of the target with the inverted dominance component (to reflect the alteration of indexicals with the perspective change). Thus, a virtual agent accumulates memories of its past interactions with the peers in the form of appraisals A. These memories, however, were erased at the beginning of each session: all initial values of Re(Ai) and Im(Ai) were sampled from a uniform distribution on the interval (0, 0.01). This detail should not be considered as a drawback attributed to the lack of long-term memory in agents, which is not the case. Keeping mutual appraisals from previous sessions would be easy and would result in persistence of the hierarchy across sessions. The model (1), (2) re-formulated in a simplified form is relatively easy to solve analytically (Samsonovich, 2012a): first, in a reasonably good approximation, the two dimensions – valence and dominance – become independent; then, an observation can be made that model dynamics are reducible to iterative multiplication of a matrix, which results in singling out of its main eigenvector. Details are not presented here. The resultant analytical prediction for a stable configuration of actor appraisals is: (i) either all positive or all negative, equal valences; and (ii) symmetrical, centered at zero distribution of dominances, which in the case N = 3 implies one dominant and one subordinate actor,
and one in between – at zero dominance. The particular allocation of each individual actor in the stable hierarchy is a matter of chance and may depend on initial conditions and first actions. These analytical predictions are consistent with simulation results for homogeneous teams presented in the next section.
Human participants Five George Mason University college students, 2 females and 3 males, participated in this study. The ethnic breakdown was as follows: 1 White, 1 Hispanic, 1 Black, and 2 Asian. All students were from Northern Virginia. All students reported English as their native language. In terms of their student status, they were 3 freshmen, 1 sophomore, and 1 junior. All of the students were full time students. The age range was between 20 and 25. Students were majoring in Psychology, Computer Game Design, Computer Engineering, Neuroscience and Bioengineering, and one reported undecided major. Before playing their roles, human participants were first presented with a short animation showing three randomly moving shapes – Circle, Square and Triangle – in a square environment (a snapshot from the animation is shown in Fig. 5A). An explanation was given to the participants that the three shapes-actors are foraging for food (scattered brown dots) in the closed environment, where they occasionally meet each other, and when this happens, they have to perform one of the four possible actions with respect to the peer: hit, greet, yield, or ignore. Then participants were asked to keep this scene in mind and imagine being in control of the Circle. With these mental settings, participants engaged in interactions with the two other actors. No actual spatial element was included in the game: the target of action was sampled randomly for every actor at each iteration, and no visualization of shapes was used during the game. Participants performed their actions by typing commands, when prompted, after reading incoming messages; e.g.: ‘‘Triangle greets Circle. Your action with respect to Square (1:hit, 2:yield, 3:greet, 4:ignore):_’’ (see Table 2). Again, there were four possible actions that actors could perform: hit, yield, greet, and ignore. Actors were performing actions in turn, one after another, choosing themselves what action to perform, while the target of each action was sampled randomly and independently of the actor. This corresponds to random encounters during spontaneous 2-D motion (Fig. 5A).
Results and analysis The first task was to observe what stable patterns of emotional relationships among agents (understood as their mutual appraisals) can emerge in this environment, and how the outcome may depend on the eBICA parameters. First, simulations of a homogeneous group consisting of virtual agents only were performed. It was found computationally that stable patterns of mutual appraisals (that correspond to certain configurations of emotional relationships among agents) develop in this model in 100 iterations. E.g., with the choice of parameters specified above, and given randomly sampled initial appraisal values with all positive
118
A.V. Samsonovich
A
B
0.2
:−) :−) :−)
Dominance, arousal
0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 −1
−0.5
0
0.5
1
1.5
2
2.5
3
Valence
Fig. 5 (A) The random social interaction paradigm is explained with a short demo to subjects. Agents are embedded in a simplistic virtual environment. They forage for randomly scattered chocolate pellets (brown dots) while spontaneously interacting with each other. (B) A typical macroscopically stable configuration with N = 7 agents on the weak semantic map. While the distribution remains stable, relative vertical positions of actors keep changing with time. Colors are arbitrary.
Table 2
Beginnings of the first sessions of two subjects with the same random seed.
Subject 2, first session (Session 4)
Subject 3, first session (Session 9)
Your action with respect to Square (1:hit, 2:yield, 3:greet, 4:ignore):3 1: Circle greets Square 1: Square greets Triangle 1: Triangle yields Square Your action with respect to Triangle (1:hit, 2:yield, 3:greet, 4:ignore):2 2: Circle yields Triangle 2: Square hits Triangle 2: Triangle greets Circle Your action with respect to Square (1:hit, 2:yield, 3:greet, 4:ignore):3 3: Circle greets Square 3: Square hits Triangle 3: Triangle yields Square Your action with respect to Triangle (1:hit, 2:yield, 3:greet, 4:ignore):4 4: Circle ignores Triangle
Your action with respect to Square (1:hit, 2:yield, 3:greet, 4:ignore):3 1: Circle greets Square 1: Square greets Triangle 1: Triangle yields Square Your action with respect to Triangle (1:hit, 2:yield, 3:greet, 4:ignore):1 2: Circle hits Triangle 2: Square hits Triangle 2: Triangle yields Circle Your action with respect to Square (1:hit, 2:yield, 3:greet, 4:ignore):3 3: Circle greets Square 3: Square hits Triangle 3: Triangle yields Square Your action with respect to Triangle (1:hit, 2:yield, 3:greet, 4:ignore):1 4: Circle hits Triangle
valence components, a stable configuration on the valencedominance plane always developed in which all N appraisals had positive valence, and the distribution extended vertically and symmetrically along the dominance axis (Fig. 5B). Jumping ahead, the choice of initial conditions for appraisals turned out consistent with human data: typically, the first action selected by a subject was ‘‘greet’’, and in most cases heterogeneous group sessions ended with all-positive appraisals. If, however, the initially sampled values of appraisals had negative valence components, then a stable configuration with negative valence in all actor appraisals developed. In heterogeneous groups, this kind of an out-
come was observed only in a few sessions of only one subject (see below). With a small number of actors in the group (N < 4), each actor controlled by a virtual agent reaches its stationary position on the semantic map in less than 100 iterations, or cycles (each cycle consists in every actor acting on a randomly selected target). In all simulated sessions, the order of actors in the dominance hierarchy (that was spontaneously selected as one out of N! possibilities) never changed after 100 cycles with N < 4 (tested up to 1000 cycles). At N P 4, however, the final configuration remained stationary only ‘‘macroscopically’’, as a ‘‘cloud’’ (Fig. 5B).
Emotional biologically inspired cognitive architecture
A
119
B
0.5 0.4
0.3
Arousal, dominance
0.3
Arousal, dominance
0.4
0.2 0.1 0 −0.1 −0.2
0.2 0.1 0 −0.1 −0.2
−0.3
−0.3
−0.4 0
0.2
0.4
0.6
Valence
0
0.2
0.4
0.6
Valence
Fig. 6 Actor trajectories reaching a stable configuration on the semantic map in 200 cycles. The starting point is the origin of coordinates shown by the cross. (A) A homogeneous team of three virtual agents. (B) A human participant (blue) and two virtual agents. During a short time interval at the beginning of the session, the human valence was negative. All actor appraisals plotted here were computed at each step according to (1) based on the performed actions. Each session was limited to 200 cycles, with each actor performing one move in each cycle.
The qualitative outcome for N = 2 is nearly obvious based on the aforementioned analytical considerations. It is a stationary configuration in which the two agent vectors tend to be complex conjugates of each other with the positive real part. At N = 3, the stable configuration is qualitatively the same for two out of three vectors, while the third vector takes a position in the middle between the two, at approximately zero dominance. This result has been obtained both analytically, in a simplified version of the model (not described here), and numerically. At N P 4 the configuration is microscopically undetermined: actors do not have permanent stationary positions in the cloud. They tend to spread on the map uniformly in a vertical line at a positive valence, and keep drifting up and down, continuously switching their positions in the dominance hierarchy. Every actor in these conditions, if traced long enough over time, at some point reached the top, and at another point reached the bottom. Similar sessions were performed with one of the N = 3 virtual agents replaced by a human participant. Appraisals defined by (1) for all actors, including the one controlled by a human, were used as measures for human performance. In most cases the outcome for homogeneous and for heterogeneous groups looked similar; however, in some sessions the human behavior seemed less consistent over time, compared to virtual agents (Fig. 6). These intuitively noticeable differences in individual sessions, however, were difficult to quantify. All human subjects developed mutually positive valence relations with virtual agents, except one subject, who developed mutually negative valence relations in half of the sessions (Fig. 7E), and mutually positive valence relations in the other half (Fig. 7F). The order of sessions did not correlate with the two outcomes. Because of the difference in valences, the two groups of sessions of this subject were treated in analysis as sessions of two different subjects. Moreover, in support of this approach to the inter-
pretation of results, this subject reported that he noticed that there were two alternating conditions in which the virtual partners behaved differently. A striking difference between human and virtual actors was observed when comparison of all recorded outcomes was performed. Specifically, most of the sessions of heterogeneous teams ended up with the same stable hierarchy of actors: ordered from dominant to subordinate. While a certain order of actors in the hierarchy was persistent for each participating subject across multiple sessions, no noticeable preference in ordering virtual agents was observed across subjects. For different subjects, the ordering of actors in the hierarchy was different and selected independently, except possibly one detail: for actors controlled by human subjects, the stable place in the hierarchy seems to be in the middle. This speculation needs further verification and is based on the following observations. In 2 out of 4 statistically significant cases of persistent hierarchies (Fig. 7A and F) the place of the human was in the middle. In two other cases, the stable place of the human actor was at the top (Fig. 7C and E). In cases that did not show statistical significance (Fig. 7B and D), the tendency of the human to stay in the middle still can be seen. The fact that in 4 out of 6 cases actor hierarchies were persistent across sessions for each individual subject is nontrivial, because the order of actors in the hierarchy was not pre-determined or favored by any a priori factor in the implementation of this paradigm, including the initial conditions, settings, etc. E.g., initial conditions could not account for the consistency of outcomes, because in all cases except one they were sampled randomly and independently of previous sessions (the case in which due to the experimenter mistake the initial conditions and random number generator seeds in two sessions were identical is shown in Table 2). Therefore, this observation indicates that individual human subjects each had a (pre)determined
120
A.V. Samsonovich
0.8
A*
B
C*
D
E*
F*
Dominance, arousal
0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 5
10
15
20
25
30
Session number
G
Dominance, arousal
0.8
H
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8 2
4
6
8
−0.8
0
10
20
30
40
50
60
70
80
90
100
Simulated session number
Fig. 7 Outcomes of sessions of random social interactions within heterogeneous groups including one human subject (A–F), homogeneous groups (G and H) with three identical virtual agents (G) and with one modified virtual agent (H, blue dots). Marker shapes correspond to the actor names. Session 1 in (A) corresponds to Fig. 6B. (A) Subject 1; (B) Subject 2; (C) Subject 3; (D) Subject 4; (E and F) Subject 5. All sessions in (E) ended with all negative valences; all other sessions ended with all positive valences. The asterisks after a letter indicate statistical significance of one-way ANOVA after Bonferroni correction. The null hypothesis is that all mean dominances are equal across sessions. At the beginning of each new session all memory of previous sessions in virtual agents was erased, and initial conditions were sampled randomly.
role and place in the hierarchy for every actor in a group of 3 actors, and were able to enforce this hierarchy in the group by their choice of actions, without awareness that they were the only cause for this outcome. All virtual agents were controlled by an identical code, all memories of previous sessions in them were erased before each new session, and all initial conditions were sampled each time independently. Therefore, the only possible cause of persistence of hierarchies was due to persistent appraisals of agents identified by their names in the human mind. The name of a virtual agent was the only element in the code that was agent-specific and persistent from session to session. One possible interpretation could be that the human participant simply developed a stereotype for opening a session, and the first several actions pre-determined the final hierarchy in each session. However, this logic does not hold. The final order of actors in the hierarchy could not be determined by the first several actions, because the order changed multiple times during a session. In fact, the human position in the hierarchy typically changed at relatively late times (e.g., see example of a crossover in Fig. 6B) no matter which session it was: the first or the last. Even after
20 moves, a crossover occurred on average 3.7 times during a session. Interestingly, for 2 out of 5 subjects, relative positions of the two virtual agents in the hierarchy were not persistent across different sessions (Fig. 7B and D). E.g., in all sessions with Subject 2, Triangle ended up at the top of the hierarchy in 3 sessions, and Square in 2 sessions. In sessions with Subject 4, Square ended up on top in 4 sessions, and Triangle in 3 sessions. To quantify these observations, the outcomes represented in Fig. 7A–F were subjected to one-way ANOVA. The null hypothesis was that mean dominances of the 3 actors were equal across sessions. The P-values are as follows: A, P < 0.0055; B, P > 0.34; C, P < 8.1e13; D, P > 0.16; E, P < 4.9e4; F, P < 5.1e4; G, P > 0.37; H, P > 0.082. Therefore, after Bonferroni correction, the null hypothesis can be rejected at the standard level in cases A, C, E, and F. In cases B, D, G and H the null hypothesis cannot be rejected. In the first sessions of Subjects 2 and 3, initial conditions for virtual agents were identical across subjects (this is because the same seed for the random number generator was used in both initial sessions by mistake; however, this was not the case with other sessions and other subjects, where
Emotional biologically inspired cognitive architecture
121
initial conditions were sampled independently in each session). The beginnings of these two sessions are shown in Table 2. In the left column is the beginning of the first session of Subject 3 (Fig. 7, session 9), and in the right column is the beginning of the first session of Subject 2 (Fig. 7, session 4). These listings, while not conclusive, give an idea of how in principle the choice of actions by the human participant could influence emergent relationships among actors.
Discussion All virtual agents in groups of sessions A, C, E and F were controlled by one and the same script and had no memory of a previous session (although they could have it, the memory was erased); their initial conditions were sampled independently from one and the same uniform probability distribution. Therefore, an interpretation of results could be that actors acquired subjectively perceived, persistent roles in the human mind, that always determined their final place in the hierarchy. This persistence of a hierarchy across sessions cannot be attributed to any factor other than the influence of the human participating in the group. Indeed, for the human participant, virtual agents had persistent names (Square and Triangle) and therefore could acquire persistent individual roles attributed to them in the human mind, if not ‘‘imaginary personalities’’. Yet, the human subject was not aware of her exclusive role in the final outcome, and was not aware of the fact that both virtual partners were controlled by identical scripts and had no memory of previous sessions. It seems as if the human participant had an idea of how things should be normally in the group, and selected actions that eventually led to the ‘‘normal’’ order. A question is: how, within the eBICA framework, these intuitive speculations can be formalized in terms of moral schemas? One approach is explained below: we focus on the persistence of the agent own place in the middle of the hierarchy across sessions, leaving for now the question of persistence of others’ places. We assume that the agent has a preexisting moral schema for her role in a group of three: to be in the middle of the hierarchy. Therefore, according to the schema, the ‘‘normal’’ value of the dominance of this agent is zero. Any perceived deviation from zero should be corrected and reduced to zero by altering the agent’s behavior: this is exactly what the schema does. It monitors the value of the agent’s dominance and biases the probabilities of action selection in order to ‘‘correct’’ the value. As a result, the choice-of-action rule changes. Instead of being determined by (2), the action selection process is overridden by the moral schema. Now, to determine the likelihood of an action, the agent will use not the actually perceived self-appraisal A, but the imagined appraisal A\, the complex conjugate of A. As a result, instead of (2), we have: Laction ½Reðaaction ðAactor þ Atarget ÞÞþ :
ð3Þ
Implementation of this rule in one of the 3 virtual agents, Circle, produces results shown in Fig. 7H that are qualitatively consistent with most observed human data (Fig. 7A, B, D, and F). Similarly, the data of Fig. 7C and E could be simulated using a moral schema that sets the normal value
of Re(A) to 0.6 instead of zero. An analogous approach could be used to account for the persistence of the actor hierarchy across sessions, using a moral schema that sets ‘‘normal’’ values of dominance for each actor and becomes permanently bound to the actor names, Square and Triangle. Thus, adding moral schemas that have effect on agent’s behavior and tend to stabilize a pattern of appraisals believed to be ‘‘normal’’ means overriding a simple rule of action selection that is based on a match of agent and action appraisals, and in principle allows us to account for the observed human data. The modified rule is similar to the original rule, except that the actually perceived agent appraisal is replaced in it with the over-corrected value, therefore resulting in an opposite behavioral bias. This principle was illustrated here using a simple experimental paradigm, and may have broader implications. One important aspect of this study is that virtual agents were capable of accumulating long term emotional memories associated with individuals in the form of appraisals. The fact that in this study virtual agents did not retain memory of their experience with the human subject from previous sessions does not reflect a weakness of the model, and is merely a consequence of the paradigm, in which new fresh copies of a virtual agent were used in each session. Each virtual agent accumulated emotional memory over the session, which could be used in the following session, allowing the agent to demonstrate emotional intelligence, but was erased due to the selected paradigm. The study needs to be continued to see whether moral schemas can reproduce all details of the human data in small groups, and also whether they can stabilize social relations among agents in a large group, account for various social phenomena, etc. Of particular interest is replication of the complex of major human social emotions in virtual agents, including shame, pride, guilt, resentment, jealousy, compassion, sense of humor, etc. Describing them phenomenologically on a case-by-case basis (Ortony et al., 1988) is not sufficient. Why certain social emotions cluster with each other, and under circumstances turn into each other? The framework of eBICA should give answers to these questions. An exploratory consideration follows below.
Understanding social emotions In the context of this study, special attention deserve examples of complex social emotions and emotional relationships, including shame, pride, trust, guilt, jealousy, humor, compassion, etc. As pointed by Lazarus (2001), social emotions are better understood when grouped into clusters. This statement becomes even more meaningful when a cluster of emotions can be described by one moral schema. The following three examples of clusters of social emotions illustrate this point. Shame and pride According to the global structure of emotion types in OCC, shame and pride occur as valenced reactions to actions attributed to self, perceived as an agent (Ortony et al., 1988). On the other hand, Tangney, Wagner, Fletcher, and Gramzow (2001) found shame-proneness to be correlated
122 with suspiciousness, resentment, and tendency to blame others. If shame and pride do not need to involve representations of other minds, then, how their social correlates can be understood? The answer can be given based on the mental state formalism, if we agree to understand social emotions as emotions involving multiple mental states. For comparison, basic emotions like joy or fear do not need to rely on other mental states, when their source is present in the current situation and does not depend on other agents. In contrast, shame and pride arguably involve appraisal of self from another mental perspective. Trust, guilt and jealousy As pointed by Baumeister, Stillwell, and Heatherton (2001), in contrast with shame, guilt occurs primarily in the context of ongoing relationships, e.g., involving trust and responsibility (e.g., love or close friendship), and may be triggered by a failure to meet another’s expectations. Furthermore, Baumeister et al. (2001) show that guilt has the ability to heal broken relationships. In analogy with the emotions of shame and guilt, jealousy is differentiated from envy by Parrott (2001): ‘‘In envy, one’s own appraisal leads to dissatisfaction with oneself. In jealousy, the reflected appraisal of another leads to a lack of security and confidence.’’ In general, trust has been a topic of great interest in recent empirical studies (Berg, Dickhaut, & McCabe, 1995). It is remarkable that in the eBICA framework the inference about an emergent feeling of guilt or jealousy can be made immediately based on the same instance of the moral schema of trust (Samsonovich, 2012a), without the need to analyze the entire situation anew, which would be necessary in a traditional approach, e.g., based on OCC (Ortony et al., 1988). Humor The nature of humor is one of the topics that still evades scientific analysis. Of all social emotions, the sense of humor is probably most poorly understood. This could be partially due to a large number of humor subtypes. While this feature is typical for many social emotions (e.g., Parrott, 2001, points to a number of subtypes of both envy and jealousy), the sense of humor is probably unique in this regard. Not surprisingly, there is no precise scientific definition of the sense of humor: the term is used as an umbrella for a broad spectrum of phenomena (Hurley, Dennett, & Adams, 2011; Lefcourt, 2001). While the sense of humor is frequently taken for granted as an exclusive human characteristic, little is known about humor-like emotional states in other animals. Do animals other than humans have a sense of humor? Do they laugh? These are two very different questions. E.g., human laughter can be triggered by tickling rather than by an emotional experience. It was recently found that rats respond to tickling with frequency-modulated 50 kHz vocalization (Burgdorf, Panksepp, & Moskal, 2011; Panksepp, 2007). Furthermore, these studies suggest that this kind of vocalization in rats signifies a positive affective state and has functional similarities with human laughter. Yet, there is no documented evidence suggesting that rats,
A.V. Samsonovich or any nonhuman animals in general, may have a sense of humor. Among cases when laughter is triggered by emotions, the sensation of humor caused by a joke may be an exception. Bering (2012) points that laughter is typical for many other emotional contexts, e.g., joy, affection, amusement, cheerfulness, surprise, nervousness, sadness, fear, shame, aggression, triumph, taunt, schadenfreude (pleasure in another’s misfortune). E.g., types of laughter studied by Szameitat, Darwin, Wildgruber, Alter, and Szameitat (2011) include: joy, tickling, taunting (gloating), schadenfreude. On the other hand, sensation of humor triggered by a joke may not be expressed as laughter. The picture from the theoretical perspective does not look better. A large number of theories of humor were developed over the centuries. Many of them correctly capture some aspect or type of humor, while missing others. Recent studies focused on the cognitive aspect of humor, as well as other considered here social emotions. More specifically, the focus was on the type of humor that was called ‘‘higher-order humor’’ by Hurley et al. (2011). This is the type of humor that involves the Theory-of-Mind: i.e., the human ability to simulate and understand other minds (Nichols & Stich, 2003). Hurley et al. (2011) use the term ‘‘intentional stance’’ instead of the term ‘‘Theory-ofMind’’, and the term ‘‘mental space’’ instead of the term ‘‘mental perspective’’. Resolving these issues, in eBICA a simulationist computational model of a Theory-of-Mind lies at the foundation of the mental state framework that sheds light on the nature of humor (Samsonovich, 2012b).
Concluding remarks Until recently, the cognitive study of emotions was largely neglected or misunderstood by the mainstream science. Here is an illustrative example from a chapter summary in a textbook of Cognitive Psychology: ‘‘Research on the cognitive consequences of emotion and anxiety is beginning to show a standard pattern of effects. Emotion, whether positive or negative, tends to impair performance because it distracts attention from task demands.’’ (Ashcraft, 1994). Not surprisingly, in cognitive modeling, emotional information processing is traditionally contrasted with rational cognition, and it is considered a challenge to put the two together in one cognitive architecture. By contrast, one basic idea underlying the present study is that cognitive architectures should be designed in such a manner that all information processing in them could be regarded as ‘‘emotional’’. In particular, this means that (i) goals should originate from intrinsic emotions rather than from externally given instructions; (ii) emotional components should be essential to any part of the cognitive process in the architecture; and (iii) the outcome of each cognitive process should be captured by a certain emotional state. From this point of view, it would be misleading to think that emotions, moods and feeling may only subserve impulsive responses or biases, whereas rational planning and decision making is emotion-independent. Instead, emotional elements should find their proper place in all basic mechanisms of cognition in future cognitive architectures. But this can only happen if
Emotional biologically inspired cognitive architecture the right approach can be found to implement and use them. Do artifacts need emotional intelligence? The example paradigm involving Circle, Square and Triangle was inspired by the paradigm used by Heider and Simmel in their behavioral study of human subjects (Heider & Simmel, 1944). Besides its main purpose to serve as a test for humans, their example clearly demonstrates that in order to be emotionally understood and accepted by humans on an equal footing, artifacts do not need to be human-level realistic in their appearance, or in their ability to control voice and motion. Regardless of physical abilities, they need to demonstrate a human-level emotional competency, and therefore they need to be emotionally intelligent at a human level. The same conclusion follows from research on the sense of presence in virtual environments (Herrera, Jordan, & Vera, 2006). On the other hand, research on human learning tells us that emotions play a vital role in self-regulated learning (Zimmerman, 2008), and therefore it seems that it is necessary for artifacts to have human-level emotional intelligence to be able to learn like humans. Emotions in mainstream cognitive architectures Soar (Laird et al., 1987) and ACT-R (Anderson & Lebiere, 1998) are the two most widely known and used cognitive architectures. There were recently numerous works on implementation and study of emotions in Soar and ACT-R, along with other nowadays popular features like episodic memory, which is not discussed here. It is relatively easy to implement some aspect of emotions in a cognitive architecture. But this does not necessarily solve the problem. For example, the recently extended version of Soar (Laird, 2008) implements an appraisal theory of Scherer (that belongs to the same family as OCC) in its Appraisal Detector, which is used for reinforcement learning. A limitation here is that appraisal is evaluated as a global characteristic of the situation of the agent. Also, proposals on extending ACT-R with emotional structures were made recently (Dancy, this issue; Dancy et al., 2012; Oltramari, Lebiere, Ben-Asher, Juvina, & Gonzalez, 2013). The works of Marsella and Gratch (2009) implement the theory of Lazarus (1991, 2001). The truth is, however, that we did not move far beyond the level of Baddeley, Eysenck, and Anderson (2009) models of working memory in understanding how emotions enter cognition. Potential applications Today’s large volumes of surveillance data pose a challenge of extracting vital information from the data automatically and/or of enhancing human ability to do so. Biologically inspired affective cognitive architectures increasingly attract attention in this context as an efficient, robust and flexible solution to the challenge of understanding human realworld data analysis. At the same time, modeling of higher cognition in cognitive architectures is often limited to traditional algorithms, and is separated from biologically inspired information processing. The approach developed in this work supports the addition of a new element to cognitive architectures: natural emotional intelligence, inspired by the human mind, taken in a human-like form
123 and potentially at the human level. While this work is in part only a proposal for a new step, the significance of this step may be difficult to overestimate. An emergent view is that an artifact can and should have similar to human emotional-intelligent abilities in order to be a successful human partner. In this regard, I would like to conclude with the quote of Allen Newell (1990, p. 495): ‘‘It is entirely possible that before one can define a model of a social agent of any real use in studying social phenomena, substantial progress must be made in welding affect and emotion into central cognitive theories.’’
Acknowledgments I am grateful to many participants of BICA conferences and VideoPanels for discussions of the status of the field of emotional artificial intelligence (some of these discussions are available online at http://bicasociety.org/videos/vp.html). Parts of this work were previously presented at AAAI Workshops and Fall Symposia and accordingly reported in AAAI Technical Reports (Samsonovich, 2012a, 2012b, 2013).
References Anderson, J. R., & Lebiere, C. (1998). The atomic components of thought. Mahwah: Lawrence Erlbaum Associates. Ashcraft, M. H. (1994). Human memory and cognition (2nd ed.). New York: HarperCollins College Publishers. Baddeley, A. D., Eysenck, M., & Anderson, M. C. (2009). Memory. New York: Psychology Press. Baumeister, R. F., Stillwell, A. M., & Heatherton, T. F. (2001). Interpersonal aspects of guilt: Evidence from narrative studies. In W. G. Parrott (Ed.), Emotions in social psychology: Essential reading (pp. 295–305). Ann Arbor, MI: Taylor & Francis. Berg, J., Dickhaut, J., & McCabe, K. (1995). Trust, reciprocity, and social history. Games and Economic Behavior, 10(1), 122–142. Bering, J. (2012). The rat that laughed: Do animals other than humans have a sense of humor? Maybe so. Scientific American, 307(1), 74–77. Bradley, M. M., & Lang, P. J. (1999). Affective norms for English words (ANEW): Instruction manual and affective ratings. Technical report C-1. The Center for Research in Psychophysiology, University of Florida. Buchsbaum, D., Blumberg, B., & Breazeal, C. (2004). Social learning in humans, animals and agents. In AAAI Fall symposium technical report FS-04-05 (pp. 9–16). Menlo Park, CA: The AAAI Press. Burgdorf, J., Panksepp, J., & Moskal, J. R. (2011). Frequencymodulated 50 kHz ultrasonic vocalizations: A tool for uncovering the molecular substrates of positive affect. Neuroscience and Behavioral Reviews, 35(9), 1831–1836. Castelfranchi, C., & Miceli, M. (2009). The cognitive-motivational compound of emotional experience. Emotion Review, 1(3), 223–231. Chalmers, D. J. (1996). Conscious mind: In search of a fundamental theory. New York: Oxford University Press. Dancy, C. L. (2013). ACT-R
: A cognitive architecture with physiology and affect. Biologically Inspired Cognitive Architectures (this issue). Dancy, C. L., Ritter, F. E., & Berry, K. (2012). Towards adding a physiological substrate to ACT-R. In Proceedings of the 21st conference on behavior representation in modeling and simulation (pp. 78–85). Amelia Island, FL.
124 Goldman, A. I. (2006). Simulating minds: The philosophy, psychology and neuroscience of mindreading. New York: Oxford University Press. Gray, W. D. (Ed.). (2007). Integrated models of cognitive systems. Series on cognitive models and architectures. Oxford, UK: Oxford University Press. Heider, F., & Simmel, M. (1944). An experimental study of apparent behavior. American Journal of Psychology, 57, 243–259. Heise, D. R. (2007). Expressive order: Confirming sentiments in social actions. New York: Springer. Herrera, G., Jordan, R., & Vera, L. (2006). Agency and presence: A common dependence on subjectivity? Presence: Teleoperators and Virtual Environments, 15(5), 539–552. Hudlicka, E. (2011). Guidelines for designing computational models of emotions. International Journal of Synthetic Emotions, 2(1), 26–79. Hurley, M. M., Dennett, D. C., & Adams, R. B. (2011). Inside jokes: Using humor to reverse-engineer the mind. Cambridge, MA: The MIT Press. Kringelbach, M. L. (2009). The pleasure center: Trust your animal instincts. Oxford University Press. Laird, J. E. (2008). Extending the Soar cognitive architecture. In P. Wang, B. Goertzel, & S. Franklin (Eds.), Artificial general intelligence 2008: Proceedings of the first AGI conference (pp. 224–235). Amsterdam, The Netherlands: IOS Press. Laird, J. E. (2012). The soar cognitive architecture. Cambridge, MA: MIT Press. Laird, J. E., Newell, A., & Rosenbloom, P. S. (1987). SOAR: An architecture for general intelligence. Artificial Intelligence, 33, 1–64. Landauer, T. K., McNamara, D. S., Dennis, S., & Kintsch, W. (Eds.). . Handbook of latent semantic analysis. Mahwah, NJ: Lawrence Erlbaum Associates. Langley, P., Laird, J. E., & Rogers, S. (2009). Cognitive architectures: Research issues and challenges. Cognitive Systems Research, 10, 141–160. Lazarus, R. (1991). Emotion and adaptation. New York: Oxford University Press. Lazarus, R. S. (2001). Relational meaning and discrete emotions. In K. R. Scherer, A. Schorr, & T. Johnstone (Eds.), Appraisal processes in emotion: Theory, methods, research (pp. 37–67). Oxford, UK: Oxford University Press. Lefcourt, H. M. (2001). Humor: The psychology of living buoyantly. New York: Kluwer Academic. Marsella, S. C., & Gratch, J. (2009). EMA: A process model of appraisal dynamics. Cognitive Systems Research, 10, 70–90. McCarthy, J. (1993). Notes on formalizing context. In Proceedings of the 13th international joint conference on artificial intelligence (pp. 555–562). Menlo Park, CA: AAAI Press. Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press. Nichols, S., & Stich, S. (2003). Mindreading: An integrated account of pretence, self-awareness, and understanding other minds. Oxford: Oxford University Press. Olds, J. (1956). Pleasure centers in the brain. Scientific American, 105–116. Oltramari, A., Lebiere, C., Ben-Asher, N., Juvina, I., & Gonzalez, C. (2013). Modeling strategic dynamics under alternative information conditions. In Proceedings of the 12th international conference on cognitive modeling. Ottawa (Canada). Ortony, A., Clore, G., & Collins, A. (1988). The cognitive structure of emotions. Cambridge, UK: Cambridge University Press. Osgood, C. E., Suci, G., & Tannenbaum, P. (1957). The measurement of meaning. Urbana, IL: University of Illinois Press. Panksepp, J. (2007). Neuroevolutionary sources of laughter and social joy: Modeling primal human laughter in laboratory rats. Behavioral Brain Research, 182(2), 231–244.
A.V. Samsonovich Parisi, D., & Petrosino, G. (2010). Robots that have emotions. Adaptive Behavior, 18(6), 453–469. Parrott, W. G. (2001). The emotional experiences of envy and jealousy. In W. G. Parrott (Ed.), Emotions in social psychology: Essential reading (pp. 306–320). Ann Arbor, MI: Taylor & Francis. Phelps, E. A. (2006). Emotion and cognition: Insights from studies of the human amygdala. Annual Review of Psychology, 57, 27–53. Picard, R. W. (1997). Affective computing. Cambridge, MA: The MIT Press. Plutchik, R. (1980). A general psychoevolutionary theory of emotion. In R. Plutchik & H. Kellerman (Eds.), Emotion: Theory, research, and experience. Theories of emotion (Vol. 1, pp. 3–33). New York: Academic. Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178. Sabater, J., Paolucci, M., & Conte, R. (2006). Repage: Reputation and image among limited autonomous partners. Journal of Artificial Societies and Social Simulation, 9(2), 3. Samsonovich, A. V. (2009). The constructor metacognitive architecture. In Samsonovich, A. V. (Ed.), Biologically inspired cognitive architectures II: Papers from the AAAI Fall symposium. AAAI technical report FS-09-01 (pp. 124–134). Menlo Park, CA: AAAI Press. Samsonovich, A. V. (2012a). An approach to building emotional intelligence in artifacts. In W. Burgard, K. Konolige, M. Pagnucco, & S. Vassos (Eds.), Cognitive robotics: AAAI technical report WS-12-06 (pp. 109–116). Menlo Park, CA: The AAAI Press. Samsonovich, A. V. (2012b). Modeling social emotions in intelligent agents based on the mental state formalism. In V. Raskin, & J. M. Taylor (Eds.), Artificial intelligence of humor: Papers from the AAAI Fall symposium. AAAI technical report FS-12-02 (pp. 76–83). Palo Alto, CA: AAAI Press. Samsonovich, A. V. (2012c). On a roadmap for the BICA challenge. Biologically Inspired Cognitive Architectures, 1, 100–107. Samsonovich, A. V. (2013). Modeling human emotional intelligence in virtual agents. In Lebiere, C. L., & Rosenbloom, P. (Eds.), Integrated cognition: Papers from the AAAI Fall symposium. AAAI technical report FS-13-04. Palo Alto, CA: AAAI Press. Samsonovich, A. V., & Ascoli, G. A. (2007). Cognitive map dimensions of the human value system extracted from natural language. In B. Goertzel, & P. Wang (Eds.), Advances in artificial general intelligence: Concepts, architectures and algorithms. Proceedings of the AGI workshop 2006. Frontiers in artificial intelligence and applications (Vol. 157, pp. 111–124). Amsterdam, The Netherlands: IOS Press. ISBN 978-1-58603-758-1. Samsonovich, A. V., & Ascoli, G. A. (2010). Principal semantic components of language and the measurement of meaning. PLoS ONE, 5(6), e10921.1–e10921.17. Samsonovich, A. V., & Ascoli, G. A. (2013). Augmenting weak semantic cognitive maps with an ‘‘abstractness’’ dimension. Computational Intelligence and Neuroscience, 308176, 1–10. http://dx.doi.org/10.1155/2013/308176. Samsonovich, A. V., Ascoli, G. A., De Jong, K. A., & Coletti, M. A. (2006). Integrated hybrid cognitive architecture for a virtual roboscout. In M. Beetz, K. Rajan, M. Thielscher, & R. B. Rusu (Eds.), Cognitive robotics: Papers from the AAAI workshop, AAAI technical reports WS-06-03 (pp. 129–134). Menlo Park, CA: AAAI Press. Samsonovich, A. V., De Jong, K. A., & Kitsantas, A. (2009). The mental state formalism of GMU-BICA. International Journal of Machine Consciousness, 1(1), 111–130. Samsonovich, A. V., & De Jong, K. A. (2005). Designing a self-aware neuromorphic hybrid. In K. R. Thorisson, H. Vilhjalmsson, & S. Marsela (Eds.), AAAI-05 workshop on modular construction of human-like intelligence: AAAI technical report, WS-05-08 (pp. 71–78). Menlo Park, CA: AAAI Press.
Emotional biologically inspired cognitive architecture Samsonovich, A. V., Goldin, R. F., & Ascoli, G. A. (2010). Toward a semantic general theory of everything. Complexity, 15(4), 12–18. Samsonovich, A. V., & Nadel, L. (2005). Fundamental principles and mechanisms of the conscious self. Cortex, 41(5), 669–689. Samsonovich, A. V. (2006). Biologically inspired cognitive architecture for socially competent agents. In M. A. Upal & R. Sun (Eds.), Cognitive modeling and agent-based social simulation: Papers from the AAAI workshop, AAAI technical report, Vol. WS-06-02 (pp. 36–48). Menlo Park, CA: AAAI Press. Scally, J. R., Cassimatis, N. L., & Uchida, H. (2012). Worlds as a unifying element of knowledge representation. Biologically Inspired Cognitive Architectures, 1, 14–22. Sellers, M. (2013). Toward a comprehensive theory of emotion for biological and artificial agents. Biologically Inspired Cognitive Architectures, 4, 3–26. Steunebrink, B. R., Dastani, M., & Meyer, J. -C. (2007). A logic of emotions for intelligent agents. In Proceedings of the 22nd conference on artificial intelligence (AAAI 2007) (pp. 142–147). Menlo Park, CA: AAAI Press. Strawson, G. (1999). The phenomenology and ontology of the self. In D. Zahavi (Ed.). Exploring the self: Philosophical and psychopathological perspectives on self-experience. Advances in consciousness research (Vol. 23, pp. 39–54). Amsterdam: John Benjamins.
125 Strawson, G. (2011). The minimal self. In S. Gallagher (Ed.), Oxford handbook of the self. Oxford handbooks in philosophy (pp. 253–278). Oxford, UK: Oxford University Press. Szameitat, D. P., Darwin, C. J., Wildgruber, D., Alter, K., & Szameitat, A. J. (2011). Acoustic correlates of emotional dimensions in laughter: Arousal, dominance, and valence. Cognition and Emotion, 25(4), 599–611. Tangney, J. P., Wagner, P., Fletcher, C., & Gramzow, R. (2001). Shamed into anger? The relation of shame and guilt to anger and self-reported aggression. In W. G. Parrott (Ed.), Emotions in social psychology: Essential reading (pp. 285–294). Ann Arbor, MI: Taylor & Francis. Treur, J. (2013). An integrative dynamical systems perspective on emotions. Biologically Inspired Cognitive Architectures, 4, 27–40. Wise, R. A. (1978). The role of reward pathways in the development of drug dependence. Pharmacology and Therapeutics, 35(1–2), 227–263. Zimmerman, B. J. (2008). Investigating self-regulation and motivation: Historical background, methodological developments, and future prospects. American Educational Research Journal, 45(1), 166–183.