Plausible reasoning and plausibility monitoring in language comprehension

Plausible reasoning and plausibility monitoring in language comprehension

JID:IJA AID:8057 /FLA [m3G; v1.218; Prn:23/05/2017; 11:22] P.1 (1-19) International Journal of Approximate Reasoning ••• (••••) •••–••• 1 Contents...

2MB Sizes 0 Downloads 32 Views

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.1 (1-19)

International Journal of Approximate Reasoning ••• (••••) •••–•••

1

Contents lists available at ScienceDirect

2

1 2

3

International Journal of Approximate Reasoning

4 5

3 4 5

6

6

www.elsevier.com/locate/ijar

7

7

8

8

9

9

10

10

11 12 13

Plausible reasoning and plausibility monitoring in language comprehension ✩

14 15 16 17 18

Maj-Britt Isberner , Gabriele Kern-Isberner

b

a

Department of Psychology, University of Kassel, Germany b Department of Computer Science, University of Technology Dortmund, Germany

23 24 25 26 27 28 29 30 31 32

13 15 16 17 18 19

20 22

12 14

a

19 21

11

20

a r t i c l e

i n f o

Article history: Received 27 November 2016 Received in revised form 8 May 2017 Accepted 9 May 2017 Available online xxxx Keywords: Plausible reasoning Belief revision Language comprehension Spohn’s ranking functions c-representations c-revisions

33 34 35 36

a b s t r a c t In psychological research on language comprehension, so-called epistemic Stroop effects illustrate how implausible information can interfere with human action decisions, i.e., actions with positive goals can be delayed after implausible information, and vice versa. The basic assumption here is that humans reason from suitable situation models that are built upon background beliefs. In this paper, we present formal models that are apt to simulate cognitive processes that are relevant for language comprehension and these epistemic Stroop effects. Since background knowledge is crucial for the situation model, we use the inductive methods of c-representation and c-revision that are capable of processing explicit (conditional) knowledge bases to make plausible reasoning in the experimental tasks transparent. We argue that the delays in response time are partially caused by belief revision processes which are necessary to overcome the mismatch between plausible context (or background resp. world) knowledge and implausible target words. We also present first tentative results that different types of knowledge may induce different processing patterns. © 2017 Elsevier Inc. All rights reserved.

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

37

37

38

38

39 40

39

1. Introduction

41 42 43 44 45 46 47 48 49 50 51 52 53 54

41

Nonmonotonic logics (cf., e.g., [1]) have been devised to overcome the limitations of classical logics with respect to handling rules with exceptions. As is usually the case in Artificial Intelligence, human reasoning has provided the paradigm for reasoning in those logics, and formal systems to axiomatize nonmonotonic logics, like system P [2], have often been considered also as rationality postulates that human reasoning follows. Likewise, the AGM postulates of belief revision [3] have been motivated by considerations of which belief change operations humans would deem to be rational. While there is an increasing interest in evaluating how well the rationality postulates from nonmonotonic reasoning and belief revision are suited to model human reasoning (cf., e.g., [4,5], research that looks more closely and empirically into the relationships between formal logic-based models of reasoning on the one hand, and commonsense human reasoning on the other hand is still rare. To date, we do not know much about how useful our formal models of reasoning are in fact to at least describe how humans reason, whether humans distinguish between different types of knowledge, e.g., such as causal or normative knowledge, or how the plausibility of incoming information is evaluated to produce the information that a human would be willing to accept for revision processes, and what influence perceived (im)plausibility has on the reasoning of humans,

55

58 59 60 61

42 43 44 45 46 47 48 49 50 51 52 53 54 55

56 57

40

56 ✩

This paper is part of the Virtual special issue on Uncertainty Reasoning, Edited by Robert E. Mercer and Salem Benferhat. E-mail address: [email protected] (G. Kern-Isberner).

http://dx.doi.org/10.1016/j.ijar.2017.05.003 0888-613X/© 2017 Elsevier Inc. All rights reserved.

57 58 59 60 61

JID:IJA AID:8057 /FLA

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

[m3G; v1.218; Prn:23/05/2017; 11:22] P.2 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

2

or even on their action decisions. The BDI1 -model of agent theory [6] neatly distinguishes between the modules in which beliefs, desires, and intentions are processed, and how this leads to actions. While there are numerous interactions between all three modules, it is usually assumed that these interactions occur via interfaces, i.e., beliefs have influences on intentions but the ways in which beliefs are produced are irrelevant. However, different effects have been observed in psychology: In a seminal study, Stroop [7] showed that when people were asked to name the color in which color words were printed, a mismatch between the color and the meaning of the word (e.g., the word “blue” in red font or vice versa) resulted in slower and more erroneous responses. This is generally taken as evidence that the process of reading is so strongly automatized that it cannot be suppressed even though it is irrelevant for – and interferes with – the actual task of naming the color. In other words, the automaticity of a process is established by demonstrating its interference with performance in an unrelated task for which the process is not required. This casts general doubts on how well the BDI agent model with its clear modular structure fits the way humans base their action decisions on cognitive processes. The empirical insights provided by Stroop and others seem to suggest that it is not only the result of reasoning that is decisive for human action decisions but also that irritations during the process of reasoning itself can influence a human’s disposedness for actions. Plausibility of observations or perceived information (in particular, through reading), and how information is processed play a crucial role in investigating these Stroop-like effects. This creates an interesting connection between knowledge representation in artificial intelligence and language comprehension in psychology. Most modern theories of language comprehension agree that to understand a text, readers need to integrate text information with their knowledge about the world to construct a situation model of what the text is about [8–10]. An important but generally overlooked implication of this assumption is that the process of constructing a situation model must be sensitive to the goodness of fit between incoming information and world knowledge [11]. Therefore, Isberner and Richter [12,13] proposed that knowledge-based plausibility must be routinely monitored during language comprehension. They tested this assumption with a reaction time paradigm in which an assessment of plausibility was irrelevant or even detrimental to performance on the actual experimental task. In three experiments using different experimental tasks, they found interference of task-irrelevant plausibility with task performance, which constitutes evidence that readers cannot actually comprehend information without also assessing its consistency with their plausible beliefs about the world. In this paper, we elaborate on the relations between formal models of plausible reasoning and belief revision on the one hand, and plausibility monitoring in language comprehension on the other hand. While Isberner and Richter [12,13] are mainly interested in demonstrating Stroop-like effects and measure the impact of plausibility only implicitly, their empirical work provides nevertheless deep insights into the role background knowledge plays for human reasoning. This is particularly visible and impressive in [13] where the authors explicitly distinguish between items involving high or low knowledge. As a main contribution of this paper, we propose formal models of the cognitive processes that may happen in the reader when he or she encounters plausible and implausible information of the kind used by [12,13] in their experiments, and discuss to what extent this model can account for their empirical findings. These formal models allow for plausible reasoning and belief revision while taking commonsense background knowledge explicitly into account, in order to comply with this crucial aspect of Isberner and Richter’s work. As a suitable framework, we choose Spohn’s ordinal conditional functions, OCF [14,15], and the approach of c-representations and c-revisions [16,17] because this combination is able to provide all methods necessary for a framework of plausible, inductive reasoning from background knowledge and iterated belief revision in the spirit of [3,18]. C-representations allow for (inductive) nonmonotonic reasoning of a very high quality, meeting basically all standards which have been proposed for nonmonotonic logics so far (cf. [16,17]). Moreover, c-revisions generalize crepresentations, so that we can take advantage of a seamless methodological framework for all reasoning activities that we consider in the experiments. This is important both from a formal and a psychological point of view because such a unifying theory adequately models the close link between uncertain reasoning and belief revision (cf., e.g., [16]) which has also been pointed out in the psychological literature (cf., e.g., [19]). However, we would like to emphasize that the focus here is on the formal reasoning activities themselves (inductive conditional reasoning, plausible reasoning, and iterated belief revision) as potential causes for observed delays. This means that, unlike [19], not the exact methodologies according to which plausible reasoning and belief revision are performed are in the focus of this paper but how suitable formalisms can simulate and explain psychological findings on human thinking in general. Conceivably, other unifying frameworks of plausible reasoning that provide all the mentioned reasoning activities might work as well. It might be an interesting topic of future work to compare different specific formalisms with respect to their adequacy of modeling significant features of human reasoning. The basic idea is to simulate the test persons’ reasoning by first setting up a knowledge base of conditionals which express the relevant beliefs for the situation under consideration in a task within an experiment. Instead of using some kind of plausibility distribution right away, we thereby aim at making plausible beliefs which form the relevant background knowledge that the test person may use for processing the information shown in the tasks as explicit and transparent as possible. Then, an OCF-c-representation is built up which serves as an epistemic model of this background belief base, making the test person ready for responding to the respective task the appertaining (but unrelated) information of which may require a revision process. Our claim is that this revision takes more or less time and needs more or less effort, depending on how compatible the new information is with the contextual epistemic state, and thus may cause delays or

59

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

1

60 1

BDI = Beliefs, Desires, Intentions.

61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.3 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

1 2 3 4 5 6 7 8 9 10 11 12 13 14

3

irritations in the cognitive processes of the test persons. The tasks in the experiments of [12,13] are somewhat different: While the tasks in [12] requires iterated propositional revision, the tasks in [13] demand for revision by conditionals, or checking the plausibility of conditional statements, respectively. The low-knowledge condition in [13] is particularly challenging here because we have to find ways to represent background knowledge that is not sufficiently informative on the item shown in the task but nevertheless provides a basis for rational plausible reasoning. Using typical examples from the experiments of [12,13], we will illustrate in detail what happens when information comes in, and explain why response delays may occur from a knowledge representation perspective. Moreover, we also re-analyzed the findings of [12,13] in a post-hoc manner as to whether differences in processing different types of commonsense knowledge might be observed. Indeed, the different types of knowledge in the experiments yield different patterns of the Stroop-like effects that hint at differences in the way different types of knowledge are processed, in particular with respect to speed. Of course, further systematic investigations would be necessary, but our results give rise to first hypotheses.

1

This paper is a revised and substantially extended version of [20]. In particular, it provides the following novel contributions beyond [20]:

13

15 16 17 18 19 20 21 22

25 26 27 28 29 30 31

• We extend our discussion of viewpoints and results from language comprehension in psychology in order to illustrate the relevance that is assigned to knowledge and plausible beliefs about the world in this field. • We also develop formal models for the experiment in [13] that addresses low-knowledge vs. high-knowledge situations and requires revision by conditionals. • We investigate whether different types of knowledge (e.g., normative, ontological, or causal knowledge) in the experiments of [12,13] lead to different patterns in the Stroop-like effects, and present tentative empirical results of post-hoc analyses.

36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

5 6 7 8 9 10 11 12 14 16 17 18 19 20 21 22 24 25 26 27 28 29 30 31 32

2. Inductive reasoning and belief revision with OCFs

34 35

4

23

This rest of this paper is organized as follows: In section 2, we recall how c-representations and c-revisions based on ordinal conditional functions (OCFs) can serve as a general model for plausible (inductive) reasoning. Section 3 provides background information on the psychological field of language comprehension, describes the details of the experiments of [12,13], and summarizes their results. Section 4 then presents our formal epistemic modeling for simulating human reasoning in the tasks used by [12,13]. In particular, we deal both with high knowledge and low knowledge cases. For the high knowledge tasks, we distinguish between different types of knowledge and indeed find hints at different patterns of processing in section 5. Section 6 concludes the paper by summarizing and discussing its main contributions and pointing out future research directions.

32 33

3

15

23 24

2

33 34

We build upon a propositional logical framework. Let L be a finitely generated propositional language, with atoms a, b, c , . . . (also called propositional variables), and with formulas A , B , C , . . .. For conciseness of notation, we will omit the logical and-connector, writing A B instead of A ∧ B, and overlining formulas will indicate negation, i.e. A means ¬ A. Let  denote the set of possible worlds over L;  will be taken here simply as the set of all propositional interpretations over L. ω |= A means that the propositional formula A ∈ L holds in the possible world ω ∈ ; then ω is a model of A. Mod( A ) denotes the set of all models of A. As usual, let |= also denote the classical entailment relation between propositions, defined by A |= B iff Mod( A ) ⊆ Mod( B ). By slight abuse of notation, we will use ω both for the model and the corresponding conjunction of positive or negated atoms. Since both readings of |= coincide in  this case, no misunderstandings will arise. The classical consequences of a set S of formulas are given by Cn(S ) = { B ∈ L | S |= B }. Conditionals ( B | A ) over L, i.e., A , B ∈ L, are meant to express uncertain, plausible rules “If A then plausibly B” and will play a crucial role for modeling an agent’s (background) knowledge. The language of all conditionals over L will be denoted by (L|L). A conditional ( B | A ) is verified by a world ω if ω |= A B, and falsified if ω |= A B. If ω does not satisfy A then the conditional ( B | A ) is not applicable to ω . This accounts for the three-valued nature of conditionals that strongly distinguishes conditionals from material implications. A (finite) set R = {( B 1 | A 1 ), . . . , ( B n | A n )} ⊂ (L|L) expresses plausible beliefs of a human being or an agent, and is called a (conditional) knowledge base. We will use the terms knowledge and beliefs rather synonymously to denote propositions the agent is strongly convinced of, or deems to be most plausible. Indeed, a large (if not the major) part of our commonsense knowledge are actually beliefs, and conditionals are much better suited than classical binary logics to model such commonsense beliefs. Knowledge bases R should be consistent in the sense that they should represent a coherent world view of the agent. This is certainly the case if it is possible to validate the plausibility of all conditionals of R within a formal epistemic framework. Such an epistemic framework can be set up via so-called ordinal conditional functions. Ordinal conditional functions (OCFs, also called ranking functions) κ :  → N ∪ {∞} with κ −1 (0) = ∅, were introduced first by [14]. They express degrees of plausibility of propositional formulas A by specifying degrees of disbeliefs (or implausibility) of their negations A. More formally, we have κ ( A ) := min{κ (ω) | ω |= A }, so that κ ( A ∨ B ) = min{κ ( A ), κ ( B )}. Hence, due to κ −1 (0) = ∅, at least one of κ ( A ), κ ( A ) must be 0, and altogether we have κ ( ) = 0 where denotes a tautology. A proposition A is believed if κ ( A ) > 0 (which implies particularly κ ( A ) = 0). Degrees of plausibility can also be assigned to conditionals by setting κ ( B | A ) = κ ( A B ) − κ ( A ). A conditional ( B | A ) is accepted in the epistemic state represented by κ ,

35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.4 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

written as κ |= ( B | A ), iff κ ( A B ) < κ ( A B ), i.e. iff A B is more plausible than A B. This can also be understood as a plausible, nonmonotonic inference: From A, we can plausibly derive B, in symbols A |∼κ B, iff κ ( A B ) < κ ( A B ). So, conditionals and plausible inferences can be related concisely by ordinal conditional functions:

κ |= ( B | A ) iff κ ( A B ) < κ ( A B ) iff A |∼κ B .

37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

6 7 8 9 10

Note that, as explained in the beginning of this section, possible worlds ω are used both as models and as propositions in this paper. This allows for easy notations also in the context of OCFs: for the proposition ω , we have κ (ω) = 0 precisely if κ (ω) = 0 for the model ω since the proposition ω has exactly one model which is identified by ω . OCF-rankings can be understood as logarithmic probabilities [21], so there are lots of analogies between OCFs and probability functions. In particular, the uniform OCF κu assigns the same rank to each world: κu (ω) = 0 for all ω ∈ . Note that OCFs treat (plausible) propositions A in the same way as the conditional ( A | ), so we consider only conditionals as elements of our knowledge bases but keep in mind that knowledge bases can also contain plausible propositions. It would also be possible to take strict knowledge into account, e.g., categorical knowledge like “cars are vehicles”, or “penguins are birds” which are processed in a classical logical way by excluding possible worlds. But this would make the formalism a bit more complex without providing significant additional insights because the focus of this paper is on plausible (not necessarily strict) commonsense knowledge. Given a knowledge base R which usually expresses most relevant, but only partial knowledge about the world, a crucial task is to find an epistemic state that validates R and completes it with plausible inferences that can be drawn from R. This process of completing a knowledge base towards an epistemic state is often called inductive reasoning. In our framework, this means that we have to compute an OCF κ that accepts all conditionals in R and can then be used for plausible reasoning. We will use here the approach of c-representations that allows for an easy generalization to also handle revision tasks [16,17]. A c-representation of a knowledge base R = {( B 1 | A 1 ), . . . , ( B n | A n )} is an OCF κR of the form

12

κR (ω) =





κi

(3)

κ j > min

ω|= A j B j

ω|= A j B j

i = j ω|= A i B i

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

34 35

(4)

κ ∗ (ω) = κ0 + κ (ω) +



κi−

ω|= A i B i

κi− that are chosen in such a way as to ensure that κ ∗ |= R, i.e.,

36 37

i = j ω|= A i B i

38

Such a κR can then be used to check arbitrary inferences – for any A , B ∈ L, we can compute and compare κR ( A B ) and κR ( A B ) and decide if A |∼κR B holds (cf. (1)). Note that we might also have κR ( A B ) = κR ( A B ); in this case neither A |∼κR B nor A |∼κR B holds – the agent is undecided about the inference. C-representations are not uniquely determined, (3) and (4) provide more like a schema to build up model-based epistemic representations of knowledge bases. In this paper, we will take minimal c-representations as suitable epistemic models, i.e., c-representations for which the κi− have been chosen in a minimal way. The rationale behind this is that (im)plausibility degrees should be as low as possible in order to not generate differences between ranks arbitrarily, which would result in unjustified conditional knowledge, or plausible inferences, respectively, due to (1). This is in line with similar approaches like system Z (cf. [21]). Maintaining one epistemic state is not enough – new information comes in, and the agent (or the human) has to incorporate this information into her epistemic state. So, the epistemic state has to be changed, i.e., we have to perform a belief change operation (usually denoted with the symbol ∗) to adapt the agent’s beliefs to the current state of the world. Belief revision theory [3] provides lots of approaches to tackle this problem (for a recent overview, see [22]). In this paper, we make use of c-revisions [16,17] as a powerful approach to handle advanced belief change tasks that we need here. C-revisions provide a schema for iterated revision not only by (multiple) propositional beliefs, but also by (sets of) conditionals, which will be necessary for dealing with Experiment 2. In more detail, c-revisions are able to solve the following problem: Given a prior OCF κ and a set R of conditionals that represent new information, compute a posterior κ ∗ = κ ∗ R that accepts R and still uses as much of the information of the prior κ as possible. Formally, a c-revision of κ by R = {( B 1 | A 1 ), . . . , ( B n | An )} is an OCF κ ∗ = κ ∗ R of the form

with non-negative integers

14

33

κi− that are chosen in such a way as to ensure that κR |= R, i.e.,   κi− − min κi− .

with non-negative integers −

13

32

ω|= A i B i

60 61

5

11

35 36

3

(2)

33 34

ω).

2 4

(1)

In this way, OCFs can provide semantics for validating conditionals and plausible inferences, and have become quite a popular model for (non-quantitative) epistemic states [21]. The most plausible beliefs represented by an OCF κ are contained in the set Bel (κ ) which is the set of all formulas that are satisfied by all most plausible models, i.e., by all ω with κ (ω) = 0. More formally, we have

Bel (κ ) = Cn(∨ω:κ (ω)=0

1

(5)

39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.5 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

1 2

κ− > min (κ (ω) + j ω|= A j B j

5 6 7 8 9 10 11 12 13

i = j

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

κi− ),

(6)



κ ∗ A (ω) =

κ (ω) − κ ( A ), κ (ω) + max{0, −κ ( A ) + 1}, 

3

κ ∗ ( )

= 0. The similarity between c-

(7)

iff

A |∼κ B ;

38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

7 8 9 10 11 12 13 15

18 19 20 21 22 23 24 25 26

(9)

it is straightforward to check that (9) holds for c-revisions. Together with (1), both plausible inference and belief revision are also very closely related to conditional reasoning. In particular, checking if Bel (κ ∗ A ) |= B is equivalent to checking if κ |= ( B | A ), and both can be validated iff κ ( A B ) < κ ( A B ), thanks to (1). After having laid the formal grounds for our approach to explaining monitoring in language comprehension, we explain how language comprehension is evaluated in [12] and [13] in more detail.

27 28 29 30 31 32 33 34

3. Plausibility monitoring in psychological language comprehension

36 37

6

17

(8)

34 35

5

16

if ω |= A ∨ B if ω |= A B

If already κ |= A resp. κ |= ( B | A ), then κ ∗ A = κ resp. κ ∗ ( B | A ) = κ ; otherwise, for κ ∗ A, A-worlds are shifted downwards by κ ( A ), and A-worlds are shifted upwards by 1, and for κ ∗ ( B | A ), A B-worlds are shifted upwards, and normalization does the rest. This close connection between c-representations and c-revisions is not just a technical trick but allows for a smooth integration of plausible inference and belief revision, in very much the same spirit that is revealed by the so-called Ramsey test [24] (phrased for epistemic states represented by OCFs here):

Bel (κ ∗ A ) |= B

4

14

if ω |= A if ω |= A

κ (ω), κ ∗ ( B | A )(ω) = −κ ( A ∨ B ) + κ (ω) + κ ( A B ) − κ ( A B ) + 1,

1 2

i = j

ω|= A i B i

representations and c-revisions is obvious, c-revisions are indeed built in a way that is very similar to c-representations. Actually, c-representations arise from (conditional) c-revisions when a uniform epistemic state κu is revised by a set of conditionals, so both share the same theoretical foundations. Revisions by plausible propositional beliefs are also subsumed as the special cases where the conditionals have a tautological antecedent, due to A ≡ ( A | ). For this paper, we only need c-revisions in quite a simple form because the new information will be just one plausible proposition or conditional, that is, we only have to find a revision of an epistemic prior κ by a proposition A, or by a conditional ( B | A ). Under the assumption that we are going for minimal c-revisions, for these cases, handy and unique forms of c-revisions can be used (for more details, please see, e.g., [16,17,23]):

16 17

ω|= A j B j



and κ0 is just a normalizing factor that makes κ ∗ a proper OCF satisfying

14 15

κi− ) − min (κ (ω) +

ω|= A i B i

3 4



5

35 36

A widely accepted view in psychology is that language comprehension and reasoning are separate processes with little overlap. According to this view, reasoning processes operate on the output of the comprehension process and are thus subsequent to language comprehension. They are also often assumed to be optional, in the sense that they are only carried out if the comprehender has the motivation and the ability to reason about the linguistic input she has comprehended. Judgments of the truth or plausibility of linguistic input are assumed to pertain to the domain of reasoning and, thus, not affect the comprehension stage of information processing. Based on this idea, two-step models of comprehension and validation have been predominant which either assume that comprehension proceeds without any evaluative component (e.g., [25]), or that the linguistic input is by default initially accepted as true and can only effortfully be unbelieved at a later point (e.g., [26,27]). Thus, a common assumption of these two-step models is that readers need to actively question the plausibility of information to notice inconsistencies with their world knowledge. This would imply that it is possible for readers to comprehend linguistic information while ignoring whether or not it aligns with what they know or believe about the world. From a formal linguistic perspective, this assumption seems plausible: The comprehension process of a sentence such as “Dinosaurs are extinct” should essentially be the same as the comprehension process of a sentence such as “Dogs are extinct”. However, studies have shown that this is not actually the case (e.g., [28,29]): Even when people are not asked to perform any judgment of truth or plausibility on sentences they are asked to comprehend, real-world truth or plausibility (and even the agreement with moral positions; [30]) immediately affect language processing, within as little as a few hundred milliseconds after the presentation of a word rendering the linguistic input false, implausible, or inconsistent with the reader’s (moral) beliefs. In fact, it has been shown that people can even use their real-world knowledge to anticipate the continuation of linguistic input before it is complete ([31]). This shows that knowledge and beliefs about the world are not only used for reasoning activities subsequent to comprehension, but they are an inherent part of the comprehension process itself, drawing into question a strict distinction of comprehension and reasoning processes. At the same time, it is unlikely that language comprehension would, under normal circumstances, comprise extensive reasoning activities, unless the comprehender has the motivation to effortfully analyze the linguistic input. So, the question is: To what extent are reasoning processes an inherent part of language comprehension?

37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

6

1 2

[m3G; v1.218; Prn:23/05/2017; 11:22] P.6 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

3 4 5

Stroop paradigm Task: Name the color (Stroop, 1935)

6 7 8 9 10 11 12 13

1

Table 1 Paradigms that have been used to demonstrate Stroop-like effects.

Epistemic Stroop paradigm Various tasks (Isberner & Richter, 2013, 2014; Richter, Schroeder, & Wöhrmann, 2009)

2 3

Congruent

Incongruent

blue green red yellow

blue green red yellow

Positive Response after True/Plausible Sentence

Negative Response after True/Plausible Sentence

8

Negative Response after False/Implausible Sentence

Positive Response after False/Implausible Sentence

10

4 5 6 7 9 11

Response Latencies

<

Response Latencies

12

Error Rates

<

Error Rates

13

14

14

15

15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

There are a few exceptions to the separate treatment of reasoning and language comprehension in psychology (e.g., [32]). Most prominently, [8] proposed the concept of mental or situation models to explain both reasoning and language processing. Based on this idea, most modern theories of language comprehension agree that understanding a text is tantamount to constructing an (adequate) situation model of what the text is about, for which readers need to integrate the text information with their knowledge about the world [8–10]. This raises the question of which knowledge readers access and use to construct a situation model. Resulting questions that have been investigated in psychology and that connect both domains are the question of which kind of inferences are drawn “on-line” (during) vs. “offline” (after) language comprehension (e.g., [33,34]) and what dimensions of situation models (e.g., causal, temporal or spatial relations) are routinely monitored during comprehension [35]. Richter and colleagues [11] adapted the classical psychological paradigm of the Stroop effect (cf. [7], see also section 1) to show that epistemic truth (or validity) is routinely monitored during language comprehension. Isberner and Richter [12,13] extended this research and tested whether people could ignore plausibility in a task for which an assessment of plausibility was irrelevant and detrimental. They assumed that if plausibility monitoring is automatic, the negative outcome of the plausibility monitoring process for implausible sentences should facilitate negative and interfere with positive responses in any subsequent task, whereas the positive outcome for plausible sentences should have the opposite effect. Thus, they expected a Stroop-like effect (termed epistemic Stroop effect) caused by the incongruence between plausibility and required response, with responses matching the outcome of the assumed plausibility monitoring process being faster and less error-prone than responses mismatching the outcome (cf. Table 1):

35 36 37

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

(In)Congruence Effect A: Positive responses are facilitated by plausible and interfered with by implausible sentences. (In)Congruence Effect B: Negative responses are facilitated by implausible and interfered with by plausible sentences.

38 39

16

36 37 38

Facilitation and interference is measured both with respect to response time (i.e., faster vs. slower) and error rate (i.e., less vs. more error-prone). These effects are assumed to be symmetrical, yet it is possible to observe one effect without observing the other. First, we describe the tasks and results of the Experiments 1A and 1B, then we turn to Experiment 2. In the experiments by Isberner and Richter [12], participants read sentence pairs describing everyday situations that were either plausible or implausible based on general world knowledge (cf. the task example in Table 2). These sentence pairs were presented word by word on a computer screen (300 ms/word). Their plausibility hinged on the last word of the sentence pair (target word). 300 ms after the target word appeared, participants were prompted to perform a task on the target word. The task was either an orthographic task in which participants were asked to indicate whether the target word was spelled correctly or not (Experiment 1A), or a color judgment task in which participants judged whether or not the target word had changed color (Experiment 1B). Thus, participants in both tasks were required to provide positive (yes) and negative (no) responses unrelated to the plausibility of the preceding sentence pair. The results of both experiments showed the expected (in)congruence effect of plausibility and required response; however, it was only found for the response latencies (not the error rates) and only for positive responses, which were significantly slower for target words that made the described situation implausible than for target words that made it plausible, while negative responses were either also slower for implausible target words (Experiment 1A) or not affected (Experiment 1B). That is, only (in)congruence effect A could be reliably confirmed with respect to response times in this experiment (cf. Fig. 1 and Fig. 2). Isberner and Richter [12] also tried to rule out that their results could be explained by predictability rather than plausibility (plausible target words are usually more predictable than implausible target words) by using target words that were either predictable or unpredictable, where (un)predictability is always assessed with respect to the plausible context. Predictability had been ascertained empirically before. In the context of knowledge representation, predictability could be interpreted as specificity, or informativeness. Although unpredictable target words were generally responded to more slowly, the overall pattern of a delay of positive responses to implausible as compared to plausible target words did not

39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.7 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

7

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

Fig. 1. Mean correct response latencies for Experiment 1A from [12].

15

16

16

17

17

18

18

19

19

20

20

21

21

22

22

23

23

24

24

25

25

26

26

27

27

28

28

29

29

30

30

31 32

Fig. 2. Mean correct response latencies for Experiment 1B from [12].

33 34 35

31 32 33

Table 2 Sentence pairs from Experiment 1 by [12].

34 35

36

Condition

Item version

36

37

Plausible, predictable word

Frank has a broken pipe. He calls the plumber.

37

Implausible, predictable word

Frank has a broken leg. He calls the plumber.

39

Plausible, unpredictable word

Frank has a broken pipe. He calls the tradesman.

41

Implausible, unpredictable word

Frank has a broken leg. He calls the tradesman.

43

38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

38 40 42 44 45

significantly differ from the pattern for predictable target words, which supports the prediction that this delay is due to plausibility rather than predictability. In [12], a within-items manipulation of plausibility and predictability is used, meaning that for each possible combination of both variables (i.e., for each condition), a different version of the same item was constructed. Each participant saw only one version of each item but the same number of items in each condition, such that across participants, all versions of each item were used. We chose the items from Table 2 as a typical example of the stimuli in [12]. Each pair of sentences makes up one task. The item version shows the sentences while the condition explains the precise combination of plausibility and predictability used for the respective task. Plausibility and predictability refer to the last word of each sentence pair (given the context), which is crucial for the task and hence underlined. Table 3 shows the average response times for positive and negative answers with respect to each combination of plausibility and predictability in Experiment 1A. It is clearly seen that in any case, responses were significantly slower for implausible words. In another experiment, [13] tried to rule out the concern raised by Wiswede et al. [36] that the evaluative nature of both the orthographic and the color judgment task (i.e., having to make a positive or negative judgment about the orthographical correctness or the color of the target word) might have induced an evaluative mindset in participants which encouraged plausibility judgments and thus might explain the obtained effects of task-irrelevant plausibility. This would

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

8

1 2 3 4

[m3G; v1.218; Prn:23/05/2017; 11:22] P.8 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

Table 3 Response times (means (M) and standard deviations (SD) by experimental condition) of Experiment 1A in [12]. Means and standard deviations are based on participants as units of observations.

6

Plausible Response time

7

M

(SD)

M

(SD)

Predictable Positive response Negative response

953 873

(280) (261)

1034 927

(333) (331)

Unpredictable Positive response Negative response

1026 995

(293) (317)

1171 1009

(371) (310)

5

8 9 10 11 12 13

Condition

Implausible Response time

1 2 3 4 5 6 7 8 9 10 11 12 13

14

14

15

15

16

16

17

17

18

18

19

19

20

20

21

21

22

22

23

23

24

24

25

25

26

26

27

27

28

28

29

29

30

30

31

31

32

32

33

33

34

34

35

35

36

36

37

37

Fig. 3. Trial structure of Experiment 2.

38

38

39 40 41

39

Table 4 Sentence pairs from Experiment 2 by [13].

40 41

42

Knowledge level

Validity

Item version

42

43

high high low low

valid invalid valid invalid

Gold is valuable. Sunburn is healthy. Woolly rhinos are extinct. Curry contains salt.

43

44 45 46

49 50 51 52 53 54 55 56 57 58 59 60 61

45 46

47 48

44

47

mean that plausibility monitoring is in fact not the default mode of language processing (as the experiments by [12] and [11] suggested) but rather an artifact of the task. To rule out this potential alternative explanation, [13] used a nonevaluative task introduced by Wiswede et al. [36] which only required responding to one of two probe words presented immediately after the end of each sentence: “TRUE” or “FALSE” (see Fig. 3). The presented probe word was randomly selected on each trial and thus independent from the actual truth value of the sentence, and only identification (not evaluation) of the probe word was necessary to perform the task. To make sure that participants still read the sentences, Isberner and Richter presented comprehension questions after a proportion of the trials. In this experiment, [13] used stimuli as in Table 4, which were similar to those used by Wiswede et al. [36] and identical to those used by [11]. These sentences varied on two dimensions: their truth value (or validity; valid vs. invalid) and whether participants actually possessed the knowledge necessary to assess their truth value (high vs. low knowledge; this had been ascertained empirically beforehand with a separate sample of participants from the same population). As expected, [13] found (in)congruence effects of validity and required response for the high knowledge stimuli but not for the low knowledge stimuli (cf. Figs. 4 and 5). There was again an (in)congruence effect for positive responses in the response latencies, but this time also an (in)congruence effect for negative responses in the error rates. In addition, comparisons with the low knowledge condition revealed that having

48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.9 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

9

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

Fig. 4. Mean correct response latencies and mean error rates for low knowledge for Experiment 2 from [13].

15

16

16

17

17

18

18

19

19

20

20

21

21

22

22

23

23

24

24

25

25

26

26

27

27

28

28

29

29

30

30

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Fig. 5. Mean correct response latencies and mean error rates for high knowledge for Experiment 2 from [13].

knowledge that allowed participants to assess the real-world truth value of a sentence made it easier to respond to the target word (as in reduced response latencies and error rates) when it matched the actual (but task-irrelevant) truth value of that sentence. In the next section, we show how formal models based on the reasoning and revision techniques from section 2 can help explain the response latencies in the experiments by [12,13]. 4. OCF-based models for plausibility monitoring First, we outline the general design for our formal models for Experiments 1 and 2; both modelings share the same basic set-up and make use of the techniques from section 2. Then we will describe the formal models for the experiments in detail, and comment on their relevance for explaining the findings in [12,13]. 4.1. General design of the formal epistemic models

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

The experiments described in section 3 show clearly that the plausibility of information has a significant influence on how information is processed, and how decisions for actions are made. In general, we may take the message from the experiments of [12,13] that implausible information causes delays in the reactions of agents part of which may be due to resolving conflicts while reasoning. In the following, we focus on the cognitive part of these experiments, namely, what are the rational thought-processes that allow plausible statements to be processed more smoothly in the mind of agents. In this section, we elaborate on the reasoning processes that are necessary to solve the tasks by setting up an epistemic model for typical examples used in the experiments with the help of OCFs. By the formal means of conditional and inductive reasoning resp. belief revision (see section 2) we are able to simulate the information processing within a test person’s mind and to explain the observations from [12,13] from a cognitive point of view. More precisely, re-interpreting the experiments in the context of agents, we focus on the belief module and leave the closer consideration of the (re)action module (i.e., how can Stroop-like effects be modeled?) for future work. As a key feature of our approach, we will make background knowledge explicit to show its crucial impact on reasoning. In all examples from [12,13], test persons are expected to use commonsense knowledge to validate the statements, and the background knowledge will provide the base for that. The experiments 1 [12] and 2 [13] both deal with plausibility monitoring, yet the designs of the respective tasks are a bit different: While the tasks in experiment 1 tell a little story and a plausibility conflict may arise after the first sentence

47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

[m3G; v1.218; Prn:23/05/2017; 11:22] P.10 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

10

Table 5 Prior κ p and revised OCFs

1

κ p ∗ p, κ p ∗ l, and κ p ∗ ut for the broken pipe example from Table 2.

2

ω

κp

κp ∗ p

κp ∗ l

κ p ∗ ut

ω

κp

κp ∗ p

κp ∗ l

κ p ∗ ut

uptld uptld uptld uptl d

1 1 1 0

1 1 1 0

1 1 2 1

2 2 2 1

u ptld u ptld u ptld u ptl d

2 2 2 1

2 2 2 1

2 2 3 2

1 1 1 0

uptld uptld upt ld uptl d

1 2 1 1

1 2 1 1

1 2 2 2

2 3 2 2

u ptld u ptld u pt ld u pt l d

1 2 1 1

1 2 1 1

1 2 2 2

2 3 2 2

u ptld u ptld u ptld u ptl d

1 1 1 0

2 2 2 1

1 1 2 1

3 3 3 2

u ptld u ptld u ptld u ptl d

1 1 1 0

2 2 2 1

1 1 2 1

1 1 1 0

u ptld u ptld u pt ld u pt l d

1 2 1 1

2 3 2 2

1 2 2 2

3 4 3 3

u ptld u ptld u pt ld u pt l d

0 1 0 0

1 2 1 1

0 1 1 1

2 3 2 2

20

23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56

61

6 7 8 9 10 11 12 13 14 15 16 17 18 19

22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

4.2. Formal OCF-model for Experiment 1

59 60

5

21

has been processed and then the second sentence is perceived (see Table 2), the tasks in experiment 2 are much briefer, consisting of just one sentence (see Table 4) and might be understood as checking whether some conditional statement is plausible and hence compatible with background knowledge, or not. In any case, perceived implausibilities will demand for conflict resolution and may cause delays in cognitive processing due to (hypothetical) revision activities even if they result in the decision not to adopt the new information. In Experiment 1, this new information arrives in two separate pieces while in experiment 2, it comes in as a conditional statement. Therefore, we model the tasks from Experiment 1 as iterated belief revision tasks, and the tasks of Experiment 2 as conditional revision tasks. But note that in the case of Experiment 2, the revised epistemic state would be of no further use, no further revision is necessary. So it will be enough to evaluate whether the conditional statement is plausible with respect to background knowledge, or not. It is only in the latter case that revision activities are necessary, and hence delays in cognitive processing may occur. Thanks to the close relation between belief revision, plausible inference, and conditional reasoning via the Ramsey test (9), from the point of view of methodology, this difference in modeling of the two experiments will not cause substantial differences, in particular in the chosen framework of OCFs and c-representations, resp. c-revisions. So, even if we decide to consider the conditional statements in Experiment 2 as an information consisting of two pieces – e.g., first process sunburn, then (being) healthy –, there will be no difference with respect to plausibility evaluation between whether we consider the tasks as belief revision or conditional acceptance tasks: The second piece of information will be evaluated as implausible after revision by the first piece of information iff the respective conditional statement as an entity is evaluated as implausible with respect to background knowledge, i.e., in the prior epistemic state. This is exactly what the Ramsey test (9) together with (1) says, and indeed, the framework of c-revisions offers both options. We will illustrate this when discussing the formal models for Experiment 2. Due to this methodological distinction, the courses of the tasks in Experiment 1 and 2 are modeled slightly differently. In both experiments in all examples, we start with building up a prior epistemic state represented by an OCF that is to reflect relevant background knowledge in an appropriate way. To this end, we first set up a knowledge base for the respective domain, showing background knowledge explicitly, and then take a c-representation of this knowledge base as prior κ . Then, in Experiment 1, the first sentence A arrives, triggering an adaptation of κ to A to set up the epistemic context (situation model) for evaluating the second sentence B. In our approach, this is modeled by a belief revision operation κ ∗ A which we realize by a (minimal) c-revision. When the second information B arrives, the crucial question for evaluating the plausibility of B is: What is the formal-logical relationship between κ ∗ A and B? If κ ∗ A |= B, then B is plausible in the context of κ ∗ A, but if κ ∗ A |= ¬ B, then there is a conflict between the new information B and what the test person’s current epistemic context validates as plausible. Solving this conflict, or even merely deciding to ignore the conflict, takes time and causes the observed delay. In Experiment 2, the stimulus items basically follow a more straightforward pattern if A then B is plausible and hence will be dealt with as conditionals. In this case, we check whether κ |= ( B | A ) holds for the prior κ . For Experiment 2, we will compute a full revised epistemic state only exemplarily to illustrate revision by conditionals. However, considerably more effort for modeling background knowledge is needed in the low knowledge case.

57 58

4

20

21 22

3

58 59

We take up the example given in Table 5 and work out the formal model for this example in full; analogous modelings can be set up for all items in the experiment.

60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.11 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

11

First, we have to take care of modeling relevant background knowledge. We use the following logical variables for this:

1 2 3 4 5 6

2

Variables: p u d

3

having a broken pipe calling the plumber calling the doctor

l t

having a broken leg calling the tradesman

7 8 9

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

(u | p ) (d|l) (t |u ) (t |d)

If one has a broken pipe, one usually calls the plumber. If one has a broken leg, one usually calls the doctor. A plumber is (usually) a tradesman. A doctor is usually not a tradesman.

Applying the technique of c-representations and choosing minimal parameters κi− , we obtain the OCF κ p shown in Table 5. The calculations for setting up κ p are straightforward using equations (3) and (4), so we explain them with just some examples. For a c-representation of R, we need four parameters κ1− , κ2− , κ3− , κ4− , each κi− being associated with the i-th conditional in R, e.g., κ2− is associated with the second conditional (d|l). Since there are no direct conflicts between the conditionals in R, we obtain for all four parameters κi− > 0 from (4), so we can choose them minimally by setting κi− = 1 for i = 1, . . . , 4. This means that for all worlds ω , we just have to count how many conditionals in R are falsified by ω , according to (3). For example, the world uptld (in which Frank has a broken pipe and a broken leg, and he calls someone who is a plumber, a doctor, and a tradesman2 ) falsifies only the fourth conditional (t |d) and hence has the κ p -rank 1. Analogously, the world uptld (in which Frank has a broken pipe and a broken leg, and he calls someone who is a plumber, but neither a tradesman nor a doctor) falsifies (d|l) and (t |u ) and hence is assigned the rank κ2− + κ3− = 2. Generally, the more conditionals from R ω falsifies, the higher is its κ p -rank, and the less plausible ω is. Exactly six worlds do not falsify any conditional in R and thus have κ p -rank 0: These are the models uptl d, u ptl d, u ptl d, u ptld, u pt ld and u pt l d, so in her initial epistemic state, the agent accepts all conditionals in R but is indifferent with respect to all logical variables p , l, u , t , d, that is, she believes neither the positive nor the negative form of each variable. Then the first information comes in: “Frank has a broken pipe/leg”, i.e., the test person comes to know p resp. l and has to incorporate this information into κ p . We compute κ p ∗ p resp. κ p ∗ l which are also shown in Table 5. Let us first consider κ p ∗ p which is computed via (7) by shifting p-worlds upwards by 1. Since there is only one world ω with κ p ∗ p (ω) = 0, namely ω = uptl d, we have κ p ∗ p |= uptl d – after reading “Frank has a broken pipe”, the agent believes that Frank has a broken pipe and that he calls the plumber (who is a tradesman), but she does not believe that he has a broken leg, nor that he calls the doctor. So, when she then reads that “He calls the plumber”, this fits her beliefs perfectly. Therefore, a revision κ p ∗ p by the new information u is effortless, we have (κ p ∗ p ) ∗ u = κ p ∗ p, and thus does not cause any delay. However, if the agent first comes to know “Frank has a broken leg”, her revision κ p ∗ l yields belief in u ptld – now she believes that Frank has a broken leg and calls the doctor, but also that Frank does not have a broken pipe and in particular, that he does not call the plumber (nor a tradesman). So, the next information “He calls the plumber” is contradictory to what she believes, and the adaptation of κ p ∗ l to u needs a true revision to solve the conflict: (κ p ∗ l) ∗ u = (κ p ∗ l). For the same reason, the sentences Frank has a broken leg – He calls the tradesman cause a confusion of the test person because after learning Frank has a broken leg, she believes u ptld, so tradesman is implausible. The example Frank has a broken pipe – He calls the tradesman is more intricate. In the experiments, Isberner and Richter [12] noticed a slight delay here in any case. This cannot be explained straightforwardly by our modeling since when learning Frank has a broken pipe, also tradesman is plausible (κ p ∗ p |= uptl d). A first explanation for this effect can be given by looking closer at the knowledge base R: Here, the conditional (u | p ) establishes an immediate connection between p and u, while t is entailed from p only via a transitive chaining of the conditionals (u | p ) and (t |u ). This would imply that the agent does not reason from the full epistemic state κ p ∗ p in any case but takes the knowledge base as a more compact representation of her beliefs. Only in cases where she is not able to derive an answer directly from the knowledge base, she initiates the more complex reasoning process of computing κ p ∗ p. Note that (naive) transitive chaining is not allowed in general because other conditionals might interfere. But in the case considered in this particular example, transitive chaining of (u | p ) and (t |u ) would be allowed since κ p |= (t | p ) because of κ p ( pt ) = 0 < 1 = κ p ( pt ).

6 8 9 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

59 61

5

10

Background knowledge R p :

58 60

4

7

p symbolizes the sentence “Frank has a broken pipe”, and the other variables have an analogous reading. We specify relevant background knowledge by the following knowledge base R p = {(u | p ), (d|l), (t |u ), (t |d)}:

10 11

1

59 2

Note that in a propositional framework, we cannot model statements on different objects, so in order to keep the modeling consistent, it must always be Frank who has a broken pipe or a broken leg, and it must always be some other person who can be a plumber, a doctor, or a tradesman.

60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.12 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

12

1

Table 6 Epistemic state

2 3 4 5 6 7 8 9 10 11

1

κs built upon Rs for the sunburn example from Table 4.

ω

κs (ω)

ω

κs (ω)

sbih sbih sbih sbi h

1 0 1 1

sbih sbih sbih sbi h

1 0 1 1

sbih sbih sb ih sb i h

2 1 1 1

sbih sbih sb ih sb i h

1 0 0 0

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Another explanation could be as follows: Reading t leaves the test person with the options ut or ut. Given Grice’s conventions for communication [37], she may assume that the information given to her in the test is as specific as needed3 (which would be u, or ut). That is, after reading t, she might wonder whether actually ut is meant. But ut is as implausible as d in the context of κ p ∗ p: κ p ∗ p (ut ) = 1 while κ p ∗ p (ut ) = 0. Because of κ p ∗ p (ut ) = 0 < κ p ∗ p (ut ) = 1, the test person accepts the conditional (u |t ) in the epistemic state κ p ∗ p – if a tradesman is called it is plausibly a plumber. The option ut, which might appear realistic, would not only violate the current beliefs of the test person, its incorporation (κ p ∗ p ) ∗ ut which is also shown in Table 5 even casts doubt on p being true or not: We have Bel ((κ p ∗ p ) ∗ ut ) = Cn(u ptl d ∨ u ptl d) = Cn(utl d), that is, the test person would be uncertain whether p still holds. Anticipating these problems, the test person might adhere to the plausible option ut, but even deliberating about this costs time and may cause delays. The symmetric effect of positive vs. negative responses observed (but not expected) by [12] in the experiments – both are significantly delayed – appears to be completely reasonable on the basis of our model because the revision processes depend only on the (logical) incompatibility between contextual knowledge and new information, not on the polarity of the response. 4.3. Formal OCF-model for Experiment 2 Here, we base our considerations upon the examples from Table 4 and elaborate our formal model for the cases high, but invalid knowledge and low, but valid knowledge. In principle, the other cases are also covered by our exemplary modelings by simply negating the consequents (i.e., sunburn is not healthy, and woolly rhinos are not extinct, respectively).

37 38

Variables:

34 35 36

39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56

s b

4 5 6 7 8 9 10 11

(having a) sunburn (having a) burn

i h

(having an) injury feeling healthy,

Background knowledge Rs :

(b|s) (i |b) (h|i )

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37



κ (ω), κs ∗ s(ω) = s κs (ω) + 1,

39 40 41 42 43 44

A sunburn is a burn. A burn causes an injury. An injury makes one feel not healthy.

Using the background knowledge Rs = {(b|s), (i |b), (h|i )}, the connection between sunburn and (not) healthy can be established only by transitively chaining rules which is not a trivial task for nonmonotonic reasoning formalisms and invalid in general. A (minimal) c-representation κs of Rs can be determined by the use of (3) and (4) in a straightforward way (since there are no conflicts between the conditionals, all κi− can be chosen to be 1); κs is shown in Table 6. Checking whether the agent with background knowledge Rs believes if sunburn is healthy comes down to comparing κs (sh) and κs (sh): We find κs (sh) = 1 while κs (sh) = 0, so κs (sh) < κs (sh) and hence κs |= (h|s), i.e., the agent does not believe that sunburn is healthy; more precisely, she believes that sunburn is not healthy: κs |= (h|s). Applying revision would yield the same result: Since κs (s) = κs (s) = 0, (7) gives us

58

13

38

and might contain the following conditionals (that focus on the immediate consequences of a sunburn, not taking problems in the far future into account):

57

if ω |= s if ω |= s

59

45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

3

12

Sunburn example. Since this example is a high-knowledge example, modeling background knowledge can be done in obvious ways. Different knowledge bases are possible here, starting from the most straightforward base that contains just the conditional sunburn is not healthy, to very sophisticated knowledge bases which take the chemical processes that lead to cell damages and possible mutations into account. A realistic commonsense knowledge base that is not trivial might consider the following variables:

33

2

60 3

We are grateful to an anonymous reviewer who pointed out the correct phrasing here to us.

61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.13 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Bel (κs ∗ s) reveals what the agent believes in this posterior state, only worlds ω with κs ∗ s(ω) = 0 are relevant for this (cf. section 2). Since the models of s are left unchanged under revision, but all models of s must have a degree of at least 1 in κs ∗ s, only the left column of Table 6 is relevant, and here we find Bel (κs ∗ s) = Cn(sbih), according to (2). That is, after the revision by s, the agent believes h. This also illustrates the Ramsey test 9 for this example: κs |= (h|s) iff Bel (κs ∗ s) |= h.

1

Woolly rhinos example. This example is a low-knowledge example, i.e., the test persons are generally assumed to have not much knowledge about woolly rhinos, maybe they have never heard about woolly rhinos, and found rhinos with a woolly fur hard to imagine. Plausibly they are rhinos, and rhinos live in Africa and are not extinct, so maybe, woolly rhinos also live in Africa and are not extinct, but they might also be an exceptional subclass of rhinos that are extinct and did not live in Africa, maybe they are not even rhinos but just look similar to rhinos. So, in this case, we are facing the problem of how to model lacking knowledge in a reasonable way. Since different test persons may approach this task in different ways, we propose different knowledge bases that all appear to be suitable prima facie to model a test person’s knowledge, three knowledge bases for low (or plain) knowledge, and one knowledge base for high knowledge (because there might be people among the test persons who have heard about woolly rhinos but the assumption is that these people will not have a significant impact on the results of the experiment). Altogether, we consider the following variables and conditionals from which we set up our background knowledge bases:

5

Variables: w a h

43 44

r e

(being a) rhino (being) extinct

47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

4 6 7 8 9 10 11 12 13 14 15 16

23

R1w

Our first three knowledge bases represent low knowledge: = {r1 , r2 , r3 } (the test person believes that woolly rhinos are rhinos and has some knowledge about rhinos), R2w = {r2 , r3 } (the test person has no specific knowledge about woolly rhinos at all), R3w = {r1 , r2 , r3 , r4 , r5 } (the test person believes that woolly rhinos are rhinos but has never heard about them, she has some knowledge about rhinos and species she has never heard about). The fourth knowledge base represents high knowledge about woolly rhinos: R4w = {r1 , r2 , r3 , r6 }. We will not go into the details of calculating minimal c-representations κ w1 , . . . , κ w4 for all knowledge bases (all κi− are 1 except for κ1− = 2 = κ4− for κ w3 resp. κ1− = 2 = κ6− for κ w4 ) but just list the falsified conditionals and the rankings for each world in Table 7. Moreover, to exemplify conditional revision, we will also ∗ = κ 1 ∗ (e | w ) with show κ w w

κ w∗ (ω) = κ w1 (ω) +

2, 0,

19

22

Woolly rhinos are rhinos. Rhinos live in Africa. Rhinos are not extinct. One has never heard about woolly rhinos. Species that one never has heard about are extinct. Woolly rhinos are extinct.



18

21

if ω |= we if ω |= w ∨ e

in Table 7, according to (8). For the low knowledge bases, we calculate the following ranks that are relevant for checking whether the agent knows that woolly rhinos are extinct:

45 46

3

20

Conditionals for background knowledge: r1 = (r | w ) r2 = (a|r ) r3 = (e |r ) r4 = (h| w ) r5 = (e |h) r6 = (e | w )

2

17

(being a) woolly rhino living in Africa (being a) species the agent has heard of, which sounds familiar and/or imaginable

41 42

13

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

κ w1 (e w ) = 0 < 1 = κ w1 (e w )

46

κ w2 (e w ) = 0 = 0 = κ w2 (e w )

48

κ

3 w (e w )

=1=1=κ

3 w (e w )

Therefore, with background knowledge R1w , the agent believes that woolly rhinos are not extinct, i.e., the agent deems the test sentence to be false while with R2w and R3w , she is undecided about the truth of the test sentence. In all three ∗ = κ 1 ∗ (e | w ) is shown to exemplify cases, adopting the test sentence would make belief revision necessary; the revision κ w w conditional revision. However, the other two revisions are less drastic because beliefs are just extended and not overridden, i i due to κ w (e w ) = κ w (e w ) for i = 1, 2. This means that the disposedness of the agent is more neutral, and also revision processes might be faster. For the high knowledge base R4w , the agent clearly believes that woolly rhinos are extinct because this belief is already part of the knowledge base. Here, no belief revision would be necessary, but these informed test persons are usually rare. So, across individuals, all beliefs concerning woolly rhinos and extinct are possible and can be rationally justified. Assuming that most people would tend to adopt background knowledge bases similar to R2w and R3w , one might conclude that doubtful indecisiveness and irritation prevail, and that belief revision processes would be necessary.

47 49 50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.14 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

Table 7 Epistemic states

1

κ w1 , κ w∗ = κ w1 ∗ (e| w ), κ w2 , κ w3 , κ w4 for the woolly rhino example from Table 4.

2

ω

Falsified r i

κ w1 (ω)

κ w∗ (ω)

κ w2 (ω)

κ w3 (ω)

κ w4 (ω)

wraeh wraeh wraeh wraeh

r3 , r4 r3 r4 , r6 r5 , r6

1 1 0 0

1 1 2 2

1 1 0 0

3 1 2 1

1 1 2 2

wraeh wraeh wra eh wra eh

r2 , r3 , r4 r2 , r3 r2 , r4 , r6 r2 , r5 , r6

2 2 1 1

2 2 3 3

2 2 1 1

4 2 3 2

2 2 3 3

wraeh wraeh wraeh wraeh

r1 , r4 r1 r1 , r4 , r6 r1 , r5 , r6

1 1 1 1

1 1 3 3

0 0 0 0

4 2 4 3

2 2 4 4

12

wr aeh wr aeh wr a eh wr a eh

r1 , r4 r1 r1 , r4 , r6 r1 , r5 , r6

1 1 1 1

1 1 3 3

0 0 0 0

4 2 4 3

2 2 4 4

16

w raeh w raeh w raeh w raeh

r3 r3 – r5

1 1 0 0

1 1 0 0

1 1 0 0

1 1 0 1

1 1 0 0

20

w raeh w raeh w ra eh w ra eh

r2 , r3 r2 , r3 r2 r2 , r5

2 2 1 1

2 2 1 1

2 2 1 1

2 2 1 2

2 2 1 1

24

w raeh w raeh w raeh w raeh

– – – r5

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 1

0 0 0 0

28

w r aeh w r aeh w r a eh w r a eh

– – – r5

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 1

0 0 0 0

32

36 37 38 39 40 41 42 43

When looking closer into the computations of the c-representations one finds that the conditional r2 = (a|r ) has no effect at all on the other κi− and is not relevant for the query whether woolly rhinos are extinct or not. And indeed, there is no reason to assume that extinction and living (or having been living) in Africa are related. But note that all OCFs which 2 are based on knowledge bases that consider woolly rhinos to be plausibly a subclass of rhinos (i.e., all OCFs except for κ w ) yield the inference w |∼ a – woolly rhinos plausibly live (or lived) in Africa. Although woolly rhinos might be an exceptional subclass of rhinos (with respect to extinction), they inherit their plausible habitat from the class of rhinos. This illustrates that c-representations do not suffer from the drowning problem (for further information, cf. [21]).

48 49 50 51 52

55 56 57 58

61

7 8 9 10 11 13 14 15 17 18 19 21 22 23 25 26 27 29 30 31 33 34 35 37 38 39 40 41 42 43 45 47 48 49 50 51 52 53

E O C P N

General, plausible knowledge from experience Ontological and categorical knowledge Causal and abductive/explanatory knowledge Knowledge about preferences Normative knowledge, orders, and prohibitions

59 60

6

46

The experiments in [12,13] make use of a variety of different types of knowledge, e.g., declarative knowledge from experience, causal knowledge, and normative knowledge. In this section, we introduce categories of knowledge to which the experimental items can be assigned and reanalyze the experimental findings in the experiments described above to investigate whether significant differences can be found when dealing with different types of knowledge. For classifying the commonsense knowledge in the tasks of experiments 1 and 2, the following categories of knowledge are useful:

53 54

5

44

5. Distinguishing between different types of knowledge

46 47

4

36

44 45

3

54 55 56 57 58 59

We set up these categories to be disjoint for our investigations, but it is clear that it is often not so easy to assign sentences from natural language to exactly one category. Different contexts may suggest different understandings of sentences, and

60 61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.15 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

1

15

Table 8 Number of items contained in each knowledge category for experiments 1 and 2.

2 3

1 2 3

4

Category

Experiment 1

Experiment 2

4

5

E O C P N

9 17 27 11 0

24 64 3 0 5

5

6 7 8 9 10

13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

8 9 11

therefore different categories appear to be suitable. For instance, the item “Selma loves animals. Therefore, she has a cat.” might be assigned to category P (because the verb “loves” appears in it), or to category O (because the main focus might be on judging whether the crucial last word belongs to the mentioned class). However, we assigned it to category C because of the word “Therefore” that strongly hints at a causal relationship. Moreover, in a broad sense, all categories contain knowledge that belong in one way or another to E, so E was realized here as the residual class including all types of plausible knowledge for which there was no good reason to assign it to one of the other classes. Table 8 contains the numbers of items contained in each of the knowledge categories for both experiments. Note that the items in the experiments were not designed for distinguishing between different types of knowledge, so there was no systematic covering of all type categories. Therefore, the results of this analysis can only be seen as giving rise to cautious hypotheses on the ease and speed with which humans process knowledge of the different types, and how different types of knowledge interfere with action performance. Due to the different designs of the respective experimental items, big differences in the size of the categories can be noticed for the two experiments. Whereas items in Experiment 1 are more comprehensive (consisting of two sentences), the items in Experiment 2 are brief “Is-a”-sentences or SPO4 -sentences. Therefore, Experiment 1 allows for the investigation of a wider variety of knowledge types while Experiment 2 tends to focus on knowledge along ontological lines. Moreover, only the high-knowledge items of Experiment 2 are considered in this analysis, since it is only with these items that the different types of knowledge we are interested in can be assumed to influence processing. This results in the exclusion of category C from the analysis of Experiment 2 because all three items in this category involve low knowledge. After classifying the items into the different knowledge categories, we inspected the numerical patterns of results for the different knowledge categories in order to generate hypotheses regarding potential differences in the processing of violations of the different knowledge types (cf. Figs. 6 and 7). What can be clearly seen is that the tasks in the three experiments 1A, 1B, and 2 differ in the average response latencies. The orthographical tasks in Experiment 1A produced the longest response latencies and highest error rates, whereas the color tasks in Experiment 1B produced the shortest response latencies and lowest error rates, with the probe tasks in Experiment 2 in between them. This can be taken as an indicator of the level of difficulty of each task, which seems to have been highest for the orthographical tasks and lowest for the color tasks. It also means that the different tasks capture effects at different time points after the presentation of the stimuli, and thus may reflect the time course at which the different kinds of knowledge affect language processing. Although the exploratory nature of the analyses and the unbalanced number of items in each category precludes a conclusive interpretation, it is interesting to note that there is considerably more evidence for an (in)congruence effect for the O-items in Experiments 1B and 2 (and thus, in the two easier tasks) than in Experiment 1A (as the two lines are almost parallel here). This could indicate that the evaluation of plausibility or validity based on ontological knowledge might be completed more quickly than evaluations based on other kinds of knowledge, thus affecting faster but not slower responses. Moreover, an inspection of the error rates of Experiment 2 reveals somewhat different data patterns for the different knowledge types: Whereas the pattern is almost perfectly symmetrical for ontological knowledge, meaning that participants made almost equal proportions of errors when having to respond positively after a false or negatively after a true statement, it is more skewed for E and N knowledge: Participants made particularly many errors when having to respond negatively after a true E-statement or positively after a false N-statement. This could indicate that violations of the different types of knowledge are processed differently, or that they are at least perceived as more or less strong. Again, it must be emphasized that these hypotheses are based on post-hoc inspections of data patterns rather than empirically validated conclusions. However, these analyses can be taken as an indication that it may not only be important to distinguish between different kinds of knowledge (violations) when investigating effects of plausibility and validity during language comprehension, but that it might in fact be a promising endeavor to systematically investigate potential differences between these different kinds of knowledge in language comprehension and reasoning tasks, as it might provide valuable insights into how people evaluate truth or plausibility based on different kinds of knowledge and to what extent this influences language comprehension.

59

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

60 61

7

10

11 12

6

60 4

SPO = Subject-Predicate-Object.

61

JID:IJA AID:8057 /FLA

16

[m3G; v1.218; Prn:23/05/2017; 11:22] P.16 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

15

16

16

17

17

18

18

19

19

20

20

21

21

22

22

23

23

24

24

25

25

26

26

27

27

28

28

29

29

30

30

31

31

32

32

33

33

34

34

35

35

36

36

37

37

38

38

39

39

40

40

41

41

42

42

43

43

44

44

45

45

46

46

47

47

48

48

49

49

50

50

51

51

52

52

53

53

54

54

55

55

56

56

57

57

58

58

59

59

60 61

60

Fig. 6. Mean correct response latencies for Experiment 1A (left) and 1B (right), evaluated according to knowledge categories.

61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.17 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

17

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

15

16

16

17

17

18

18

19

19

20

20

21

21

22

22

23

23

24

24

25

25

26

26

27

27

28

28

29

29

30

30

31

31

32

32

33

33

34

34

35

35

36

36

37

37

38

38

39

39

40

40

41

41

42

42

43

43

44

44

45 46

Fig. 7. Mean correct response latencies (left) and mean error rates (right) for Experiment 2, high knowledge, evaluated according to knowledge categories.

47 48

51 52 53 54 55 56 57 58 59 60 61

46 47

6. Conclusion and future work

49 50

45

48 49

This paper presents first steps towards establishing closer connections between the fields of knowledge representation in artificial intelligence and language comprehension in psychology, elaborating on the topic of plausible reasoning that plays quite different parts in both areas. While plausible reasoning is in the focus of knowledge representation, it is only evaluated implicitly in language comprehension when investigating what influence plausibility has on the processes of understanding language. However, situation models and commonsense knowledge about the world are crucial issues when investigating such processes in psychology and suggest suitable mental counterparts for notions like epistemic state and plausible reasoning in knowledge representation. Therefore, the aim of this paper is to show how formal models of epistemic states and reasoning processes can be used to simulate human lines of thoughts in psychological experiments and can be helpful to explain findings in such experiments and generate hypotheses that can be empirically tested. We focus on epistemic Stroop effects that have been described in [12,13] to show the impact of perceived (im)plausibility through reading on the actions of humans. Although plausible reasoning is not evaluated directly in those experiments, the results of [12,13] provide impressive evidence for the influence of background knowledge about the world on the actions

50 51 52 53 54 55 56 57 58 59 60 61

JID:IJA AID:8057 /FLA

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

[m3G; v1.218; Prn:23/05/2017; 11:22] P.18 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

18

of humans, and hence for the relevance of epistemic states for the modeling of agents. This paper is meant to fill in details on epistemic processes from a knowledge representation perspective by setting up a formal model of plausible reasoning and belief revision that helps explain findings in psychological experiments for language comprehension. As beliefs about the world are particularly important for language comprehension, we spent particular effort on modeling such background beliefs explicitly to make reasoning more transparent. Therefore, we simulate the test persons’ epistemic processes when reading the texts given in the experiments by first reasoning inductively from background knowledge bases, then building up a situation model (contextual epistemic model) from this, and afterwards evaluating and incorporating new information with respect to this epistemic context. We also consider possible formal epistemic models for those tasks in the experiments of [13] that deal with low knowledge to show how even in those cases agents can come to rational conclusions which, however, might be indecisive. We argue that the observed delays in responding to given tasks might at least partially be caused by belief change processes that are necessary to overcome incompatibilities between plausible context knowledge and obtained information. We also point out the relevance of these psychological findings for agent models, since epistemic Stroop effects show that even if belief revision can resolve conflicts between prior and new information, the conflict itself might lead to interference effects in human acting. In our post-hoc analyses of the experiments that distinguish between different kinds of knowledge, such ideally crosswise patterns show up in particular for ontological knowledge. But of course, further and more systematic investigations will have to be made to clarify whether the type of knowledge or belief has an impact on how it is processed, or how people react to belief conflicts. On the formal side, the reasoning model can be fine-tuned and optimized to allow for distinguishing between these different reasoning modes by integrating specific methods for processing, e.g., causal and normative knowledge. A research question that would be interesting for both disciplines is whether general differences between using explicit knowledge in the knowledge base and implicit knowledge derived by some epistemic processes can be observed. Moreover, our formal model that is based on degrees of plausibility might also be relevant to provide a logical environment to explain results from psychological experiments concerning uncertain reasoning and belief revision, as in [19].5 Furthermore, we point out that results from language comprehension may shed some light on the question how humans derive from the new information that relevant and consistent part which they are willing to incorporate into their prior beliefs. This issue has been discussed intensely in the belief revision community and gave rise to approaches like non-prioritized revision (cf., e.g., [38]), or selective revision [39]. Indeed, the success postulate of AGM belief revision [3] is meant to presuppose some preprocessing of information that evaluates which part of the new information (if any) should be taken as input for the revision process. The insights from psychology described in this paper might be relevant for future research on this topic. Our formal model to simulate human plausible reasoning can serve as a very helpful bridge between the AI discipline of knowledge representation and psychological research on language comprehension and general human reasoning. Crucially, our model is neither restricted to classical logics nor does it make use of probabilities but is located in the wide area of qualitative default logics between these two extremes. Therefore, we also consider our work as a proof of concept that suitable normative models for psychological experiments can be built from axiomatic or inductive approaches to plausible reasoning. The framework of c-representations and c-revisions that we chose for modeling in this paper is an instance of such approaches but of course, other approaches like possibility theory [40], Dempster–Shafer theory [41], or probabilistic networks [42,43] exist. Once we have results from modified experiments which allow us to find out more about actual human inferences in the considered tasks, it will be interesting to compare different formalisms with respect to their cognitive adequacy, e.g., in the style of [5].

42 43

We are very thankful for the valuable comments of the anonymous reviewers that helped us improve the paper.

46

51 52 53 54 55 56 57 58

References

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 43 45 47 48

[1] R. Reiter, A logic for default reasoning, Artif. Intell. 13 (1980) 81–132. [2] S. Kraus, D. Lehmann, M. Magidor, Nonmonotonic reasoning, preferential models and cumulative logics, Artif. Intell. 44 (1990) 167–207. [3] C. Alchourrón, P. Gärdenfors, D. Makinson, On the logic of theory change: partial meet contraction and revision functions, J. Symb. Log. 50 (2) (1985) 510–530. [4] R. Neves, J. Bonnefon, E. Raufaste, An empirical test of patterns for nonmonotonic inference, Ann. Math. Artif. Intell. 34 (1–3) (2002) 107–130. [5] M. Ragni, C. Eichhorn, G. Kern-Isberner, Simulating human inferences in the light of new information: a formal analysis, in: S. Kambhampati (Ed.), Proceedings of the 25th International Joint Conference on Artificial Intelligence, IJCAI 2016, AAAI Press, Palo Alto, CA, USA, 2016, pp. 2604–2610. [6] A.S. Rao, M.P. Georgeff, BDI-Agents: from theory to practice, in: Proc. of the First Int. Conference on Multiagent Systems, San Francisco, 1995. [7] J.R. Stroop, Studies of interference in serial verbal reactions, J. Exp. Psychol. (1935) 643–662. [8] P.N. Johnson-Laird, Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness, Harvard University Press, Cambridge, 1983. [9] T.A. van Dijk, W. Kintsch, Strategies of Discourse Comprehension, Academic Press, New York, 1983. [10] R.A. Zwaan, G.A. Radvansky, Situation models in language comprehension and memory, Psychol. Bull. (1998) 162–185.

59

49 50 51 52 53 54 55 56 57 58 59

60 61

4

46

48 50

3

44

45

49

2

42

Acknowledgement

44

47

1

60 5

We are grateful to an anonymous referee for pointing out this paper to us.

61

JID:IJA AID:8057 /FLA

[m3G; v1.218; Prn:23/05/2017; 11:22] P.19 (1-19)

M.-B. Isberner, G. Kern-Isberner / International Journal of Approximate Reasoning ••• (••••) •••–•••

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

19

[11] T. Richter, S. Schroeder, B. Wöhrmann, You don’t have to believe everything you read: background knowledge permits fast and efficient validation of information, J. Pers. Soc. Psychol. (2009) 538–558. [12] M.-B. Isberner, T. Richter, Can readers ignore plausibility? Evidence for nonstrategic monitoring of event-based plausibility in language comprehension, Acta Psychol. 142 (2013) 15–22. [13] M.-B. Isberner, T. Richter, Does validation during language comprehension depend on an evaluative mindset?, Discourse Process. (2014) 7–25. [14] W. Spohn, Ordinal conditional functions: a dynamic theory of epistemic states, in: W. Harper, B. Skyrms (Eds.), Causation in Decision, Belief Change, and Statistics, vol. II, Kluwer Academic Publishers, 1988, pp. 105–134. [15] W. Spohn, The Laws of Belief: Ranking Theory and Its Philosophical Applications, Oxford University Press, 2012. [16] G. Kern-Isberner, Conditionals in Nonmonotonic Reasoning and Belief Revision, Lecture Notes in Artificial Intelligence, vol. 2087, Springer, 2001. [17] G. Kern-Isberner, A thorough axiomatization of a principle of conditional preservation in belief revision, Ann. Math. Artif. Intell. 40 (1–2) (2004) 127–164. [18] A. Darwiche, J. Pearl, On the logic of iterated belief revision, Artif. Intell. 89 (1997) 1–29. [19] G. Politzer, L. Carles, Belief revision and uncertain revision, Think. Reasoning 7 (3) (2001) 217–234. [20] M.-B. Isberner, G. Kern-Isberner, A formal model of plausibility monitoring in language comprehension, in: Z. Markov, I. Russell (Eds.), Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference, FLAIRS-29, AAAI Press, Palo Alto, Ca, 2016, pp. 662–667. [21] M. Goldszmidt, J. Pearl, Qualitative probabilities for default reasoning, belief revision, and causal modeling, Artif. Intell. 84 (1996) 57–112. [22] E. Fermé, S. Hansson, AGM 25 years – twenty-five years of research in belief change, J. Philos. Log. 40 (2011) 295–331. [23] G. Kern-Isberner, D. Huvermann, Multiple iterated belief revision without independence, in: I. Russell, W. Eberle (Eds.), Proceedings of the TwentyEighth International Florida Artificial Intelligence Research Society Conference, FLAIRS-28, AAAI Press, Palo Alto, Ca, 2015, pp. 570–575. [24] F. Ramsey, General propositions and causality, in: R. Braithwaite (Ed.), Foundations of Mathematics and Other Logical Essays, Routledge and Kegan Paul, New York, 1950, pp. 237–257. [25] L. Connell, M.T. Keane, A model of plausibility, Cogn. Sci. (2006) 95–120. [26] D.T. Gilbert, D.S. Krull, P.S. Malone, Unbelieving the unbelievable: some problems in the rejection of false information, J. Pers. Soc. Psychol. (1990) 601–613. [27] D.T. Gilbert, R.W. Tafarodi, P.S. Malone, You can’t not believe everything you read, J. Pers. Soc. Psychol. (1993) 221–233. [28] P. Hagoort, L. Hald, M. Bastiaansen, K. Petersson, Integration of word meaning and world knowledge in language comprehension, Science (2004) 438–441. [29] I. Fischler, D. Childers, T. Achariyapaopan, N. Perry, Brain potentials during sentence verification – automatic aspects of comprehension, Biol. Psychol. (1985) 83–105. [30] J.J.A. van Berkum, B. Holleman, M.S. Nieuwland, M. Otten, J. Murre, Right or wrong? The brain’s fast response to morally objectionable statements, Psychol. Sci. (2009) 1092–1099. [31] J.J.A. van Berkum, C.M. Brown, P. Zwitserlood, V. Kooijman, P. Hagoort, Anticipating upcoming words in discourse: evidence from ERPs and reading times, J. Exp. Psychol. Learn. Mem. Cogn. (2005) 443–467. [32] G.A. Radvansky, D.E. Copeland, Reasoning, integration, inference alteration, and text comprehension, Can. J. Exp. Psychol./Revue canadienne de psychologie expérimentale 2 (2004) 133–141. [33] A.C. Graesser, M. Singer, T. Trabasso, Constructing inferences during narrative text comprehension, Psychol. Rev. 3 (1994) 371–395. [34] G. McKoon, R. Ratcliff, Inference during reading, Psychol. Rev. (1992) 440–466. [35] R.A. Zwaan, M.C. Langston, A.C. Graesser, The construction of situation models in narrative comprehension: an event-indexing model, Psychol. Sci. 6 (1995) 292–297. [36] D. Wiswede, N. Koranyi, F. Müller, O. Langner, K. Rothermund, Validating the truth of propositions: behavioral and ERP indicators of truth evaluation processes, Soc. Cogn. Affect. Neurosci. (2013) 647–653. [37] H.P. Grice, Logic and conversation, in: P. Cole, J.L. Morgan (Eds.), Syntax and Semantics, vol. 3: Speech Acts, Academic Press, New York, 1975, pp. 41–58. [38] S. Hansson, A survey of non-prioritized belief revision, Erkenntnis 50 (2–3) (1999) 413–427. [39] E. Fermé, S. Hansson, Selective revision, Stud. Log. 63 (1998) 331–342. [40] D. Dubois, J. Lang, H. Prade, Possibilistic logic, in: D. Gabbay, C. Hogger, J. Robinson (Eds.), Handbook of Logic in Artificial Intelligence and Logic Programming, vol. 3, Oxford University Press, 1994. [41] G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, Princeton, NJ, 1976. [42] J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, San Mateo, Ca, 1988. [43] B.M. Rottman, R. Hastie, Reasoning about causal relationships: inferences on causal networks, Psychol. Bull. 140 (1) (2014) 109–139.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

42

42

43

43

44

44

45

45

46

46

47

47

48

48

49

49

50

50

51

51

52

52

53

53

54

54

55

55

56

56

57

57

58

58

59

59

60

60

61

61