Multiple neural representations of elementary logical connectives

Multiple neural representations of elementary logical connectives

NeuroImage 135 (2016) 300–310 Contents lists available at ScienceDirect NeuroImage journal homepage: www.elsevier.com/locate/ynimg Multiple neural ...

862KB Sizes 2 Downloads 86 Views

NeuroImage 135 (2016) 300–310

Contents lists available at ScienceDirect

NeuroImage journal homepage: www.elsevier.com/locate/ynimg

Multiple neural representations of elementary logical connectives Giosuè Baggio a,b, Paolo Cherubini c,d, Doris Pischedda c,d,e,g, Anna Blumenthal b,f, John-Dylan Haynes e,g,h,i,j, Carlo Reverberi c,d,⁎ a

Language Acquisition and Language Processing Lab, Department of Language and Literature, Norwegian University of Science and Technology, 7491 Trondheim, Norway SISSA International School for Advanced Studies, 34136 Trieste, Italy Department of Psychology, University of Milano-Bicocca, 20126 Milan, Italy d NeuroMi — Milan Center for Neuroscience, 20126 Milan, Italy e Bernstein Center for Computational Neuroscience Berlin, Charité-Universitätsmedizin, 10115 Berlin, Germany f The Brain and Mind Institute, Western University, N6A 5B7 London, Canada g Berlin Center for Advanced Neuroimaging, Charité-Universitätsmedizin, 10119 Berlin, Germany h Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10117 Berlin, Germany i Excellence Cluster NeuroCure, Charité-Universitätsmedizin, 10117 Berlin, Germany j Department of Psychology, Humboldt Universität zu Berlin, 12489 Berlin, Germany b c

a r t i c l e

i n f o

Article history: Received 3 September 2015 Revised 20 March 2016 Accepted 26 April 2016 Available online 29 April 2016 Keywords: Logic Semantics Reasoning Broca's area Prefrontal cortex Multivariate pattern analysis Neural code fMRI

a b s t r a c t A defining trait of human cognition is the capacity to form compounds out of simple thoughts. This ability relies on the logical connectives AND, OR and IF. Simple propositions, e.g., ‘There is a fork’ and ‘There is a knife’, can be combined in alternative ways using logical connectives: e.g., ‘There is a fork AND there is a knife’, ‘There is a fork OR there is a knife’, ‘IF there is a fork, there is a knife’. How does the brain represent compounds based on different logical connectives, and how are compounds evaluated in relation to new facts? In the present study, participants had to maintain and evaluate conjunctive (AND), disjunctive (OR) or conditional (IF) compounds while undergoing functional MRI. Our results suggest that, during maintenance, the left posterior inferior frontal gyrus (pIFG, BA44, or Broca's area) represents the surface form of compounds. During evaluation, the left pIFG switches to processing the full logical meaning of compounds, and two additional areas are recruited: the left anterior inferior frontal gyrus (aIFG, BA47) and the left intraparietal sulcus (IPS, BA40). The aIFG shows a pattern of activation similar to pIFG, and compatible with processing the full logical meaning of compounds, whereas activations in IPS differ with alternative interpretations of conditionals: logical vs conjunctive. These results uncover the functions of a basic cortical network underlying human compositional thought, and provide a shared neural foundation for the cognitive science of language and reasoning. © 2016 Elsevier Inc. All rights reserved.

Introduction The ability to combine known information in novel ways is a hallmark of human cognition. This capacity relies on a finite repertoire of operators, among which are the logical connectives AND (conjunction), OR (disjunction) and IF (conditional). These elementary logical operators allow us to construct different combinations of the same two propositions (e.g., ‘I'll have a cookie and milk’, ‘I'll have a cookie or milk’, and ‘If I have a cookie, I'll have milk’), and to recursively build larger compounds (e.g., ‘I'll have milk or coffee, and a cookie’). Logical connectives are essential for organizing thought and action in a variety of domains, including planning, reasoning, and language (Anderson and Lebiere, 1998; Piaget, 1957; Stenning and van Lambalgen, 2008). Recently, a number of related issues have been investigated using fMRI, including ⁎ Corresponding author at: Department of Psychology, University of Milano-Bicocca, Piazza dell'Ateneo Nuovo 1, 20126 Milan, Italy. E-mail address: [email protected] (C. Reverberi).

http://dx.doi.org/10.1016/j.neuroimage.2016.04.061 1053-8119/© 2016 Elsevier Inc. All rights reserved.

deductive inference (Bonatti et al., 2015; Goel, 2007; Monti et al., 2007, 2009; Prado et al., 2011; Reverberi et al., 2007, 2009, 2012a), the neural encoding of rules and the compositionality of high-level representations (Baron and Osherson, 2011; Reverberi et al., 2012b, 2012c; Bunge and Wallis, 2007; Woolgar et al., 2011). However, to the best of our knowledge, no study has directly addressed how logical connectives are represented in the human brain. In what format are logical compounds maintained on-line before they are used in cognitive computations? Does that format include their meaning, or is it instead restricted to the compound's surface form? Finally, how is the initial representation modified when meaning is eventually computed and used? Answering these basic questions is essential to solve one of nature's most intriguing puzzles: the neurophysiological bases of human symbolic intelligence. In logic and linguistics, compounds may be represented in three non-mutually exclusive ways, following the established theoretical distinction between form (phonology and syntax), meaning (semantics) and use (pragmatics) (Carnap, 1948; Morris, 1938). Correspondingly,

G. Baggio et al. / NeuroImage 135 (2016) 300–310

the brain may encode a logical compound relying on its surface form, initially disregarding its semantics, or it may represent a compound by means of its full logical meaning. Representing the full logical meaning requires representing all models of the compound, that is, all mappings of the compound onto a reference structure (Tarski and Vaught, 1956). For the present purposes, each model corresponds to a single entry in the truth table (Table 1) in which the compound is true. For example, the full logical meaning of the indicative and factual conditional ‘If it is a crow, it is black’ refers to black crows, black non-crows, and non-black non-crows. Only if there are non-black crows around (in the reference structure), the conditional is false (Table 1, fifth column). A further possibility is that the brain represents the active compound by means of a simplified interpretation of its full logical meaning, also depending on the (pragmatic) context of use. For example, ‘If it is a crow, it is black’ may be interpreted as being equivalent, in terms of truth tables, to ‘It is a crow and it is black’ (the so-called “conjunctive” interpretation of IF, Politzer, 1981). This reading may be derived by pruning two of the three models that define IF's full logical meaning. Under this interpretation, the conditional is, just like AND, inconsistent with states of affairs involving black non-crows, non-black crows and non-black non-crows (Table 1, sixth column). Importantly, however, the identity of the truth tables for conjunctive-IF and AND does not make the two operators indistinguishable semantically: for conjunctiveIF, but not for AND, the full logical meaning of IF may remain available to participants (i.e., it may still be encoded as a neural representation), and may be recovered should the task or context so require (Johnson-Laird, 2010; Johnson-Laird and Byrne, 2002). In this study, we capitalized on the different neural signatures of these theoretically-motivated representational formats (surface form, full logical meaning and simplified interpretation of IF) to investigate how and where the brain represents the building blocks of human compositional thought: AND, OR and IF. Furthermore, we aimed at exploring how those representations may change in two successive processing phases: a maintenance phase, during which a compound is recalled and held active in working memory, and an evaluation phase, during which participants judge the compatibility of the active compound with a given visual target scene (Fig. 1). Our experimental design included two within-subject factors and one between-subject factor. The within-subject factors were the connective used in each trial (AND, OR or IF) and the task phase. The between-subject factor was the interpretation of IF spontaneously adopted by a participant. Some participants interpreted IF relying upon its full logical meaning, involving three explicit models (Table 1). By contrast, for other participants the meaning of IF was fully captured by only one explicit model; this conjunctive interpretation makes IF equivalent to AND in terms of truth tables. The distinction between full-logical and conjunctive readings of conditionals is well documented in the literature. For example, Politzer (1981) showed that participants (a large sample of N = 242) spontaneously and consistently adopt the classical (from 20% to 75% depending on condition and group) or conjunctive (from 11% to 46%) interpretation of IF. Oberhauer et al. (2007) provided evidence that these two interpretations persist even in tasks where participants are asked to evaluate the probability (instead of the truth or compatibility) of conditionals. Moreover, the existence of the two interpretations is predicted by theories of reasoning. In particular, the Mental Models Theory (MMT, Johnson-Laird, 2010; Johnson-Laird and Byrne, 2002) implies that all participants initially interpret conditionals as being true in a single case only, in which both p and q are true (a conjunctive interpretation); some participants, however, will further ‘flesh out’ the initial representation by recovering and making explicit the cases in which p is false and q is true, and both p and q are false (thus accounting for a full logical interpretation). Therefore, there are at least two reasons for using an experimental design in which IF-interpretation is a betweensubject variable: first, the aim of a neuroscience of reasoning is to uncover the cortical bases of naturally-occurring interpretations of conditionals, as opposed to interpretations imposed upon participants by a

301

Table 1 Truth table (T: true; F: false) for AND, OR, IF (classical logical interpretation) and IF* (simplified conjunctive interpretation). Each row shows one of the four possible combinations of truth values (T/F) for the propositional variables (p and q) in the compound (gray shaded areas). In the example discussed in the Introduction, p is ‘it is a crow’ and q is ‘it is black’. p

q

p AND q

p OR q

IF p THEN q

IF p THEN q*

T

T

T

T

T

T

T

F

F

T

F

F

F

T

F

T

T

F

F

F

F

F

T

F

learning protocol; second, our aim here is also to provide (counter-)evidence for the hypothesis, following from MMT, that only some participants represent the full logical meaning of conditionals. The three different representational formats (surface form, full logical meaning and simplified interpretation of conditionals) imply both a different representational complexity for each type of logical compound, and a different similarity across compound types (Feldman, 2000). The representational complexity of surface forms is largely matched across types. Linguistically, AND, OR and IF are all monosyllables, and in logical syntax they are all binary operators: they connect exactly two elementary propositions. This stands in contrast with negation (i.e., ‘not-p’) or modals (e.g., ‘it is necessary that p’), which are unary operators. At this stage, no differences related to the interpretation of IF should be found, as these originate from the semantics of the compound, and not from its surface structure. The representational complexity of the full logical meaning of compounds is instead expected to differ across types. The number of truth table entries that verify a compound (the number of models) may be used as a measure of semantic complexity. Conditionals (classical-IF) and disjunctions (OR) are both semantically more complex than conjunctions (AND) because they are verified by 3 vs 1 models (Table 1). Moreover, classical-IF is more similar to OR than to AND as it shares more models with OR than with AND, i.e., more positive truth-table entries. On the assumption that greater processing demands entail stronger univariate responses, this should yield, in the brain regions that represent or process the full logical meaning of compounds, a higher activation for OR and IF compared to AND. Higher semantic similarity between OR and IF should also be observed in the pattern of similarity of the neural representations associated with each logical connective. We assume that two cognitive states that are semantically more similar, relative to a third state, are instantiated in the brain as patterns of activation that are also more similar, relative to a third pattern (Kriegeskorte et al., 2008). Finally, the brain may instantiate a simplified representation of the full logical meaning of conditionals. A conjunctive interpretation of IF is truth-conditionally equivalent to AND. Those representations would thus have lower representational complexity than a full logical interpretation of IF. Participants that adopt a simplified conjunctive interpretation of IF should show lower activations in at least one of the relevant regions compared to participants that adopt the classical logical interpretation. In participants adopting a conjunctive interpretation of IF, the neural patterns representing IF should be more similar to AND than to OR given that the two have the same truth tables. Materials and methods Participants Before taking part in the main study, potential participants underwent a screening procedure. All candidates were requested to complete a questionnaire in which they evaluated compounds similar

302

G. Baggio et al. / NeuroImage 135 (2016) 300–310

Fig. 1. Timeline of experimental (left) and baseline (right) trials in the present paradigm. Participants were presented with a graphic sign as a cue to the associated logical compound for 1000 ms. A delay of 6000 ms followed cue presentation. A blank screen with a fixation cross in its center was displayed during the delay. After the delay, a visual target scene was shown. Participants judged whether the target was compatible or incompatible with the compound by pressing the key associated with the chosen option (I: incompatible; C: compatible). In baseline trials, either the letter I or C on the target screen would turn orange, and participants pressed the key associated with the letter that had changed color.

to those used in the main experiment. Candidates were admitted to the fMRI session if: (i) their responses to AND and OR compounds were consistent with their logical meaning and (ii) their responses to IF compounds conformed either to the full logical meaning or to a simplified conjunctive interpretation (Table 1). Overall, 62 people underwent screening; 16 were excluded either because they did not adopt one of the possible interpretations of logical connectives, or they did not meet the safety criteria for fMRI or it was not possible to schedule the experimental sessions in the time slots available at the lab. Participants that were selected for the fMRI session were not different from the other screened volunteers in terms either of age (t(60) = −0.60, p = .55) or gender (z = −0.99, p = .32). Additionally, 11 participants took part in the study but were excluded before fMRI data collection: 7 failed the training, 2 quit the experiment because they reported feeling uncomfortable in the scanner, and 2 could not make it to the fMRI session. Therefore, 35 participants took part in the fMRI experiment, and were rewarded with money for their participation. All participants were righthanded native speakers of German, with no neurological or psychiatric disorders, and with normal or corrected-to-normal vision. The study was approved by the ethics committee at the Humboldt University of Berlin. Five participants were discarded, four due to poor behavioral performance in the scanner (see below) and one for technical problems during data acquisition. The remaining 30 participants (15 female, mean age 25.4 years, age range 19–35) were included in the final data analysis. Experimental stimuli In the main fMRI experiment, participants had to actively maintain one of three possible logical compounds (AND, OR or IF), and then evaluated the compatibility of the compound with a visual target depicting geometrical shapes on a black background. All three compounds were based on the same two constituent propositions: ‘There is a yellow square’ and ‘There is a green circle’. The two propositions were combined in three alternative ways to form the compounds. That is, we combined the two propositions using a conjunction (‘There is a yellow square AND there is a green circle’), a disjunction (‘There is a yellow square OR there is a green circle’) or a conditional (‘IF there is a yellow square THEN there is a green circle’). The resulting compounds therefore cannot be distinguished based on the elementary propositions involved, but only by the logical connective used to bind them. Each compound, and an additional low-level baseline condition (details below), were associated with two abstract visual cues (Fig. 1). Eight

abstract visual cues in total were used; associations between cue pairs and compounds were learned by participants during a training session preceding fMRI scanning (see below). We used visual cues to avoid showing the compounds in written language on the screen. Showing the compounds in written language would create a confound for fMRI analyses: it would make it difficult to disentangle brain responses to the different visual appearance of each compound, from brain responses to the surface form, and from brain responses to the meaning of compounds. The associations between cues and compounds were randomly reshuffled across participants, effectively removing the confounds that would be introduced by a fixed mapping between the compound's visual appearance and its structural and semantic properties. The use of cue pairs associated with each compound provides an extra safeguard in this regard: it makes the mapping between cues and compounds redundant, allowing the decoding of compounds independently of the visual properties of the cues. This approach has been used in previous research employing the same experimental paradigm we adopted here (Reverberi et al., 2012b and 2012c). During the target phase, participants were presented with a visual scene constituted by two vertically-aligned colored geometrical shapes (Fig. 1). Participants had to evaluate whether the active compound in the current trial was compatible or incompatible with the presented scene. The scene could be one of four different types, obtained by combining either the truth or falsity of the two elementary propositions, representing all four possible entries of the truth table for that compound (Table 1). The visual target always showed a square and a circle, but their colors varied. The target scene could show either a yellow square and a green circle (both propositions true), a yellow square and a red circle (first true, second false), a blue square and a green circle (first false, second true) or a blue square and a red circle (both false). The relative position of the two geometrical shapes (top vs bottom) was randomized across trials. Two letters were located on the left and right hand sides of the fixation point, indicating which response button corresponded to the compatible (letter ‘C’) or incompatible (letter ‘I’) response. The position of the letters was randomized across trials in order to ensure that participants could not prepare a response prior to the target phase. All experimental materials were in German. Experimental procedure Each trial started with the presentation of a visual cue for 1 s. Upon presentation of the cue, participants were expected to retrieve the

G. Baggio et al. / NeuroImage 135 (2016) 300–310

compound they had learned to associate with that cue (see below for details on training). The cue was immediately followed by a delay, in which a blank screen containing a fixation cross in its center was displayed. During the delay, the participant only had to represent the active compound and to fixate on the cross. The delay lasted for 6 s in experimental trials, and for 1 s in catch trials. Catch trials with a shorter delay (1 s) were introduced to force the participant to immediately represent the compound upon cue presentation, as required by the task instructions. Following the delay, the target screen was shown for 3 s. In both standard and catch trials, the participant had to evaluate whether the target scene was compatible with the logical compound they had recalled and answer as quickly as possible. They responded by pressing the left or the right hand side key on a button box using either the left or the right index finger, respectively, depending on the letter they intended to choose. In low-level baseline trials, the participant had to press the key on the same side as the letter (‘C’ or ‘I’) that turned orange on the screen. The interval between the letter's color change and the onset of the target screen was dynamically adapted during the experiment, such that the average time interval between target onset and response in baseline trials was similar to that in experimental trials. An intertrial interval with variable duration (average 2050 ms) followed the target offset. During fMRI scanning, participants performed 360 trials, divided into 6 runs. In each run, 60 trials (36 experimental, 12 baseline, 12 catch trials) were administered in random order. The entire experiment lasted approximately 1 h. The experiment was administered using MATLAB and the Cogent2000 toolbox (Functional Imaging Laboratory and Institute of Cognitive Neuroscience, University College London, London, UK). Training was undertaken within the three days prior to fMRI scanning. During the first part of the behavioral training (mean duration 45 min, SD = 4.8, range 30–60), participants could learn and practice the 8 cue-compound associations. This first training ended when the participant attained 100% accuracy in 15 consecutive trials. In the second part of the training (mean duration 60 min, SD = 17.7, range 30– 90), participants practiced a task similar to the experimental one. The training ended when participants could cope with trials with short delays (1 s) and produced internally consistent answers in all experimental conditions (average consistency in the last 15 trials ≥ 95%). As we were interested in the interpretations of the logical operators that were spontaneously adopted by the participants, in this part of the training we did not provide any feedback about the correctness of their response to the participants. Participants were only encouraged at the beginning of the session to be consistent in their responses throughout the training (i.e., to always give the same response in the same condition). Participants were never forced to use any particular interpretation of any logical connective. Image acquisition Functional imaging data were collected using a 3-T Siemens Trio scanner (Erlangen, Germany), equipped with a 12-channel head coil. In each of the six scanning sessions, we acquired 340 T2*-weighted images in descending order, using gradient-echo echo-planar imaging (EPI) sequences. The volumes were composed of 33 slices (3 mm thick), separated by a gap of 0.75 mm. Imaging parameters were as follows: TR 2000 ms, TE 30 ms, FA 78°, matrix size 64 × 64, and FOV of 192 mm × 192 mm, yielding an in-plane voxel resolution of 3 mm, resulting in a voxel size of 3 mm × 3 mm × 3.75 mm. At the beginning of the scanning session, a T1-weighted structural dataset was collected, with the following parameters: TR 1900 ms, TE 2.52 ms, FA 9°, matrix size 256 × 256 × 192, FOV of 256 mm × 256 mm × 192 mm, 192 slices (1 mm thick), and a resolution of 1 mm × 1 mm × 1 mm. After the experiment, a magnetic field mapping sequence was also run. Imaging parameters were TR 400 ms, TE 5.19 ms and 7.65 ms, FA 60°, matrix size 64 × 64, FOV of 192 mm × 192 mm, 33 slices (3 mm thick), and resolution 3 mm × 3 mm.

303

Behavioral analyses For every participant, we computed the average consistency and the RTs in each experimental condition. Average consistency was measured as the rate of the most frequent response (either C or I) that each participant gave to each combination of a compound and a target scene. For example, if a participant answers 90% of the time ‘Compatible’ in OR trials (‘There is a yellow square OR there is a green circle’) when a target scene with a yellow square and a green circle is shown, then they are 90% consistent in those trials. One-sample Kolmogorov–Smirnov tests on RTs in each condition (AND, OR, IF) indicate that the data are normally distributed (all tests, p N .40). Neuroimaging analyses In the fMRI analyses, we first considered within-subject factors (logical compound type, task phase). Next, we explored the betweensubject factor (interpretation of IF compounds) by focusing on the brain areas in which we previously detected an effect of logical compound type. This analysis strategy is justified on several grounds. First, two out of the three representational formats examined here (surface form and full logical meaning) are by definition independent of interpretation. Thus, for those, considering the sample as a whole is appropriate to increase the sensitivity of the analysis. Second, the detection of the effects related to interpretation should not be limited by analyzing data from both groups in a between-subjects design. The semantic complexity of the three logical compounds (AND, OR, IF) differ, in one way or another, in both IF-interpretation groups: namely OR N conjunctive-IF ~ AND; OR ~ classical-IF N AND. Therefore, the relevant brain areas should still be detected even when the two interpretation groups are collapsed. Notwithstanding these a-priori considerations, we also empirically tested whether the two interpretation groups differed in regions that could not be detected by the approach just described: that was not the case (see Supplementary data). Pre-processing Functional imaging data were pre-processed and analyzed using SPM12 (Wellcome Trust Centre for Neuroimaging, Institute of Neurology, UCL, London, UK). During pre-processing, volumes were realigned, corrected for geometric distortions using unwrapped field maps, and slice-time corrected. Low-frequency noise was removed using a highpass filter with a cutoff period of 128 s, and an AR model was fit to the residuals to allow for temporal autocorrelations. For the univariate analysis, the data were also normalized to the standard MNI brain and spatially smoothed (FWHM = 6 mm). Univariate analyses We used standard univariate statistical analyses to test for differences in mean activation levels for AND, OR and IF compounds during either the maintenance or the evaluation phase. Statistical inferences were based on a random effects approach (Penny et al., 2004) comprising two steps. First, the data were best fitted at every voxel for each participant using a combination of effects of interest. Second, we ran an ANOVA modeling the mean of each effect across participants. For all group-level analyses, statistical thresholds of α = .0051 at the voxellevel, not corrected for multiple comparisons, and α = .05 at the cluster level, family-wise error (FWE) corrected for multiple comparisons, were used (Friston et al., 1996). For the maintenance phase, we implemented an FIR model for the first-level analysis (Henson, 2003). Four regressors were considered, corresponding to the four conditions: AND, OR and IF compounds, and low-level baseline. Each condition was modeled using 16 time bins lasting 2 s each. The onset vectors of all 1 Although we used a more liberal p b .005 statistical threshold for cluster definition, we should point out that all reported results are qualitatively the same also at a cluster threshold of p b .001.

304

G. Baggio et al. / NeuroImage 135 (2016) 300–310

regressors were defined using cue onset times. Linear contrasts were used to estimate the main effect of each condition of interest in the time bins from 3 to 5. When the hemodynamic delay is considered, these time bins capture the signal related to processing of the cue and to maintenance of the compound, but not to evaluation. This procedure resulted in the generation of four contrast images per participant, one per condition. These images were then subjected to a one-way ANOVA. A two-tailed F-test was used to test for differences in activation between the three compounds (AND, OR, IF). For this analysis, we only considered brain regions in which at least one compound produced a significant activation compared to the low-level baseline (see Section 1.3, Supplementary data online) by applying an inclusive mask (i.e., small volume correction was not used). The univariate analysis for the evaluation phase was similar to that of the maintenance phase. The only difference was the use of the HRF model, instead of FIR for first-level modeling. The HRF model was preferred over the FIR model given that it was not necessary to temporally disentangle the evaluation phase from the subsequent task phase within the same trial (Henson, 2003). Multivariate pattern analysis: classification We applied multivariate pattern analysis (MVPA) to identify where and how the human brain maintains and evaluates the active compound. An FIR model was applied to the realigned and slice-time corrected images (Henson, 2003). The volumes were neither spatially smoothed nor normalized, in order to preserve fine-grained patterns of activation. We modeled four regressors corresponding to the four conditions: AND, OR, IF and baseline. Each condition was modeled using 16 time bins of 2 s each. The time vectors for all regressors were defined using cue onset times. The beta parameters estimated by the FIR model for each of the six runs were used to perform a whole-brain MVPA using a searchlight approach (Kriegeskorte et al., 2006; Reverberi et al., 2012c). The same analysis was performed independently for each time bin. To avoid overfitting of the parameter estimates, we implemented a leave-one-run-out cross-validation procedure (Misaki et al., 2010). A linear SVM (Chang and Lin, 2011) with a fixed regularization parameter C = 1 was trained to distinguish between two of the experimental conditions using data taken from 5 of the 6 runs available. The classifier performance was then tested on the data from the remaining run. The procedure was repeated 6 times and the results were averaged. Three different classifications were performed using all possible pairs of experimental conditions: AND vs OR, AND vs IF and OR vs IF compounds. For each searchlight sphere (radius = 4 voxels), a measure of decoding accuracy with respect to the chance level (50%) was obtained. The values were then combined to produce whole-brain accuracy maps. For each participant, the three accuracy maps resulting from the analyses described above were normalized to the MNI space and averaged before performing group-level analyses. The resulting mean accuracy maps, one for each time bin, were submitted to a one-way ANOVA to test in which brain regions the accuracy level was significantly higher than chance across all participants. For MVPA on the maintenance phase, only data from time bins from 3 to 5 were used. For the evaluation phase, analogous to the univariate analysis, we used an HRF model. For all analyses, unless specified otherwise, effects were considered significant for p b .005 at the voxel-level and p b .05 at the cluster level (FWE-corrected for multiple comparisons). Multivariate pattern analysis: representational similarity analysis The goal of the pattern similarity analysis was to test whether the local activation patterns evoked during maintenance or evaluation of a target compound bore a systematically higher similarity with one of the two other compounds (Kriegeskorte et al., 2008). For example, we asked whether the local patterns of activation for IF compounds were systematically more similar to those of either AND or OR compounds. The representation similarity analysis rests on the assumption that items more similar along a relevant dimension should be encoded in

the appropriate brain regions by patterns of activation which bear a higher similarity compared to pattern of activations representing less similar items (Kriegeskorte et al., 2008). To perform this analysis, we relied on an approach similar to the one used for the MVPA analyses. Specifically, we used the same first-level models implemented for the MVPA, and a searchlight approach. Relevant time bins were analyzed independently. For each searchlight sphere (radius = 4 voxels), we extracted a vector representing the activation level of all voxels within that searchlight. We collected one vector for each compound type and for each run. We then assessed the correlation between a target compound (e.g., IF) with the two other compounds (e.g., AND or OR), and we assigned to the center of the searchlight one of two labels (− 1 or + 1) corresponding to the most similar compound (e.g., AND: − 1, OR: +1). This procedure was repeated for all 6 local patterns of the target compound extracted from the 6 fMRI runs, for all voxels in the brain. Thus, we obtained one similarity map per participant, compound type and time bin. For each participant, we averaged the similarity maps belonging to the phase of interest (maintenance or evaluation). Finally, we performed a group analysis on the similarity maps to test whether the group average for a voxel was different from zero, indicating a systematically higher similarity of the target compound with one of the other two compound types. We ran an ANOVA with target compound (AND, OR, IF) as a factor, and with the similarity value for the target as the dependent variable. The representational similarity analysis was applied to brain regions in which an effect was found in the other analyses. Region of interest analysis A region of interest (ROI) analysis was used to focus data analyses anatomically when a priori information on the localization of the effect was available (Brett et al., 2002). For all ROIs identified, we extracted the average effect across all voxels within the ROI. The average ROI effects were then submitted to a group-level analysis. When multiple ROIs were considered in the same analysis, FWE was used to correct for multiple comparisons. ROI-based analyses were also used to test for the effect of alternative interpretations of IF. In particular, we tested whether alternative interpretations modulated the effects of maintenance and evaluation of IF compounds in the relevant ROIs (see Results). We performed ANOVAs with ROIs in which the main effect of IF was used as a within-subject factor and the interpretation of IF as a between-subject factor. The dependent variable could be either the activation level in the case of univariate analyses, or accuracy values in the case of multivariate analyses. Results Behavioral results The average consistency of responses to the visual target scenes was high in all 3 conditions: AND 89% (SD = 4.5%), OR 89% (3.8%) and IF 91% (4.4%). Mean accuracy on baseline trials was also high at 93% (SD = 3%). On average, RTs from the onset of the visual target were short: AND 1345 ms (SD = 208 ms), OR 1531 ms (175 ms) and IF 1428 ms (211 ms). The difference in RTs across conditions was significant (F(2,58) = 49.7, p b .001). Participants were divided into two subgroups depending on their interpretation of conditionals, as inferred from their responses to the 4 different targets. Thirteen participants spontaneously adopted the full logical interpretation of IF, whereas 16 used the simplified interpretation of IF, and 1 used neither of the two. The latter participant was discarded from further analyses. The two subgroups were comparable with respect to age (t(19) = − 0.39, p = .70) and gender (z = 0.54, p = .59). We ran two 3 × 2 mixed ANOVAs, with compound type (AND, OR, IF) as a within-subject factor and IF-interpretation (classical or conjunctive) as between-subject factor. In one ANOVA, the dependent variable was response consistency; in the other it was RTs. For consistency, the main effect of interpretation was significant (F(1,54) =

G. Baggio et al. / NeuroImage 135 (2016) 300–310

305

Fig. 2. Experimental fMRI results from the delay (or maintenance) phase showing the brain region involved in the representation of the active logical compound. Mean decoding accuracy (0 is chance level, 50%) is shown for the time bins in the maintenance phase (from 4 to 10 s after cue onset), and for the time windows that immediately precede (ITI; up to 4 s) and follow the delay (target or ‘evaluation’ phase; from 10 s on, Fig. 3). The left posterior inferior frontal gyrus (pIFG, BA44) is the only brain region where information about the active compound can be decoded throughout the entire period considered. See Table S1 for detailed statistical values.

5.64, p = .024): participants using the full logical meaning of IF were slightly more consistent compared to those adopting a simplified-IF reading (91% vs 89%, respectively). However, no interaction effect (F(2,54) = 1.02, p = .37) was found. When RTs were examined, there was no main effect (F(1,54) = 4.05, p = .054), but a clear interaction of compound type by IF-interpretation (F(2,54) = 9.41, p b .001) was present. This interaction was due to a difference in RTs between the two interpretation groups only when processing IF compounds (1540 ms for the classical vs 1326 ms for the conjunctive group, t(27) = 3.0, p = .006), but not for AND and OR. Neuroimaging results Maintenance phase We first tested for the presence of differences in average activation levels when maintaining AND vs OR vs IF compounds (univariate effect), but no difference was found between compound types. Next, we assessed whether local patterns of brain activity can predict which logical compound is currently being maintained with an accuracy significantly greater than chance: only the posterior portion of the left inferior frontal gyrus (pIFG, primarily BA44, overlapping with Broca's area) contained information on the identity of the active compound for the entire duration of the delay (Fig. 2). We further checked whether this finding could be driven by one or two of the three decoding pairs (i.e., AND vs OR, AND vs IF, OR vs IF). We found no differences in decoding performance across the three pairs (p N .10). By contrast, all three decoding pairs showed a greater than chance decoding performance in left pIFG/BA44 (p b .05, corrected). Once the central role of the left pIFG in the representation of the active compound was established, we assessed whether, in that region, the pattern of activation representing one compound type displays a higher similarity with the pattern representing another compound type. For example, we tested whether neural patterns representing IF are systematically more similar to those representing OR than to those representing AND. We did not find any evidence for consistent similarities between patterns encoding each compound type in the delay phase. Finally, we tested whether information on the active logical compound could also be found in the regions that were identified during the evaluation phase by means of a ROI analysis (see the next section and Fig. 3). Again, we found that the identity of the active logical compound during the maintenance phase could only be extracted from the pIFG ROI (p b .01, corrected) but not from aIFG or IPS. As a note of caution however, one should consider that the definition of the ROIs in the evaluation phase might not be independent from activity in the maintenance phase due to the sluggish duration of the hemodynamic response function.

Evaluation phase We first explored whether any brain areas had a different average activation while processing the 3 compound types (univariate effect) and, at the same time, an activation higher than the low-level baseline. We found that activations differed in a relatively restricted lefthemispheric cortical network (Fig. 3), comprising the anterior inferior frontal gyrus (aIFG, primarily BA47), pIFG (primarily BA44), and the intraparietal sulcus (IPS, BA40). We conducted a series of post-hoc ttests to establish which simple comparisons underlie the observed main effect. Post-hoc tests focused on the 3 ROIs (left aIFG, pIFG and IPS) derived from the preceding analysis2. We found that the overall differences were largely due to an increased activation for IF and OR compared to AND. The activation for IF and OR compounds was similar. This pattern was observed in all 3 regions: left aIFG (BA47; IF–AND: t(28) = 5.17, p b .001; OR–AND: t(28) = 3.47, p = .002; IF–OR: t(28) = 1.15, p = .26), left pIFG (BA44; IF–AND: t(28) = 3.36, p = .002; OR–AND: t(28) = 4.93, p b .001; IF–OR: t(28) = − 1.05, p = .30) and left IPS (BA40; IF–AND: t(28) = 4.29, p b .001; OR–AND: t(28) = 4.66, p b .001; IF–OR: t(28) = .48, p = .63). Next, we performed a classification analysis and a similarity analysis. For these analyses, we focused on the 3 regions that displayed a univariate effect in the evaluation phase (Fig. 3). We found that the identity of the active compound could be decoded from each of these three ROIs (all ps b .001, whole-brain corrected for multiple comparisons). Most interestingly, both in left pIFG (BA44) and in left aIFG (BA47) activation patterns related to IF compounds were more similar to those associated with OR than to AND (p b .001, corrected for multiple comparisons), whereas AND showed the opposite pattern, being systematically more similar to OR than to IF in all ROIs (p b .008 in all tests, corrected for multiple comparisons). OR did not show a significant similarity towards any of the other two compound types. Note that the pattern similarity analysis is independent of the analysis used to define the ROIs. Finally, we applied the same ROI-based approach to the only region that was found to represent the active compound during the maintenance phase (Fig. 2). We found that the identity of the active compound during the evaluation phase could be extracted from this ROI (BA44) (p b .001, corrected). Unlike in the maintenance phase, in the evaluation phase we found a stable pattern of similarity among compound types. 2 Given how these ROIs were defined, the post-hoc tests are not independent from the whole-brain ANOVA used to define them. These post-hoc tests are meant to explore the simple effects involved in the ANOVA interaction: we report these post-hoc analyses to provide direct evidence on the source of the interaction effect. Nevertheless, the reported p-values might be inflated and should thus be interpreted together with the associated ANOVA. In the Supplementary data online (Section 1.2), we also report whole-brain corrected analyses producing similar results.

306

G. Baggio et al. / NeuroImage 135 (2016) 300–310

Fig. 3. Experimental fMRI results from the evaluation phase showing the cortical regions in which activation levels differed between AND, OR and IF compounds. The plots report the average activation level within each region for each compound type. In all these brain regions, the activation was higher for OR and IF compounds compared to AND. IPS: Intraparietal Sulcus; pIFG: posterior Inferior Frontal Gyrus; aIFG: anterior Inferior Frontal Gyrus. See Table S2 for detailed statistical values.

Specifically, the local activation patterns for IF in BA44 were more similar to OR than to AND (p b .005, corrected), AND was more similar to OR than to IF (p b .005, corrected), while OR did not show a significant similarity towards either AND or IF. Again, note that this similarity analysis is independent from the analysis used to define ROIs. Effects of alternative interpretations of IF Participants interpreted IF compounds in alternative ways, either adopting the full logical meaning of the conditional, or a simplified conjunctive interpretation (Table 1). The latter interpretation of IF produced a pattern of behavioral choices identical to that observed for AND. Here, we explored whether and how using the full logical meaning of IF, or the conjunctive reading, changed the way the brain represents and processes conditionals. We again focused on the 3 brain regions (left aIFG, pIFG and IPS) for which we had found effects in the preceding analyses (similar results on the interpretation effect are found also by adopting a whole-brain approach, see Section 1.4 of the Supplementary data). First, we considered all brain regions that were found to respond differently when evaluating the different compound types (left aIFG, pIFG and IPS; Fig. 3). In these regions, we tested for the presence of a modulating effect of IF-interpretation on activations associated with the evaluation of IF compounds. Specifically, we performed a 3 (ROI) × 2 (IFinterpretation) between-subjects ANOVA (Fig. 3) with the contrast IF N low-level baseline as a dependent variable. We found that both main effects were significant, indicating that the activation level was different across ROIs compared to low-level baseline (F(2,54) = 14.83, p b .001), and that different interpretations of conditionals led to different activations in the overall left-hemispheric aIFG–pIFG–IPS network (F(1,27) = 18.82, p b .001). Crucially, we found a significant interaction of ROI by IF-interpretation (F(2,54) = 5.56, p = .006). Post-hoc analyses showed that this interaction was due to higher activation levels in the left IPS in the group that adopted the full logical interpretation of IF (t(27) = 2.93, p = .007). Notably, the effect was absent in left aIFG (t(27) = 1.32, p = .20) and left pIFG (t(27) = 1.59, p = .12) (Fig. 4). Furthermore, we assessed the presence of interpretation effects on the representation of logical compounds during the maintenance phase. In particular, we tested whether the performance of the multivariate classifier in pIFG (Fig. 2) differed in the two interpretation groups. No difference was found (p = .66). Moreover, we did not find any difference across groups in pIFG/BA44 during the evaluation phase (p = .28).

Finally, we tested for interpretation effects on the patterns of similarity across compound types observed in aIFG and pIFG during the evaluation phase. Again, no difference was found across interpretation groups. In particular, the similarity pattern for IF did not change between the full logical and the conjunctive interpretation groups (p b .05, corrected). This negative finding was further confirmed when pattern similarity analyses for IF were explored in the group endorsing a simplified reading of IF: in both aIFG and pIFG, IF remained more similar to OR compared to AND (p b .005, corrected), even though responses to IF and AND were identical in this group. Discussion Logical connectives are cornerstones of human cognition, but how they are represented in the brain has so far eluded direct investigation. Our fMRI results support a dynamic, multi-step and multi-format model of the maintenance and evaluation of logical compounds. When compounds are only maintained in working memory, their surface form is represented in the left posterior inferior frontal gyrus (pIFG, BA44,

Fig. 4. Effect of the alternative interpretations of IF compounds during the evaluation phase. The plot reports the average activation in three relevant brain regions (aIFG, pIFG, IPS; see Fig. 2) during the evaluation of IF compounds minus the low-level baseline. The average activation in these regions is displayed separately for the interpretation of IF compounds: full logical meaning and simplified interpretation.

G. Baggio et al. / NeuroImage 135 (2016) 300–310

Broca's area). When compounds must be evaluated against a visual scene, a more complex pattern emerges. The left inferior frontal gyrus (both pIFG, BA44, and aIFG, BA47) represents and processes the full logical meaning of connectives, while the specific interpretation of IF adopted by a participant relies on the left IPS (BA40). Left posterior inferior frontal gyrus (left pIFG, BA44) Left pIFG (BA44), which largely overlaps with traditional Broca's area, was the only region that represented the active compound during both the maintenance (Fig. 2) and evaluation (Fig. 3) phases. Our results suggest that pIFG instantiates a dynamic neural code: during maintenance, pIFG encodes the surface form of the active compound, whereas during evaluation it encodes its full logical meaning. During maintenance, it was possible to determine the identity of the active compound by decoding local activation patterns in pIFG. However, pIFG showed no difference in activation levels, and no systematic pattern of similarity across different logical compounds. In contrast, during evaluation, pIFG was more active for IF and OR than AND, and the patterns of activation associated with IF showed greater similarity to OR than to AND. Neither activations nor similarity patterns in pIFG were modulated by the interpretation of IF. This is intriguing considering that IF on a conjunctive reading produces behavioral choices indistinguishable from AND. The left pIFG (BA44) is involved in phonological and syntactic processing and maintenance (Grodzinsky and Santi, 2008; Hagoort, 2005; Makuuchi et al., 2009; Owen et al., 2005; Pallier et al., 2011; Rogalsky and Hickok, 2010), and in representing and processing hierarchical information (Badre, 2008; Koechlin and Jubault, 2006). A recent study by our group showed that left pIFG is a key area in representing the formal structure of a logical problem during deductive reasoning (Reverberi et al., 2012a; see also Reverberi et al., 2009). Our findings confirm the role of left pIFG in representing both the surface form and the logical meaning of sentences. Our findings suggest that left pIFG (BA44; Broca's area) encodes and maintains the logical connection between elementary propositions in a compound, and shifts rapidly between a surface form representation and a semantic representation of a logical compound within the few seconds spanning a single trial (Stokes et al., 2013). These findings are consistent with previous research suggesting that different linguistic computations are instantiated in fine-grained patterns of spatio-temporal activity in posterior and middle IFG (Sahin et al., 2009).

307

Left anterior inferior frontal gyrus (left aIFG, BA47) The left aIFG was differentially recruited when compounds had to be evaluated. Like pIFG, aIFG was more active when processing IF and OR than AND. Activation of aIFG did not change with the interpretation of IF. Whereas pIFG encodes information also during maintenance, aIFG only does so during evaluation. Left aIFG is engaged during controlled retrieval from semantic memory (Badre et al., 2005; Metzler, 2001; Thompson-Schill et al., 1997; Whitney et al., 2011). In our study, controlled retrieval may be required when the compound is semantically complex, as when more models are involved (with IF and OR) as compared to when just one (AND) is. This implies a higher activation of aIFG for IF and OR than AND, which is what we observed. However, the activation of aIFG is not influenced by the alternative interpretations of IF, suggesting that the meaning retrieved at the stage involving aIFG is similar across participants. Left aIFG is also activated, along with other regions, by rule processing and deduction (Bunge et al., 2003; Monti et al., 2007, 2009; Noveck et al., 2004; Prado et al., 2010; Reverberi et al., 2010, 2012a; Rodriguez-Moreno and Hirsch, 2009; Sakai and Passingham, 2003). In a recent study, we showed that the activation in aIFG predicts whether different participants make logically valid inferences (Reverberi et al., 2012a). Two lines of enquiry, one on semantic retrieval and the other on inference making, converge in considering the left aIFG a crucial region for language and reasoning. Inference and controlled semantic retrieval may seem unrelated, but they have deep functional links. First, retrieving meanings may involve inference. For example, one may recover relatively uncommon meanings by building a deductive chain of implications starting from a more easily retrieved meaning (e.g., ‘If candle, then flame’; ‘If flame, then halo’). Moreover, in formal logic, a conclusion (e.g., of a deductive argument) is a semantic consequence of a given set of premises, if there is no model in which all premises are true and the conclusion is false. According to these approaches, logical inference is the direct consequence of retrieval and integration of the meaning of premises (see also Chierchia, 2013). Left intraparietal sulcus (IPS, BA40) Like the left aIFG, the left IPS was more active when processing IF and OR than AND only during evaluation. But in contrast with aIFG, the activation in IPS was modulated by the interpretation of

Fig. 5. Summary of the proposed functional account of experimental data showing the specific contributions of each brain region during the maintenance and evaluation phases: pIFG (BA44, Broca's area) represents the surface form of logical compounds during maintenance, and switches to processing the full logical meaning of compounds during evaluation, together with aIFG (BA47); IPS (BA40) is involved in computing a simplified interpretation of conditionals.

308

G. Baggio et al. / NeuroImage 135 (2016) 300–310

conditionals. IPS was the only area in which an effect of interpretation was observed. IPS may be involved in the post-retrieval selection and manipulation of the context-relevant meaning of compounds. If multiple models are selected and evaluated, as with IF (full logical meaning) and OR, the involvement of IPS is greater. By contrast, in compounds with only one model (AND), or in compounds that are simplified by pruning some models (e.g., conjunctive IF), selection and evaluation demands are lower. This interpretation is consistent with previous models and findings on the functional role of left IPS. Left IPS is part of a general-purpose system involved in representing information relevant for the task at hand and controlling top-down attention (Fedorenko et al., 2013; Humphreys and Lambon-Ralph, 2014; Owen et al., 2005; Power and Petersen, 2013; Whitney et al., 2011; Sarma et al., 2016). This region has been implicated in a variety of tasks, from working memory to arithmetics and inference. As for semantics, a recent TMS experiment showed that left IPS is necessary for the top-down selection of specific features (e.g., size), out of multiple features characterizing an object (Cabeza et al., 2008; Whitney et al., 2011). In studies on inference, a supportive role has been attributed to left IPS, possibly in representing the structure of an argument (Monti et al., 2007, 2009), though not in generating logically valid inferences (Reverberi et al., 2012a). The present experiment expands and confirms previous findings by assigning to the left IPS the role of selecting and manipulating context-relevant representations of logical compounds. Alternative explanations Based on theoretical insights from logic and linguistics, we have presented and tested a number of predictions for neural responses to AND, OR and IF compounds, depending on whether (and where) the surface form, the full logical meaning or a simplified interpretation of a compound is represented in the brain. It is important to emphasize the inferential structure underlying these predictions. In all cases, we used logical theory to form a prediction in terms of neural similarity patterns or activation levels, and we initiated a search for brain areas that might conform to the prediction. The regions discussed above (i.e., left pIFG, aIFG and IPS) fit the predicted patterns. However, it would also be useful to rule out alternative accounts of the function of those regions that are or appear to be equally compatible with the data, in order to strengthen the conclusion that these areas indeed represent and process the surface form, the full logical meaning and the interpretation of logical compounds in different phases of the task. One possible alternative account of the differences in activation between IF, OR and AND during evaluation is that IF and OR require additional ‘effort’, working memory, attention or control, compared to AND which only involves perceptual matching of an internal representation (a yellow square and a green circle) with the visual scene. It is not clear whether this account would explain why, as our results also suggest, IF and OR should be more complex than AND without (implicitly or indirectly) referring to the different truth-table cases that must be considered to evaluate disjunctive and conditional compounds. Furthermore, this account would predict that IF-simplified and AND are equally difficult, and would therefore fail to explain the difference we have observed in IFG between IF (regardless of interpretation) and AND. These considerations apply to a version of this account in which evaluating IF and OR compounds requires additional deductive or ‘rule-based’ processing, a possible source of working memory load. The problem here is that one must show independently from the experimental results that one is trying to predict or explain, that checking the logical consistency between compound and target is more complex computationally when implemented deductively (applying derivation rules) than when implemented semantically (inspecting models). However, deductive systems for propositional logic often consider a number of ‘cases’ that mirror models (truth tables), making it difficult to distinguish rulebased from model-based accounts in this case (Stenning and van

Lambalgen, 2008 pp. 118–119). Thus, what determines working memory load here would still be the number of instances (i.e., of models or deduction cases) that have to be considered in order to evaluate a compound relative to the target. Yet another alternative explanation implies that the observed activations, especially stronger BOLD responses to IF and OR, are driven by RTs, which are also longer for disjunctions and conditionals. However, it seems more likely that RTs are a downstream effect of differences in (semantic) complexity, as revealed by activation patterns, rather than the opposite, namely activation levels being an ‘effect’ of RTs. Moreover, RTs cannot explain the effects we see during the delay, as no response is given or can be prepared at that stage. Also, RTs for IF-simplified and AND (and for OR and classical-IF) are comparable, yet we see differences in activation. A similar observation would apply to compound length and syntactic complexity: IF compounds are one word longer (i.e., ‘then’) than AND and OR, and include a subordinate clause. Here too, the alternative account seems to be empirically incomplete: first, AND and OR compounds have the same length in words, however we found differences in activation in the evaluation phase; second, simplified and classical-IF have the same length, yet we found differences in IPS. In brief, alternative accounts of the data, besides being internally consistent, should (a) explain all of the results reported here in a single framework (as our logic-based theory does), (b) avoid (implicit or indirect) recourse to logical notions in providing non-logical accounts of the data, and (c) demonstrate that alternative accounts are empirically distinguishable, and not formal equivalents. The alternative explanations discussed above fail to meet these criteria. A fronto-parietal network for semantic combination with logical connectives: theoretical implications Our findings have theoretical implications for theories of language and reasoning. Univariate and multivariate analyses indicate that left pIFG, largely overlapping with classical Broca's area, encodes the surface form of compounds during maintenance, and only later, during evaluation, does it process the full logical meaning of connectives, together with aIFG. The left IFG instantiates representations that were previously seen as purely logical, model-theoretic abstractions. Moving from logic to psychology, research has shown that conditionals may be assigned a full logical interpretation or a simplified reading, either equivalent or very similar to AND (Johnson-Laird and Byrne, 2002; Politzer, 1981). In the present experiment, behavioral responses to IF and to AND compounds were indistinguishable for participants endorsing conjunctive interpretations. However, neural codes in pIFG and aIFG did not differ between the two groups: IF was more active than AND, even for the participants who understood IF as having a similar truth table to AND. In all cases, the semantic functions at work in IFG may enable the computation of the full logical meaning of conditionals. Why, then, are behavioral responses different between the two groups? The answer lies in the contribution of left IPS, the only area sensitive to the alternative interpretations of IF. IPS is activated more when IF is interpreted according to its classical logical meaning (three models), than when the meaning is simplified conjunctively (one model). Our results indicate that the brain encodes the full logical meaning of connectives, but can manipulate it by pruning some models to select a specific contextual meaning that eventually drives behavioral responses. This finding subverts an important tenet of MMT (Johnson-Laird, 2010), namely that the initial representation of conditionals relies on just one model (like AND), and only subsequently (in some participants) can this representation be expanded to include the full logical meaning of the conditional. Instead, our findings suggest that the full meaning of connectives, including IF, is always available, even though some participants may prune part of that meaning by devising a simplified interpretation; this was the case for participants who adopted the simplified interpretation of IF. This is consistent with findings on the early recovery of multiple meanings of ambiguous

G. Baggio et al. / NeuroImage 135 (2016) 300–310

words (Onifer and Swinney, 1981; Petten and Kutas, 1987; but see also Seidenberg et al., 1982; Klein and Murphy, 2001; Pylkkänen et al., 2006). In conclusion, propositional logical operators are represented and processed by three cognitive components, instantiated in particular left-hemispheric regions: one that represents the surface form of compounds (pIFG), one that recovers their full logical meaning (aIFG and pIFG) and one that adapts this meaning to the specifics of the context at hand (IPS) (Fig. 5). It is noteworthy that the left pIFG, Broca's area, besides representing the surface form of compounds, can rapidly switch to a different code and represent their logical meaning. These findings uncover a brain network underlying human compositional thought, and provide a shared neural foundation for the cognitive science of language and reasoning. Funding PC, DP and CR were supported by PRIN grant 2010RP5RNM_001 from the Italian Ministry of University and Research. JDH was supported by SFB grant 940 from the Deutsche Forschungsgemeinschaft. Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.neuroimage.2016.04.061. References Anderson, J.R., Lebiere, C., 1998. The Atomic Components of Thought. Lawrence Erlbaum Associates. Badre, D., 2008. Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes. Trends Cogn. Sci. 12 (5), 193–200. Badre, D., Poldrack, R.A., Pare-Blagoev, E.J., Insler, R.Z., Wagner, A.D., 2005. Dissociable controlled retrieval and generalized selection mechanisms in ventrolateral prefrontal cortex. Neuron 47 (6), 907–918. Baron, S.G., Osherson, D., 2011. Evidence for conceptual combination in the left anterior temporal lobe. NeuroImage 55 (4), 1847–1852. Bonatti, L.L., Cherubini, P., Reverberi, C., 2015. Nothing new under the sun, or the moon, or both. Frontiers in Human Neuroscience 588. Brett, M., Anton, J.-L., Valabregue, R., Poline, J.-B., 2002. Region of interest analysis using an SPM toolbox. NeuroImage 16. Bunge, S.A., Wallis, J.D., 2007. Neuroscience of Rule-guided Behavior. Oxford University Press. Bunge, S.A., Kahn, I., Wallis, J.D., Miller, E.K., Wagner, A.D., 2003. Neural circuits subserving the retrieval and maintenance of abstract rules. J. Neurophysiol. 90 (5), 3419–3428. Cabeza, R., Ciaramelli, E., Olson, I.R., Moscovitch, M., 2008. The parietal cortex and episodic memory: an attentional account. Nat. Rev. Neurosci. 9 (8), 613–625. Carnap, R., 1948. Introduction to Semantics and Formalization of Logic. Harvard University Press. Chang, C.-C., Lin, C.-J., 2011. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2 (3), 1–27. Chierchia, G., 2013. Logic in Grammar: Polarity, Free Choice, and Intervention. OUP Oxford, Oxford. Fedorenko, E., Duncan, J., Kanwisher, N., 2013. Broad domain generality in focal regions of frontal and parietal cortex. Proc. Natl. Acad. Sci. U. S. A. 110 (41), 16616–16621. Feldman, J., 2000. Minimization of Boolean complexity in human concept learning. Nature 407 (6804), 630–633. Friston, K.J., Holmes, A., Poline, J.B., Price, C.J., Frith, C.D., 1996. Detecting activations in PET and fMRI: levels of inference and power. NeuroImage 4 (3 Pt 1), 223–235. Goel, V., 2007. Anatomy of deductive reasoning. Trends in Cognitive Sciences 11 (10), 435–441. Grodzinsky, Y., Santi, A., 2008. The battle for Broca's region. Trends Cogn. Sci. 12 (12), 474–480. Hagoort, P., 2005. On Broca, brain, and binding: a new framework. Trends Cogn. Sci. 9 (9), 416–423. Henson, R.N.A., 2003. Analysis of fMRI time series. In: Frackowiak, R.S.J., Friston, K.J., Frith, C., Dolan, R., Friston, K.J., Price, C.J., Penny, W.D. (Eds.), Human Brain Function, second ed. Academic Press. Humphreys, G.F., Lambon-Ralph, M.A.L., 2014. Fusion and fission of cognitive functions in the human parietal cortex. Cereb. Cortex. Johnson-Laird, P.N., 2010. Mental models and human reasoning. Proc. Natl. Acad. Sci. 107 (43), 18243–18250. Johnson-Laird, P.N., Byrne, R.M.J., 2002. Conditionals: a theory of meaning, pragmatics, and inference. Psychol. Rev. 109 (4), 646–678. Klein, D.E., Murphy, G.L., 2001. The representation of polysemous words. J. Mem. Lang. 45 (2), 259–282. Koechlin, E., Jubault, T., 2006. Broca's area and the hierarchical organization of human behavior. Neuron 50 (6), 963–974.

309

Kriegeskorte, N., Goebel, R., Bandettini, P., 2006. Information-based functional brain mapping. Proc. Natl. Acad. Sci. U. S. A. 103 (10), 3863–3868. Kriegeskorte, N., Mur, M., Bandettini, P., 2008. Representational similarity analysis — connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2, 4. Makuuchi, M., Bahlmann, J., Anwander, A., Friederici, A.D., 2009. Segregating the core computational faculty of human language from working memory. Proc. Natl. Acad. Sci. U. S. A. 106 (20), 8362–8367. Metzler, C., 2001. Effects of left frontal lesions on the selection of context-appropriate meanings. Neuropsychology 15 (3), 315–328. Misaki, M., Kim, Y., Bandettini, P.A., Kriegeskorte, N., 2010. Comparison of multivariate classifiers and response normalizations for pattern-information fMRI. NeuroImage 53 (1), 103–118. Monti, M.M., Osherson, D.N., Martinez, M.J., Parsons, L.M., 2007. Functional neuroanatomy of deductive inference: a language-independent distributed network. NeuroImage 37 (3), 1005–1016. Monti, M.M., Parsons, L.M., Osherson, D.N., 2009. The boundaries of language and thought in deductive inference. Proc. Natl. Acad. Sci. U. S. A. 106 (30), 12554–12559. Morris, C.W., 1938. Foundations of the Theory of Signs. The International Encyclopedia of Unified Science. University of Chicago Press. Noveck, I.A., Goel, V., Smith, K.W., 2004. The neural basis of conditional reasoning with arbitrary content. Cortex 40 (4-5), 613–622. Oberauer, K., Geiger, S.M., Fischer, K., Weidenfeld, A., 2007. Two meanings of “if”? Individual differences in the interpretation of conditionals. Q. J. Exp. Psychol. 60 (6), 790–819. Onifer, W., Swinney, D., 1981. Accessing lexical ambiguities during sentence comprehension: effects of frequency of meaning and contextual bias. Mem. Cogn. 9 (3), 225–236. Owen, A.M., McMillan, K.M., Laird, A.R., Bullmore, E., 2005. N-back working memory paradigm: a meta-analysis of normative functional neuroimaging studies. Hum. Brain Mapp. 25 (1), 46–59. Pallier, C., Devauchelle, A.-D., Dehaene, S., 2011. Cortical representation of the constituent structure of sentences. Proc. Natl. Acad. Sci. U. S. A. 108 (6), 2522–2527. Penny, W. D., Holmes, A. P., Friston, K. J. (2004). Random-Effects Analysis. In R. S. J. Frackowiak, K. J. Friston, C. Frith, R. Dolan, C. Price, S. Zeki, … W. D. Penny (Eds.), (pp. 843–850). San Diego (CA): Academic Press. Petten, C.V., Kutas, M., 1987. Ambiguous words in context: an event-related potential analysis of the time course of meaning activation. J. Mem. Lang. 26 (2), 188–208. Piaget, J., 1957. Logic and Psychology. Basic Books. Politzer, G., 1981. Differences in interpretation of implication. Am. J. Psychol. 94 (3), 461–477. Power, J.D., Petersen, S.E., 2013. Control-related systems in the human brain. Curr. Opin. Neurobiol. 1–6. Prado, J., Van Der Henst, J.B., Noveck, I.A., 2010. Recomposing a fragmented literature: how conditional and relational arguments engage different neural systems for deductive reasoning. NeuroImage 51 (3), 1213–1221. Prado, J., Chadha, A., Booth, J.R., 2011. The brain network for deductive reasoning: a quantitative meta-analysis of 28 neuroimaging studies. J. Cogn. Neurosci. 23 (11), 3483–3497. Pylkkänen, L., Llinás, R., Murphy, G.L., 2006. The representation of polysemy: MEG evidence. J. Cogn. Neurosci. 18 (1), 97–109. Reverberi, C., Cherubini, P., Rapisarda, A., Rigamonti, E., Caltagirone, C., Frackowiak, R.S.J., ... Paulesu, E., 2007. Neural basis of generation of conclusions in elementary deduction. NeuroImage 38 (4), 752–762. Reverberi, C., Shallice, T., D'Agostini, S., Skrap, M., Bonatti, L.L., 2009. Cortical bases of elementary deductive reasoning: inference, memory, and metadeduction. Neuropsychologia 47 (4), 1107–1116. Reverberi, C., Cherubini, P., Frackowiak, R.S.J., Caltagirone, C., Paulesu, E., Macaluso, E., 2010. Conditional and syllogistic deductive tasks dissociate functionally during premise integration. Hum. Brain Mapp. 31 (9), 1430–1445. Reverberi, C., Bonatti, L.L., Frackowiak, R.S.J., Paulesu, E., Cherubini, P., Macaluso, E., 2012a. Large scale brain activations predict reasoning profiles. NeuroImage 59 (2), 1752–1764. Reverberi, C., Görgen, K., Haynes, J.-D., 2012a. Compositionality of rule representations in human prefrontal cortex. Cereb. Cortex 22 (6), 1237–1246. Reverberi, C., Görgen, K., Haynes, J.-D., 2012b. Distributed representations of rule identity and rule order in human frontal cortex and striatum. J. Neurosci. 32 (48), 17420–17430. Rodriguez-Moreno, D., Hirsch, J., 2009. The dynamics of deductive reasoning: an fMRI investigation. Neuropsychologia 47 (4), 949–961. Rogalsky, C., Hickok, G., 2010. The role of Broca's area in sentence comprehension. J. Cogn. Neurosci. 23 (7), 1664–1680. Sahin, N.T., Pinker, S., Cash, S.S., Schomer, D., Halgren, E., 2009. Sequential processing of lexical, grammatical, and phonological information within Broca's area. Science 326, 445–449. Sakai, K., Passingham, R.E., 2003. Prefrontal interactions reflect future task operations. Nat. Neurosci. 6 (1), 75–81. Sarma, A., Masse, N.Y., Wang, X.-J., Freedman, D.J., 2016. Task-specific versus generalized mnemonic representations in parietal and prefrontal cortices. Nat. Neurosci. 19 (1), 143–149. Seidenberg, M.S., Tanenhaus, M.K., Leiman, J.M., Bienkowski, M., 1982. Automatic access of the meanings of ambiguous words in context: some limitations of knowledgebased processing. Cogn. Psychol. 14 (4), 489–537. Stenning, K., van Lambalgen, M., 2008. Human Reasoning and Cognitive Science. first ed. The MIT Press. Stokes, M.G., Kusunoki, M., Sigala, N., Nili, H., Gaffan, D., Duncan, J., 2013. Dynamic coding for cognitive control in prefrontal cortex. Neuron 1–12. Tarski, A., Vaught, R.L., 1956. Arithmetical extensions of relational systems. Compos. Math. 13, 81–102.

310

G. Baggio et al. / NeuroImage 135 (2016) 300–310

Thompson-Schill, S.L., D'Esposito, M., Aguirre, G.K., Farah, M.J., 1997. Role of left inferior prefrontal cortex in retrieval of semantic knowledge: a reevaluation. Proc. Natl. Acad. Sci. U. S. A. 94 (26), 14792–14797. Whitney, C., Kirk, M., O'Sullivan, J., Lambon Ralph, M.A., Jefferies, E., 2011. Executive semantic processing is underpinned by a large-scale neural network: revealing the contribution of

left prefrontal, posterior temporal, and parietal cortex to controlled retrieval and selection using TMS. J. Cogn. Neurosci. 24 (1), 133–147. Woolgar, A., Thompson, R., Bor, D., Duncan, J., 2011. Multi-voxel coding of stimuli, rules, and responses in human frontoparietal cortex. NeuroImage 56 (2), 744–752.