Modelling information flow along the human connectome using maximum flow

Modelling information flow along the human connectome using maximum flow

Accepted Manuscript Modelling information flow along the human connectome using maximum flow Youngwook Lyoo, Jieun E. Kim, Sujung Yoon PII: DOI: Refer...

761KB Sizes 0 Downloads 55 Views

Accepted Manuscript Modelling information flow along the human connectome using maximum flow Youngwook Lyoo, Jieun E. Kim, Sujung Yoon PII: DOI: Reference:

S0306-9877(17)30631-X https://doi.org/10.1016/j.mehy.2017.12.003 YMEHY 8738

To appear in:

Medical Hypotheses

Received Date: Accepted Date:

18 June 2017 1 December 2017

Please cite this article as: Y. Lyoo, J.E. Kim, S. Yoon, Modelling information flow along the human connectome using maximum flow, Medical Hypotheses (2017), doi: https://doi.org/10.1016/j.mehy.2017.12.003

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Page 1

TITLE PAGE

Title Modelling information flow along the human connectome using maximum flow

Authors Youngwook Lyoo1, Jieun E. Kim, MD, PhD2,3, Sujung Yoon, MD, PhD2,3

Author Affiliations 1

Seoul National University College of Medicine, Seoul, South Korea

2

Ewha Brain Institute, Ewha W. University, Seoul, South Korea

3

Department of Brain and Cognitive Sciences, Ewha W. University, Seoul, South Korea

Corresponding Authors Sujung Yoon and Jieun E. Kim, Ewha Brain Institute and Department of Brain and Cognitive Sciences, Ewha W. University, 52 Ewhayeodae-gil, Seodaemun-gu, Seoul 03760, South Korea, Tel: +82 2 3277 2478, Fax: +82 2 3277 6562, E-mail: [email protected] (S Yoon) and [email protected] (JE Kim)

Source of Support This project was supported by the Undergraduate Research Program of the Korea Foundation for the Advancement of Science and Creativity (Y Lyoo).

Page 2

ABSTRACT

The human connectome is a complex network that transmits information between interlinked brain regions. Using graph theory, previously well-known network measures of integration between brain regions have been constructed under the key assumption that information flows strictly along the shortest paths possible between two nodes. However, it is now apparent that information does flow through non-shortest paths in many real-world networks such as cellular networks, social networks, and the internet. In the current hypothesis, we present a novel framework using the maximum flow to quantify information flow along all possible paths within the brain, so as to implement an analogy to network traffic. We hypothesize that the connection strengths of brain networks represent a limit on the amount of information that can flow through the connections per unit of time. This allows us to compute the maximum amount of information flow between two brain regions along all possible paths. Using this novel framework of maximum flow, previous network topological measures are expanded to account for information flow through non-shortest paths. The most important advantage of the current approach using maximum flow is that it can integrate the weighted connectivity data in a way that better reflects the real information flow of the brain network. The current framework and its concept regarding maximum flow provides insight on how network structure shapes information flow in contrast to graph theory, and suggests future applications such as investigating structural and functional connectomes at a neuronal level.

Page 3

INTRODUCTION

The brain is an immensely complex network of interconnected neurons. Technological advances in noninvasive neuroimaging, such as diffusion tensor imaging (DTI) and resting-state functional magnetic resonance imaging (rsfMRI), have allowed us to map structural or functional networks of the brain at a near-millimeter resolution [1]. This has led to the recent explosive growth of connectomics, the study of the brain’s connections. Network analysis, and in particular graph theory, has provided great insight in understanding how network structure shapes the integration of neural processes [2]. Graph theory is a mathematical discipline that originated in the early 1700s to solve the famous "Seven Bridges of Konigsberg" problem [3]. By considering brain regions as nodes and brain connections as edges, graph theory can be used to analyze the topological structure of the brain connectomes. Graph theoretical studies have shown that the brain network is comprised of modular structures which are interconnected by hub regions, and shows the small world property which is often observed in social networks [4]. Graph theory may also provide valuable information regarding the biological framework underlying brain function, physiology, and pathology [5-11]. Structural or functional brain networks can be conceptualized as models where brain regions (nodes) are linked by physical or functional connections (edges). Using graph theoretical analysis, several local and global network measures have been quantified to detect the nature of segregation and integration of the structural brain network [12,13]. Network integration accounts for the brain's ability to communicate across the

Page 4

segregated modules [12]. Efficiency of network integration has been previously estimated using the concept of the shortest path length [13]. The underlying assumption of this previous approach was that information flows through the shortest path, therefore implying that smaller shortest path lengths, the stronger the capability to integrate information flow between brain regions [13]. However, these measures of network integration do not account for information flow through non-shortest paths, despite the fact that this is an integral part of brain connectivity [19] In this paper, we first describe novel network measures using the maximum flow model. The maximum flow model enables us to estimate network integration by calculating the maximum amount of information flow possible between two nodes. We hypothesize that maximum flow-based network measures can provide a more accurate and comprehensive picture of information flow in the real brain. We expect that alterations in the network metrics based on maximum flow, as compared with those based on the shortest path length, may be a more valid marker for predicting the extent of brain functions especially when performing tasks that require the use of the maximum capacity of brain networks.

HYPOTHESIS

A matrix consists of a set of nodes (neural elements), and interconnecting edges (connections between neural elements). A matrix can be either binary or weighted. A binary matrix is constructed under the assumption that all connections have equal

Page 5

strength, so that the focus is only on the topological structure of the matrix. In a weighted matrix, the strength of each connection is represented by real-valued connection weights, therefore much more information regarding the structure of the brain network is retained. For example, a human structural connectome that is built using DTI data usually has nodes defined by parcellating cortical and subcortical gray matter regions anatomically. Once the nodes are defined, the weight of the connections can be defined according to its size, density, or the coherence of reconstructed white matter tracts between the pairs of nodes, giving a single connection matrix as shown in Fig. 1. A larger weight of connections between two nodes typically reflects a larger and stronger white matter tract that connects the corresponding brain regions. A core concept in brain network analysis is the measure of integration [12,13]. Measures of integration characterize the brain network’s ability to combine information from distributed brain regions, and are commonly based on the shortest path lengths [12,13]. A path is a sequence of edges that connects two nodes with one another, representing potential routes of information flow [13]. The shortest path length between two regions is defined as the path with the smallest number of edges (in a binary graph) or the shortest sum of edge lengths (in a weighted graph). Given the assumption that information flows only along the shortest path, the shortest path length is widely used to estimate the potential for integration between two regions [13]. Therefore, a lower value of shortest path length corresponds to a stronger integration between the two regions [13]. For estimating the integrative capacity of brain networks, the following two

Page 6

network metrics are widely used in the literature. The first is the characteristic path length of a network, and is the average shortest path length between all pairs of network nodes. Similar to social networks [14], brain networks are known to consistently exhibit small worldness [4], having surprisingly short characteristic path lengths than expected. These characteristics of brain networks imply efficient information integration in the brain. The second measure of integration is the global efficiency of a network, which is defined by the average inverse shortest path length, and reflects the efficiency of information integration in a parallel system. Individual variations in global efficiency of brain networks have been successfully linked to and associated with global brain functions such as intelligence [15] as well as various pathological conditions including Alzheimer’s disease [16] and multiple sclerosis [17]. As discussed above, it should be noted that the global efficiency is also estimated based on the assumption that information flows only using the shortest paths [18]. One issue with the shortest path length framework is that it uses lengths, not weights. Since larger weights should correspond to shorter lengths, it is necessary to transform the raw weight data to length data. Any monotonically decreasing function can be used, but the current convention is to use the inverse function. Therefore, the length is defined as the inverse of the weight. Therefore, path lengths do not actually represent true lengths in brain networks. Fig. 2 shows a schematic representation of the relationship between the weight and path length in the graph theory. One advantage of the maximum flow framework is that it works directly with weight data, eliminating the need for an arbitrary transformation to lengths.

Page 7

Furthermore, information can also flow using non-shortest paths between the two brain regions (illustrated as paths A-E-B and A-D-B in Fig. 2). This brings into question the validity of the previous and widely accepted assumption that information transfer occurs only through the shortest paths (illustrated as the path A-C-B in Fig. 2). Indeed, additional involvement or recruitment of retained paths may be necessary in order to perform tasks requiring substantial cognitive processing. Furthermore, information could flow along more than one pathway, representing different processing streams. Considering these issues, the concept of communicability, which indirectly incorporates the effects of non-shortest paths, has been introduced [19]. The communicability of a network uses the random walk formulation to account for all possible walks between the two nodes. A walk is different from a path in that revisiting nodes is allowed. The communicability can be thought of as the total number of walks possible, where longer walks are penalized exponentially. We also believe that considering all possible paths is important for more accurate estimation of network metrics, and therefore propose an information flow model using the maximum flow. In the framework of the maximum flow model, information flow using all possible paths is accounted for in estimating the integrative capacity of the brain network. For instance, the shortest path length between A-B of panel c in Fig. 2 and A-B of panel d is the same (1/2 vs 1/2), but the maximum flow is greater in panel c (7 vs 5), reflecting the contribution of non-shortest paths. A graphical comparison of information flow using the shortest path framework and the maximum

Page 8

flow framework is also shown in Fig. 3. Originally arising from network models of Soviet traffic networks [29], maximum flow has become a fundamental concept in optimization theory, with applications in areas as diverse as airline scheduling. The maximum flow between two regions (the source to the sink) in a network can be thought of as the maximum amount of information flow possible between two regions, per unit of time [20]. Mathematically, the maximum flow between two regions is defined as the maximal flow possible, subject to two constraints [20]. Here the flow is defined as the sum of all the flows into the sink region. The capacity constraint states that the flow along an edge cannot exceed its capacity (Fig. 4). This assumes that the strength of the connectome’s connection places an upper limit on the amount of information that can flow along the edge. One interesting consequence of this assumption is the existence of bottlenecks. Weak connections will not be able to handle large flows. Of note, the shortest path is insensitive with respect to the appearance of structural bottlenecks in a network [19]. The conservation constraint states that the sum of flows entering a node must equal to the sum of the flows exiting a node (Fig. 4). This constraint may be too restrictive when modeling information flow in the brain. It is likely that as information is processed and passed on, some of the flow may be lost, so that the total flow entering a node will be larger than the total flow exiting the same node. It is possible to relax the conservation constraint so that the sum of flows entering a node is greater or equal to the sum of the flows exiting a node (Fig. 4). A flow that satisfies the capacity constraint and the relaxed conservation constraint is called a

Page 9

preflow [30]. It has been mathematically shown that the maximum preflow (based on the relaxed conservation constraint) is equal to the maximum flow (based on the conservation constraint) [30] In the brain network setting, the relaxed conservation constraint states that for every brain region, the total amount of information flow to the region is greater than the total amount of information flow out of the region. Because this allows for information flow loss, this is a much more biologically plausible model of information flow in the brain. It is noteworthy that in practice, brain connectomes are often binarized using a threshold prior to analysis [13], which may raise the following issues. First, the estimation of network metrics based on the binary matrix leads to the loss of all data about the strength of the connections, because all connections are treated equally regardless of their strength. Also, the choice of a threshold for binarization is highly arbitrary. Therefore, more emphasis is being placed on analyzing weighted connectomes. For example, a recent study has suggested that weak brain connections may also be important in determining general cognitive functions [21]. The maximum flow approach is a natural step in this direction, by making full use of the rich data available within the weighted connectomes.

DISCUSSION AND FUTURE DIRECTIONS

Page 10

The concept of maximum flow will provide insight on how network structure shapes information flow within the brain. Maximum flow can be thought of as a capacity, or an upper bound for the information flow possible between two regions using all possible pathways [20] (Table 1). We hypothesize that the maximum flow model is best suited to describe the brain’s functional activation when searching for solutions to complex and novel tasks. From an evolutionary perspective, information flow through the shortest, quickest path may confer benefits to deal with potential, imminent dangers efficiently [22]. However, performance on highly cognitively demanding and complex tasks may require using diverse pathways rather than following the efficient, but stereotyped process of information integration through the short path. For example, evidence suggests that sentence complexity may greatly influence the amount of brain activation [23]. Complex and novel tasks may therefore utilize diverse brain pathways while exploring and evaluating different solutions. We believe that the maximum flow model may be applied to investigate diverse connectomes from the neuronal level such as in c.elegans [24], to structural connectomes constructed using DTI or radioactive tracers, as well as functional connectomes using fMRI. Compared to measures based on shortest path lengths, measures based on maximum flow may better predict the levels of performance on highly cognitively demanding tasks requiring creativity. Furthermore, measures utilizing maximum flow may reveal novel information not seen by other graph theoretic measures by considering information flow through multiple possible pathways.

It is

also notable that the maximum flow framework gives rise to a rich venue of network

Page 11

measures (Table 1). By computing the average maximum flow between all pairs of regions, one can obtain the global maximum flow, a "maximum flow analog" for global efficiency (Table 2). Other important measures such as the local efficiency [18], and the betweenness centrality of a node [25] have natural analogues as well (Table 2). It is also possible to create a maximum flow matrix between multiple regions, akin to a connectivity matrix between multiple regions. The maximum flow can also be generalized to undirected networks and multiple sources and sinks (Table 2). This makes it possible to study the maximum flow between multiple brain regions. For example, maximum flow between regions in the temporal lobe and regions in the frontal lobe could be calculated. It is also possible to split the maximum flow into two components, and compare the amount of flow that travels through the shortest paths with the amount of flow that travels through the non-shortest paths. This may allow investigations into the roles that non-shortest paths play in integrating information within the brain.

LIMITATIONS

One limitation in applying the maximum flow model is the absence of ideal connectome data optimized for this model. Because maximum flow is defined on weighted directed graphs, ideally connectomes that are not only weighted but also directed should be used. However, structural connectomes using radioactive tracers can differentiate edge directions, but lack quantitative weight data. Meanwhile, structural connectomes from

Page 12

DTI data have excellent weight data but lack directional information. Despite the absence of information regarding directionality, network data constructed using DTI data seems to be the better choice in this situation, as weight information is indispensable in calculating the maximum flow. Network measures of the maximum flow using undirected graphs are also presented in Table 2. Connectomes created from DTI data have enriched our understanding of the brain in many ways, but still suffer from weaknesses. The spatial resolution limit for DTI data is around a millimeter, and it is likely that a true connectome of the human brain at the neuronal level will require a radically different technology. Furthermore, there are inherent weaknesses in the DTI framework such as the issue of crossing fibers [26]. Also, there is no consensus on the optimal definition of nodes, despite the fact that this is known to have large effects on graph theoretical measures [27]. Many of these problems are being addressed [28], but there is still much work to do before a consensus on the best approach to constructing connectomes is reached. Both the graph theory model and the max flow model describe a very idealized mathematical model of information flow in the brain. Although the maximum flow model has the added advantage of accounting for information flow through non-shortest paths, it should be noted that these mathematical models still cannot consider all the complex interactions between multiple brain regions. Furthermore, maximum flow cannot account for the loop or recurrent information flow, as it accounts only for uni-directional flow between multiple brain regions. Because maximum flow looks at the maximum potential for a network to carry information flow between multiple brain regions, it may not be suited in analyzing temporally dynamic information flows. Although this simplicity

Page 13

of including only two components, such as input and output flows rather than potential loop or recurrent flows, have the benefit of being able to model the brain networks, the current model should be validated using the human brain connectome data in order for it to be applicable to studies on the human structural and functional networks. Another point that needs to be addressed is how much the non-shortest paths actually take part in conducting information flow across the brain. We still do not have enough knowledge about the human brain to answer this decisively. It is however possible to use functional imaging to test the hypothesis we made earlier that more nonshortest paths would be recruited in complex and novel tasks. The constraint assumptions underlying the maximum flow model could also be experimentally tested using structural and functional imaging. As with all models, modifications may have to be made to better reflect the brain.

CONCLUSIONS

In summary, we present a novel framework using the maximum flow to characterize information flow in brain networks. We also hypothesize that more non-shortest paths will be recruited in complex and novel tasks, and give possible routes of investigation. It is hoped that this will provide new insight on how structure shapes function in brain networks.

Page 14

CONFLICT OF INTERESTS

The authors reports no conflicts of interest.

Source of Support

This project was supported by the Undergraduate Research Program of the Korea Foundation for the Advancement of Science and Creativity (Y Lyoo). The sponsor of the study had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.

Page 15

REFERENCES

1.

Le Bihan D, Mangin JF, Poupon C, et al. Diffusion tensor imaging: concepts and applications. J Magn Reson Imaging 2001;13:534-46.

2.

Sporns O. Contributions and challenges for network models in cognitive neuroscience. Nat Neurosci 2014;17:652-60.

3.

Euler L. Solutio problematisa d geometriams itus pertinentis, commentariaicademiae scientiarum imperialis. Petropolitanae 1736;8:140.

4.

Watts DJ, Strogatz SH. Collective dynamics of 'small-world' networks. Nature 1998;393:440-2.

5.

Bassett DS, Bullmore E, Verchinski BA, Mattay VS, Weinberger DR, MeyerLindenberg A. Hierarchical organization of human cortical networks in health and schizophrenia. J Neurosci 2008;28:9239-48.

6.

Wang Q, Sporns O, Burkhalter A. Network analysis of corticocortical connections reveals ventral and dorsal processing streams in mouse visual cortex. J Neurosci 2012;32:4386-99.

7.

Markov NT, Ercsey-Ravasz M, Lamy C, et al. The role of long-range connections on the specificity of the macaque interareal cortical network. Proc Natl Acad Sci U S A 2013;110:5187-92.

8.

He Y, Chen Z, Evans A. Structural insights into aberrant topological patterns of large-scale cortical networks in Alzheimer's disease. J Neurosci 2008;28:4756-66.

9.

Stam CJ. Use of magnetoencephalography (MEG) to study functional brain networks in neurodegenerative disorders. J Neurol Sci 2010;289:128-34.

10.

Horstmann MT, Bialonski S, Noennig N, et al. State dependent properties of epileptic brain networks: comparative graph-theoretical analyses of simultaneously recorded EEG and MEG. Clin Neurophysiol 2010;121:172-85.

11.

Raj A, Mueller SG, Young K, Laxer KD, Weiner M. Network-level analysis of cortical thickness of the epileptic brain. Neuroimage 2010;52:1302-13.

12.

Sporns O. Network attributes for segregation and integration in the human brain. Curr Opin Neurobiol 2013;23:162-71.

Page 16

13.

Rubinov M, Sporns O. Complex network measures of brain connectivity: uses and interpretations. Neuroimage 2010;52:1059-69.

14.

Milgram S. The small world problem. Psychology Today 1967;1: 61-67.

15.

Li Y, Liu Y, Li J, et al. Brain anatomical network and intelligence. PLoS Comput Biol 2009;5:e1000395.

16.

Supekar K, Menon V, Rubin D, Musen M, Greicius MD. Network analysis of intrinsic functional brain connectivity in Alzheimer's disease. PLoS Comput Biol 2008;4:e1000100.

17.

He Y, Dagher A, Chen Z, et al. Impaired small-world efficiency in structural cortical networks in multiple sclerosis associated with white matter lesion load. Brain 2009;132:3366-79.

18.

Latora V, Marchiori M. Efficient behavior of small-world networks. Phys Rev Lett 2001;87:198701.

19.

Estrada E, Hatano N. Communicability in complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 2008;77:036111.

20.

Ford LR, Fulkerson DR. Maximal flow through a network. Canadian Journal of Mathematics 1956:8;399-404.

21.

Santarnecchi E, Galli G, Polizzotto NR, Rossi A, Rossi S. Efficiency of weak brain connections support general cognitive functioning. Hum Brain Mapp 2014;35:4566-82.

22.

Dehaene S, Changeux JP. Experimental and theoretical approaches to conscious processing. Neuron 2011;70:200-27.

23.

Just MA, Carpenter PA, Keller TA, Eddy WF, Thulborn KR. Brain activation modulated by sentence comprehension. Science1996;274:114-6.

24.

Varshney LR, Chen BL, Paniagua E, Hall DH, Chklovskii DB. Structural properties of the Caenorhabditis elegans neuronal network. PLoS Comput Bio 2011;7:e1001066.

25.

Freeman LC. A set of measures of centrality based on betweenness. Sociometry 1977:35-41.

26.

Jeurissen B, Leemans A, Tournier JD, Jones DK, Sijbers J. Investigating the prevalence of complex fiber configurations in white matter tissue with diffusion

Page 17

magnetic resonance imaging. Hum Brain Mapp 2013;34:2747-66. 27.

Fornito A, Zalesky A, Bullmore ET. Network scaling effects in graph analytic studies of human resting-state FMRI data. Front Syst Neurosci 2010;4:22.

28.

Behrens TE, Berg HJ, Jbabdi S, Rushworth MF, Woolrich MW. Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? Neuroimage 2007;34:144-55.

29

Schrijver, Alexander. On the history of the transportation and maximum flow problems. Mathematical Programming 91.3 (2002): 437-445.

30

Mazzoni, Giuseppe, Stefano Pallottino, and Maria Grazia Scutellà. The maximum flow problem: A max-preflow approach. European Journal of Operational Research 53.3 (1991): 257-278.

Page 18

TABLES

Table 1. Mathematical formulation of maximum flow Basic mathematical notation

Maximum flow

Let be a weighted graph. is the set of all nodes in the network, and is the total number of nodes. is the set of all edges(connections) in the network. A connection between the nodes and associated with the connection weight . If a connection does not exist we simply set Note that in a directed network does not necessarily equal . Let

be a weighted graph. and

and

The capacity of an edge is a mapping of flow that can pass through a specific edge. A flow from the source node two constraints.

to the sink node



is .

be the source node and the sink node. , denoted by

. It represents the maximum amount +

is a mapping

, denoted by

, subject to

Constraint 1. (The capacity constraint) for each . This means that the flow along an edge cannot exceed its capacity. Constraint 2. (The conservation of flow)

This means that the sum of the flows entering a region must equal the sum of the flows exiting a region, with the exception of the source and sink region. The value of flow is defined by , where amount of flow passing from the source to the sink.

is the source node. It represents the

The maximum flow problem: Find a flow from the source node the value of flow

Maximum preflow

to the sink node

that maximizes

The maximum preflow is identical to maximum flow except for the change of Constraint 2 (the conservation of flow) to Constraint 3.

Page 19

Constraint 3. (The relaxed conservation of flow)

This means that the sum of the flows entering a region must be greater or equal the sum of the flows exiting a region, with the exception of the source and sink region. It has been shown in [30] that this yields an identical value to maximum flow.

Page 20

Table 2. Maximum flow analogues of classical network measures Maximum flow model Maximum flow

Graph theoretical model Maximum flow between

and

Global maximum flow

Shortest path length

and

Global efficiency

Local maximum flow

, is the subgraph of neighbors of the node

Local efficiency

. Maximum flow centrality

Shortest path length between

[weighted network]

is the maximum flow between and , and is the amount of those flows passing through

Betweenness centrality

, is the subgraph of neighbors of the node . This is a measure of fault tolerance of the network. [binary network]

is the total number of shortest paths between node and , and is the number of those paths passing through .

Extensions of maximum flow Undirected max flow Multiple source, Multiple sink max flow

Binary max flow

Maximum flow in undirected graphs is a special case of maximum flow in directed graphs. Simply set the capacity for all Given a network with a set of sources , and a set of sinks , the multi-source multi-sink flow between the sets and can be reduced to the single-source single-sink problem by adding a supersource connecting each vertex in , and a supersink connecting each vertex in , with infinite capacity on each edge. Maximum flow in binary graphs is a special case of maximum flow in weighted graphs. Simply set all nonzero capacities for all such that

Page 21

FIGURE LEGEND

Figure 1. A schematic flowchart for constructing structural and functional connection networks.

Page 22

Figure 2. Measures for network integration using the shortest path length framework

Figure 3. A visual depiction of the concept of 'maximum flow' in the maximum flow model (left) and the 'shortest path length' in the shortest path length framework (right), for modelling weighted connection strength (middle). For instance, maximum flow between A and D is defined as the sum of the information flows among the paths A-B-D,

Page 23

A-C-D, and A-E-F-D. In contrast, the shortest path length between A and D is defined as the information that only flows through the shortest path, which is A-B-D.

Figure 4. Assumptions underlying maximum flow model

Page 24