Multi-agent large-scale parallel crowd simulation

Multi-agent large-scale parallel crowd simulation

Available online at www.sciencedirect.com ScienceDirect This space isComputer reservedScience for the header, do not use it Procedia 108CProcedia (20...

720KB Sizes 31 Downloads 83 Views

Available online at www.sciencedirect.com

ScienceDirect This space isComputer reservedScience for the header, do not use it Procedia 108CProcedia (2017) 917–926 This space is reserved for the Procedia header, do not use it This space is reserved for the Procedia header, do not use it

International Conference on Computational Science, ICCS 2017, 12-14 June 2017, Zurich, Switzerland

Multi-agent large-scale parallel Multi-agent large-scale parallel crowdlarge-scale simulationparallel Multi-agent crowd simulation Artur Malinowski1 , Pawel Czarnul Krzysztof Czurylo2 , Maciej Maciejewski2 , crowd11, simulation 1 2 2 2 Artur Malinowski , Pawel Czarnul , Krzysztof Czurylo , Maciej Maciejewski , and Pawe l Skowron 2 1 1 and Pawe l Skowron Artur Malinowski , Pawe l Czarnul , Krzysztof Czury lo2 , Maciej Maciejewski2 , 1 Faculty of Electronics, Telecommunications 2 and Informatics and Pawe l Skowron 1 Gdansk University of Technology, Poland Faculty of Electronics, Telecommunications and Informatics [email protected], [email protected] Gdansk University of Technology, Poland Faculty Electronics,Poland Telecommunications and Informatics 2 IntelofTechnology Sp. [email protected] z o.o., Gdansk, Poland [email protected], [email protected], University of Technology, Poland [email protected] 2 [email protected], Intel Technology Poland Sp. z o.o., Gdansk, Poland [email protected], [email protected] [email protected], [email protected], [email protected] 2 Intel Technology Poland Sp. z o.o., Gdansk, Poland [email protected], [email protected], [email protected] 1

Abstract This paper presents design, implementation and performance results of a new modular, parallel, Abstract agent-based and large scale crowd simulationand environment. A results parallelofapplication, implemented This paper presents design, implementation performance a new modular, parallel, Abstract with C and MPI, was implemented and run in this parallel environment for simulation and agent-based and large scale crowd simulation environment. A parallel application, implemented This paper presents design, implementation and performance results of a new modular, parallel, visualization of an evacuation scenario at Gdansk University of Technology, Poland and further with C and and MPI, wasscale implemented and runenvironment. in this parallel environment for simulation and agent-based large crowd simulation A parallel application, implemented in the area ofof districts of Gdansk. The at application uses a parallel MPI I/O run on two different visualization an evacuation scenario Gdansk University of Technology, Poland and further with C and MPI, was implemented and run in this parallel environment for simulation and clusters andofa districts two or three node Parallel File System (PFS) to store a current state in a file. In in the area of Gdansk. The at application uses a parallel MPI I/O run on two visualization of an evacuation scenario Gdansk University of Technology, Poland anddifferent further order to and make this or implementation efficient, we used our previously developed and tuned Byteclusters a two three node Parallel File System (PFS) to store a current state in adifferent file. In in the area of districts of Gdansk. The application uses a parallel MPI I/O run on two addressable Non-volatile RAM Distributed Cache – aour solution that developed allows to access smallBytedata order to make this implementation efficient, we used previously and tuned clusters and a two or three node Parallel File System (PFS) to store a current state in a file. In chunks fromNon-volatile spread locations a file efficiently. We have that presented execution addressable RAMwithin Distributed – aour solution allowsapplication to access small data order to make this implementation efficient,Cache we used previously developed and tuned Bytetimes versus the number of agents (up to 100000), versus the number of application simulation execution iterations chunks from spread locations within a file efficiently. We have presented addressable Non-volatile RAM Distributed 2Cache – a solution that allows to access small data (up to 25000), mapofsize (up to 6km ) and versus the ofofprocesses (upiterations to more times theversus number agents to 100000), versus thenumber number simulation chunksversus from spread locations within(up a file efficiently. We have presented application execution 2 thanto650) showing high speed-ups. (up 25000), versus map size (up to 6km ) and versus the number of processes (up to more times versus the number of agents (up to 100000), versus the number of simulation iterations than 650) showing high speed-ups. Keywords: parallel crowd dis(up to 25000), versus mapsimulation, size (up tomulti-agent 6km2 ) andlarge-scale versus thesimulation, number ofscalability, processes NVRAM (up to more

© 2017 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee oflarge-scale the International Conference on Computational Science tributed cache than 650) showing high Keywords: parallel crowdspeed-ups. simulation, multi-agent simulation, scalability, NVRAM distributed cache Keywords: parallel crowd simulation, multi-agent large-scale simulation, scalability, NVRAM distributed cache

1 Introduction 1 Introduction Computer simulations are one of the most useful high performance computing (HPC) applica1 Introduction tions. Sincesimulations when supercomputers were introduced, such simulations were used(HPC) to predict the Computer are one of the most useful high performance computing applicaweather (e.g.when 24-hour weather forecast [21]), investigate the effect ofwere people’s behavior (e.g. tions. Since supercomputers were introduced, such simulations used(HPC) to predict the Computer simulations are one of the most useful high performance computing applicanetwork traffic24-hour simulation [10]) forecast or determine expected parameters ofofhypothetical substances weather (e.g. weather [21]), investigate the effect people’s behavior (e.g. tions. Since when supercomputers were introduced, such simulations were used to predict the (e.g. simulation of fluid adsorption in buckytubes [9]). parameters Several years of research and studies, network traffic24-hour simulation [10]) forecast or determine substances weather (e.g. weather [21]), expected investigate the effectofofhypothetical people’s behavior (e.g. including both software and hardware related, have ledSeveral to much more accurate and models and (e.g. simulation of fluid adsorption in buckytubes [9]). years of research studies, network traffic simulation [10]) or determine expected parameters of hypothetical substances significantly increased sizes ofhardware simulation domains. Aledgreat example of accurate a modern computer including both software and related, have to much more models and (e.g. simulation of fluid adsorption in buckytubes [9]). Several years of research and studies, significantly increased sizes of simulation domains. A great example of a modern computer including both software and hardware related, have led to much more accurate models and significantly increased sizes of simulation domains. A great example of a modern computer1 1 1877-0509 © 2017 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the International Conference on Computational Science 10.1016/j.procs.2017.05.036

1

918

Architecture of multi-agent large-scale crowd et simulation application Malinowski et al. Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926

simulation could be Blue Brain project 1 that plans to prepare a digital reconstruction of the brain – at the current project stage researchers were able to simulate a part of the brain of a rat [3]. Another phenomenon that may benefit from accurate computer simulations is crowd behavior. The topic is really popular nowadays: large areas such as cities have to deal with constantly growing populations, many European universities recruit more and more students, conference centers organize bigger and bigger events each year etc. In such cases crowd simulation is a method that allows to predict whether a place is able to offer enough capacity for a given number of individuals, calculate time of queuing in bottlenecks and – probably the most useful – plan the evacuation. Fast and secure evacuation is especially important for Polish academics – in 2015 three students died in a panic-driven stampede at one of Polish universities. Increasing complexity of the model and the scale of a simulation environment is a constant challenge for HPC. More powerful hardware is usually not enough, the main focus should be put on algorithms and application architecture. In this paper we propose an agent-based crowd simulation application with a modular, file based architecture, that is able to perform large scale simulations (thousands of individuals, area of square kilometers). We focus on an environment for individual agents – we provide description of an architecture, full technology stack and implementation details. Efficiency and performance of proposed solution is proven by a set of experiments run on two different HPC clusters.

2

Related Work

There is a great deal of research in the topic of modeling behavior of a crowd, most of these are focused on an agent-based approach (sometimes called also individual oriented). In a typical case, agents move through a two-dimensional map. The map is divided into small squares, each square represents either an obstacle (e.g. wall, tree, fence), is free, or can be occupied by an agent. The simulator triggers consecutive agent behavior, one step per each iteration of simulation [7]. Behavior of each agent in the environment is determined by a provided model. Many models are really sophisticated, they can include not only basic parameters (e.g. agent speed), but also consider for instance impact of relationship between agents (e.g. family, stranger), their need for help (e.g. young, old, disabled), or level of altruism [2]. In the context of performance, experimental results in literature are usually based maximum on hundreds of individuals [7]. In 2014, A. Gutierrez-Milla et al. proposed a simulator implemented both in NetLogo and C language using Message Passing Interface (MPI) [4]. Although the technology stack and an approach is similar to our solution, the paper focuses mainly on a single agent model and the authors do not give much details about the method of agents interactions. Results of experiments presented in the paper were promising, but the scale was not very large – the simulation contained maximum 15 000 agents and 16 MPI processes. A significant group of agent based simulation software is also connected to computer graphics, it is widely used in movies or video games. Although applications are often efficient and precisely described, the methods cannot be applied mainly because of a single node architecture [15, 19, 18]. Many papers propose concepts different than an agent-based approach. Roger L. Hughes obtained a set of equations for calculating properties of a crowd itself, not as a sum of individuals [6]. The biggest disadvantage of this method, however, is lack of flexibility – e.g. it cannot be applied to a heterogeneous crowd easily. Another solution, hybrid modeling, com1 http://bluebrain.epfl.ch/

2



Architecture of multi-agent large-scale crowd et simulation application Malinowski et al. Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926

bines macroscopic (focus on the features of the whole crowd) with microscopic (individual based) approach [22, 8]. The main motivation for hybrid modeling was insufficient performance of the agent-based approach. On the other hand, hybrid modeling increases complexity of the application. In most cases, a simulation area is restricted to a single building or a part of a city. On the other hand, there are situations that require a specific approach, e.g. evacuation on a ship [1, 14] or on a moving platform in general [17]. Additional conditions like ocean waves are considered as out of the scope of this paper – our solution is designed to perform best in stable areas.

3

Motivations

Considering existing solutions presented in related work, our aim is to create agent-based crowd simulation with an architecture that would concern following concepts: 1. Performance. The platform should be able to run simulations of thousands of agents efficiently on a large size map. To our best knowledge, so far there has been no solution that allowed to perform simulation with tens of thousands of people in the area of a district or even a small city. 2. Scalability. A common property of HPC applications – with an increasing number of cluster nodes, execution time of the application should decrease. 3. Modularity. With increasing complexity of applications and the number of people that work on these, there is a need for a software, that could be developed by different teams together. We want to create the architecture in a modular way with precise specification of interfaces between each of the modules. Moreover, we do not focus on a model of agents behavior, so it needs to be easily replaceable.

4

Proposed Solution

The basic architecture of the application contains three component types: • simulation environment, • agent models – one or more, • output components – in a basic scenario one for gathering statistics, and one for visualization.

4.1

Simulation Environment

The simulation environment is the main part of the application. The first task of creating a simulation environment is loading a map. In the simulation, a two-dimensional map is divided into 40cm x 40cm tiles (the space needed to hold one individual [4]). Each tile has a fixed size – within this paper we operate on four byte tiles, but it can be easily extended if needed. The environment component does not specify properties and format of the data stored within a tile, however, it must known how to recognize the agent and obstacles the agent cannot pass through (e.g buildings or water reservoirs). In order to do this, a programmer needs to register two binary masks: one for an agent, one for a tile that is blocked. A single tile can store information related to the type of terrain (as it may influence 3

919

920

Architecture of multi-agent large-scale crowd et simulation application Malinowski et al. Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926

for instance the speed of an agent), and – if occupied by an agent – visible agent parameters like age, sex, disability. As an agent is able to observe a part of the map around, saving the data in a tile is the only method to pass information between different agents. After map initialization, in each iteration of the main loop, each agent is triggered to make a move. The component passes two pieces of information to an agent: its position and state. The position is represented by actual coordinates of the agent on the map. The state is an object that the agent should store its internal state in. In the first iteration, the state is empty and the agent is responsible for its initialization. After an agent finishes its task, it is expected to return the list of the positions it is trying to move to. Then, the environment component checks whether it is possible and moves the agent by changing information stored in tiles. As observing the map by an agent is a relatively expensive operation, it is up to the agent whether and when the observation procedure is executed. The last task of the simulation environment is executing output components. As the whole state of a map and all objects are stored in a file, the only information passed to components are a file handle and map dimensions. To reduce overhead for output operations, it is executed after each n iterations of agents’ moves. Dependencies between modules are illustrated in Figure 1.

Figure 1: Components of the proposed architecture. The simulation environment was implemented in the C language using an MPI library, the natural way of communicating with PFS in MPI is MPI I/O. Storing an application state in a file causes frequent PFS requests, most of these are of the size of bytes. Moreover, with randomly distributed agents, there is no possibility to rearrange requests into larger blocks. Such I/O pattern is strongly discouraged due to performance reasons [20, 5, 16], so we decided to use an additional layer between MPI I/O and PFS. The layer is a slightly modified version of Byte-addressable Non-volatile RAM Distributed Cache – solution that allows to access small data chunks from spread locations of a file efficiently [11]. Specifically, there are cache managers, one per a cluster node, that act as proxies in reading/writing data between an MPI application process and a PFS. As NVRAM is assumed within each node, it can be used for very fast byte-level access. A file is prefetched into NVRAM and data operated on in NVRAM at application runtime in that solution. In the context of this paper, the first modification is connected to underlying memory – instead of NVRAM we used regular RAM. This results in a high performance RAM based storage with caching and a file I/O API. The second modification introduces a new operation to the cache that enables locking a request at the level of a single map tile. This change was required to ensure correct synchronization between agents accessing same location simultaneously. Another advantage of incorporating distributed cache is a possibility 4



Architecture of multi-agent large-scale crowd et simulation application Malinowski et al. Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926

to switch to byte-addressable NVRAM when available, which would introduce support for larger input maps and provide some failure recovery routines almost at no cost [12].

4.2

Agent models

An agent model is responsible for controlling behavior of a single agent. As stated in the related work section, logic behind an agent could be really complicated – calculating the next move may depend on many different properties. An agent compatible with our architecture must implement a single function in C, so the most straightforward technology to create an agent could be plain C/C++ enhanced by OpenMP, OpenCL, NVIDIA CUDA etc. Creating model implementations in other languages is also possible, e.g. Python – more and more popular among scientists [13] – could be used by CFFI tools2 . Possibility of leveraging GPGPU in order to improve efficiency of a model implementation depends on additional conditions. In each iteration of a simulation, overhead for GPU initialization/deinitialization and synchronization of memory between a host system and a GPU should be compensated by a gain from execution on GPU cores. Reducing execution time of an agent action is possible only with parallel, more computationally demanding algorithms. In the case of an agent model behavior could rely on artificial intelligence – such an agent would work out the best movement strategy for a specific environment on its own. Using other computaR tional accelerators may require a different approach, e.g. running the application on Intel R should not include any additional modifications because it natively supports MPI. Xeon Phi On the other hand, the MPI I/O Distributed Cache has not been tested on this architecture yet, its efficiency requires further research. At the current phase of the project we focused on the architecture of the simulation application. Considering a particular model in details is out of scope of this paper. However, in order to verify the solution and obtain performance results, we implemented a basic model. An agent starts at a selected position and walks in a straight line to its aim. If it is not possible (because of an obstacle or another agent), it tries to bypass the occupied tile. To simulate more demanding computations, in each step we included a 1 millisecond delay.

4.3

Output components

Output components are triggered each n iterations, their goal is to save an application state at a certain step of simulation. The simplest component would prepare a copy of a processing file that could be used in post-processing, a more advanced component can prepare visualization and gather some useful statistics (e.g. the number of evacuated people, bottleneck locations). In our example we prepared a simple ASCII-based visualization component for debugging process and a more advanced one that is able to prepare a map with agents marked. Sample fragments of both are presented in Figure 2.

5

Experiments

5.1

Testbed environment

All of the experiments were performed using two clusters: Lap06 and a larger K2. Specification of Lap06: 2 http://cffi.readthedocs.io

5

921

922

Architecture of multi-agent large-scale crowd etsimulation application Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926Malinowski et al.

(a) ASCII-based visualization

(b) Visualization using map

Figure 2: Two sample output modules. ASCII-based visualization prepared for debugging (a) and visualization fragment of evacuation simulation at Gdansk University of Technology (b).

• six computing nodes, two PFS nodes; R Xeon R E5-4620 2.20 GHz, 32GB of RAM • each node equipped with two CPUs Intel and SSD;

• nodes connected with 40Gb/s Infiniband. Specification of K2: • hundred computing nodes, three PFS nodes; R Xeon R E5345 2.33 GHz, 8GB of RAM and • each node equipped with two CPUs Intel HDD;

• nodes connected with 10Gb/s Ethernet. In both environments the same software was installed: CentOS 6.5, MPICH 3.2 as MPI implementation and Orange-FS 2.8.7 as PFS.

5.2

Test simulations details

In the tests we used a map of the area of three districts in the city of Gdansk, Poland. Our visualization component prepared a full dump of the current state each thousand iterations, which is approximately 5 minutes of simulated evacuation. In order to measure the performance under full load of the application all the time, when the agent finds its destination it selects another destination point randomly. Each of presented test cases was repeated six times, charts presents average values. The difference between single measurements and the average value did not exceed 20% in any of test runs. 6



Architecture of multi-agent large-scale crowd etsimulation application Malinowski et al. Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926

5.3

Results

Figure 3 and 4 prove, that we were able to perform a simulation for a large number of agents (up to 100 000) on an area of square kilometers (6km2 ). As expected, execution time grows linearly with an increasing number of agents. Execution time is also directly proportional to the number of iterations (illustrated in Figure 5).

Figure 3: Execution time vs the number of agents. Lap06 cluster, 12 500 iterations (about 1 hour of simulated evacuation), map of 6km2 .

Figure 4: Execution time vs the number of agents. K2 cluster, 12 500 iterations (about 1 hour of simulated evacuation), map of 6km2 . The application is scalable (Figure 6) but, understandably, efficiency is not ideal. The reason for the latter is the fact, that some of the parts of the code that are executed sequentially, as well as PFS operations. In our implementation, visualization of the large map is performed using a single thread. Moreover, usage of a distributed cache to handle file requests involves initialization and deinitialization overhead. Additional time of the overhead is related to the size of a file and PFS performance. 7

923

924

Architecture of multi-agent large-scale crowd simulation application Malinowski et al. Artur Malinowski et al. / Procedia Computer Science 108C (2017) 917–926

Figure 5: Execution time vs the number of iterations (time of evacuation). Lap06 cluster, 15 000 agents, map of 6km2 .

Figure 6: Execution time vs the number of processes. K2 cluster, 12 500 iterations (about 1 hour of simulated evacuation), 15 000 agents, map of 6km2 .

One of the biggest advantages of proposed architecture is ability to handle large simulation areas. Figure 7 shows a relatively slow growth of the execution time for an increasing size of a map. What is more, the reason for the growth is not related to the simulation itself, but results from the visualization component (more data to copy) and distributed cache overhead (only when opening and closing a file).

6

Summary and Future Work

In this paper we presented an architecture, execution and tests of parallel agent-based crowd simulation application. Detailed description of three modules, their dependencies and technology stack show ease of development process that could be an advantage when adjusting it to another use case (for instance, to use it with a different model of agent behavior). Performance 8



Architecture of multi-agent large-scale crowd simulation application Malinowski et al. Artur Malinowski et al. / Procedia Computer Science 108C (2017) 917–926

Figure 7: Execution time vs the size of the simulation area. Lap06 cluster, 12 500 iterations (about 1 hour of simulated evacuation), 15 000 agents. according to different parameters and scalability results prove that the architecture is able to perform simulations of thousands of agents on an area of districts or even cities. Future work may contain development of a more sophisticated agent model and preparing a simulation of evacuation with a detailed map of large areas. We expect that such experiments could potentially detect parts of the city that may need a special attention in order to ensure that the whole area would be safe in case of a danger. Another future task is related to performance analysis in the situation when the distributed cache is stored on an NVRAM device. In such a case the application state could be recovered from the time before a failure – it is really important for a simulation that takes a large amount of time.

Acknowledgments The research in the paper was supported by a Grant from Intel Technology Poland.

References [1] Marina Balakhontceva, Vladislav Karbovskii, Dmitrii Rybokonenko, and Alexander Boukhanovsky. Multi-agent simulation of passenger evacuation considering ship motions. Procedia Computer Science, 66:140 – 149, 2015. [2] A. Braun, S. R. Musse, L. P. L. de Oliveira, and B. E. J. Bodmann. Modeling individual behaviors in crowd simulation. In Computer Animation and Social Agents, 2003. 16th International Conference on, pages 143–148, May 2003. [3] Henry Markram et al. Reconstruction and simulation of neocortical microcircuitry. Cell, 163(2):456 – 492, 2015. [4] A. Gutierrez-Milla, F. Borges, R. Suppi, and E. Luque. Individual-oriented model crowd evacuations distributed simulation. Procedia Computer Science, 29:1600 – 1609, 2014. [5] Bilel Hadri. Introduction to Parallel I/O, October 2011. https://www.olcf.ornl.gov/ wp-content/uploads/2011/10/Fall_IO.pdf. [6] Roger L. Hughes. A continuum theory for the flow of pedestrians. Transportation Research Part B: Methodological, 36(6):507 – 535, 2002.

9

925

926

Architecture of multi-agent large-scale crowd etsimulation application Artur Malinowski al. / Procedia Computer Science 108C (2017) 917–926Malinowski et al.

[7] Hiroyuki Kobayashi, Yutaka Ishimoto, Masaki Fujioka, and Kenichi Ishibashi. A multi-agent evacuation simulator to design safe cities for high quality of life with computer clustering. In SICE, 2007 Annual Conference, pages 3043–3046, Sept 2007. [8] Haiying Liu, Zhixin Yan, Robert W. Lindeman, Gangyi Ding, and Tianyu Huang. A hybrid crowd simulation framework towards modeling behavior of individual avoidance of crowds. In Proceedings of the 14th ACM SIGGRAPH / Eurographics Symposium on Computer Animation, SCA ’15, pages 193–193, New York, NY, USA, 2015. ACM. [9] M. W. Maddox and K. E. Gubbins. Molecular simulation of fluid adsorption in buckytubes and mcm-41. International Journal of Thermophysics, 15(6):1115–1123, 1994. [10] Hani S Mahmassani, R Jayakrishnan, Kyriacos C Mouskos, and Robert Herman. Network traffic simulation and assignment: Supercomputer applications. Res. Report CRAY-SIM-1988-F, 1988. [11] Artur Malinowski, Pawe£l Czarnul, Piotr Doro˙zynski, Krzysztof Czury£lo, L £ ukasz Dorau, Maciej Maciejewski, and Pawe£l Skowron. A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache. In M. Ganzha, L. Maciaszek, and M. Paprzycki, editors, Position Papers of the 2016 Federated Conference on Computer Science and Information Systems, volume 9 of Annals of Computer Science and Information Systems, pages 133–140. PTI, 2016. [12] Artur Malinowski, Pawe£l Czarnul, Maciej Maciejewski, and Pawe£l Skowron. A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications. In Adam Grzech, Jerzy Swiatek, Zofia Wilimowska, and Leszek Borzemski, editors, Information Systems Architecture and Technology: Proceedings of 37th International Conference on Information Systems Architecture and Technology – ISAT 2016 – Part II, pages 137–147. Springer International Publishing, 2017. [13] K Jarrod Millman and Michael Aivazis. Python for scientists and engineers. Computing in Science & Engineering, 13(2):9–12, 2011. [14] Jin-Hyung Park, Dongkon Lee, Hongtae Kim, and Y-S Yang. Development of evacuation model for human safety in maritime casualty. Ocean Engineering, 31(1112):1537 – 1547, 2004. [15] N. Pelechano, J. M. Allbeck, and N. I. Badler. Controlling individual agents in high-density crowd simulation. In Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’07, pages 99–108, Aire-la-Ville, Switzerland, Switzerland, 2007. Eurographics Association. [16] NASA High-End Computing Program. Lustre Best Practices, August 2015. http://www.nas. nasa.gov/hecc/support/kb/lustre-best-practices_226.html. [17] Dmitriy Rybokonenko, Marina Balakhontceva, Daniil Voloshin, and Vladislav Karbovskii. Agentbased modeling of crowd dynamics on a moving platform. Procedia Computer Science, 66:317 – 327, 2015. [18] Mankyu Sung, Michael Gleicher, and Stephen Chenney. Scalable behaviors for crowd simulation. Computer Graphics Forum, 23(3):519–528, 2004. [19] Branislav Ulicny and Daniel Thalmann. Towards interactive real-time crowd behavior simulation. Computer Graphics Forum, 21(4):767–775, 2002. [20] Philippe Wautelet. Best practices for parallel IO and MPI-IO hints, March 2015. http://www. idris.fr/media/docs/docu/idris/idris_patc_hints_proj.pdf. [21] P. M. Wolff and P. G. Kesel. Accurate 24-Hour Weather Forecasts: An Impending Scientific Breakthrough, pages 437–460. Springer US, Boston, MA, 1977. [22] Muzhou Xiong, Michael Lees, Wentong Cai, Suiping Zhou, and Malcolm Yoke Hean Low. Hybrid modelling of crowd simulation. Procedia Computer Science, 1(1):57 – 65, 2010.

10