A sustainable collaboratory for coastal resilience research

A sustainable collaboratory for coastal resilience research

Journal Pre-proof A sustainable collaboratory for coastal resilience research Steven R. Brandt, Shuai Yuan, Qin Chen, Ling Zhu, Reza Salatin, Rion Doo...

9MB Sizes 0 Downloads 83 Views

Journal Pre-proof A sustainable collaboratory for coastal resilience research Steven R. Brandt, Shuai Yuan, Qin Chen, Ling Zhu, Reza Salatin, Rion Dooley

PII: DOI: Reference:

S0167-739X(19)30295-X https://doi.org/10.1016/j.future.2019.11.002 FUTURE 5273

To appear in:

Future Generation Computer Systems

Received date : 31 January 2019 Revised date : 31 May 2019 Accepted date : 1 November 2019 Please cite this article as: S.R. Brandt, S. Yuan, Q. Chen et al., A sustainable collaboratory for coastal resilience research, Future Generation Computer Systems (2019), doi: https://doi.org/10.1016/j.future.2019.11.002. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier B.V.

Journal Pre-proof

Highlights

urn a

lP

re-

pro of

Our coastal model repository predicts storm surges, tsunamis, flooding, and more. Our system will make it easier for scientists to model hazards and share data. We use advanced cyberinfrastructure to make experiments repeatable and easy to use. Our software and the models it leverages are open source and free to use.

Jo

   

Journal Pre-proof

A Sustainable Collaboratory for Coastal Resilience Research Qin Chen, Ling Zhu and Reza Salatin Department of Civil and Environmental Engineering, and Department of Marine and Environmental Sciences Northeastern University Email: [email protected] Email: [email protected] Email: [email protected]

pro of

Steven R. Brandt and Shuai Yuan Center for Computation and Technology Louisiana State University Email: [email protected] Email: [email protected]

Rion Dooley Data Machines Corp Email: [email protected]

of complex models on cloud and cloud-like architectures with negligible performance overhead through use of low-level virtualization technology. We demonstrate a system that enables and facilitates the research and collaboration at the CRC. The system is built upon (1) Jupyter notebooks [KRKP+ 16a] to create a customizable, interactive tool for science discoveries and engineering analyses, (2) the Agave framework [DBF18] to simplify sharing of data and accessing high performance computing resources, (3) Singularity [KSB17] to deploy and run simulation codes on cloud-enabled resources (making use of MPI and multiple compute nodes), and (4) Docker [Mer14] both to construct the images for singularity and to deploy the Jupyter notebooks to user machines and workstations. The integration of these technologies provides an intuitive and easy-to-use way for users to interact both with high performance computers and with each other.

I. I NTRODUCTION

lP

re-

Abstract—Our goal is to create a cloud-ready repository of open-source coastal modeling tools which enable scientists and engineers to use high performance computers to study a variety of physical and ecological processes. The system we are building leverages Jupyter notebooks, Docker, Singularity, and the Agave Platform to create a platform for running jobs and analyzing data in a way that is (1) intuitive, (2) repeatable, (3) and collaborative. The paper describes the methodology to integrate the four technologies into the system that serves the coastal resilience research collaboratory. Four open-source numerical models are used to demonstrate the utility of the system. Simulation results of ocean waves generated by Hurricane Issac (2012), coastal wave evolution, and wave forces on a bridge deck are presented as an illustration. The sustainability of the collaboratory is discussed in detail.

Jo

urn a

Communities on modern river deltas (with total populations > 500 million people) are threatened due to global reductions in river sediment, land subsidence and rising sea level. Hundreds of billions of dollars must be invested to secure human communities from these threats in the face of continued population growth. The many risk mitigation projects needed to address these issues will require an intensive effort in computer simulations that are integrated with data collection and engineering analytics for guidance. This is a grand challenge for earth system science as dynamic environmental processes at appropriate space and time scales which must be integrated into observation networks and coupled simulation models. Our vision (which was funded by NSF [Cyb]) was to build a Coastal Resilience Collaboratory (CRC) to advance research, enrich training, inspire collaboration, and inform decision making through highly available innovation-enabling cyberinfrastructure (CI), with a particular focus on geosciences and engineering integrated with ecosystem ecology for the sustainability of deltaic coasts. We proposed an integrated, coupled modeling framework built on top of a platformtransparent cloud technology tailored for the coastal modeling community. Its primary focus is to facilitate the deployment

II. R ELATED W ORK

The most closely related work is, perhaps CyberGISJupyter [YLP+ 17] which (like our project) uses a Jupyter notebook to provide access to HPC resources. Like our work, also, they employ containers to launch their notebooks, but run them from resources such as Jetstream [SCF+ 15]. Considerable effort has gone into enabling remote execution within the Jupyter project through the JupyterSpawner community projects [jup], [TCC+ 17], [Mil17]. This project is inspired by theirs, but uses a Docker container for a Jupyter Notebook rather than a JupyterHub. Our goal was to make the installation and use of the Coastal Model Repository (CMR) as decentralized and light-weight as possible, and needing only minimal support from HPC support staff. Indeed, our system is intended to be able to make use of a workstation or a small group-managed cluster. Delegating authentication, provenance, file movement, sharing, job management to Agave has been a successful solution for a number of science gateways [HVM+ 15], [HVM+ 14],

Presented at Gateways 2018, University of Texas, Austin, TX, September 25–27, 2018. https://gateways2018.figshare.com/

Journal Pre-proof

III. M ETHODOLOGY

container. Thus, we do not need the same version of MPI (or any version of MPI at all) to exist on the host. Getting this to work is not as straightforward as it sounds. When OpenMPI starts up, for example, it runs a command similar in form to “ssh orted ...”, where orted is the name of OpenMPI’s remote execution program and “...” is the argument list passed to orted. We need to change the command orted ... to “singularity exec my.simg orted ...”. It might be possible to modify OpenMPI or MPICH to make this change, however, we have found a simpler solution based on ssh. Normally, ssh does not make the command it is executing available through the argument list (the variable $* in bash) or an environment variable. It will, however, make the command available if some command is specified in the ~/.ssh/authorized_keys file, i.e. if the line containing the public key starts off with the command directive, e.g.

pro of

[LGD15], [CMZ+ 18], etc. In choosing it, we greatly reduced the amount of work required to build our gateway. It is not the only possible choice available to gateway developers, however. Other projects have utilized a variety of other approaches to manage remote computation. Middleware libraries such as SAGA[GJK+ 06], Vine[RDG+ 07], and DARE[MKEKJ12] have been popular in the gateway community over the years. Programmatically invoking containers and raw code from within the Python kernel is common in high-trust and single user environments. Co-locating notebook servers with the container runtime, and forking direct system callouts are also frequently used. Each of these approaches requires knowledge of the target execution environment and valid user credentials to run correctly. Adopting them would introduce a trade-off between security, portability, reusability, and reproducibility into our notebooks and force users without experience in these areas to make such choices.

command="bin/auth-cmd.sh"| \verb|ssh-rsa AAB...

Docker is a tool designed to make it easier to create, deploy, and run applications by using self-contained container environments. It consists of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows. Docker provides a convenient framework for providing standardized, repeatable builds that is readable and understandable. We use Docker to launch a consistent Jupyter environment for users of our tool, ensuring that relevant Jupyter widgets and the Agave command line tools are properly installed. In order to provide a persistent working environment, we instantiate a Docker volume as well. We orchestrate the instantiation of the tool using a Docker Compose file which, additionally, allows us to name the container as it starts up, making it easier for users to copy files into and out of the running image (an optional method for providing the input files required for simulations).

With this change, ssh will put the relevant command, e.g. “orted ...” in the special environment variable “SSH_ORIGINAL_COMMAND.” If, in our auth-cmd.sh program, we execute the following:

re-

A. Docker

exec singularity exec my.simg bash \ --login -c "$SSH_ORIGINAL_COMMAND"

B. Singularity

urn a

lP

We now have a solution which enables us to use a single MPI version, one installed inside the container, without the need for having a matching installation (or indeed, any MPI installation) on the host. For this project we developed scripts to make this exec function select the correct image from the many available on the host as execution begins on each node, allowing one user account to be used for running many types of jobs. The question may arise, however, can this installation of MPI make proper use of infiniband or proprietary interconnect? The answer is, yes. The current images are built with MPICH and can, therefore, take advantage of the ABI compatibility initiative which began in 2013 [mpi]. This means that the .so files created on the host machine and those in the container are binary compatible. Building on this property, the containers set LD_LIBRARY_PATH=/usr/local/mpich/lib, a directory that can be mounted from the host machine to supply libraries needed by the host machine for MPI. Our base container users MPICH 3.1.4, which has a wide binary compatibility with later versions of MPICH.

Jo

Docker, however, because of its reliance on root permissions for so many of its tasks, has limited availability in academic high performance computing (HPC) environment. For this reason, Singularity and other similar tools have been created. Singularity does not allow regular users any form of sudo access. However, because root access is often needed during the creation of images (e.g. for installing software), Singularity frequently relies on Docker to obtain images for use on HPC machines. A difficulty with Singularity is that, while it can be used for running MPI jobs, the MPI version inside and outside the container must be matched exactly with one installed on the host, even (in many cases) including compiler options. In this project we developed a technique to circumvent this problem by using a command in the ~/.ssh/authorized_key file to start the desired image. That allows us to launch a job from inside Singularity and use only the MPI version inside the

C. Jupyter Although the Agave Platform’s reference science gateway, Agave ToGo, provides a wizard and form-driven interface for submitting jobs, our target audience finds the lower level Python interface made available through Jupyter Notebooks[KRKP+ 16b] to be convenient for visualization services, which ToGo does not provide.

Jupyter is a tool for interactive computing that is halfway between a GUI and a command-line tool, offering the best of both worlds for many users. For the beginners, using the notebooks to visualize a simulation is as simple as editing parameters (e.g. what time step or color model to use) and re-running a cell. More advanced users can easily copy and modify code used to manipulate and visualize the data based on the examples provided in the notebooks. Each model that is included in the CMR will, therefore, come with one or more notebooks which load data (from ASCII, HDF5, or whatever format), and visualize it with matplotlib or paraview-python (other visualization tools may be added as time goes by). D. The Agave Platform

Figure 1. A GUI for constructing input parameters for one of the CMR applications.

configuration information will persist, removing the need to run configuration again. Additionally, the ability to run jobs on an HPC resource requires the permission of the owner of a community account for the given resource. These can be easily granted through Agave. The CMR, however, also allows the user to configure such an account on any publicly accessible workstation or cluster that they control (where, minimally, Singularity and the images to be run need to be installed). Example configuration files for different kinds of systems with different batch schedulers (PBS, Slurm, simple forking of jobs, etc.) is provided. All of the models that we run using the CMR require a complex set of input parameters with hundreds of configurable parameters. Typically, they are provided through text files with simple grammars, usually sequences of elements of the form “name = value.” Displaying such a large number of inputs is impractical for any kind of GUI. The number of parameters for any given run needs to be cut and reduced in some intelligent way. Our observation is that researchers typically focus on changing one small limited set of parameters at a time, depending on their needs. We have, therefore, constructed a concept of templates. Each application can have one or more sets of input files with certain magic strings in them such as ${value=3.4}, or ${option=T/F,default=T}. These can be read and parsed dynamically and converted into a simple GUI and rendered using Jupyter notebook widgets. See Fig 1 for an example of what such a GUI looks like. Just before the job is run, the magic strings are converted to appropriate values and the converted files are passed to the application. Researchers can create and upload these templates at any time using the docker cp command, adjusting them as necessary for classes, workshops, or research questions. Alternatively, the researcher can forego usage of the GUI and upload their own hand-crafted input files. A design goal of the CMR is to simplify research, but avoid limiting what users can do. This is one of the reasons that we use Jupyter

urn a

lP

re-

One step removed from our container runtimes, is the Agave Platform. Just as Docker and Singularity provide a consistent interface to running our notebook code, we need an abstraction to access data and launch jobs within our Singularity containers. Because our project does not provide ongoing compute and storage infrastructure to support a persistent Jupyter Hub instance, and because we do not provide shared resource allocations to run published containers, it is imperative that anyone in the community should be able to run coastal modeling codes from any Jupyter Hub, Notebook, or Lab environment on any system that can run our community images. By delegating the mechanics of managing application assets, interacting with systems and schedulers, monitoring, data staging and archiving, notifications, and logging to Agave, we can treat each simulation, regardless of size and duration, as an asynchronous process and conceptually deal with it as a highlevel step in our experiment. Every aspect of the experiment is recorded both in the notebook and in Agave’s provenance trail (including version numbers for the base image, libraries, and model codes) and can be shared, published, and repeated by anyone or everyone to whom access is granted. Our Coastal Modeling notebooks leverage TACC’s public tenant, which is an independently managed instance of the platform available for anyone in the open science community to use free of charge. In that tenant, we created three execution systems, one representing LSU’s Shelob HPC cluster and the other two representing workstations (the latter are used for small runs and testing). Finally, we created Agave apps that accept a single input and target either Shelob’s PBS scheduler as the execution target, or simply forks a job outside the scheduler.

pro of

Journal Pre-proof

E. Putting it All Together

Jo

To use the system, the user first starts the Jupyter notebook through our docker image with docker-compose. A persistent volume is mounted at /home/jovyan inside the container, and the notebook server is started on port 8003 (an arbitrary choice which could be reconfigured with docker-compose). Once the user starts the environment, they configure the notebook, providing it with their Agave credentials. This

Figure 2.

Schematic of the CMR workflow

Nearshore [SWA]); (2) simulation of nearshore nonlinear wave propagation using FUNWAVE-TVD (the TVD version of the FUlly Nonlinear WAVE model [FUN]); (3) an alternate implementation of the FUNWAVE-TVD code using the Cactus Framework [GAL+ 02], which is able to make use of AMR [CBCS17]; and (4) simulation of dynamic solitary wave force exerted upon a bridge deck using OpenFOAM (Open source Field Operation And Manipulation [Ope]). First, the docker images were generated (SWAN, Funwave, etc.). Second, a Jupyter notebook that incorporates the functionalities of Agave as well as data visualization was developed. Third, a Docker image of the Jupyter notebook was created. Fourth, Singularity images generated from the Docker images on the target machines. For our applications, a user can simply download the Jupyter notebook image, and modify the commonly used parameters in the input file of the model, run numerical simulations and visualize the model results. All of this happens without any fixed, centralized clientfacing server and without any fixed HPC resources on which to run. Even the Agave tenant that we use (currently hosted at TACC) is configured by the notebooks. The CMR could, in principle, be switched to use another tenant elsewhere.

urn a

lP

re-

notebooks, so that the full power of the command line and Python is available for those users who need it and want to use it. In the next version of this modeling tool, we plan to encode these template parameters as metadata and provide a “Search And Share” interface for the purpose of discovering jobs that have been previously run for the purpose of retrieval, reanalysis, or sharing with other users of the CMR by leveraging Agave’s ability to provide fine-grained access permissions to the output of runs or sets of runs. In this fashion, results of calculations by any single user of the system can be studied by any number of others. Using this methodology, we can easily accommodate new applications and allow users of the system to extend the functionality of the existing models by uploading new templates. To simplify the running of jobs, the Docker/Singularity images that we create have some features in common. All of them, for example, contain a directory called /workdir from which mpi is called, and a script named /usr/local/runapp.sh which launches the mpi job. The inputs are sent through Agave to the image as input.tgz and the results come back as output.tgz. The above choices make it possible to use a single standard batch script which receives the image name as a parameter. From Agave’s perspective, all jobs, regardless of the image they use, look the same. The notebook, however, has custom code for visualizing computational results. As jobs progress through the stages of QUEUED, RUNNING, FINISHED, etc. Agave will call back to web servers to provide notifications. At present, we make use of PushBullet [Pus], to receive SMS notifications, primarily due to its pervasive use among the authors, but any service accessible through the web could be utilized. The notification service allows a user to submit a job, then walk away or work on other things while waiting for the job to make it through the queue and run. Once the data is ready to analyze, they can return to the notebook, or spin up a new one, and continue working. Alternatively, while waiting for a job to complete, a user can reload data from any previous job and analyze it. Currently, the notebook offers both matplotlib, paraview-python, and paraview (though this latter option must run through Xwindows instead of a notebook) as visualization tools. You can see how all these pieces are connected in Fig. 2. Singularity provides the interface to the HPC cluster, Docker and Jupyter provide the user interface, and Agave sits at the center, orchestrating the movement of files, the starting and monitoring jobs, and sending notifications through SMS.

pro of

Journal Pre-proof

Jo

IV. R ESULTS

The system described above incorporates a number of open-source numerical models commonly used by the coastal science and engineering community. For demonstration purposes, we show the following applications: 1) Simulation of ocean waves generated by Hurricane Isaac (2012) in the Gulf of Mexico (GoM) using SWAN (Simulating WAve

A. Modeling of Hurricane-Generated Waves Using SWAN SWAN is a spectral wave model based on the wave energy (action) balance equation that computes random, short-crested wind-generated waves. It accounts for a range of physics, including wave propagation (shoaling and refraction), wave generation by wind, three- and four-wave interactions, energy dissipation due to steepness-limited and depth-limited wave breaking, bottom friction and aquatic vegetation, etc. In this demonstration, SWAN is applied to simulate the wave field in the GoM during Hurricane Isaac in 2012. The hurricane wind field and bathymetry at the computational grids are prepared by the user. Fig. 3(a) shows the graphical input interface generated by the Jupyter notebook, where users can upload the input file, edit the parameters in the input file (e.g. time step) and choose the format of output. Fig. 3(b) exhibits the track for Hurricane Isaac (2012) and the bathymetry of the

Journal Pre-proof

(a)

The CaFunwave implementation is, however, based on an older model of the original Funwave-TVD and misses much of the functionality present in Funwave-TVD, including the ability to model ship-wakes, and bedload sediment transport. CaFunwave was employed to simulate wave attenuation by coastal wetlands during Hurricane Isaac (2012). Similar to the Jupyter notebook for Funwave-TVD, the user prepares the bathymetry and other input files. Fig. 5 shows the modeled wave field over the wetlands with and without vegetation. It is seen that wetland vegetation reduces a considerable amount of wave energy.

pro of

GoM. Fig. 3(c) presents the modeled significant wave heights in the study area, visualized through the Jupyter notebook.

(a)

(b)

(c)

Figure 3. (a) Graphical input interface generated by the Jupyter notebook, (b) track for Hurricane Isaac (2012) and (c) modeled significant wave heights in the GoM.

(c)

re-

B. Modeling of Nearshore Wave Processes Using FUNWAVETVD

(b)

D. Modeling of Solitary Wave Force Exerted on a Bridge Deck Using OpenFOAM OpenFOAM is an open-source C++ toolbox for solving the Navier-Stokes equations for fluid dynamics. In this demonstration, OpenFOAM is applied to simulate the dynamic wave force exerted upon a bridge deck in a wave flume. Geometrical properties of the numerical model, solitary wave properties, and turbulence properties are set by the user. A snapshot of the wave-deck interaction using Paraview (Parallel Visualization Application [AGL05]) is illustrated in Fig. 6(a). In Fig. 6(b), the numerical results for vertical wave forces are compared with the results of a similar experimental setup conducted by French [FRE69]. In this figure, vertical and p horizontal axes are normalized by Fs and g/d, respectively, where g is the gravitational acceleration and d is the water depth. As demonstrated in Fig. 6(c), Fs represents the weight of water in the approaching solitary wave above the deck platform. The uplift wave forces predicted by the OpenFOAM numerical model are in good agreement with the experimental data.

(a)

urn a

lP

FUNWAVE-TVD is a phase-resolving wave model based on the fully nonlinear Boussinesq equations-[Che06]. It is capable of modeling nearshore nonlinear wave processes and breakinggenerated circulation, and tsunami waves in both regional and coastal scales. It employs a shock-capturing numerical scheme, a wetting-drying moving boundary condition, and a static sponge boundary condition [LD83]. In this demonstration, FUNWAVE-TVD is applied to simulate the wave propagation over an elliptic shoal resting on a plane beach (slope 1/50). The bottom topography, shown in Fig. 4(a), is prepared by the user. Fig. 4(b) presents the snapshot of the surface elevation at 30 sec, visualized through the Jupyter notebook.

Figure 5. Modeled wave fields over (a) vegetated wetland and (b) nonvegetated wetland by CaFunwave; (c) The study site and the wetland under normal conditions.

(b)

Figure 4. Contour plots of (a) water depth and (b) surface elevation in the study area.

C. Modeling of Nearshore Wave Processes Using CaFunwave

Jo

CaFunwave is an implementation of the FUNWAVE-TVD using the Cactus Framework [GAL+ 02]. While the two codes overlap greatly in formulation, they have some different nonoverlapping capabilities. In particular, CaFunwave adds the possibility of using adaptive mesh refinement for some problems [CBCS17], and more efficient I/O schemes based on HDF5.

V. S USTAINABILITY The CMR will provide and maintain a minimal operational system image to synchronize with upstream development by the open source community for bug fixes, performance improvement, and feature enhancement. It will also automatically check community code repositories, such as Community Surface Dynamics Modeling System (CSDMS [CSD]) and DesignSafe-CI [Des], as well as the download page of each model hosted on the CMR.

Journal Pre-proof

3 Numerical Model Experiments (French, 1969)

2 1

𝐹/𝐹𝑠

0 -1 -2 -3 -4

0

1

2

3

4

5

6

7

8

9

10

t√g/d

(b)

(c)

Figure 6. (a) Snapshot of Wave-Deck Interaction at t=4.75 sec (b) Comparison of Numerical and Experimental Uplift Wave Forces (c) Weight of Water in the Approaching Solitary Wave Above the Deck Platform [HX09].

pro of

(a)

of experiments, making it easier understand and reproduce previous results. But coastal science is not the limit. We point out that the system which we have developed this purpose is general purpose in its construction. Codes from other fields of science could, in principle, be added to the collection equally as easily as the coastal codes that are our focus. In the future, we intend to explore the possibilities of this generalization. The second component of our Coastal Model Repository (CMR), the collection of images targeting cloud and cloudlike architectures, has value independent of the tool we have constructed to use them. This collection will serve as a community repository for standard open source models that are widely used by coastal researchers. While source code for various models and libraries will always be publicly available, the CMR collection will introduce a standard, containerized build and method of distributing these coastal models, which can run on any cloud-like architecture directly, and with negligible system overhead or effort. VII. ACKNOWLEDGMENTS

We are pleased to acknowledge NSF Award 1856359 [Cyb], and the Louisiana State High Performance Computing facility, including the clusters SuperMike and Shelob.

re-

Once a new release of a model is detected, the CMR will compile, and test the model to update the repository to see if a new standard image may be safely created and made available through the tool (although it will still be possible to run or re-run older versions of the models). With the help of the CMR, a coastal researcher will be able to start running stateof-the-art models on the latest cloud-ready computing systems in minutes, as well as verify earlier calculations on older triedand-true models/images. Workflow management tools will be able to take advantage of the CMR to quickly deploy coastal models on academic and commercial cloud platforms while continuing their support on traditional HPC systems. With community involvement, we will build a software ecosystem for coastal models around the CMR to better serve the coastal researchers to speed up scientific discovery and decision-making process.

[AGL05]

urn a

lP

[CBCS17]

VI. C ONCLUSION

Jo

For years, many coastal scientists and engineers who did not run large-scale applications of numerical models regularly were inhibited by the effort needed to gain the specialized knowledge needed to effectively use HPC resources for their research. Their time could be better spent on their research if they did not have to worry about how to compile software and run their applications in different HPC environments. Even more computer-literate coastal researchers tend to focus on a few machines with which they are most familiar. They, too could spend their time in better ways than keeping track of which version of which pieces of software are deployed on which HPC resource. All coastal scientists and engineers can benefit from a system that not only makes all machines look the same, so that they can easily switch from one system to another, but which encapsulates and tracks the provenance information

[Che06] [CMZ+ 18]

[CSD] [Cyb] [DBF18]

[Des] [FRE69] [FUN] [GAL+ 02]

[GJK+ 06]

R EFERENCES

James Ahrens, Berk Geveci, and Charles Law. Paraview: An end-user tool for large data visualization. The visualization handbook, 717, 2005. Agnimitro Chakrabarti, Steven R Brandt, Qin Chen, and Fengyan Shi. Boussinesq modeling of wave-induced hydrodynamics in coastal wetlands. Journal of Geophysical Research: Oceans, 122(5):3861–3883, 2017. Qin Chen. Fully nonlinear boussinesq-type equations for waves and currents over porous beds. Journal of Engineering Mechanics, 132(2):220–230, 2006. Brian D Corrie, Nishanth Marthandan, Bojan Zimonja, Jerome Jaglale, Yang Zhou, Emily Barr, Nicole Knoetze, Frances MW Breden, Scott Christley, Jamie K Scott, et al. ireceptor: A platform for querying and analyzing antibody/b-cell and t-cell receptor repertoire data across federated repositories. Immunological reviews, 284(1):24–41, 2018. Csdms model repository. https://csdms.colorado.edu/wiki/ Model_download_portal. https://www.nsf.gov/awardsearch/showAward?AWD_ID= 1856359&HistoricalAwards=false. Rion Dooley, Steven R Brandt, and John Fonner. The agave platform: An open, science-as-a-service platform for digital science. In Proceedings of the Practice and Experience on Advanced Research Computing, page 28. ACM, 2018. Designsafe-ci. https://www.designsafe-ci.org/. JA FRENCH. Wave uplift pressures on horizontal platforms(analysis of positive and negative hydrodrodynamic pressure variation effects on horizontal piers and platforms). 1969. Funwave. https://fengyanshi.github.io/build/html/index.html. Tom Goodale, Gabrielle Allen, Gerd Lanfermann, Joan Massó, Thomas Radke, Edward Seidel, and John Shalf. The cactus framework and toolkit: Design and applications. In International Conference on High Performance Computing for Computational Science, pages 197–227. Springer, 2002. Tom Goodale, Shantenu Jha, Hartmut Kaiser, Thilo Kielmann, Pascal Kleijer, Gregor von Laszewski, Craig Lee, Andre Merzky, Hrabri Rajic, and John Shalf. SAGA: A Simple API for Grid Applications. High-level application programming on the Grid. Computational Methods in Science and Technology, 12(1):7–20, 2006.

Journal Pre-proof

[YLP+ 17]

Dandong Yin, Yan Liu, Anand Padmanabhan, Jeff Terstriep, Johnathan Rush, and Shaowen Wang. A cybergis-jupyter framework for geospatial analytics at scale. In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, page 18. ACM, 2017.

pro of

Matthew R Hanlon, Matthew Vaughn, Stephen Mock, Rion Dooley, Walter Moreira, Joe Stubbs, Chris Town, Jason Miller, Vivek Krishnakumar, Erik Ferlanti, et al. The arabidopsis information portal: an application platform for data discovery. In 2014 9th Gateway Computing Environments Workshop, pages 38–41. IEEE, 2014. + [HVM 15] Matthew R Hanlon, Matthew Vaughn, Stephen Mock, Rion Dooley, Walter Moreira, Joe Stubbs, Chris Town, Jason Miller, Vivek Krishnakumar, Erik Ferlanti, et al. Araport: an application platform for data discovery. Concurrency and Computation: Practice and Experience, 27(16):4412–4422, 2015. [HX09] Wenrui Huang and Hong Xiao. Numerical modeling of dynamic wave force acting on escambia bay bridge deck during hurricane ivan. Journal of waterway, port, coastal, and ocean engineering, 135(4):164–175, 2009. [jup] Jupyter spawner. https://github.com/jupyterhub/ dockerspawner. [KRKP+ 16a] Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian E Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica B Hamrick, Jason Grout, Sylvain Corlay, et al. Jupyter notebooks-a publishing format for reproducible computational workflows. In ELPUB, pages 87–90, 2016. [KRKP+ 16b] Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian E Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica B Hamrick, Jason Grout, Sylvain Corlay, et al. Jupyter notebooks-a publishing format for reproducible computational workflows. In ELPUB, pages 87–90, 2016. [KSB17] Gregory M Kurtzer, Vanessa Sochat, and Michael W Bauer. Singularity: Scientific containers for mobility of compute. PloS one, 12(5):e0177459, 2017. [LD83] Jesper Larsen and Henry Dancy. Open boundaries in short wave simulationsa new approach. Coastal Engineering, 7(3):285–297, 1983. [LGD15] Carol M Lushbough, Etienne Z Gnimpieba, and Rion Dooley. Life science data analysis workflow development using the bioextract server leveraging the iplant collaborative cyberinfrastructure. Concurrency and Computation: Practice and Experience, 27(2):408–419, 2015. [Mer14] Dirk Merkel. Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239):2, 2014. [Mil17] Michael Milligan. Interactive HPC Gateways with Jupyter and Jupyterhub. In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, PEARC17, pages 63:1–63:4, New York, NY, USA, 2017. ACM. [MKEKJ12] Sharath Maddineni, Joohyun Kim, Yaakoub El-Khamra, and Shantenu Jha. Distributed Application Runtime Environment (DARE): A Standards-based Middleware Framework for Science-Gateways. J. Grid Comput., 10(4):647–664, December 2012. [mpi] Mpich abi compatibility initiative. http://www.mpich.org/abi/. [Ope] Open source field operation and manipulation. https:// openfoam.com/. [Pus] The push bullet service. https://www.pushbullet.com/. [RDG+ 07] Michael Russell, Piotr Dziubecki, Piotr Grabowski, Michal Krysin´ski, Tomasz Kuczy´nski, Dawid Szjenfeld, Dominik Tarnawczyk, Gosia Wolniewicz, and Jaroslaw Nabrzyski. The Vine Toolkit: A Java Framework for Developing Grid Applications. In Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science, pages 331–340. Springer, Berlin, Heidelberg, September 2007. [SCF+ 15] Craig A Stewart, Timothy M Cockerill, Ian Foster, David Hancock, Nirav Merchant, Edwin Skidmore, Daniel Stanzione, James Taylor, Steven Tuecke, George Turner, et al. Jetstream: a self-provisioned, scalable science and engineering cloud environment. In Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, page 29. ACM, 2015. [SWA] Swan spectral wave model. http://swanmodel.sourceforge.net/. + [TCC 17] Rollin Thomas, Shane Canon, Shreyas Cholia, Lisa Gerhardt, and Evan Racah. Toward interactive supercomputing at nersc with jupyter. In Cray User Group (CUG) Conference Proceedings. Cray User Group (CUG), 2017.

Jo

urn a

lP

re-

[HVM+ 14]

Journal Pre-proof

Conflict of Interest and Authorship Conformation Form Please check the following as appropriate: All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or revising it critically for important intellectual content; and (c) approval of the final version. Check

o

This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue. Check

o

The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript Check

o

The following authors have affiliations with organizations with direct or indirect financial interest in the subject matter discussed in the manuscript:

re-

pro of

o

Affiliation

Jo

urn a

lP

Author’s name

pro of

Journal Pre-proof

lP

re-

Steven Brandt obtained his Ph.D. from the University of Illinois at ChampaignUrbana for his research in numerical simulations of rotating black holes. He currently serves as Assistant Director at the Center for Computation and Technology at Louisiana State University and an adjunct faulty member in the Department of Computer Science & Engineering. He is involved in research into making parallel programming more effective. He works with the STEllAR team led by Dr. Hartmut Kaiser, and helps lead the Cactus Frameworks effort. He is PI on grants relating to Cactus Frameworks development and cyberinfrastructure.

Jo

urn a

Shuai was born and raised in Sichuan, China. He is a second-year M.S. student in the Division of Computer Science and Engineering at Louisiana State University. Prior to arriving at LSU, he earned a masters degree at University of Electronic Science and Technology of China in 2016, and earned B.S. degree at Chengdu University of Technology in 2013. He currently works on project Coastal Resilience Research with Dr. Brandt at the Center for Computation and Technology at Louisiana State University.

Journal Pre-proof

re-

pro of

Dr. Qin Chen is a Professor of Civil & Environmental Engineering and Marine & Environmental Sciences at Northeastern University. He specializes in the development and application of numerical models for coastal dynamics, including ocean waves, storm surges, nearshore circulation, fluid-vegetation interaction, and sediment transport and morphodynamics. His research includes field experiments and application of remote sensing and high-performance computing technologies to solve engineering problems. He has been leading the Coastal Resilience Collaboratory funded by the NSF CyberSEES award.

urn a

lP

Ling Zhu is an Associate Research Scientist at the Department of Civil and Environmental Engineering, Northeastern University. She finished her undergraduate study in 2006 at East China Normal University with a major in Mathematics and Applied Mathematics, and obtained her M.S. degree in Computational Mathematics in 2009 from the same university. In 2015, she received her Ph.D. degree in Civil and Environmental Engineering from Louisiana State University. Her research activities involve developing numerical wave models, investigating wave-vegetation interactions in nearshore areas, modeling storm surges, etc.

Jo

Reza Salatin is a PhD student in the Department of Civil Engineering at Northeastern University. He received his B.Sc. in Civil Engineering from the University of Tabriz, Iran, in 2015 and his M.Sc. in Structural Engineering from the Middle East Technical University, Turkey, in 2018. His research interests include structural optimization, structural health monitoring, wave-structure interaction, coastal engineering and nearshore processes.

Journal Pre-proof

Jo

urn a

lP

re-

pro of

Rion Dooley is principal investigator on the Agave Project a Science-as-aService API platform allowing researchers worldwide to manage data, run code, collaborate freely, and integrate their science anywhere. His previous projects span areas of identity management, distributed web security, full-stack application development, data management, cloud services, and high performance computing. He earned his Ph.D. in computer science from Louisiana State University. Rion actively puts his wife and two daughters at the top of his list of accomplishments. He hopes his work can someday edge out dancing teddy bears and smear-proof lipstick on their lists of favorite inventions.