Fusion Engineering and Design 87 (2012) 2045–2051
Contents lists available at SciVerse ScienceDirect
Fusion Engineering and Design journal homepage: www.elsevier.com/locate/fusengdes
Web-based (HTML5) interactive graphics for fusion research and collaboration E.N. Kim ∗ , D.P. Schissel, G. Abla, S. Flanagan, X. Lee General Atomics, P.O. Box 85608, San Diego, CA, USA
a r t i c l e
i n f o
Article history: Available online 24 April 2012 Keywords: Web HTML5 Interactive graphics Data visualization
a b s t r a c t With the continuing development of web technologies, it is becoming feasible for websites to operate a lot like a scientific desktop application. This has opened up more possibilities for utilizing the web browser for interactive scientific research and providing new means of on-line communication and collaboration. This paper describes the research and deployment for utilizing these enhanced web graphics capabilities on the fusion research tools which has led to a general toolkit that can be deployed as required. It allows users to dynamically create, interact with and share with others, the large sets of data generated by the fusion experiments and simulations. Hypertext Preprocessor (PHP), a general-purpose scripting language for the Web, is used to process a series of inputs, and determine the data source types and locations to fetch and organize the data. Protovis, a Javascript and SVG based web graphics package, then quickly draws the interactive graphs and makes it available to the worldwide audience. This toolkit has been deployed to both the simulation and experimental arenas. The deployed applications will be presented as well as the architecture and technologies used in producing the general graphics toolkit. © 2012 Elsevier B.V. All rights reserved.
1. Introduction The Web was established based on the principle of Universal Readership, “once information is available, it should be accessible from any type of computer, in any country, and an (authorized) person should only have to use one simple program to access it. [1]” About twenty year later, the Web has significantly evolved to serve as a global data sharing space for a broad range of users. Since its early days as a collaboration tool for a group of scientists at CERN, there are now an estimated 1.9 billion users worldwide [2] and its supporting technologies continue to advance rapidly. The ubiquity of the client and easy accessibility to other data sources make the Web a very attractive means of information sharing. For over a decade in the magnetic fusion community, the Web has played a big role in documentation, monitoring and collaboration. Tools such as the DIII-D experimental web-portal [3], Alcator C-MOD’s electronic logbook [4] and the monitoring web-portal for the Center for Simulation of RF Wave Interactions with Magnetohydrodynamics (SWIM) (refer to “Interactive Monitoring Portal for Fusion Simulations” by G. Abla to be published in Fusion Engineering and Design) utilize some of the recent web technologies to allow users to remotely monitor and participate in the on-going experiments and simulations in real-time. Open source packages such as MediaWiki, WordPress, and Bugzilla have been popular and actively used for collaborative writing and tracking application
∗ Corresponding author. E-mail address:
[email protected] (E.N. Kim). 0920-3796/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.fusengdes.2012.03.041
enhancements. Overall, the Web has provided fusion scientists an extremely useful and accessible solution to catalogue and share information about their research. However, since this information is provided mostly in text or static images, the scientists are still required to rely on external applications to further graphically investigate their data. With the recent advancements in web technologies, it has become more feasible for websites to operate more like desktop applications. However, such developments have produced some limitations confining the idea of Universal Readership. First, it is common for web pages to require client-side plugins to display the page properly. Most of the time, it is up to the users to download, install and maintain the plugins. Also, there has been a significant increase in the usage of mobile platforms that often behave differently and vary in their supporting plugin requirements. Web pages requiring certain plugins to display on a web browser may simply fail on some mobile platforms. Plugins and mobile platforms have added new constraints to the philosophy of one simple program. Technologies such as HTML5, Scalable Vector Graphics (SVG), the 3rd standard of Cascading Style Sheet (CSS3) or JavaScript are able to help avoid such limitations. They are able to produce highly interactive, graphics-dense web pages without a browser plugin. Since they do not require users to resort to browser plugins, they can also be displayed on any mobile platform supporting modern browsers. These graphics can be rendered on the client-side with dynamic data collected from the server-side, while the overall interface utilizes the features provided by both ends. This has opened up substantially more possibilities for utilizing the Web for
2046
E.N. Kim et al. / Fusion Engineering and Design 87 (2012) 2045–2051
Fig. 1. An example of a multi-dimensional signal plot showing interactive features: toggle dimensions, time slice bar, tracking x/y values, pan and zoom.
rich, interactive scientific research and simultaneously providing new means of communication and collaboration through the Web. 2. Approach Due to the large volume of information collected during experiments and simulations, data visualization tools are critical for magnetic fusion researchers. At DIII-D, tools like ReviewPlus [5] have been developed and utilized for more than a decade providing interactive 2D and 3D graphs. The application’s user interface and capabilities have been enhanced with user feedback over time. Given this rich history, much of the ReviewPlus’ general interface design has been considered when deploying the interactive visualization on the Web. Although, all the rich and fine details of stand-alone applications cannot be mimicked with the current state of the web technology, some of its basic features can easily be adopted on the Web. Examples are shown in Fig. 1, including a slice bar for browsing data in time or space, a button to toggle the multi-dimensional data, as well as other essential interactions like crosshair, zoom and pan.
3. Architecture design and technologies used Interactive visualization on the Web should offer a way of monitoring and processing the complex experimental or simulation data fast and effortlessly. This capability needs to be general and modular enough to enhance any existing web pages from simple overview pages to more complex web-portals. It also must be able to work with the present computational environment to collect the necessary data. These requirements can be achieved by the combined effort of the client-side and the server-side Web technologies. Fig. 2 shows the workflow process between the two sides. This overall process is a sequence of three stages: (1) Input processing, (2) Data collection, and (3) Graphics display. 3.1. Input processing The input process schema can vary depending on the individual website’s purpose and design. Under some cases, the user stays oblivious of this process and the values are determined by the hyperlink path he or she has taken to visualize the data. For
Fig. 2. An overview diagram showing the workflow sequence.
E.N. Kim et al. / Fusion Engineering and Design 87 (2012) 2045–2051
example, a user ends up on a visualization page by clicking on a series of links where each click narrows down to more and more specific information he or she is looking for. By the end of this exploration, the page has gathered all the necessary input to create the graphical output. On the other hand, some web pages can prompt the user for a set of input data. This case includes those pages mimicking the external visualization applications – where the user’s specific input directly drives the output. Regardless of how they are defined, the inputs are critical as they establish the desired data’s location and type. For our deployment, PHP is the key technology on the serverside where the input processing is done. Its abilities to integrate smoothly across various platforms, software and data sources allow this scripting language to play a very important role in connecting the scientific data and the client-side interface. After PHP accepts the input parameters, it verifies the input validity and determines the requested data’s location and type by communicating with numerous data acquisition and database storage systems. 3.2. Data collection Most of the experimental and simulation data can be accessed with MDSplus or SQL. MDSplus is a set of software tools for data acquisition and storage and a methodology for management of complex scientific data. The system was specifically designed to enable users to easily construct complete and coherent data sets with a few basic commands, simplifying data access even into complex structures [6]. Both SQL and MDSplus provide an extension for PHP to determine further information on the requested data such as the most efficient path to the data, and the data’s dimensionality. Once all the prerequisites are confirmed, PHP makes its final connection to the appropriate storage system to collect the requested data. Then it organizes and passes this dynamically generated data back to the client-side. To insure user interactivity with minimal latency, the number of data points is compared to the total available screen pixels in which the graph will be rendered. Displaying more data than the screen can render is excessive and can cause slowness in process time, network and client-side memory. To avoid these bottlenecks, a data decimation algorithm has been created using MDSplus’ TDI language so that only the sub-sampled dataset is passed and rendered on the client-side. 3.3. Graphics display HTML5 Canvas and SVG are the two options presently available for producing rich graphics on the Web without a plugin. HTML has been the core language of the Web from the beginning and its design continues to be adapted as other Web technologies and the usage of the Web progress. HTML5 is the 5th major revision of HTML and it introduces new features to help the development of web applications [7]. Canvas was introduced in HTML5: it is a low level, procedural model that updates a bitmap and does not have a built-in scene graph [8]. SVG development began in 1999, and is part of HTML5. It is at a higher level than Canvas and XML-based which allows for better accessibility of the graphical elements via Document Object Model (DOM) [9]. This ability to interact with the client-side using DOM makes SVG a more attractive solution for the web-based graphics. Web graphics can be rendered at any desired point – when a page is loaded, when a document object is clicked on or when any other Javascript recognizable user action event occurs. It is at this point when the request event is made that (1) the input is processed, and (2) the data is collected and pushed back to the client-side. Once the data is received, the data is ready to be rendered on the web browser. The main technology that completes this
2047
sequence to produce the interactive graphics on the Web is Protovis [10], a Javascript based data visualization library. It uses SVG for web-native visualization to render the data in the final interactive graphical form. Protovis supports interactive functionalities such as slider bars, zoom, pan and cross-hair on user action events to quickly re-render the graphics as needed. As the data is collected and pushed back to the client-side, PHP also dynamically produces the appropriate Protovis code to send back to the client. While processing the user input and the retrieved data, PHP determines the configuration of the graphics such as the total number, types of graphs and their corresponding interactions. It also calculates the minimum and maximum x and y values which is later used to determine the range of axes. It decides if the graphs are to be overlaid or individually displayed and defines the user interaction and their corresponding event triggers. Utilizing Javascript’s event handlers, the dynamically created Protovis code is able to provide highly interactive output to the users such as the multi-dimensional time slice bar as well as pan, zoom and crosshair. 4. Applications The general capability for interactive web-based graphics has been deployed to both the simulation as well as the experimental areas of magnetic fusion research. The tools implemented include different levels of user interaction from a simplified overview page for real-time monitoring to a highly dynamic and interactive page for further analysis. The initial user interfaces have been adapted to fit the needs of the corresponding audience but the main engine driving the graphics remains the same throughout the applications. 4.1. Electronic logbook The electronic logbook was developed by Alcator C-MOD and allows users to systematically input, organize and share text comments in real-time. As data visualization tools are critical to the researchers in analyzing a mass amount of data efficiently, an embedded image within their notes can help elucidate the author’s idea. The electronic logbook has been updated for DIII-D to process two types of images: manually uploaded static images and dynamically created signal plot images. Fig. 3 shows the updated view of the electronic logbook’s output and the Add/Edit window. “Insert Image” and “Insert MDSplus Plot” sections have been included to the application’s Add/Edit window to supports the application’s new capability. “Insert Image” section provides a way for each user to upload images to his or her individual image libraries. Uploaded images are placed on the server and their corresponding metadata are filled into the relational database. Upload timestamp is attached to each image’s name to prevent any confusions with the file names. “Insert MDSplus Plot” accepts “Signal Name” and “Shot #”, the two required fields by MDSplus for fetching the visualization data. This input combination is verified then inserted into the relational database. A Python script using matplotlib is launched to create a static thumbnail image of the requested plot. Once the appropriate input is provided, both static and dynamic plot images can be embedded anywhere within the entry’s text with wiki-like syntax. For example, “[[Image:test.20110501.jpg]]” will embed the “test.20110501.jpg” and “[[Signal:bt|Shot:111203]]” will visualize the data for signal BT for shot 111203. “Insert Image” and “Embed Plot” button will create the wiki-like syntax and append it at the end of the logbook entry at the point when the button is clicked. When a logbook’s main display page is loaded, thumbnail images will be displayed where an Image or Signal tags are
2048
E.N. Kim et al. / Fusion Engineering and Design 87 (2012) 2045–2051
Fig. 3. The updated electronic logbook interface allows users to embed static images and dynamic MDSplus plots within the entry.
specified. For Image tags, a thumbnail of the uploaded image will appear. Clicking on the thumbnail will open a full size view of the uploaded image in a new window. For Signal tags, a static plot image will appear if the Python script successfully generated the thumbnail. If the script was not able to produce an image, a generic plot image holder thumbnail will be displayed. When the cursor is placed over the generic thumbnail for the plot, a tool tip will appear with the signal name and shot number. Clicking on either thumbnail image will open an interactive view of the signal plot in a new window. 4.2. Simulation overview real-time data plots The SWIM monitoring website provides the scientists a way to monitor their code runs rapidly and efficiently. Organized by run ID numbers, the users are able to quickly scan through the latest code runs and the related metadata such as the responsible person, code name, status, wall time, etc. Each run ID also provides a detailed
view, listing the individual code’s log output. Fig. 4 shows the two variations of the web-based graphics which have been included to this detailed view. First is a lightweight overview page of all available plots without any interaction so the scientists can rapidly visualize their data. Clicking on each plot will lead to an interactive view of the selected plot – offering zoom, pan, crosshair and the slider bar when necessary. Second is a downloadable PDF file of all available plots that the scientists can quickly view. This PDF file is produced by a Python code that simply prints the former into a PDF file and places it on the web server. This real-time overview plots are made possible by the data made available to the SWIM MDSplus data system continually. The SWIM SQL database catalogues metadata on all stored runs and is queried for ongoing runs every few minutes. When found, the data files for these runs are downloaded, read, and loaded into MDSplus using a Python code. The data files are regularly updated with new data as the run progress, so the MDSplus storage will also reflect the latest data snapshot until the run is completed.
E.N. Kim et al. / Fusion Engineering and Design 87 (2012) 2045–2051
2049
Fig. 4. SWIM monitoring website provides real-time, interactive signal plots as soon as the data becomes available.
4.3. Experimental and simulation signal plot analysis Signal plots are used to show how much a signal variable is affected over time or space. It is one of the most essential ways to quickly analyze the large amount of data quickly and efficiently. The signal plot application is a new addition to both the experimental and the simulation team and is based on the design of an IDL based external visualization tool, ReviewPlus. Fig. 5 shows the user input form and its corresponding result. The user input form accepts a set of specific information: Data location, Signal name, New box and Shot number. Data location specifies the MDSplus server. As long as the firewall allows access and the data format is consistent, this application has the ability to connect and fetch data from other facility’s data. New box is a checkbox that will determine if the entry will be drawn in a new box rather than overplayed on top of the previous plot. This new applications provides the scientists visually compare and analyze the experimental data from their institution as well as other labs. 4.4. Experimental relational database scatter plot Scatter plots are used to show how much one variable is affected by another. For our deployment, this is a general purpose plotting tool which makes multiple 2D plots of data from a relational database. It accepts from users a series of input: the database server,
the database name and the table name. Once this information is input, all numeric column names of the selected table are provided in two separate dropdowns – x and y. Once the form is submitted, the dataset is fetched, pushed back to the client and then rendered. Users also have the ability to download the database as a CSV file. Fig. 6 shows the user input form with dropdown options for x and y values and each field’s ranges, and the form’s resulting scatter plot. This particular result is color and symbol coded to differentiate a specific value in the database. 5. Future work and summary The Web has always played the role of a visual medium – allowing users to share information and data in the form of text and graphics. Its ability to easily reach a wide range of audience has been a key factor for its function in collaboration. However, its capability to allow user interaction has been fairly limiting. The recent advancements supporting highly interactive data visualization on the Web has contributed significantly more possibilities towards this valuable tool, and made it a more essential part of the scientific process. The general web-based interactive graphics mechanism has been developed and deployed for the experimental and the simulation researchers in the fusion community. Its modular design allows for rapid deployment to multiple web pages and web
2050
E.N. Kim et al. / Fusion Engineering and Design 87 (2012) 2045–2051
Fig. 5. Signal plot analysis plotting multiple SWIM signals, a DIII-D signal dataset and a signal dataset from CMOD.
applications. As long as the data is stored in an organized format with a well-defined API, it should be fairly simple to adapt this capability for other applications. Data source connection methodology will need to be explored. When a request is made on the web-based graphics, the connection to the data source is closed after each request. ReviewPlus keeps this connection open until the application is closed entirely. Although both tools connect, fetch
and display the data at a similar level of performance, ReviewPlus can process subsequent requests faster than the first where the initial connection is made. Due to the cost in data source connection, the web-based graphics is relatively slower in processing multiple requests in comparison to ReviewPlus. The general data handling mechanism will continue being reviewed to provide the most efficient and effective data. For example, the current method for comparing and reducing the number of data points to fit the available pixels within the screen takes a uniform subsample of the whole dataset. A more intelligent data decimation algorithm will be implemented so appropriate outliers missed due to the uniform subsample will be considered and made available for the initial display and during the user interaction. During the initial deployment period, users have already requested addition of the web-based graphics to existing web pages such as the DIII-D summaries pages. The overall mechanism will continue to be supported and applied to existing and new pages as needed and requested. Also, as the technology continues to grow, new features and supporting tools will continue to be tested. New graphics libraries such as WebGL [11] will be reviewed and investigated. Acknowledgments
Fig. 6. Two numeric columns selected as X and Y plotted as a scatter plot.
The authors would like to thank the staff and collaborators at the DIII-D National Fusion Facility and SWIM for their feedback, Tom Fredian for his MDSplus-PHP support and Stanford Visualization Group for their work on Protovis. This work was supported by the U.S. Department of Energy under DE-FC02-04ER54698.
E.N. Kim et al. / Fusion Engineering and Design 87 (2012) 2045–2051
References [1] http://www.w3.org/Talks/General/Concepts.html. [2] http://www.internetworldstats.com/stats.htm. [3] G. Abla, E.N. Kim, D.P. Schissel, S.M. Flanagan, Customizable scientific web portal for fusion research, Fusion Engineering and Design 85 (2010). [4] T.W. Fredian, J.A. Stillerman, Web based electronic logbook and experiment run viewer for Alcator C-Mod, Fusion Engineering and Design 81 (2006). [5] J. Schachter, Q. Peng, D.P. Schissel, Data analysis software tools for enhanced collaboration at the DIII-D national fusion facility, Fusion Engineering and Design 43 (1998).
2051
[6] J.A. Stillerman, T.W. Fredian, K.A. Klare, G. Manduchi, MDSplus data acquisition system, Review of Scientific Instruments 68 (1997). [7] http://dev.w3.org/html5/spec/Overview.html. [8] http://en.wikipedia.org/wiki/Canvas element. [9] http://en.wikipedia.org/wiki/Svg. [10] M. Bostock, J. Heer, Protovis: a graphical toolkit for visualization, in: IEEE Trans. Visualization & Comp. Graphics (Proc. InfoVis), 2009. [11] http://www.khronos.org/webgl/.