G Model
ARTICLE IN PRESS
FUSION-7384; No. of Pages 5
Fusion Engineering and Design xxx (2014) xxx–xxx
Contents lists available at ScienceDirect
Fusion Engineering and Design journal homepage: www.elsevier.com/locate/fusengdes
A new long-pulse data system for EAST experiments F. Yang a,b,∗ , B.J. Xiao a,c , F. Wang a , S. Li a , W.J. Huang a a
Institute of Plasma Physics, Chinese Academy of Sciences, Hefei 230031, China Department of Computer Science, Anhui Medical University, Hefei 230030, China c Center of Hefei Physical Science and Technology, University of Science and Technology of China, Hefei 230031, China b
h i g h l i g h t s • Slice storage mechanism on MDSplus has been adopted for the effective solutions to the continuous and quasi real-time data storage. • Using circular linked list method solves the speed mismatch between the network transmission and the MDSplus writing. • Using LVS (Linux Virtual server) load balance technology, the new system provides a safe, highly scalable and highly available network service for user to access data.
a r t i c l e
i n f o
Article history: Received 24 May 2013 Received in revised form 14 March 2014 Accepted 17 March 2014 Available online xxx Keywords: Slice storage LVS MDSplus Long-pulse
a b s t r a c t A long pulse discharge requires high throughput data acquisition. As more physics diagnostics with high sampling rate are applied and the pulse length becomes longer, the original EAST (Experimental Advanced Superconducting Tokamak) data system no longer satisfies the requirements of real-time data storage and quick data access. A new system was established to integrate various data acquisition hardware and software for easy expansion and management of the system. Slice storage mechanism in MDSplus is now being used for the continuous and quasi real-time data storage. For every data acquisition thread and process, sufficient network bandwidth is ensured. Moreover, temporal digitized data is cached in computer memory in doubly linked circular lists to avoid the possible data loss by the occasional temporal storage or transfer jam. These data are in turn archived in MDSplus format by using slice storage mechanism called “segments”. For the quick access of the archived data to the users, multiple data servers are used. These data servers are linked using LVS (Linux Virtual server) load balance technology to provide a safe, highly scalable and available data service. © 2014 Elsevier B.V. All rights reserved.
1. Introduction The original EAST data system saved data after a shot finished [1]. However, as the EAST experiment has gradually entered the stage of long pulse with pulse lengths soon to reach 1000 s, or even longer, a new solution for data acquisition was needed. Due to the diversification of acquisition system requirements and for historical reasons, there are various acquisition units with a variety of sampling frequencies and resolutions and hardware structures. Up to now, the data acquisition rates for EAST have reached a total of approximately 500 MB/s excluding various local data systems such as Thomson scattering raw data and camera data. For a 1000 s discharge, the total data volume will exceed 500 GB. For the long
∗ Corresponding author at: Institute of Plasma Physics, Chinese Academy of Sciences, Hefei 230031, China. Tel.: +86 551 5591354; fax: +86 551 5593350. E-mail address:
[email protected] (F. Yang).
pulse discharge, these data must be stored at least in quasi-real time and must be accessible to be viewed by the scientists during the shot. For these purposes, we described in this paper the design and implementation of this new system. As ITER is approaching its operation phase, world fusion facilities are exploring long pulse real-time data storage and distribution. For example, Alcator C-MOD adopts MDSplus segment storage and achieves good performance [2]. LHD uses a new method so called ‘sub-shot’ that divides the data stream into 10 s chunks [3]. ITER itself is developing a new data system to handle quasireal-time long pulse data. 2. System design The new data system architecture is shown in Fig. 1. So far, for the center data acquisition, EAST collects about 2300 signals by various acquisition computers. Most frequent used digitizers are DAQ-2204/2206, D-TACQ and PXI-2020/2022, which are mainly for
http://dx.doi.org/10.1016/j.fusengdes.2014.03.039 0920-3796/© 2014 Elsevier B.V. All rights reserved.
Please cite this article in press as: F. Yang, et al., A new long-pulse data system for EAST experiments, Fusion Eng. Des. (2014), http://dx.doi.org/10.1016/j.fusengdes.2014.03.039
G Model FUSION-7384; No. of Pages 5
ARTICLE IN PRESS
2
F. Yang et al. / Fusion Engineering and Design xxx (2014) xxx–xxx
Fig. 1. System scheme.
the data acquisition in sampling rate from 10 KSPS (kilo samples per second) to 250 KSPS, or even higher. Acquisition console is responsible for the unified deployment, coordination of work in every acquisition computer and data storage computer (storage server). Considering the performance and load, the data acquisition computer is only responsible for collecting and transmitting data, rather than writing data locally. Each time slice data is transmitted to the storage server after a certain interval and then archived to a local disk on the storage computer in segmented MDSplus [4–6] format. Once the data acquisition of the current shot is completed, the system will automatically move the MDSplus data to a shared storage area. The storage server is mainly responsible for receiving and writing data. Each storage server maps to shared storage area via Network File System (NFS) [7] protocol. It is an effective way to adopt unified interface for network communication between acquisition computer and storage server. Expansion of the system can be accomplished simply by adding more data acquisition computers and/or data storage computers. An LVS [8] load balancing mechanism will be used for real-time viewing and analyzing the experimental data. The load scheduler can be automatically assigned a relatively idle “Publish” server to access the data via NFS. 3. Data acquisition and transmission An acquisition computer includes an acquisition card, collect chassis, industrial personal computer (IPC), etc. Initialization parameters are sent to each acquisition computer from the acquisition console, including sampling card number, channel number, magnification, trigger time, number of slice, the mapping relation between acquisition card and the storage server, etc. The new system treats each data acquisition card as an independent body to communicate with storage server. Each slice data in each acquisition card is transmitted to the storage server via TCP/IP. The communication process between acquisition card and storage server is shown in Fig. 2. Before data collection, the double buffer associated with first-in-first-out (FIFO) buffer of an acquisition card is allocated according to slice size in industrial personal computer memory. In general, the slice size is set to hold 5 s (5 s for short) of data being acquired by the card. The 5 s duration is
configurable and can be changed if necessary. When a new shot arrives, the acquisition card begins to work. All channels in the acquisition card convert analog signals to digital signals. Digital signals are stored in the FIFO buffer and transmitted to the first buffer in IPC memory by DMA (Direct memory access). After 5 s, the first buffer is full, and then digital signals are transmitted to the second buffer, at the same time, the acquisition program is communicated with storage server process via TCP/IP, the first 5 s data in the first buffer are transmitted to storage server. In the same way, when the second buffer is full, the digital signals are transmitted to the first buffer and overwrite the existing data, and the data in the second buffer are transmitted to storage server accordingly. The cycle will start all over again until the end of discharge. Because of the speed mismatch between the network transmission and the MDplus writing, if the slice data of network transmission is written to the MDSplus directly, it may happens that the current slice data has not been written to the MDSplus completely, and the next slice data arrive and cannot be written. To solve this problem, data transmission and writing are carried out separately and used doubly linked-circular lists as a public cache area. Corresponding to each acquisition card, a storage process has been set up in storage server. There are two threads in this process, one receives the slice data from acquisition machine and the other thread is responsible for writing slice data into MDSplus. When a slice data arrives, the receiving thread sets up a linked list node, then puts slice data into node, and inserts the head node in a doubly linked-circular list. The writer thread continuously detect whether list is empty, if not empty, the slice data in the last node is written to MDSplus, then release the node memory, and delete that node. Once data acquisition of current shot is completed, the thumbnail data [13] is written to MDSplus, thread dump automatically detect the status of CPU; if CPU is idle, the MDSplus data of current shot will be uploaded to shared EMC [9] storage area via NFS network. Circular linked list method is proposed, which can ensure that each slice data is written to MDSplus and need not to worry about the speed mismatch between the network transmission and the MDSplus writing. For multiple acquisition machine corresponding to a storage server, multiple storage processes have been set up in storage sever, which map to acquisition card and share communication port. Within the storage server performance load, if user wants
Please cite this article in press as: F. Yang, et al., A new long-pulse data system for EAST experiments, Fusion Eng. Des. (2014), http://dx.doi.org/10.1016/j.fusengdes.2014.03.039
G Model FUSION-7384; No. of Pages 5
ARTICLE IN PRESS F. Yang et al. / Fusion Engineering and Design xxx (2014) xxx–xxx
3
Fig. 2. Work flow chart of data acquisition and data transmission.
to increase an acquisition machine, only need to specify the IP of storage server and communication ports in acquisition program. 4. Data storage For the 1000 s discharge, if sampling rate is 250 KHz, the data size of one signal is about 500 MB. In other words, for 1000 acquisition channels, the data size of one shot for all signals is nearly 0.5 TB. It is not feasible to store such large data in the local server. The EMC isilon products have been used as a shared storage pool. The EMC storage nodes are mounted to storage server via NFS network. Users can access data on remote systems by NFS almost as if they were local files. Due to the high cost of file locking on network file system [6], system will take much longer to write the slice data to remote MDSplus tree via NFS network. At present, the slice data is written directly into MDSplus on the local disk of storage server, and then uploaded to the storage pool. Currently, the EAST device operates gradually into the long pulse phase. The physicists will analyze data in a timely manner. It needs to acquire data while writing to MDSplus. In the field of international fusion, MDSplus is a general software system for data acquisition and storage [5]. Along with constantly updated, MDSplus has provided many APIs for slice storage [10], which are applied in long pulse quasi-real-time storage for EAST experiments. The main design concept for supporting long pulse data in MDSplus is the use of “segmented records” [6]. The slice data as a data segment has been written to MDSplus data file one by one. The working mechanism of slice storage is shown in Fig. 3. A segmented record consists of data segment and the index of data segment which includes start and end times of segment. As a slice data arrives, a new data segment in MDSplus data file is allocated and the slice data is appended to new data segment and the new index is added. The slice storage code for each data segment is as follows: Data*dimensionData = new Range (new Float64(start), new Float64(end), new Float64(period));//create dimension information for data node->makeSegment(start, end, dimensionData, slicedata); //write slice data 5. Data publish In order to reduce the workload of storage server, a high performance publish server has been built. It provides a web service
and MDSplus service. Mapped to all sub-trees which stored in each storage server and EMC shared pool, A total tree namely east tree has been created in publish server. If users want to view east data online or offline, they can get data by accessing the publish server. In a real environment, one publish server is unable to meet the demand because of the heavier workload. Based on load balancing technology, the LVS virtual server cluster has been built, which appears to be only one server to outside client. The LVS comprises of three layered modules: load scheduling, server pool and shared storage. Fig. 4 shows the data publish adopted in the present work. 5.1. Load scheduling Load balancer is an external front machine of the whole server clusters, which is responsible for sending the user’s request to the publish server. Load scheduling in LVS uses the direct routing (VS/DR) method [8] to implement load balancing. Once receive the user’s request, load balancer will dynamically select a lighter load publish server according to the scheduling algorithms, and finally return the results directly to user. By the direct routing mode, the publish server sends response to client directly instead of through the load balancer, which can greatly reduce the workload of the load balancer. If the scheduler fails, it may cause the entire system paralyzed. In order to ensure the stability and robustness of the system, the system adds a backup scheduler and uses heartbeat technology [11] for fault detection. If load balancer fails, the backup scheduler will take over his work, and the load balance algorithm can reassign tasks in the available publish server. 5.2. Server pool Server pool contains a set of real data publish servers which are really execution of user requests. Each publish server provides the same functionality which can be request for load balancer. 5.3. Shared storage All EAST data, namely MDSplus trees, are stored to EMC storage devices, which are easy to share over NFS. There is a separate EMC storage device for each server. To speed up the waveform first display, the thumbnail node corresponding to signal node in pulse tree is added [12]. The whole structure of the LVS is transparent
Please cite this article in press as: F. Yang, et al., A new long-pulse data system for EAST experiments, Fusion Eng. Des. (2014), http://dx.doi.org/10.1016/j.fusengdes.2014.03.039
G Model FUSION-7384; No. of Pages 5
ARTICLE IN PRESS
4
F. Yang et al. / Fusion Engineering and Design xxx (2014) xxx–xxx
Fig. 3. Principle of slice storage operation.
to the client, and client application is not affected by the server clusters and not to do any modification. It is convenient to add or delete a server node in a server cluster system. By detecting node or process failure to properly reset the system so as to achieve the high availability of the system. 6. Test and analysis In the test, acquisition parameters are set as follows: sampling rate is 250 KHz, sampling time is 1000 s, acquisition accuracy is 16
bit, each acquisition machine includes 6 acquisition cards and every acquisition card can capture 16 channels. The data capacity of each 5 s slice data is 40 MB (250*1000*2*5*16). All the processing time of each slice data comprises data channel processing time, the network transmission time and storage time (write to MDSplus). In order to test if the speed matches between data acquisition and storage, we respectively adopt a sampling machine corresponding to a server and two sampling machines corresponding to a server for testing. The transmission and storage time of each time slice are shown in Table 1.
Fig. 4. Overview diagram of data publish.
Please cite this article in press as: F. Yang, et al., A new long-pulse data system for EAST experiments, Fusion Eng. Des. (2014), http://dx.doi.org/10.1016/j.fusengdes.2014.03.039
G Model
ARTICLE IN PRESS
FUSION-7384; No. of Pages 5
F. Yang et al. / Fusion Engineering and Design xxx (2014) xxx–xxx
5
Table 1 The processing time of slice data on multiple acquisition cards. Number of acquisition machine 1 1 2
Number of acquisition cards 1 6 2*6
Slice data size (MB)
Process time (s)
Transmission time (s)
Storage time (s)
40 40 40
0.38 0.48 0.78
0.32 0.42 0.49
0.89 1.04 1.34
It shows that all the processing time of each slice data is far less than acquisition time (5 s), in spite of the write speed is slow, data transmission will not be affected, thus it can be concluded that using circular linked list method is feasible which can ensure the data of the sustainable, stable transmission and storage. In order to speed up data display (short pulse or long pulse), the three new methods which include frequency reduction, slice display and thumbnail are adopted in WebScope [12,13]. Discharge data can be displayed online during the discharge process by WebScope. 7. Summary Adopting a unified interface will be in favor of the expansion and management of the acquisition system. Using circular linked list method solves the speed mismatch between the network transmission and the MDSplus writing, which ensures the data can be completely and accurately written to MDSplus. The data publish system has been built and deployed based on load balancing technology. Tests show that the new data system can satisfy the long-pulse data acquisition or steady-state data collection. Acknowledgements This work supported by the ment Fusion Science Program of Anhui Provincial Natural Science (No. KJ2012A144), the Grants for
National Magnetic ConfineChina (No. 2012GB105000), University research project Scientific Research of BSKY
(No. XJ201125) from Anhui Medical University, Anhui Provincial Science Foundation for Outstanding Young Talent under Grant.(No. 2012SQRL265), Young and middle-aged academic backbone finance fund from Anhui Medical University. The authors would like to thank all of the colleagues of the Computer Application Division at the Institute of Plasma Physics, Chinese Academy of Sciences, for their contributions. References [1] Y. Liu, J.R. Luo, G.M. Li, Y.F. Zhu, S. Li, The EAST distributed data system, Fusion Eng. Des. 82 (2007) 339–343. [2] Alcator C-Mod Home Page [online]. http://www.psfc.mit.edu/research/alcator/ [3] H. Nakanish, M. Emoto, Y. Nagayama, T. Yamamoto, S. Imazu, C. Iwata, et al., Data acquisition system for steady-state experiments at multiple sites, Nucl. Fusion 51 (2011) 113014. [4] T.W. Fredian, J.A. Stillerman, MDSplus: current developments and future directions, Fusion Eng. Des. 60 (2002) 229–233. [5] G. Manduchi, A. Luchetta, C. Taliercio, T. Fredian, J. Stillerman, Real-time data access layer for MDSplus, Fusion Eng. Des. 83 (2008) 312–316. [6] MDSplus Documentation Library, 2013, available at: http://www.mdsplus.org [7] A. Osadzinski, The network file system (NFS), Comput. Stand Interfaces 8 (1998) 45–48. [8] LVS Documentation, 2013, available at: http://www.linuxvirtualserver.org/ [9] EMC Home Page, 2013, available at: http://www.emc.com [10] T. Fredian, J. Stillerman, G. Manduchi, MDSplus extensions for long pulse experiments, Fusion Eng. Des. 83 (2008) 317–320. [11] Z.B. Shen, Y. Luo, Using heartbeat to implement dynamic standby system on linux, Comput. Eng. Appl. 19 (2002) 129–131 (in Chinese). [12] F. Yang, B.J. Xiao, A web based MDSplus data analysis and visualization system for EAST, Fusion Eng. Des. 87 (2012) 2161–2165. [13] F. Yang, N.N. Dang, B.J. Xiao, WebScope. A new tool for fusion data analysis and visualization, Plasma Sci. Technol. 12 (2010) 253–256.
Please cite this article in press as: F. Yang, et al., A new long-pulse data system for EAST experiments, Fusion Eng. Des. (2014), http://dx.doi.org/10.1016/j.fusengdes.2014.03.039