Computer Physics Communications 140 (2001) 92–101 www.elsevier.com/locate/cpc
Software tools at the Rome CMS/ECAL Regional Center Giovanni Organtini 1 Università di Roma “La Sapienza” and INFN, Sez. di Roma, Dipartamneto di Fisica, Via della Vasca Navale 84, 00146 Roma, Italy
Abstract The construction of the CMS electromagnetic calorimeter is under way in Rome and at CERN. To this purpose, two Regional Centers were set up in both sites. In Rome, the project was entirely carried out using new software technologies such as object oriented programming, object databases, CORBA programming and Web tools. It can be regarded as a Use Case for the evaluation of the benefits of new software technologies in high energy physics. Our experience is positive and encouraging for the future. 2001 Elsevier Science B.V. All rights reserved.
1. Introduction The electromagnetic calorimeter of the CMS [1] experiment is being built in CERN and in Rome. Those two sites are referred in the following as Regional Centers. The calorimeter [2] is composed of about 80 000 PbWO4 scintillating crystals, each of which must be carefully characterized before being mounted in the supporting structure. The complexity of the operations to be done for each part and the huge number of them has lead the collaboration to develop suitable systems to make the whole process as automatic as possible, reducing the manipulation of parts and related data by Center operators. In Rome an automatic machine (ACCOR [3]) was built to fully characterize crystals in terms of size, transmission and light yield uniformity. We took this opportunity to test, on a limited scale, but on a real project, the possible benefits of the new software E-mail address:
[email protected] (G. Organtini). 1 Present address: INFN – Sez. di Roma, P. le A. Moro 2, 00185 Roma, Italy.
technologies that are becoming popular in high energy physics. In fact the realization of the Regional Center for the production of the CMS ECAL is the first real LHC project in Italy realized using entirely new software technologies. 2. Object Oriented design ACCOR is built in a modular way: a crystal server manipulates parts and position them in front of the instruments aiming to measure them. Three devices actually do the measurements, namely, a 3D machine for mechanical measurements, a compact spectrophotometer for transmission and a photomultiplier tube with its DAQ chain for light yield determination. Each part is controlled by a separate piece of software in order to maintain the same modularity of the hardware. Object Oriented technology is perfectly adapted to model modular pieces of hardware so we have chosen to use it in this work. 2.1. Server modeling The server carries four stepper motors whose movements are under the user control. Motors are connected
0010-4655/01/$ – see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S 0 0 1 0 - 4 6 5 5 ( 0 1 ) 0 0 2 5 9 - 4
G. Organtini / Computer Physics Communications 140 (2001) 92–101
Fig. 1. The logical model of the motor set of the machine.
to a computer by a serial cable and can be controlled by means of character commands sent over this line. Each motor drives a different part of the machine: the crystal server belts, the light source and the detector (an integrating sphere coupled with a compact spectrophotometer) for the transmission measurement and a radioactive source for light yield determination. A common class is provided (Motor) who encapsulates all the parameters of the motors and the communications over the serial line, leaving to the user only few basic methods directly related to the operations to be done at the center (see Fig. 1), i.e. the movements of the crystals in two directions along the server line, the positioning of the lamp and of the integrating sphere and the movement of the source for uniformity measurement. 2.2. Instruments modeling Each instrument along the measuring chain is automatically controlled by a Workflow Management System developed by the CRISTAL collaboration [4]. Instruments are defined, with respect to CRISTAL, as machines able to send characters over a TCP/IP network. Each instrument connects to CRISTAL establishing a socket connection with one of the CRISTAL components: the Instrument Agent. Sending appropriate strings over the network, instruments can request jobs to be done on a given crystal (part) to CRISTAL: the Instrument Agent gets the jobs from the CRISTAL Data Base (based on Objectivity™) using a CORBA protocol, converts them into strings of commands and sends them to the instrument. The jobs are computed from the status of the part for which they were requested according to its workflow stored in the Data Base. Instruments then decode command strings and execute them, if possible. At the end of the measurements, instruments send data to the Instrument Agent
93
still in the form of character strings. Outcomes are interpreted by the Instrument Agent and converted into objects, then the Data Base is updated. Both commands and outcomes exchanged between Instrument Agent and Instruments are encoded in a way similar to XML: 2 in this way the only requirement for an Instrument is to be able to connect to an Internet socket and to send characters over this line, while the complex dialog between the CRISTAL components is left to the Instrument Agent. In our Regional Center the DAQ programs are written in C++. Two classes are provided to simplify the CRISTAL-Instrument dialog: an Instrument class and a PartOnInstrument class: the first encapsulates the functionalities for connecting to CRISTAL and registering parts to be measured. Such parts are modeled by the PartOnInstrument class who provides methods for requesting jobs to CRISTAL, decoding them and sending outcomes. Both classes inherit from the XMLParser class, a class to decode any possible combination of XML-like elements foreseen for CRISTAL-Instrument communication. Fig. 2 illustrates the sequence of messages exchanged between the various components. Using the provided classes the CRISTAL interface for instruments is very simple and the program structure is the same for any instrument, regardless of the particular device that makes in fact the measurements. For them, specialized classes are used to model their functionality, improving the abstraction. For example, the light yield measuring device is based on CAMAC, so we used classes for CAMAC crates and modules. A further level of abstraction defines a DAQ object with relations with all the classes described above (see Fig. 3). 2.3. Remarks The use of Object Oriented technologies was proven to be very appropriate especially for modeling hardware objects, where the elements are easy to be mapped into classes even without a formal analysis and design. The programs controlling hardware, then, result to be very simple both to be developed and to be 2 XML became popular after the system was designed, so that, even having a very similar syntax and semantics with respect to XML, our protocol is slightly different.
94
G. Organtini / Computer Physics Communications 140 (2001) 92–101
Fig. 2. The sequence of messages exchanged between instruments and CRISTAL. In this drawing messages from and to Instrument and PartOnInstrument objects are character strings sent over a TCP/IP connections, while other arrows describes CORBA messages.
maintained. In fact, program development translates just the concepts that people usually use when talking about the hardware principles of operation, the details being encapsulated into the component classes.
The language used is very close to the common language and the design of the software can be easily discussed with people who are not familiar with OO programming and design.
G. Organtini / Computer Physics Communications 140 (2001) 92–101
Fig. 3. The logical model of the DAQ system in Rome.
Dealing with modular devices makes the object technology particularly appealing being a modular programming language. Our experience suggests that, nevertheless, a careful design at the beginning of the project, even not too detailed, is of capital importance for future maintainance. At the beginning of our project, in fact, we started coding for one of the instruments used in the Regional Center without any formal design in mind. The final program resulted to be not easily maintainable even after few weeks, when a new device was integrated into the machine. A second instrument was modeled after a very simple analysis of the requirements. Even though the analysis was not made using formal tools, the final product is perfectly adapted to our needs and its maintenance is always very easy, each modification involving only few methods of, usually, no more than one class. A subsequent deeper analysis made using formal methods and patterns shows that a further optimization of the model can be achieved together with a further simplification of program structures. Adding a further abstraction layer every possible measuring device can be modeled in exactly the same way. Only the imple-
95
mentation of the specific, low level, single measurements must be provided by the user. To this respect, a basic knowledge of a formal, common language, such as UML [5] is of capital importance. People not involved in software development too needs some training in order to use a vocabulary widely understood. In our class definitions we tried to avoid as much as possible the use of pointers. When resizable containers are needed Standard Template Libraries (STL) are used, instead. According to our experience explicit use of pointers often causes memory leaks in user programs. Moreover non experienced people have much more difficulties in the transition from FORTRAN style to Object Oriented programming. The use of STL greatly simplifies the approach to OO technologies and makes programs simpler and even appealing for FORTRAN programmers. Moreover, reusability is one of the fundamental characters of OO programming languages and STL provide an excellent example for that. The efficiency of the software is usually greatly improved too. Actually no explicitly declared pointers appear in our programs. In Real Time applications performances are quite important. C++ programs in HEP are frequently reported to be less efficient then old style FORTRAN programs. We argue that this is the case only because this programming style is quite new for high energy physicists, who are not so smart as they are in FORTRAN programming. We compared the performances of our C++ programs for data acquisition with the ones of our old traditional programs written for the same devices in the past. At least we did not found any significant difference in performances. In few cases C++ provides tools such that some operation can be even faster. Of course calling methods (especially if they are virtual) is quite expensive in terms of time and in case of special needs we provided shortcuts. On average, the time needed to execute programs containing many virtual methods calls are in fact slower than FORTRAN subroutine calls, but the development of the program, its maintenance and its readability is so better that usually the price in terms of performances is worth to be paid.
96
G. Organtini / Computer Physics Communications 140 (2001) 92–101
3. Operating systems integration In the Rome Regional Center there are computers running both Windows NT and Linux. The CRISTAL system runs on Windows NT, while instrument programs run on Linux computers. The choice of the operating systems depends on the boundary conditions: Windows NT was chosen for CRISTAL because, at the time, there were no robust ORBs and Object Data Bases for other PC platforms. Linux was chosen for instruments in Rome to provide remote access, full control and multi-user and multi-tasking capabilities. Only one of the instruments (the 3D machine for dimensional measurements) runs its software (written by the company who produced the machine) on Windows NT. The CRISTAL software is quite independent and requires interaction only with instruments: the solution is provided by the Instrument Agent so, in fact, no integration problem exists for it. The 3D machine, instead, can only be operated using a windows interface. This is of course not suitable for our needs since the machine must be fully automatic, so it must not be operated by humans, but by another program. The solution was found this way: the machine can be programmed to make a given set of automatic measurements when started. The program consists, in fact, of an infinite loop during which the machine reads appropriate input files containing numeric codes based on which the 3D program executes the measurements. Collected data are stored in text files produced by the same program. The interaction with CRISTAL is made possible using a C++ program written using the Instrument and PartOnInstrument classes described above on a Linux computer. This program connects to CRISTAL, gets the jobs to be executed and, in case such jobs are to be executed by the 3D machine, writes the appropriate codes into the input files for the 3D program. At the end of the measurements, it reads the produced data file and builds the outcome string to be sent to CRISTAL. Both the input and the data files reside on disks belonging to the Linux computer: those disks are shared with Windows machines using Samba [6]. Samba is a tool who makes Unix file systems readable and writable by Windows applications. Single directories can be exported to and mounted on other machines with the
Windows file systems, appearing as shared folders. In this way it is possible to share data between Linux and Windows. It has to be noted that Samba, as most of the Linux software, is free of charge. The usage of Linux and its tools reduces quite a lot the total cost of the system. The 3D machine being mounted along the crystal server, is inside a dark room and is not accessible during the measurement. It is of course desirable to have the possibility to interact with its control program even during the measurements in case of problems or just to control its flow. Unfortunately the cable connecting the machine to the control computer cannot be too long and the computer must be inside the dark room too. A simple solution was found in VNC [7], a freeware product initially developed by an Olivetti–Oracle consortium and now being maintained by AT&T. VNC can be downloaded for free from the network and is composed of two tools: a VNC server to be installed on a Windows machine (on NT it can be installed as a service) and a VNC viewer to be installed on a Linux computer. The viewer can be run on Xwindows and connects to the VNC server, showing the Windows desktop inside an X-window. VNC is used to remotely monitor the 3D machine status during the measurements. Windows systems are only able to export their data sharing disks, but the remote machine control is not available unless third party products are used. VNC is still not good enough for remote control, its main disadvantages being the speed of the connection and the fact that it is not multiuser. The desktop being exported via network makes in some cases the connection quite slow. In our Regional Center, however, speed is not an issue since all computers are connected between them in a local network and, moreover, the connection has to be used just in case of problems: ideally no interaction will be needed between Center operators and the 3D PC. The Linux Open Source strategy, of course, makes easier the realization of tools for machine integration. Conversely, according to our experience, Windows is less suitable to be integrated in laboratories environments, due to its closed nature. It is, of course, greatly appreciable for office tasks.
G. Organtini / Computer Physics Communications 140 (2001) 92–101
4. Web tools Even if the ACCOR machine has been built to operate in conjunction with CRISTAL, it can also be operated without using it. The way in which it operates is always the same, but in the standalone mode the Instrument Agent is not linked to CRISTAL, but it simply gets commands from dedicated input ASCII files. The only change required in the system in order to run the machine in standalone mode is the setting of an environment variable holding the name of the computer running the Instrument Agent. In order to simplify the user interaction nearly all the operations that can be done on the machine can be accessed via the Web. On the Linux machine who controls the instruments, the apache [8] server, usually distributed with the operating system, provides all the functionalities to easily setup a Web server. In the following the various operations that can be done using the Web are described. The pages design was kept as simple as possible to provide just the relevant information with the minimum load for the server and the network. The ACCOR home page has two frames: one contains the main menu and the other is used to display the results of the requested operations. 4.1. Documentation The simplest job for a Web server is to provide documents. Clicking on various items in the Documentation menu in the first frame, pages are shown containing the local documentation for each piece of hardware or software present in the center. In particular, the hardware documentation is organized in such a way that each piece of hardware is mapped into an hyper-link. Following it the user can browse the complete documentation about the use of the object inside the center. Links are provided to the corresponding data sheet and to the producer where available for a more general understanding of the object functionalities. Those pieces connected between them (i.e. NIM and CAMAC modules) by cables are logically connected by hyperlinks: for each cable connecting two modules a link is shown on the corresponding page, showing the connections and the purpose of them.
97
This kind of documentation is static since it has to be changed manually from time to time only if some change is made to the hardware setup. It is more effective than paper documentation since it can be navigated easily and is available everywhere. Users, moreover, must not take the responsibility to have the latest version of the documentation in their hands. 4.2. Machine control Web tools are also used, in the Rome Regional Center, to control the machine. Using forms, authorized users can interact with the system providing information about the measurements to be done, i.e. the parts to be measured, the type of measurements and the parameters to do them. Data provided in the form are then collected by Perl [9] scripts that automatically generates configuration files for the machine operation, according to the user requests. The system has the advantage that the user must not know about the details of the configuration, but only know about the high level operations to be done. Moreover the user interface is always the same even in the case of a redesign of the configuration protocol. The Web interface is very simple to understand even to non experienced users and let the developers to concentrate their efforts into the realization of the operations instead of using their time in implementing graphical interfaces. Another important benefit of this approach is that the ACCOR machine can be operated even remotely, without the need of operators near to it, but when the machine has to be loaded with parts. 4.2.1. Functional decomposition In order to be able to be operated by the Web, instrument control programs have no graphical interface. When needed, output is given in ASCII form, regardless of its type. Rather than being a disadvantage, such a choice has been proven to be very effective. Program structure is simpler and several programs can interact between them getting inputs from character pipes created between them, resulting in an improved collaboration between the various components. Thanks to this functional decomposition operations can be splitted in sub-programs who can collaborate in a versatile way, reducing both the number of them and their complexity, since each functional component is just responsi-
98
G. Organtini / Computer Physics Communications 140 (2001) 92–101
Fig. 4. Relationships between software components in the Rome Regional Center. Where not shown, the name of the relationship, must be read as “uses”.
ble for part of the global task to be executed. Whenever needed graphical interface is provided either by a Web browser or by Perl scripts that starts other graphic browsers. As the diagram in Fig. 4 shows, the relationships between the various functional components are quite a lot. Nevertheless, the system is extremely flexible, while, at the same time, each component remains quite simple in its structure. Components are frequently put together using Perl scripts. The ability of Perl to deal with regular expressions and its excellent interaction with the operating system, as well as the great variety and flexibility of its data types, makes it a very useful tool for all the
problems in which string manipulation and component collaboration is needed. 4.3. Machine monitoring Monitoring the machine status is another operation realized by means of Web tools. Several methods are implemented to monitor the machine status. As explained above each instrument has its own software independent on each other. Any program, nevertheless, writes logging messages on log files named after the date of the measurement. The log messages are formatted in such a way that the first characters identi-
G. Organtini / Computer Physics Communications 140 (2001) 92–101
fies the program who wrote it and the time. Users can look at log files providing the date by means of a form, whose output is given to a Perl script that automatically unzips log files (if they were compressed), reads and shows them using a different color for each instrument. In this way the user can easily identify the parts in which he is interested in. Environmental and slow control data (such as the temperature, the high voltages, etc.) are also available after the automatic execution of scripts that runs programs to read these data and to display them in a readable format. Off-line monitoring is also available: all the data recently collected by the machine are stored on disks accessible by the Web server computer. The main access to Regional Center data is CRISTAL, but
99
during measurements it is useful to be able to make fast analysis on recently collected data. Enough space is provided in order to store data coming from the measurement of about 300 parts. When space becomes to be short older files are automatically purged. Perl scripts looks for available data inside the default directories and dynamically generates lists of hyperlinks for each part for which a measurement has been done. Clicking on them other scripts are run who read those data and provides plots, histograms and combined numerical data to the user via the browser window. No plot is stored on disks, saving disk space, and no maintenance is required to the Web master to the pages providing this kind of data since they are generated on the fly using the scripts cited above. An example is provided in Fig. 5.
Fig. 5. One of the pages of the ACCOR Web Server: users can look at transmission plots just after they were measured clicking on the identifier of the crystal. Plots are generated on the fly by a Perl script using gnuplot, a GNU application to make plots.
100
G. Organtini / Computer Physics Communications 140 (2001) 92–101
In order to keep the computer and the machine under continuous control and in order to collect historical data, use is made of a freeware tool known as MRTG [10]. The Multi-Router Traffic Grapher is a tool originally developed to monitor network traffic on Internet routers, but it can be used for other purposes too. In fact, it is able to build historical plots provided by scripts whose output is the data to be plotted against time. A simple configuration file is the only duty of the user: the result is the production of gif files containing plots of data against time with daily, weekly, monthly and yearly basis. Those plots can be used inside Web pages too. In our case plots of the usage of CPU, memory, disk space and network by means of the instrument computer are available in order to monitor the computer load, as well as plots concerning the amount of measurements.
5. Object Databases and CORBA As already mentioned, the CRISTAL system uses Objectivity to makes objects persistent and CORBA to allow communications between objects. The used CORBA implementation is IONA Orbix. Object Databases are the best solution for object storing and retrivial since they provide the tools to obtain persistency and implements several methods to guarantee data consistency. Commercially available products are not yet mature enough for high energy physics needs, where performances are an issue. In our case the access speed is enough, due to the low rate of updates (a crystal is measured within few minutes). There are several implementations of CORBA protocols, but few of them are strongly supported by companies. Only recently few ORBs show up on the market (even in the freeware one) and few of them appear to be quite robust end efficient. According to our experience, the main issue, at the moment, is again the speed. The marshaling/unmarshaling phase and the type conversion are the most time consuming tasks performed by CORBA servers and clients. When extremely high speed is not required, however, CORBA is a very effective way to promote the functional decomposition and the distribution of objects.
6. Conclusions 6.1. General remarks No more doubts should exist, in high energy physics, that Object Oriented Programming is the best solution for large projects. In this work we experienced the benefits of this technology even at a smaller scale. Both the development and the maintenance of a medium size experiment can be greatly improved using OO programming. Currently there is not enough knowledge in this field and efforts must be made in order to improve it. A strongest collaboration with software engineers, in particular, is desirable. A very practical design strategy we have found is the splitting of a project into smaller independent pieces collaborating between them. We call this approach the functional decomposition of the project. During the development phase the needs of sharing the responsibilities between different tasks may not appear to be essential, but some effort should be put in the identification of the modularity of the project since it may help a lot in the future, when the project becomes larger. Functional components then result to be simple to maintain and reusable in many applications, improving the benefits of Object Oriented programming. Explicit use of pointers arithmetics is discouraged: it weaken the robustness of the software and is frequently against the OO approach, to our opinion. 6.2. On the use of commercial products A lot of discussion has been made in the recent past about the use of the so called commercial products. The current strategy is to push towards the use of them as much as possible. There are several advantages in this strategy: • timescales for the development of projects are greatly reduced; • documentation is usually of high quality; • the use of the resources are optimized, both in terms of hardware and of personnel; • support is guaranteed on large time scales, the project being commercially supported and not depending on people’s scientific interests.
G. Organtini / Computer Physics Communications 140 (2001) 92–101
Nevertheless care must be taken in choosing a commercial product for physics needs, taking care of the following points: • commercially available software is tuned for the needs of the largest possible part of potential users: with this respect physicists usually represent a negligible part of the possible market, so companies have no big interests in providing specialized software tools; • the support from companies is not always such as expected; • the risk to become company-dependent is high: any software company will try to sell its own product, so the efforts of consortia aiming to standardization, actually are non completely effective. This was proven to be true for both ODBMS, CORBA and even for the C++ language itself. All of these pros and cons have been experienced during our work. Another common practice is to consider commercial only the software produced by well identified companies, maintained by them and sold as binaries. Open source software is not usually considered as good as traditional commercial software. Open source strategy, instead, allows for support by companies, while leaving the user the freedom to patch the software for its own special needs, if any. In order to save resources and not to loose the advantages, in terms of standardization and maintainability of such products, companies can even be hired to maintain and integrate, if needed, those packages.
101
Having these points in mind, the choice of a commercial product should always be welcome, if solves more problems than it creates. For the future, national and international agencies should push for the birth of specialized software houses who will produce or maintain software modules for physicists in analogy with what happens for electronics.
References [1] The Compact Muon Solenoid, Technical Proposal, CERN/LHCC 94-38; CMS ECAL Technical Design Report, CERN/LHCC 97-33. [2] The Electro Magnetic Calorimeter Project, Technical Design Report, CERN/LHCC 97–33. [3] I. Dafinei et al., Measuring techniques for large scale production of PWO scintillators, in: Proc. of Int. Conf. on Inorg. Scintillators and their Appl., SCINT99, Moscow, Russia, 1999. [4] J.-M. Le Goff et al., CRISTAL/Concurrent repository & information system for tracking assembly and production lifecycles — A data capture and production management tool for the assembly and construction of the CMS ECAL detector from ftp://cmsdoc.cern.ch/documents/96/note96_003. [5] G. Booch, J. Rumbaugh, I. Jacobson, The Unified Modeling Language User Guide, Addison-Wesley, Reading, MA, 1999. [6] To find information about Samba see http://www.samba.org. [7] See the home page of the project: http://www.uk.research.att.com/vnc/index.html. [8] See the home page of the project: http://www.apache.org/. [9] L. Wall, T. Christiansen, R.L. Schwartz, Programming Perl, O’Reilly. [10] Documentation can be found at http://www.mrtg.org/.