2
Chemometrics and Intelligent Laboratory Systems
n
Monitor
Electronic Laboratory Notebooks Although Laboratory Information Managment Systems (LIMS) provide a suitable vehicle for recording data taken in quality control and assurance laboratories, their rigid format renders them unsuitable for the research and development laboratory. Recently there has arisen interest in the development of a program suited to these needs that might be called an Electronic Laboratory Notebook. This approach envisages a terminal on the lab bench that can be used to enter experimental procedures and results, that will electronically capture data, record and store bitmapped images from spectrometers or text sources, and eventually permit interleaving of these into reports and manuscripts via compound document architecture standards. Although the scientist may view this as a tool for his convenience, the patent attorneys in many companies view it as a means of helping establish precedence in pursuit of a patent, and to demonstrate diligence. All of this information must, of course, be stored in an electronic form that would be accepted by the courts in any legal actions that might ensue. The media must have a very long shelf life. The newer optical storage techniques are ideal for this purpose. The basic idea began, it would appear, in the mind of Keith Caserta of Procter and Gamble. A number of
others have elaborated on the subject, and helped it achieve considerable momentum. At the Scientific Computing and Automation Conference held in Philadelphia in November 1988, this author organized a symposium with the title of the Electronic Laboratory Notebook. The speakers included, beside the author, Kewal Likhyani of DuPont. At that Symposium this author demonstrated the electronically stored screens and images that comprised an Electronic Laboratory Notebook he had been keeping for over 6 months. This employed offthe-shelf software packages that included MDL’s ChemTEXT, Generic CADD, SuperCalc, a shareware program for personal data bases called 3 X 5, dBASE and rBASE, among many others. FORTH was used to capture original experimental data, and create automatic indices for each experiment that included experimental conditions. Hewlett Packard Scan-Jet units were used to capture SEM photographs, book pages, technical specification sheets, and other hard copy. Although fusion of the results was not seamless, and at times very frustrating, it demonstrated the feasibility of the concept. Subsequently, in April of 1989, Likhyani called a meeting of interested parties together in Philadelphia to further discuss the possibilities of implementing such a system. There
were 22 attendees, almost all from large American industries. These included representatives from petrochemical, pharmaceutical, and heavy machinery industries. Also in attendance were about six lawyers and legal counsels, and Linus Liddle from the U.S. Patent Department. This group has considerable experience in storing the millions of U.S. patents as bit-mapped images, and searching them using a SGML language. After a morning of formal papers the group split into four focus sessions, each having scientific and legal counsel representatives. From those sessions came a series of draft documents that indicated specifications for the Electronic Laboratory Notebook. All items mentioned above were specified as being essential. However, the scientist might be surprised at some of the features that the legal counsels felt necessary. These included some history of purchase orders associated with a project to allow demonstration of diligent pursuit; some means of limiting search by opposing legal counsels who might be tempted to go on a fishing expedition; and adequate controls to assure legal acceptance of the electronic storage media. It is interesting that the availability of optical drives in the form of WORM disk subsystems resolved any concerns in the lawyers minds. The write-once, read-many-times nature of these disks, the use of Cross-Interleaved Reed-Solomon redundant
n
Monitor
3
code to allow error detection/correction, and the natural interleaving and date stamping of the materials recorded was felt to provide suitable deterrence to forgery, alteration, and falsification. Access to such electronic documents by Clerks of Court responsible for compilation of pertinent data for Court Room trials could be limited to boundaries of reasonable search, just like current search restrictions. Indeed bodies have already begun to discuss the required mechanisms. Volume integrity could be achieved by physical number stamping, duplicate storage of the CIRC redundant code on a separate volume, and intertrack ID information. None of these techniques preclude falsification, but the routes would be much more technically dif-
m
ficult than the changing of current paper documents. The Heads of the Focus Teams have recently met for a second time, and it would appear that a consensus is being reached on what might be acceptable to industrial laboratories. The academic and governmental scientist would find such an Electronic Laboratory Notebook equally exciting. A few problems remain to be addressed, and interested readers are challenged to think of suitable strategies. For example, how does one verify that a particular piece of data was entered by a particular person without engendering time consuming and frustrating pass word, ID, or hardware security devices? Or, how can an interest group develop generic
RAYMOND E. DESSY Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, U.S.A.
Software Review
The NIST/
EPA / MSDC Mass Spectral Database, Personal Computer Versions 1.O and 2.0 Distributor and price:
concepts that would stimulate vendors to develop compatible software? There is an indication that at least two large firms are developing some form of electronic notebook, and that several small firms are also boldly entering this new arena. Efforts in mainframe computer companies directed toward compound document standards for instrument data entry are already in progress. Eventually one sees this Electronic Laboratory Notebook complementing and coupling with standard LIMS packages in creating the computer integrated laboratory.
U.S. National Institute of Standards and Technology, Office of Standard Reference Data, Gaithersburg, MD 20899, U.S.A. (Version 1 US$ 750, Version 2 US$ 975, upgrade from Version 1 to 2 US$ 225).
Technical specifications: Version 1: IBM XT, AT, PS/2 or compatibles with Computer: Hercules monochrome or CGA, EGA color monitor for optional graphics display. Version 2: IBM AT, PS/2 or compatibles with VGA, EGA, CGA color display for optional graphics. MS DOS Operating system: 512 K (Version 1, 2), 640 K (Version 2 with all Minimum memory: options). Peripherals required: Hard disk (Version 1: 8 to 14 Mbytes, Version 2: 9 to 22 Mbytes). Regular Epson compatible printer for text only. For graphics printing HP Laserjet + or compatible printer.
Introduction The recent availability of the U.S. National Institute of Standards and Technology (formerly National Bureau of Standards)/Environmental Protection Agency/Mass Spectrometry Data Center (NIST/ EPA/ MSDC) Database in an inexpensive version for personal computers should greatly increase its use. This collection of reference mass spectra is jointly maintained by NIST, the U.S. Environmental Protection Agency, and the Mass Spectrometry Data Center of the Royal Society of Chemistry (U.K.). It was originally known as the EPA/NIH Mass Spectral Database. The personal computer Version 1 of this database is dated September, 1987, and consists of 43990 electron ionization spectra with only one spectrum per Chemical Abstracts Registry Number. There are more than 42000 compounds represented