COMPUTERS
AND
BIOMEDICAL
A Medical Information JEFFREY Division
R. BUDD,?
of Pediatric
21, 419-433 (1988)
RESEARCH
WARREN
Relational J. WARWICK,
Pulmonology, Department School, Minneapolis,
Database System (MIRDS)* AND CATHERINE
of Pediatrics, Minnesota
University 55455
L. WIELINSKI
of Minnesota
Medical
AND STANLEY Division
of Health University
M. FINKELSTEIN
Computer Sciences, Department of Minnesota Medical School,
of Laboratory Minneapolis,
Medicine Minnesota
and Pathology, 55455
Received June 1, 1987
A medical information relational database system (MIRDS) which is resident on a relational database machine and is accessed via microcomputers has been created for a pediatric pulmonary division of a research hospital. The power and flexibility of MIRDS has permitted the integration of clinical tasks, research interests, and laboratory functions. Procedures have been devised to assure data integrity, allow flexibility in data retrievals, produce standardized report formats, and permit data access for users with a wide range of query expertise. There are few impediments to the integration of additional clinical, research, and laboratory functions as the system evolves. 8 1988 Academic PXSS, IIX.
The Division of Pediatric Pulmonology at the University of Minnesota is a specialized clinical unit at a research hospital which, in addition to its primary role of patient care, provides laboratory results for patients being seen throughout the hospital. Its goals and the goals of its supporting database system are multifaceted. The division’s database must provide support for the clinical care of the patients being seen by division physicians. It must retain clinical and research data for ongoing and future research projects. Finally the division’s laboratory must interface with the database to produce timely,
* This work was supported in part by the National Institutes of Health Grants 5-PSO-HL-27355 and l-ROl-HL-37504-01. t Present address: Pediatrics Department, Box 184 Mayo Memorial Building, University of Minnesota, 420 Delaware Street S.E., Minneapolis, MN 55455. 419 OOlO-4809/88 $3.00 Copyright 6 1988 by Academic Press, Inc. All rights of reproduction in any form reserved.
420
BUDD
ET AL
accurate, and readable pulmonary function results for both hospitalized and ambulatory patients. These three goals. presented in reverse order, roughly follow the evolution of the division’s computer system, each goal corresponding to a different hardware/software configuration. In the first stage a Digital Equipment Corp. (DEC) PDP-8E laboratory minicomputer was used to control and support standard pulmonary function tests, print the results, and store the data on floppy disks (I). Given the disjointed nature of the data storage media this system was never used for compiling data for research projects. In the second stage a Data General S-140 minicomputer system was used not only to control and support pulmonary testing but also to store most of the quantifiable data from the patient record for the population of cystic fibrosis (CF) patients being seen within the division (2). The software used for this purpose was a CLINFO (3) dialect, called INSIGHT, a medical database package originally written in BASIC which was intended for clinical trial research projects. As such it integrated data storage and manipulation with statistical analysis functions in a menu-driven structure. As the size of the database increased, however, the response time of queries became unacceptable. This problem plus INSIGHT’s inability to integrate text with numbers and its single user per data set limitation made the system impractical for the next stage of clinical support. In searching for a system to support clinical care, research interests, and laboratory functions a number of requirements were specified. First, the varied and unpredictable nature of research needs dictated a database structure with a high degree of flexibility in integrating data during queries. Second, the system needed to allow the complete integration of laboratory functions within the database software to relieve the laboratory staff from manual entry of results and to permit immediate review of results upon test completion. Third, both entry and retrieval of data needed to be fast even while multiple accesses of the system were being performed. Fourth, anticipated demand required that the system be expandable to accommodate more user stations as its functionality increased. Fifth, given the range of user knowledge, software had to permit both retrieval experts and computer novices to use the system to their maximum potential. Sixth, in order to achieve full functionality without writing all application software in-house the integration of the data with commercially available software was required. Seventh and last, the cost of the initial purchase and the stepwise expansions of the system needed to be within the limited budget of the division. The data for which this system was to be built include extensive clinical and research information on a population of children with chronic lung problems. In a few cases the pulmonary division has been following “children” for over 25 years. Of the 1624 patients in the database, 683 have CF while the rest of the patients have a variety of other lung diseases such as asthma or chronic bronchitis. To organize the multiple types of medical information the data have been grouped into six categories: background, event, outpatient, pulmonary, other laboratory, and research.
MEDICAL
RELATIONAL
DATABASE
421
The background information includes basic items such as name, sex, race, birth data, diagnosis data, and current status. Correspondence information is also stored such as addresses and phone numbers for the patient, parents, and personal physicians. Another large portion of the background information is obtained from a questionnaire which is based on a standardized epidemiological format created for the National Institutes of Health (4). The event information category details when events of interest occurred and describes each event. Separate relations in this category describe clinic visits, hospitalizations, clinical procedures, and participation in research projects. The final relation in the category is taken from a unique staging form which details the disease involvement of each organ system. Outpatient information is a complete record of every outpatient visit. These data include the standard measurements of height, weight, blood pressure, etc., a record of the physical exam, a health history filled out by the patient, and a record of prescribed therapy. This information is available on well-structured forms which have stayed remarkably consistent since 1975. The one noncodified relation is the record of prescribed therapy which is stored as medication name, concentration, and a textual description of prescription directions. Pulmonary information is listed as a separate category because of its importance in the evaluation and management of lung disease. It includes the Brasfield X-ray score (5) performed on all CF chest X-rays, flow-volume spirometry, blood gases, ear oximetry, body plethysmography, CO diffusion capacity, gas mixing efficiency (6), cilia function, and mixed venous COZ measured by the rebreathing technique (7). The cilia function, spirometry, gas mixing, and mixed venous CO* tests were designed and built in the pediatric pulmonary laboratory. These last three tests plus the CO diffusion and body plethysmograph tests are those for which direct database entry was desired. The other laboratory tests include all of the standard clinical laboratory tests such as antibiotic level, blood chemistry, electrophoresis, hematology, immunoglobulins, microbiology, microsensitivities, and urinalysis. Test results which are more common for lung disease patients such as a-1-antitrypsin, aminophylline level, retinal binding protein, sweat electrolyte level, and vitamin A and E levels are also stored. The final category of information is derived from research projects focused on children’s lung diseases. One type of data common to most research projects is the general information saved for all control subjects. The other data groupings correspond to specific research projects and are formatted by each investigator. The volume of this research data is approximately half as much as the clinical data listed above. SYSTEM DESCRIPTION
The data as described above are structured in MIRDS using the relational model (8, pp, 83-95). The flexibility of this model permits the creation of relations with widely varying numbers of key attributes (Fig. 1) and has worked
422
BUDD ET AL. a)
Chart
numb-,
b)
Chart
number,dati,
c)
Chart
number.
d)
v
e)
Chart
FIG.
date.
=*-date.
address,
city,
vitamin
A,
time,
fasting?,
tj&e-_swce.
state,
vitamin
zip
code
E blood
glucose
u&Jx~~Q&.
growth
numb~$W&.~~!z.~z%x!&eL~!k~~
aJ&i.!Dotic-cnqn,
f)
street
&ccedure-cpee,
sensitivity procedure-name
I. In these examples of relations the underlined attributes specify the primary key
well for specifying and managing all the division’s clinical data. The power of this model is most apparent however when it is applied to research. To quote Wiederhold: The important point in scientific usage of databases is that information is produced not by the retrieval and inspection of a few values but rather from the relating of many findings in accordance with hypothesized cause-and-effect relationships. (9. p. 188)
The fact that the interconnections between relations are not predetermined but can be quickly and dynamically established means that few constraints are placed on retrievals by the data structure. The one instance where the data structure is not truly normalized (8. pp. 237-272) is in the storage of time information. The importance of this attribute in medical databases has been well documented (3, IO). To accommodate research and clinical interests this information is saved both as date and as number of days old. This duplication of data storage occurs upon data entry when the days since birth is computed automatically by the data entry procedure. This additional format is useful in making comparisons across age cohorts regardless of date. The hardware on which the database resides is centered around a Britton Lee’ IDM 500 relational database machine. It is somewhat analogous to an intelligent disk controller with hardware, firmware, and software optimized to maintain data in a relational structure. Because it was designed for this one task it is faster and less expensive for this one function than software implementations of relational systems on large general purpose computers. One of the ways the system is able to achieve this performance is through efficient data storage. The database of 570,000 tuples from 89 relations is currently stored in 42 Mbytes of disks space. When compared to the minicomputer database implementation the data have been compressed to less than one-quarter of its previous storage size. One consequence of providing specialized hardware support for the relational database function is the need for other general purpose front-end computers. These systems serve to translate data requests from the user into database machine format and to present the resulting data in human readable ’ Britton Lee. Inc., 14600 Winchester Blvd., Los Gatos, CA 95030.
MEDICAL
RELATIONAL
DATABASE
423
form. A large number of firms have written software to perform this front-end function for computers ranging from micros to mainframes. Front-end computers can interface to this database machine through either serial, parallel, block mode, and/or Ethernet connections. To meet the listed specifications the division is using IBM-compatible microcomputers as front-end computers. Currently there are 13 microcomputers which have access to the database machine via RS232 interfaces. These computers were chosen because of the low cost to expand the number of users on the system and the availability of a wide selection of software for processing data retrieved from the database. The Department of Epidemiology has linked their DEC VAX 780 to the database machine via an IEEE 488 interface. While the data sets from these two sources are currently separate these two disparate computer systems could easily share data if permission was granted within the security system. The software which is available on the front-end computers can be in the form of either higher level language interfaces, stand-alone query languages, or menu-driven access systems. The language interfaces are generally a package of subroutines for COBOL, C, BASIC, FORTRAN, or other higher level languages which give them the relational features needed to access the database. Such systems are useful for integrating the power of the database machine into applications which have already been written or for building a highly specialized system from scratch in which a procedural language is needed to perform required functions. The nonprocedural relational languages such as the Structured Query Language (SQL) are useful for ad hoc queries in which a high degree of flexibility is required (8, pp. 117-143). Such languages are easier to use than subroutine packages but they still require some skill in structuring individual queries. Higher level access systems on the other hand encompass features such as query by example which require little user sophistication and yet still retain a great deal of flexibility. One power enhancing feature of the database machine is its ability to store precompiled commands. This allows, for example, the creation of a complex retrieval using SQL which can then be executed from any software system. Given the range of activities performed by the division it was necessary to have the capabilities of each of these software options. The HDR2 SQL language used by the division allows the storing of query results into local files. This facility has proven to be useful for subsequent processing of retrievals with personal computer software packages such as word processors, graphics programs, spreadsheets, and statistical routines. FREEFORM3 is a screenoriented, menu-driven system which incorporates query by example and report generation into a unified package. Within each stage in the screen processing it is possible to program background functions in a PASCAL-like interpreter. z HDR Systems, Inc., 8404 Indian Hills Dr., Omaha, NE 68114. 3 Dimension Software Systems, Inc., 1717 Walnut Hill Lane, Suite 104, Las Colinas, TX 75038.
424
BUDD ET AL. a)
b)
select growth,count(grouth) from micro where dkn1985*” and group by growth
microbe-code:2005
select days-old/365.25, 02-sat. fefl/fvc from test t. fvctest f where t.id : f.id and t.dt :: f.dt and and 02-sat>0 and 02-sat<90 order by days-old
t.tm
: f.tm
FIG. 2. These are two examples of MIRDS retrievals using SQL where “id” “dt” is date, and “tm” is time.
is chart number.
User-defined subroutines, implemented in the Lattice4 C language, can also be linked to FREEFORM and subsequently called by this interpreter. By also using stored commands defined by the SQL language a great deal of power and flexibility have been written into an easy to use interface package. Finally, user written C language programs can be linked to Relational C routines (Britton Lee) to create stand-alone programs which can directly access the database. SYSTEM
FUNCTIONS
The multifaceted access options of MIRDS allow it to support many division functions. Research support is one of the most demanding of these functions. Research queries are primarily performed using SQL to store formatted data in a file for subsequent processing by statistical, spreadsheet, or graphics package. While the flexibility of SQL has been previously explored, actual examples can illustrate this in the context of a medical system. A query to select the number of low, medium, and high growths of Pseudomonas aeruginosa (code = 2005) in 1985 from the microbiology relation is shown in Fig. 2a. A query to select a list of age, arterial O2 saturation, and the proportion of a forced expiratory blow which occurs in the first second for all subjects with an arterial O2 saturation below 90% is shown in Fig. 2b. These data are retrieved from the “test” and “fvctest” relations and sorted by age. Database administration functions are available both within database utilities provided with SQL and within FREEFORM. These include machine wide functions such as reconfiguring ports, setting the database machine time and date, and listing databases with their owners; database functions such as backup, creation, destruction, extension, and access permission; relation functions such as creation, destruction, name changes, indexing, and access permission; and attribute functions such as adding to relations, removal, name changing, and access permissions. When relations are created they must be specified as logged or unlogged. Logged relations are automatically tracked by a transaction file which is maintained for each database on the database machine. This transaction file stores all the additions, deletions, and changes plus their time of occurrence. This file ’ Lattice, Inc., P.O. Box 3072. Glen Ellyn, IL 60138
MEDICAL
RELATIONAL
425
DATABASE
FREEFORHtm mm_-__ System Menu --w--m Enter Number of Selection Review patient information Create a report ::: Correspondence information 22. Research studies 23. Add or change subsets Append or change data i:: Verify data Request database changes f? . Pulmonary lab tests Create a plot 2:: Transfer a file 52. Electronfc memo ii.
Enter
the
selection
number
or
"00"
to
exit
FRBEFORH
FIG. 3. These options are presented on the pulmonary laboratory computer. A subset of these options are presented on other systems.
permits the administrator to roll the database back to a specific point in time or, if a data destructive crash occurs, to load a past database backup version and roll forward. The transaction file backups are performed weekly and the complete database backups are performed monthly from the VAX system’s tape drive. The remaining functions available on MIRDS are summarized in the menu presented on the laboratory microcomputer after entering FREEFORM (Fig. 3). Once the data structure is established the first step in using the database is the appending, or entering, of data (selection No. 31). Here FREEFORM’s screen generation features offer a high degree of flexibility. By embellishing the data entry screens they can perform data checks for range and data type and can cross reference data with other relations. For example, each patient number is compared to the master patient list to confirm that the number is correct and each date is ranged between the patient’s birth date and the current date. All of these data checks are performed as the data are being entered into the database with no reduction in keying speed. The second step in the data entry procedure is data verification (No. 32). In the previous database configuration it was impossible to verify the entered data so a complex scheme of data validation was devised (II). In the current configuration, however, a FREEFORM screen is created for each relation to perform this function. During this procedure, which takes about as long as data entry, keyed entries are compared with stored data and any confirmed errors are automatically changed in the database. Anyone familiar with the data or the database structure can use the “database change” screen (No. 33). This screen allows problems or requests for changes to be immediately noted in a format which assures prompt action by the database administrator. The data access screens for the patient record data are organized as background, event, outpatient, pulmonary, or laboratory information (No. 11).
426
BUDD ET AL. Staging
Screen
EXEC
MODE
y
Chart
nuber
lLUQJ&l
type
i8
r
Date Stage Problem Stage
code code
L
Problem
la
Stage
name name
Gastrointestinal BectalDrolaasa
FIG. 4. This is an example of a view (MODE = VI which was executed (EXEC = y) on a particular patient (1010101) for annual stagings (stage type = a). This is a joined screen with the names of the coded information appearing on the right.
When relations are created data items are often coded to make data entry easier and more consistent. When access screens are created these relations are joined with the code-name relations so a review of the data is more understandable (Fig. 4). In other instances two or more relations may be joined to make the presented data coherent and complete. All of the preformatting of the data and the menu structure of the access screens are designed to allow people who are less adept at access techniques to view and understand data without extensive training. Relations from long-term research projects are integrated into the menu as a separate selection (No. 22). The submenu under this selection links to study menus which access screens for the relations from that study. These screens are used for both data entry and retrieval since principal investigators are responsible for their own data. The adding or changing of subsets is an important function for research studies as well as for day-to-day database tasks (No. 23). The screens in this selection assure that no subsets will be created without defining their creator, their purpose, and their anticipated longevity. These screens also permit the adding or deleting of individual subjects as well as the deletion of an entire subset including the subject tuples with a single command. The correspondence screens (No. 21) can be used to retrieve telephone numbers and addresses of patients and family physicians, or to create an address list for any group of patients. Such mailing lists can be printed on labels for mass mailings such as newsletters or integrated with word processing packages for generation of individualized form letters. Report printing (No. 12) and plot making (No. 42) are designed to make data more understandable. These options present a number of formats for specific relations where standardized output is regularly required. The four-color digital plotter is used primarily for plotting pulmonary function test data or for longitudinal data plots used in determining patient progress (Fig. 5). The printer is
MEDICAL
RELATIONAL
U of M - F'EOIATRICS Fvc TEST
427
DATABASE WJECT,FEMLE 2222222-Z
5/25/87---2832 (* = PREDICTED)
VOLUME (LITERS)
- 1.
a
1.
3.
2 THE
4.c
-a
6ECUNO9
FIG. 5. Using a graph such as this, normally presented in multiple colors, physicians can compare a patient’s test effort to an expected effort from a well child of the same height and gender (marked by *I.
used for pulmonary function test reports (Fig. 6), outpatient therapy reports, and other formatted data retrievals. Two other functions, file transfer (No. 51) and electronic memo (No. 52), are useful because of the centralized structure of the database system. The file transfer utility allows the dumping of any host computer file in either ASCII or binary format to the database machine where it is stored as a block of data rather than as a relation. This data block can be retrieved by any other host computer connected to the database machine. The electronic memo is a feature which can take the place of phone calls or written messages. The most complex function of MIRDS is the performance of laboratory tests (No. 41). FREEFORM was used in this application so the laboratory technician-computer interactions could be designed using its screen generation capabilities. Database access is performed directly by FREEFORM or through
428
BUDD ET AL. University
of
Date:
Minnesota
Pediatric Pl
Pulmonary
Function
05/25/87
Time:
ID Number : 2222222.2 Name: SUBJECT, FEMALE DX: CF Room Relative
Height: AW : 59.1Kg Temperature: -50
since 02
Forced
Vital
c %
last
bronchial :
Capacity Best
Result Pred
73
(BTPS): % Pred
Meall
.89 1.95 2.89
FVC PF FEF.75 FEFl .O
2.06 5.39 1.77 1.51
3.56 6.94 3.80 3.18
57. 77. 46. 47.
VC./FET MEF.50 MEF.75 MMEFR
.61 1.30 .54 1.18
1.23 4.92 2.42 3.83
49. 2.6. 22. 30.
Mixed Small Percent
Airway Score: f2irw.y Score: Predicted Score:
on
maneuvers
Patient There Patient
FVC
might was
no was
better
coughing limited
44
Kg
.72 3.28 3.61
% Pred
cv
80 168 124
2.07 18.39 6.74
1.96 6.38 1.65 1.40
54 91 43 44
4.71 14.11 6.94 6.83
.54 1.18 .47 1.05
44 24 19 27
11.29 8.53 14.01 11.00
43. 29. 54.
t-e :
#
performed do
not
82. 134. 116.
46 32 53
procedu
m m Hg
93
2162 3.37
FVC
months
drainage.
FEVl/VC PF/FVC FET
3
I
760
Weight: AI : 74.4 OUTPATIENT
c
Saturation
Comment
FEMALE 28 years,
Pressure:
cm 37
hours
Arterial
20 60
107
20:32
Sex: Age :
Temperature: Humidity:
Laboratory
with
practice
during by
1 was
the pain
or test
or
best more
cooperation
procedure. fatigue.
FIG. 6. In this example report form from a forced vital capacity test the “Best” columns are the results from the best test maneuver while the “Mean” columns are the average results from the three best tests maneuvers. The “CV” column is the coefficient of variability and is used to judge the consistency of the patient effort. “AW” is allometric weight (predicted) while “Al” is allometric index (percentage of predicted weight).
stored commands created with SQL while the tests themselves are programmed in C routines linked to FREEFORM (Fig. 7). This system speeds up test performance and result availability. The technician no longer needs to enter background information on each patient before a test or key in the results and then verify them after the test is completed. In addition, for those who perform pre- and postbronchodilator or exercise tests the results of both procedures can be combined and presented in a unified format. After printing the results and plotting the data, the results are saved in MIRDS and the raw data
429
MEDICALRELATIONALDATABASE Microcomputer
Input-Output
Database
Machine
Board
FIG. 7. In the current software schema FREEFORM screens can access data in the pediatric database directly or through stored commands created with SQL. These screens access laboratory hardware through C routines linked to FREEFORM.
points are stored in a local file on the microcomputer’s hard disk. Each day these raw data files are saved on floppy disks. The date, which is in the title of each of these files, specifies in which file the raw data for each patient’s test resides. A file pointer stored with the test results can then be used to locate these data in this file once the correct floppy disk is loaded. This scheme permits the retrieval of raw data for trials of new analysis techniques or for test replotting. The next step in the functionality of this system will be to integrate it more closely with the clinical care functions of the division. The speed of the database machine and the flexibility available with personal microcomputers will reduce the need for paper records of the outpatient visit. One major clinical function, the updating and printing of individual patient therapy prescription lists, is currently being performed with this system. This procedure not only assures timely and accurate therapy prescription lists but also permits the long-term recording of the course of patient therapies. SYSTEM PERFORMANCE EXPERIENCE The division’s experience with MIRDS can be described by first reviewing the steps required for its creation and second by examining its current level of performance. Often one of the most difficult tasks in creating a relational database is designing a data structure which makes full use of the power of the relational model while minimizing missing data storage and maximizing interrelation connectivity and retrieval speed. By starting with an existing database, as the division did, this process becomes much easier. Once the individual relations are specified the attributes must be categorized into integer, floating point, or character and their maximum values or lengths determined. From this information FREEFORM permits the creation of any individual relation within minutes. A more difficult task for the division was to translate the minicomputer data formats, using individual BASIC programs, into the newly designed formats on the database machine. Given the ease of data transfer into the database machine, however, a computerized database with more flexible
430
BUDD
ET AL.
output options and a faster, nonserial interface could transfer a larger database in less than a day. Immediately after this data transfer stage retrievals can be performed using either SQL or temporary FREEFORM screens. With the data structure created and the data entered the next step is creating user interfaces for purposes other than data retrieval. Such purposes are exemplified by three major division tasks: the creation of the menu structure (Fig. 3). the integration of a laboratory function. and the application of a research result to a clinical task. In order to create the menu structure it was essential to be cognizant of the data and its intraconnections, the capabilities of the database machine, the features of the system software, and the needs of users. During the initial stages of this process a complete menu structure for data entry and retrieval was created in I programmer-month. The following 2 years have seen an evolution and growth of the menu structure as all aspects of the system have become better understood. The creation of the submenu structure for performing the laboratory tests has been more involved. Even so, for performing spirometry the creation of the screens for retrieving background information, performing the test, saving the data, creating the plot files, and printing the reports took only 2 programmermonths. This was in conjunction with translating the original minicomputer FORTRAN routines into microcomputer C routines and integrating them with FREEFORM which took an additional 4 programmer-months. This included time to learn the C language and understand the linking structures to FREEFORM. This spirometry structure is now stable and other tests are being integrated. The third example task involved a study which follows CF patients with daily home health diaries (12). A FORTRAN discriminate analysis routine had been written to categorize patients by projected health status based on home diary data retrieved into a file by SQL (13). This program was converted to C and integrated with the needed MlRDS retrievals using Relational C routines (Britton Lee) within I programmer-week. This program now permits the automated generation of clinical lists of patients on the home monitoring program along with their projected clinical status using a single microcomputer command. MIRDS’ high level of functionality would not be useful without a comparable level of performance. A description of this performance requires some background information on the database machine. The IDM 500 is a bus-based multiprocessor device with four types of processor boards: a database processor that controls the other boards and runs the bulk of the data management code, channel processors that control interfaces to host systems. disk controllers that also perform some data processing. and an optional database accelerator that helps with CPU-bound tasks. It can support 4096 host computers, 6 Mbytes of buffer memory used for disk cache and program space, and 32 Gbytes of disk storage. Within these constraints there could be any number of tuples within 50 self-contained databases with 32K relations per database and 755 attributes per relation for a maximum tuple width of 2 kbyte. It can support
MEDICAL
RELATIONAL
DATABASE
431
up to 400 simultaneous queries with no read locks but when queries and updates are performed 2-kbyte page level locking occurs. The access speed of MIRDS can be illustrated with a few examples by varying the number of users, the type of retrievals, and the mix of relations accessed. Simple queries such as retrievals of patient name given chart number take roughly 1 set with no increase in time with multiple users. When the retrievals in Fig. 2 are performed the first retrieval, from “micro” which contains 39,428 tuples of width 23 bytes, takes 35 sec. The second retrieval, from “test” with 24,904 tuples of width 72 bytes and “fvctest” with 25,006 tuples of width 56 bytes, takes 72 sec. When performing multiple user queries the retrieval time increases for the first example to 37 set for two concurrent queries and 45 set for three concurrent queries and for the second example to 115 and 167 set, respectively. These times would be lower if MIRDS had more than 1 Mbyte of memory, a database accelerator, and/or an upgraded processor board. These results are consistent with other IDM 500 performance evaluations. One demonstrated that query throughput increased 2.73 times as the number of users increased from 1 to 24 (14). Another explored the effects of relation size, tuple size, and retrieval type on access speed for single user retrievals (15). In a study of database benchmarks numerous retrieval parameters were explored. It found that retrieval times decrease with increased data sharing between queries and increase when queries begin to compete for buffer memory space (16). Finally one study found that an IDM 500 was 3.4 times faster than an INGRESS software relational system on a DEC VAX 1 l/782 when using a retrieval mix of nine different queries by 12 concurrent users (17). The database machine hardware has proven to be highly reliable. The only documented period of downtime was caused by an improperly formatted disk. Given that there has been no system failure downtime for over 9 months and no data destructive crash in over 2 years of use it is felt that the weekly/monthly backup schedule is adequate. This degree of reliability is echoed in two other accounts of this system (18, 19). DISCUSSION
The system described here has followed a somewhat different evolution than most medical database systems. By starting as a laboratory system it has emphasized the need to interact with laboratory instruments and to print timely laboratory results as well as provide long-term storage for both results and raw data. By providing for the needs of a major research effort on a stable patient population it has emphasized the long-term storage of a large portion of the patient record in a format which provides both speed and flexibility in database retrievals. The final stage of this system’s evolution, support of clinical care, is usually the first goal of medical databases. However, in this case the attainment 5 Relational Technology Inc., 1080 Marina Village Parkway, Alameda, CA 94501.
432
BUDD
ET AL.
of the first two goals provides a solid foundation for the successful implementation of a clinical support system. Anticipated progress will see the microcomputers being used for patient, clinic staff, and physician input of the necessary information. The long history of codified input of this data should make this direct input more feasible. The output will include graphical presentation of longitudinal data to the physician for review during the examination. Additional information from the clinical laboratories will be transferred directly from the hospital computer and formatted for entry into the database. This direct transfer plus longitudinal presentations of the data will permit a more timely and more encompassing overview of the patient chart when the data are being reviewed for possible therapy changes. Finally these changes will be entered directly by the physicians for the later printing of the outpatient therapy prescription lists. A number of well-established medical data systems are based on a hierarchical model because their designers see medical data as being intrinsically hierarchical in nature (20, 21). While the merits of this contention can be debated, the debate itself has little bearing on the practicality of using a relational model instead of a hierarchical one. Theoretically the relational model is a superset of the hierarchical model (10) and can be structured to mimic such a model. Some such structuring is performed in this implementation through the screen menus. However, by using newly created screens or SQL the full flexibility of the relational model can be exploited. One criticism of the relational model is that it does not permit the easy deletion of a person’s complete record (21). This has not been a concern in this system since it is designed for long-term storage. However, such a function could be provided through a series of stored commands or a Relational C program. The hardware-software solution provided by this system could serve as a model for a larger purpose. The interaction of clinical. research, and laboratory functions on a single system indicates the system has few functional limitations. The possibility of sharing data between disparate types of computers indicates the system has few connectivity limitations. Since the data storage and manipulation capabilities are centralized on the database machine and since peripheral computers serve as translators for their own unique machine and software-dependent formats, data can readily be shared among many types of computers through this machine. In a medical example, one could visualize a hospital with a database machine as the centralized depository of data for the administration’s mainframe. the clinical laboratories’ minicomputers, and the physicians’ individual personal computers. In conclusion, a relational database machine system which is accessed via microcomputers has proven to be a powerful and flexible system for the storage of medical record information. This technology shows promise fat providing even more functionality within the medical arena.
MEDICAL
RELATIONAL
DATABASE
433
REFERENCES 1.
2. 3. 4. 5.
BUDD, J. R., FINKELSTEIN, S. M., AND WARWICK, W. J. A minicomputer system for a pediatric pulmonary function laboratory. In “Proceedings, 32nd ACEMB,” p. 200. IEEE, New York, 1979. BUDD, J. R., WARWICK, W. J., FINKELSTEIN, S. M., AND WIELINSKI, C. L. A medical information system for research and clinical care. In “Proceedings, 36th ACEMB,” p. 220. IEEE, New York, 1983. GRONER, G. F., PALLEY, N. A., HOPWOOD, M. D., SIBLEY, W. L., AND FISHMAN, B. “CLINFO Users Guide: Release 3.” Rand, Santa Monica, CA, 1977. FERRIS, B. G. Epidemiology standardization project. Amer. Rev. Respir. Dis. 118,1 (1978). BRASFIELD. D., HICKS, G., SOONG. S., PETERS, J., AND TILLER, R. Evaluation of scoring systems of the chest radiograph in cystic fibrosis: A collaborative study. Amu. J. Roentgenol. 134,
1195 (1980).
6. BUDD, J. R., FINKELSTEIN, S. M., AND WARWICK, W. J. An automated preschool pulmonary function test. In “Proceedings, 5th SCAMC,” pp. 445-449. IEEE, New York, 1981. 7. LEAVELL, K. L., FINKELSTEIN, S. M., WARWICK, W. J., AND BUDD, J. R. Automated noninvasive determination of mixed venous pCOz. Med. Znstrum. 20, 247 (1986). 8. DATE, C. J. “An Introduction to Database Systems,” Vol. 1. Addison-Wesley, Reading, MA, 1981. 9. WIEDERHOLD, G. Database technology in health care. J. Med. Syst. 5, 175 (1981). JO. MARTIN, J. M., JABOT, F., AND MARCEL, P. How to organize the medical data of chronically ill patients in the computer. Methods Inf. Med. 24, 5 (1985). II. FINKELSTEIN, S. M., BUDD, J. R., EWING, L. B., WIELINSKI, C. L., WARWICK, W. J., AND KUJAWA, S. J. Data quality assurance for a health monitoring program. M&hods Inf. Med. 24, 192 (1985). 12. FINKELSTEIN, S. M., BUDD, J. R., WARWICK, W. J., KUJAWA, S. J., WIELINSKI, C. L., AND EWING, L. B. Feasibility and compliance studies of a home measurement monitoring program for cystic fibrosis. J. Chron. 0s. 39, 195 (1986). 13. BROWN-EWING, L. J., FINKELSTEIN, S. M., BUDD, J. R., RICH. S., KUJAWA. S. J.. WIELINSKI, C. L., AND WARWICK, W. J. A rule for the early detection of chronic changes in cystic fibrosis patient status. J. Chron. Dis., in press. 14. TOSSY, M. “IDM Performance Test 1: Simple Retrieves Varying the Number of Users.” Britton Lee Technical Report, 1983. 15. BOGDANOWICZ, R., CRACKER, M., HSIAO, D. K., RYDER, C., STONE, V., AND STRAWSER. P. Experiments in benchmarking relational database machines. In “Database Machines” (H. 0. Leilich and M. Missikoff, Eds.), pp. 106-134. Springer-Verlag, New York, 1983. 16. BORAL, H., AND DEWITT, D. J. “A Methodology for Database Performance Evaluation,” Computer Science Technical Report No. 532. University of Wisconsin, Madison, WI, 1984. 17. BITTON, D.. AND TURBYFILL, C. “Design and analysis of multi-user benchmarks for database systems.” Computer Science Technical Report 84-589. Cornell University, Ithaca, NY, 1984. 18. RIECHMANN, C. IDM-500 within a mainframe environment-Some first experiences. In “Database Machines” (H. 0. Leilich and M. Missikoff, Eds.), pp. 106-134. Springer-Verlag, New York, 1983. 19. SCHUMACHER, G. GEI’s experience with Britton-Lee’s IDM. In “Database Machines” (H. 0. Leilich and M. Missikoff, Eds.), pp. 106-134. Springer-Verlag. New York, 1983. 20. BARNETT, G. O., SOUDER, D. E.. BOWIE, J. E., AND JUSTICE, N. MUMPS: A support of medical information systems. Med. lnformatics 1, 183 (1976). 21. KING, C., MANIRE, L., STRONG, R. M., AND GOLDSTEIN, L. Data management systems in clinical research. In “Information Systems for Patient Care” (B. I. Blum, Ed.), pp. 404-415. Springer-Verlag, New York, 1984.