7 Computerized automation of clinical trials NICHOLAS JOHNSON RICHARD J. L I L F O R D
There are eight problems associated with running any clinical trial: 1. 2. 3. 4. 5. 6. 7. 8.
Mediocre literature searches Cheating Poor recruitment Ambiguity with the protocol Poor compliance Missing and erroneous data Unblinded assessment of outcome Poor data processing and analysis
Computers can help with all of these problems. The role of the computer in clinical trials ranges from using a hand calculator to add up results to full automation of a trial including selection, recruitment, data collection, processing and analysis. Once such a fully automated trial is underway the only human action is to interpret the computer's conclusion. In this chapter the standard uses of computers in data processing and analysis are considered, and the more revolutionary concept of trial automation is also discussed. AUTOMATED LITERATURE SEARCHES
Although computerized literature searches do have their limitations (Schoones, 1990) they are an excellent way to begin planning a clinical trial. The British Library offers on-line access to biomedical and toxicological databases from the Northern Library of Medicine in Bethesda, Maryland, USA, via its BLAISE-LINK Service. A wide range of databases is available, including MEDLINE (Index Medicus On-Line), POPLINE (Population Information On-Line) and BIOETHICSLINE. MEDLINE is the most commonly used service by obstetricians and gynaecologists and contains more than 4 750 000 records from articles in over 3000 biomedical journals. SDILINE (Selective Dissemination File) contains the current month's additions only. POPLINE covers research into human fertility, contraception and other related issues, and BIOETHICSLINE has information on ethics and morality. Bailli~re's Clinical Obstetrics and Gynaecology-771 Vol. 4, No. 4, December 1990 Copyright © 1990, by Baillirre Tindall ISBN 0-7020-1479-6 All rights of reproduction in any form reserved
772
N . J O H N S O N A N D R. J. L I L F O R D
The databases are accessed via a local telephone call to the nearest PSS (packet switch stream) node. To gain access a password is required, a personal computer terminal, a telephone line, a modem and NUI (network user identifier) from British Telecom and a current subscription to BLAISELINK. Once contact has been made with a local node, the user is prompted for his/her identification code before being connected to the PSS network. From this network the address code for BLAISE-LINK transfers the call to IPSS (International Packet Switching Service) which activates the satellite link to the computer in Maryland. The most important method of searching the databases is by subject. All records on MEDLINE and SDILINE have been indexed with a control vocabulary known as MeSH (medical subject headings). To search efficiently it is essential to have access to a printed list of MeSH headings. Because MEDLINE is a computerized version oflndex Medicus, the MeSH headings are the same and are available from Index Medicus or the beginning of the BLAISE-LINK user manual. Subjects that are too new to have been assigned a MeSH heading may be located using keywords from titles or abstracts. The search may be restricted to particular journals or set of journals, to relevant years or languages. CD-Rom (Compact Cambridge, Bethesda, Maryland) is a database containing references from Index Medicus and includes some abstracts. The data are stored on annual optical discs and accessed with a microcomputer. Most randomized perinatal trials (including unpublished trials) are listed in the Oxford Perinatal Database (Chaimers, 1988). This database can be loaded on an IBM-compatible microcomputer with a disc storage capacity exceeding 11 megabytes. AUTOMATION OF RECRUITMENT INTO TRIALS The computer may play an active or passive role in recruiting subjects for a clinical trial. A sophisticated computer program can identify suitable subjects and recruit them automatically. At a slightly lower level of automation, systems may require the clinician to decide whether or not to recruit subjects and provide 'help' screens to remind users of eligibility and exclusive criteria. More primitive systems may simply remind the clinician about ongoing trials. Finally the computer may act entirely as a passive database, requiring retrospective data entry prior to statistical analysis. Although there are no well-known published examples of randomized clinical trials where recruitment was entirely performed by the computer, such trials could easily be constructed without offending ethical standards. For example, if a team of labour ward registrars held polarized views on the advantages of ventouse extraction over forceps, a duty roster could randomly allocate a supporter of either ventouse extraction or forceps to any day. As many labour ward computers already collect birth details, including the name of the attendant, data collection could also be automated. Similarly, if two colposcopists treating cervical dysplasia used either laser or cautery exclusively, but followed similar protocols in all other respects, the
COMPUTERIZED AUTOMATION OF CLINICAL TRIALS
773
outpatient computer could randomly allocate the 'Dear consultant, please see this abnormal smear and treat' requests to either colposcopist. Provided that the computers in the cyto/histology and outpatient departments were linked, unbiased comparisons of recurrence rates and other outcomes following laser and cautery could be made. Although few randomized clinical trials involve automatic recruitment, automation can be applied to concurrent but non-randomized trials. An example of this type of trial is a comparison of two different policies practised in the same hospital. Subjects are already recruited into groups (e.g. consultant units) and outcome data are available from the labour ward computer used for birth notification, Korner etc. The computer can be programmed to exclude cohorts of patients from the study who may unbalance recruitment or fail to fulfil certain entry criteria. For example, if one unit serves a satellite clinic in a wealthy village and the other holds a clinic in a nearby industrial town, subjects with these addresses could be excluded automatically. Examples of such natural experiments include a comparison of active management of labour versus no intervention, a comparison of different induction of labour techniques, or a comparison of different antenatal care regimens. The classic example of a fully automated clinical trial is audit using historical controls. By definition this is a non-randomized and nonconcurrent trial, and the results following a new intervention are compared with the outcome from a previous series. Recruitment is automatic if the data have already been stored on a computer for another purpose. For example, it would be easy to perform a computerized comparison of the abnormal smear report rate before and after a change in policy (e.g. following legal action taken against another cytology department for alleged false negative reports). The cytology department computer automatically records data from all smears and it is simple to compare 'before and after' reports. Equally simple would be a trial evaluating the introduction of an epidural service on forceps and caesarean section rates. The effect of a change in policy or introduction of a new technology can be evaluated in this way if the data are already available on computer. The second source of historical control data is the literature. Although these observations are subject to bias this may be the only available control group (Moertel, 1984). Data on the active group can be recruited automatically from the computer's database, and data from the control group are obtainable using a computer literature search (Smith, 1987). Computerized recuitment of individual subjects requires sophisticated programming skills and assumes informed consent (Litford, 1990). However, a randomized trial can be designed without the need to approach subjects with the idea of randomization and without the need for complex computer technology. These study designs rely on group allocation (composite randomization design). Groups of individuals, clinics or communities are randomized into groups. For example, in a multi-centre ovarian cancer chemotherapy trial each unit would be allocated a different chemotherapy regimen. Data would be collected from a microcomputer in each unit and sent down telephone lines intermittently to the central investigator. Once
774
N . J O H N S O N A N D R. J. L I L F O R D
the study is underway, recruitment to the central station is automatic until the trial is terminated. In many trials, especially randomized trials, recruitment and entry of individual subjects is seldom, if ever, acceptable, because adequate counselling cannot be achieved by a computer or written leaflet. However, randomized trials can be semi-automated and highly sophisticated computer technology can be used to identify suitable patients and to encourage clinicians to enter such patients into a trial. Consider a comparison between laser ablation and laser excision of CIN III. Women with an abnormal smear who have had a colposcopically directed biopsy analysed will return to the laser clinic for treatment. A microcomputer in the laser suite linked to the main-frame computer in the histology department can retrieve the colposcopic biopsy report and other patient details. The computer can recognize details fulfilling the entry criteria and could prompt the clinician to enter the subject into the trial. If a suitable subject was not entered the computer could request an explanation. This will improve recruitment by making it uncomfortable for the colposcopist not to participate without good reason. Once the patient agrees to participate in the trial the computer can randomly allocate the subject to either group and collect subsequent follow-up data from its link with the computer in the cyto/histological department. Computers used specifically for data collection can be programmed to remind clinicians about a clinical trial. In this situation the computer plays a passive role in recruitment but once it has been established that the subject is eligible, fully informed and willing, the computer may automatically randomize subjects and collect data. Imagine, for example, a randomized controlled trial comparing two labour ward policies: routinely rupturing the membranes of all women admitted in labour and leaving the membranes intact unless amniotomy is specifically indicated. On admission to the delivery suite, patient data are entered into the labour ward computer in the usual way. The admissions officer is presented with a screen asking: 'Is your patient suitable for the rupture membranes trial?' A help screen containing the entry and exclusion criteria is available. The screen is cleared if the woman was unsuitable or unwilling but if she was enrolled into the trial, the computer would randomly allocate the patient into one or other group and inform the user accordingly. Outcome variables are collected alongside general data entered for Korner and birth notification. Automatic recruitment is worthwhile only if trials require a large sample size and if the end-points are already measured and stored on a database. AUTOMATING THE RANDOMIZATION PROCESS Not all clinical trials require randomization but it minimizes selection bias, produces comparable groups, and ensures validity of the statistical analysis. The technique chosen to implement the randomization process is very important, for if randomization is compromised, all of its advantages are lost. Unfortunately 'cheating' in randomized clinical trials is common. Chalmers et al (1983) examined the randomization process of 102 clinical
COMPUTERIZED AUTOMATION OF CLINICAL TRIALS
775
trials. The process was unknown and hidden from the investigator in 57 trials and 14% of these studies had at least one base-line variable unbalanced. There were 45 trials where the randomization technique was known and these studies had twice as many maladjusted prognostic variables (26.7%). There are several possible randomization techniques: envelopes, coded vials, coded tablets, randomization by telephone or by computer. Envelope systems are subject to errors and tampering. In one study an investigator opened the envelopes and arranged the assignments to fit their own preferences, accommodating friends and relatives in the trial (Friedman et al, 1985). In two local trials midwives appear to have opened the envelopes, decided that they preferred the alternative option and then drawn a second envelope (Griffith-Jones et al, 1990; Barrett et al, 1989). Many studies use the telephone system of randomization to protect against this problem. This is expensive, time consuming and requires dedicated personnel especially if 24-hour patient entry is required. Randomizing subjects by computer solves these problems. Computers make it impossible to pre-empt or find out which arm of the trial a subject will be allocated into and because the computer can recognize when a subject is re-entered it is impossible to change treatment allocation without detection. There are no envelopes to swap or lose and randomly allocated treatment vials cannot be examined and deliberately broken so as to select another. If the investigator decides to change the allocated treatment, the initial randomization code is known to the computer so that outcome can be analysed against the original unbiased group allocation. For example, a proposed ventouse extraction/forceps trial using a primitive randomization system such as randomization by duty roster could be flawed if registrars swap duties. However, when the trial is automated the swap is detected because the name of the attending surgeon will not match the name on the roster. Although an occasional change in allocation of registrars will weaken the power of the study, its conclusion is still unbiased if outcomes are compared in the original randomized groups. In trials contaminated by cheating, changes in the initial random allocation cannot be identified and therefore no honest conclusions can be drawn and the results may be misleading. The second advantage of computerized randomization is that the trial is not restricted to the hours when the central investigating unit is staffed by telephone. Computers are easy to use and they eliminate the daunting and tedious task of trying to contact investigators, and therefore recruitment is improved. The third advantage of computerized randomization systems is that complex and more powerful randomization techniques can be used. The most elementary form of randomization is 'simple randomization'. Subjects are allocated to treatment or intervention group on the toss of a coin (or a random-number table etc). However, there is a danger of recruiting an unequal number of subjects into each group in small studies, thereby reducing power. The solution to this is to use block randomization or adaptive randomization; both methods ensure equal allocation into each group. Block randomization (synonym-permuted randomization) randomly assigns subjects to small subgroups (blocks) made up of an equal number of
776
N . J O H N S O N A N D R. J. L I L F O R D
places for each arm of the study. For example, if the block size is four, each block will contain two allocations for group A and two for group B in any of the following order: AABB, ABAB, ABBA, BBAA, B A B A and BAAB. Any of these arrangements may be selected, and the block size can be varied to prevent an astute staff member from spotting the pattern and manipulating the entry of the last subject in a block. Adaptive randomization (minimization) attempts to balance the number of subjects in each treatment group based on previous allocations. If an imbalance begins to occur the randomization process is manipulated by balancing the allocation towards the group with fewer subjects (bias coin). Both of these techniques prevent gross imbalances and can be prescribed prospectively in any code. However, in many obstetric, gynaecological or oncology trials there are multiple prognostic variables (e.g. parity, age, gestational age etc) and, if the trial has a small sample size, there is a considerable risk that these variables may become unequally distributed between groups. To prevent this imbalance, groups should be stratified before randomization. Stratified randomization is a method that helps achieve comparability between groups and each factor. Each variable or prognostic factor is identified at recruitment and subjects are divided in the various strata (subsets). Each stratum is randomized separately using block or adaptive randomization. This ensures a similar number of subjects within each group and each subgroup. For example, if Metrodin and Pergonal are to be compared with respect to the success of ovulation induction, it would be desirable to have similar numbers of subjects in each of the following subgroups: old or young, luteinizing hormone/follicle stimulating hormone ratio greater or less than 3, presence or absence of tubal pathology or sperm abnormality. If more older subjects were recruited into one group, an imbalance created by simple randomization would reduce the power of the study. It is difficult to randomize separate and multiple subgroups using envelopes or treatment vials because each subgroup requires a different set of randomization codes. White and Friedman (1978) have implemented a special card index system but automatic stratified randomization is easy to produce from a simple computer program utilizing data entered by the admissions clerk. This relieves the investigator from having to decide which subgroup each subject belongs to and from choosing an envelope containing a random number from one of many boxes. Computerized stratified randomization maintains the security of prospective randomization and minimizes cheating with the minimum of effort and cost. AUTOMATED FOLLOW-UP A major difficulty facing investigators involved in clinical studies is ensuring timely follow-up of subjects once they have been recruited. Investigating units lacking the advantage of automated recall are haunted by lost addresses, and investigators are continually having to remember when to arrange the next visit. A computerized automated recall system eliminates
COMPUTERIZED AUTOMATION OF CLINICAL TRIALS
777
these irritations and provisional experience suggests it can reduce unexplained defaulting (Leduc et al, 1984; Fulcher and Burris, 1988). Automated recall systems are based on databases and a simple system sends out regular follow-up letters to all subjects who fulfil the follow-up criteria. For example, a regional perinatal diagnostic clinic will have data on many women whose previous pregnancy has been complicated by a chromosomal abnormality. Using a system based on the outpatient department's computer to send out appointments, a computerized prenatal diagnostic clinic can automatically send out letters to general practitioners or patients at regular intervals until the patient is 50 years old, asking for data on subsequent pregnancies. Failure of response can be followed by a second letter etc, and if both general practitioner and patient fail to reply the computer can generate a list of phone numbers or addresses for the genetic councillor to visit. Some studies require more precise follow-up. For example, a study of recurrence rates following laser cone or surgical cone biopsy requires followup at fixed intervals. Two programs, CALENDER and APPOINT (Fulcher and Burris, 1988), generate follow-up appointment dates for each patient and identify" defaulters. AUTOMATED DATA COLLECTION The trial is only as good as its data. There are three problems with data collection: (i) missing data; (ii) erroneous data; and (iii) variations in the way the same observation is reported.
Missing data Missing data (incomplete or irretrievable data) are a source of great frustration to the investigator as notes are lost, laboratory reports are missing and information is not recorded. Computer collection acquires significantly more data and is associated with fewer omissions than pen-and-paper worksheets (Friedman et al, 1983). This beneficial effect is unique to the computer and does not carry over into subsequent periods when the computer system is removed or rendered temporarily inoperative (Barnett et al, 1978; McDonald et al, 1980). There are two reasons why computers should be less likely to miss data. The first relates to the way in which data are entered into a computer database. The system requires the user to move the cursor past an empty space to omit an entry. To omit an entry on paper is simply too easy. The second explanation is related to prompts. For example, when treatment side-effects are manually recorded, rare side-effects may be overlooked and common ones will be accepted as normal and not worthy of note. When Kent et al (1985) examined the missing data from a clinical trial comparing different chemotherapeutic agents, data were recorded manually or by computer. Automated data collection was associated with improvement in the frequency of data recorded from less than 1% to 45% for drug toxicity and from 36% to 82% for general chemistry results.
778
N . J O H N S O N A N D R. J. L I L F O R D
Erroneous data
The second problem with data collection is the inclusion of inaccurate information. For example, our hospital managers were told that we performed six hysterectomies on men last year and discharged several women 6 hours after hysterectomy. The first error was due to carelessness and the second occurred because the clerk coding the data misread hysterectomy for hysteroscopy on the senior house officer's discharge summary. Quality control is the key t o minimizing erroneous data. This involves selecting random cases and rechecking each piece of data from the original source. However, this retrospective check system is limited because errors may be repeatedly missed due to illegible writing or carelessness and once the event has passed it may be impossible to recover the true result. Checking the data at the time of entry will minimize errors and allow entry of corrected data. For example, if a patient's weight is recorded as 7kg, retrospective analysis of the data can only detect an error and the truth may never be known. Prospective analysis of the data prompts the operator to enter the data correctly. Such checking systems are limited to automated computerized systems with error-traps (data validation rules) (Kronmal et al, 1978; Karrison, 1981; Friedman et al, 1983). There are two types of automated error-trap: range checks that detect values outside a defined range, and consistency checks that test for inconsistency and incompatible data. An error-trap may reject data, preventing the program from continuing until realistic data is entered, or the trap may recognize unusual data and flag a message asking the operator to recheck and, if appropriate, re-enter the data. For example, an input of 440 weeks for gestational age would be incompatible with life and the data would be rejected. An entry of 44 weeks would activate a flag '?are you sure--re-enter data'. The consistancy checking procedure is powerful and can be sophisticated. Primitive systems may recognize an admission date of 1.1.90, discharge date 2.1.89 as impossible, and two potassium results a week apart of 3.3 and 5.5 mmol/l as unlikely. Sophisticated consistency check systems may accept this unusual change in potassium if it accompanies a rise in urea. An additional advantage to consistency checks is that the computer can flag a warning. For example, if subjects with ovarian cancer treated with cyclophosphamide are studied and the white cell count is recorded as 2.5 x 103, the operator may be greeted with a warning 'danger of myelosuppression-reduce dose of cyclophosphamide' and the computer may again ask for additional information, e.g. 'enter platelet count' or 'enter new dose of cyclophosphamide'--a theme taken up in Chapter 3. Variations in data interpretation
The third problem with collecting data for clinical trials can be getting all investigators to understand precisely what data are required. Hard endpoints are easy to quantify (e.g. death or survival) but dichotomous endpoints such as death are uncommon in obstetrics and gynaecology. If a trial is to have a reasonable chance of detecting a change in perinatal mortality rate using death as the end-point, the sample size would need to be in excess of a
COMPUTERIZED AUTOMATION OF CLINICAL TRIALS
779
third of a million subjects (Lilford, 1987). To reduce the sample size, surrogate outcomes are often measured but these may be soft end-points and difficult to interpret. Blood pressure (Rose et al, 1964; Wilcox, 1961), cardiotocographs (Barrett et al, 1990) and X-ray pelvimetry are all interpreted differently. When a data-notification sheet asks an investigator to record diastolic blood pressure, does it mean Korotkow IV or V; what precisely is a type II deceleration on a cardiotocograph; and what is an abnormal electrocardiogram? Cardiologists who examined the same ECG on two occasions disagreed with themselves in one in eight cases (SmyUie et al, 1965) and a similar observer-variation exists with cardiotocographs and Apgar scores (Barrett et al, 1990). Ambiguous outcome definitions should be clarified before a trial, but variable data expression is usually noted only in the middle or end of the study. Automated trials have their protocols specified in advance and the encoding process may reveal serious omissions or inadvertent ambiguities in the protocol documentation. These faults may be unnoticed by a clinician but they are obvious to a computer programmer. For example, four oncology protocols combining chemotherapy and radiotherapy failed to specify precisely when the radiotherapy should have been given. It was not until these protocols were encoded for computerization that designers of these studies became aware that the timing of radiotherapy had been omitted from the original protocol description and treatment had been given with ad hoc timing. Inadvertent ambiguities can cause deviations from the protocol in 40-50% of cases (Cortez et al, 1983; Hickam et al, 1984). Although clinicians may still follow the spirit of the trial it is not surprising that they are ready to use personal discretion when the protocol specifies vague instructions such as 'treatment (chemotherapy) may be continued in a modified fashion if counts are borderline' (Musen, 1987). Writing protocols that are unambiguous is not easy, particularly when the treatment regimen is complex, as in clinical oncology. The total number of specifications is enormous and the effect of multiple stipulations all occurring simultaneously can be hard to anticipate. However, writing a protocol based on similar previous studies is particularly easy if the protocol is stored on a word-processor. A 'computer base protocol authoring development system' has the advantage that knowledge remains organized and ensures that all details are specified. The framework for an oncology system, termed OPAL, is under development, linking an advice system with automated protocols (Musen et al, 1987). The disadvantage of these systems is that the objective of the trial changes from a comparison of a treatment regimen used in practice to comparison of a regimen used according to one specific protocol. The question that a 'pragmatic' trial should answer is 'which treatment is superior taking into account the variables of clinical life' rather than the question 'which drug is superior according to a specific inflexible protocol'. In Leeds, a new 'computer-aided clinical trial design' tool called medical significance explorer (MSE) is under development. It allows the clinician to weigh all the outcome variables and explore the significance of each para-
780
N. JOHNSON AND R. J. LILFORD
meter before the protocol is formalized. This should reduce the number of changes made to the trial protocol, minimize ad hoc comparisons, and prevent clinicians from performing a study without prospectively defining what it is they want to prove. The system then runs a trial design system (TDS) to produce a protocol. For example, if a unit wanted to examine the benefits of vaginal delivery compared with caesarean section for premature breech babies, the clinicians would ascribe a value (utility) to neonatal morbidity associated with either mode of delivery and score maternal and future maternal risks and economic variables. MSE would identify the variables that need to be measured and TDS designs the trial and calculates the sample size. QUALITY CONTROL Poor quality data can be minimized by designing unambiguous protocols, using clear concise data-sheets with explanations, standardizing laboratory techniques, identifying vague measurements, supervising/training investigators to make those measurements, and audit. Audit permits investigators to identify the features of a protocol that yield a high proportion of unsatisfactory data early in the trial. In randomized trials the randomization codes should be periodically checked for missing numbers, lost envelopes etc. Drugs should be checked for expiry dates and contamination, and dataforms should be examined for completeness, inconsistencies and default rates. Automation of clinical trials will improve data quality by minimizing deviations from the randomization code and by reducing the omissions and errors that have been documented with manual systems (Bagniewski et al, 1983). Paper data-sheets are not required when data is collected directly into a computer terminal and direct data exportation eliminates the errors of data transfer. For example, the original Leeds Ruptured Membranes Trial (Barrett et al, 1989) was a manual clinical trial designed to ask the question, 'Should membranes be ruptured in every woman who is admitted in labour or should they be left intact?' Suitable patients were randomized by envelopes and at the end of the study 362 case notes were recalled and examined. Data were transferred to data-sheets and then on to a computer statistics package. These are standard techniques but quality control had revealed that 32 randomized envelopes were lost, some may have been opened and the random allocation changed. After the data had been extracted it was noted that an important parameter had not been recorded and all the case notes had to be re-examined. Twenty case notes could not be retrieved. Some data were corrupted as they were transferred to data-sheets and further typographical errors occurred at the statistical input stage. Although these errors were detected, they are typical of a large clinical trial involving multiple investigators. It is important that data are accurate and complete because the accuracy of data is often the only aspect of the study that cannot be validated by external referees. To improve data quality the trial was repeated but this time it was semi-automated. The labour ward computer
C O M P U T E R I Z E D A U T O M A T I O N OF C L I N I C A L TRIALS
781
randomized patients using a stratified block randomized code and data collection was superimposed on standard delivery details. After 300 patients had been entered in the trial, quality control revealed that several patients had been entered into the trial on more than one occasion and in some cases patients were entered up to three times within 10 minutes. Clearly one or two workers were doing this to get the allocation they preferred for their patients. The trial program was altered to ensure that once a patient had been allocated into a group they remained within that group. After delivery a different midwife usually enters birth notification data including time of membrane rupture and this should identify cheating and the groups can be analysed by the initial allocation. However, we have to accept that the power of the study will also be reduced by including patients who are not in labour and who may have spontaneously ruptured their membranes when they re-present on labour ward. At the end of the study, data will be transferred electronically to a statistics package. Error rates associated with the automated clinical trial are below 1%. This compares to an error rate of 1.5% for manual data-sheets. The automated trial recruited subjects twice as quickly as the conventional manual study (650 subjects per year compared with 350 per year) despite the fact that recruitment in the first trial was enthusiastically encouraged and the computerized trial recruited subjects passively. AUTOMATIC DATA COLLECTION SYSTEMS Data can be collected on hand-held portable computers, on microcomputers using a commercially available or custom-built system, or on local network systems using a central server. Portable computers offer the prospect of bridging the gap between a large desk-top computer and the patient's home. They have been used in general practitioner's surgeries to record details from subjects in a randomized clinical trial (Tattersal and Ellis, 1989). Detachable data-packs are posted to the central unit at regular intervals and loaded into a data management package. In a pilot study with eight general practitioners, initially unfamiliar with computers, all were able to participate after only 1 hour of tuition. One expressed a preference for written forms, two had no preference and five preferred the computer (Tattersal and Ellis, 1989). The disadvantage of hand-held computers is memory size, the need to swap data packages, and the limited screen available for help messages. There has been extensive experience with data collection for clinical trials with desk-top computers and there are several packages available commercially. COMPACT (Timber Lake Ltd, 40b Royal Hill, London, UK) is a data management package designed specifically for entering, checking and editing data from clinical trials. Its particular strength is handling longitudinal trials but it can also manage cross-sectional studies. Data validation is accomplished with error-traps and inconsistent and missing data are listed in a problems file. COMPACT is a database designed to act as an interface between a clinical trial and standard statistical packages such as SPSS and
782
N . J O H N S O N A N D R. J. L I L F O R D
SAS (Chilvers et al, 1988; Neil, 1988). It has been written with oncology trials in mind but can be adapted for smaller less complex trials. ONCOCIN is a medical consultant system designed for oncology trials. It is based on a program called MYCIN (Buchanan and Shortliffe, 1984) and is termed an expert system because it uses artificial intelligence techniques based on knowledge acquired from expert oncologists. Before ONCOCIN, the clinical trialist followed a protocol defining the principal treatment options and collected data on flow sheets. ONCOCIN replaces the flow sheets and records the same data but also combines clinical information with protocol guideline providing management advice (Hickam et al, 1984; 1985). This capability is a major advantage of the system and consequently a physician is more likely to enter data than other personnel. The other subspeciality in obstetrics and gynaecology that involves vast quantities of data is reproductive medicine. In vitro fertilization units frequently computerize their data storage and processing, and customized programs are being designed to semi-automate local clinical trials. Automation of data collection is not restricted to stand-alone units. Some automated clinical trials use a networked system. For example, if the computers in the antenatal clinic, ultrasound department, Doppler unit, pharmacy and labour ward were linked to a central server, an automated trial comparing different antenatal therapies could be run using birth details as the outcome variables (e.g. the effect of essential oils or aspirin on hypertension, the effect of raspberry tea on induction rates, the effect of iron supplementation on transfusion requirements of haemoglobin levels etc could all be studied automatically). More sophisticated data collection systems involve peripheral collecting systems interfacing with a central station. In France 233 senior psychiatrists, most of them private practitioners, participated in a semi-automated trial using MINITEL. Every medical record was computerized, processed and transferred through phone lines to a database located in Paris, the headquarters of the nationwide multi-centre trial. Each investigator could call up the trial protocol and flow chart on their microcomputer 500 miles from the database. There were several procedures to ensure that eligible cases were not inadvertently missed and recalls were automatically scheduled at 7 and 28 days, with reminders if necessary. Abnormal or unacceptable responses were filed and exclusion criteria and error-traps prevented the investigator from entering spurious data or ineligible patients. Although the mean connection time was 15 minutes (SD 3min), 1700 patients were recruited in 3 months using this technology (Benkelfat et al, 1986). DATA PROCESSING
There are many examples of data processing systems that have satisfactorily managed large trials (Cassano et al, 1983; Dreyfus et al, 1984; Hawkins and Singer, 1986). Automating the link between data collection and processing eliminates the typographical errors and reduces the administrative burden
COMPUTERIZED AUTOMATION OF CLINICAL TRIALS
783
for the investigators. In small trials, the burden is light and manual data processing can be checked easily. In large clinical studies automation is a significant advance. A good clinical trial database system should allow local participating centres to enter data which can be easily updated, altered and retrieved. The co-ordinating centre must be able to maintain and summarize the data and it must have facilities to export data to a statistical package for analysis. DATA ANALYSIS The results of almost every clinical trial are analysed using computerized statistical techniques. Mental arithmetic or a calculator can be used to calculate confidence intervals and perform Fisher and ×2 tests but almost all other statistics must be performed on a computer. Automated clinical trials have the advantage of directly dumping (exporting) data from a database to a statistical package. With the recent advances in computer intercommunications the concept of using a database as the front-end of a clinical trial computer system and networking the information to a main-frame system allows powerful data analysis and is cost-effective (Trapilo and Friedman, 1984). However, although this may seem a great advantage, in practice it allows clinicians to pull out every possible subset of data for statistical analysis, with the disadvantage that the clinician is likely to find a significant association by chance alone. A combined data processing and analysis package has a narrow gateway between the database management system and the statistical analysis package. Although this has limitations it prevents ad hoc comparisons (Johnson and Lilford, 1990). Automation of the link between data processor and statistical package allows the data to be analysed sequentially (continually adjusting the probability value as data are entered and adjusting for fact that the data are sampled repetitively). This allows a clinical trial to be performed until the computer automatically stops the trial when a predetermined level of P (probability of rejecting the null hypothesis) is reached. The second advantage is that updated and repeated summary plots may be constructed on individual patients, reflecting their clinical course. EXAMPLES OF AUTOMATED STUDIES There are no well-known published examples of fully automated obstetric trials. Many units have audited their practices using data extracted from the labour ward computer. For example, following three third-degree tears associated with deliveries performed by student midwives, Johnson and Gupta extracted retrospective data from the labour ward computer linking the attendant seniority to episiotomy rates and second and third-degree tear rates (N. Johnson and J. Gupta unpublished data). They were able to satisfy themselves that the registrars had the highest episiotomy rate because they performed the difficult deliveries (forceps and breech) and student midwives
784
N. JOHNSON AND R. J. LILFORD
were just as capable of achieving an intact perineum as senior midwives. The Leeds Rupture Membrane Trial (Johnson and Lilford, 1989) is the first attempt at a semi-automated randomized trial but it is not surprising that renal physicians, who collect enormous amounts of data on each patient, were the first to perform non-randomized automated studies. Pollak et al (1985) studied the risk factors involved in predicting the outcome in patients with end-stage renal disease. Outcome data were retrieved from the computerized medical information system used in clinical practice and using multivariant analysis they calculated relative risks, survival curves and identified factors associated with survival. M I D B I M (Medical Integrated Database Interactive Management) is a database written for a clinical trial of plasmapheresis therapy in systemic lupus erythematosus nephritis in four centres. The updated version, SID (Statistical Interactive Database Manager), was first used in a multi-centre clinical trial of cyclosporin in the management of diabetes. The software is Canadian and it supports an external connection to the Canadian phone system for exporting and importing standard ASCII files. The program can run multi-centre clinical trials and permits individual centres to enter, update and retrieve their data and produce summaries, reports and graphic outputs while allowing detailed analysis to take place at a central station (Koval et al, 1987).
SUMMARY Over the past decade computers have revolutionized medical record systems. Data is now legible, organized and easily retrievable. The next decade will see a revolution in the way large multi-centre clinical trials are carried out. Computerized automation of trials improves compliance with the protocol, identifies deviations from the randomization system and reduces missing and erroneous data. Computerization also improves recruitment and eliminates mistakes during data transfer. All of these proven advantages are bought for a fraction of the cost of the investigators' time.
Acknowledgements I would like to thank Dr Mike Kelly for his advice during the preparation of this manuscript.
REFERENCES BagniewskiA, Curtis C, Fox K et al (1983)Data qualitycontrolin a distributed data processing system; nature and method. Controlled Clinical Trials 4(2): 148 (abstract). Barnett GO, Winickoff R, Dorsey JL, Morgan MM & Lurie RS (1978) Quality assurance through automated monitoring and concurrent feedback using a computer-basedmedical information system. Medical Care 14: 962-970. Barrett JFR, Jalwis GJ, Macdonald HN, Buchan PC, Tyrrell SN & Lilford RJ (1990) Inconsistencies in clinical decisions in obstetrics. Cancer (ii) 336: 549-551. Barrett JFR, Phillips K, SavageJ & Lilford RJ (1989)Randomised trial of routine amniotomy
COMPUTERIZED AUTOMATION OF CLINICAL TRIALS
785
versus the intention to leave the membranes intact until the second stage of labour. Congress Book of Abstracts, p 114. RCOG Silver Jubilee British Congress of Obstetricians and Gynaecologists. London: Royal College of Obstetricians and Gynaecologists. Benkelfat C, Gay C & Renardet M (1986) Use of interactive computer technology in open clinical trials. Neuropsychobiology 15 (supplement 1): 10-14. Buchanan BG, Shortliffe EH. Ruled based expert systems, Mycin experiment of the Stanford Heuristic Programme Project, Reading, Mass. published Addison Wesley, 1984. Cassano GB, Conti L & Massimetti G (1983) Reeherches psychopharmacologiques multicentriques et banque de donnres. (Multicentre psychopharmacological research and the data-bank.) Acta Psychiatrica Betgica 83: 267-281. Chatmers I (1988) Oxford Database of Perinatal Trials. Oxford: Oxford University Press. Chalmers TC, Selano P, Sacks HS et al (1983) Bias in treatment assignment in controlled clinical trials. New England Journal of Medicine 309: 1358-1361. Chilvers CED, Fayers PM, Freedman LS et al (1988) Improving the quality of data in randomized clinical trials: the compact computer package; compact steering committee. Statistics in Medicine 7: 1165-1170. Cortez MM, Jackson PM, Torti FM & Carter SK (1983) A comparison of the quality of participation the community affiliates and that of the universities in a northern California oncology group. Journal of Clinical Oncology 1(10): 640--644. Dreyfus JF, Blanchard C & Guelfi JD (1984) Un programme d'ordinateur pour le traitement des donnres recueillies lors des essais therapeutiques en psychiatrie. (Computer program for the processing of data collected during therapeutic trials in psychiatry.) Annales Medicopsychotogiques 142: 902-907. Friedman RB, Entine SM & Carbone PP (1983) Experience with an automated cancer protocol surveillance system. American Journal of Clinical Oncology 6: 583-592. Friedman LM, Furberg CD & De Mets DL (1985) Fundamentals of Clinical Trials 2nd edn, p 63. Massachusetts: PSG Publishing Company. Fulcher SFA & Burris TE (1988) A computerized recall system for clinical trials. Annals of Ophthalmology 20: 10-16. Griffith-Jones M, Tyrrell SN & Tuffnell DJ (1990) A prospective trial comparing intravenous oxytocin with vaginal prostaglandin E2 tablets for labour induction in cases of spontaneous rupture of the membranes. Obstetrics and Gynaecology Today 1: 104-105. Hawkins BS & Singer SW (1986) Design, development and implementation of a data processing system for multiple controlled trials and epidemiologic studies. Controlled Clinical Trials 7: 89-117. Hickam DH, Shortliffe EH, Bischoff MB & Jacobs CD (1984) The effect of enhancing cancer chemotherapy protocol guidelines with expert knowledge in a computer-based treatment consultant. Medical Decision Making 4(4): 533 (abstract). Hickam DH, Shortliffe EH, Bischoff MB, Scott AC & Jacobs CD (1985) A study of the treatment advice of a computer-based cancer chemotherapy protocol advisor. Annals of Internal Medicine 101(6): 928-936. Johnson N & Lilford RJ (1989) Trial by computer; using the labour ward computer to perform clinical trials in perinatology. Congress Book of Abstracts, RCOG Silver Jubilee British Congress of Obstetricians and Gynaecologists. London: Royal College of Obstetricians and Gynaecologists. Johnson N & Lilford RJ (1990) Statistics in obstetrics and gynaecology. In Phillips E. (ed.) Scientific Principles in Obstetrics and Gynaecology. Oxford: Heinemann books (in press). Karrison T (1981) Data editing in a clinical trial. Controlled Clinical Trials 2: 15-29. Kent DL, Shortliffe EH, Carlson RW, Bischoff MB & Jacobs CD (1985) Improvements in data collection through physician use of a computer-based chemotherapy treatment consultant. Journal of Clinical Oncology 3(10): 1409-1417. Koval JJ, Kwarciak LM, Grace MGA & Lockwood BJ (1987) A comprehensive database management system for a variety of clinical trials. Methods oflnfbrmation in Medicine 26: 24-30. Kronmal RA, Davis K, Fisher LD, Jones RA & Gillespie MJ (1978) Data management for a large collaborative clinical trial (Cass: Coronary Artery Surgery Study). Computers and Biomedical Research 11: 553-566. Leduc D, Loesser H, Hercz L & Pless IB (1984) A computerized recall system for office practice. Pediatrics 73: 233-237.
786
N. JOHNSON AND R. J. LILFORD
Lilford RJ (1987) Clinical trials in obstetrics. British Medical Journal 295: 1298--1300. Lilford ILl (1990) What is informed consent? In Bromham DR, Dalton ME & Jackson J (eds) Proceedings of the First International Conference on Philosophical Ethics in Reproductive Medicine (PERM), pp 211-227. Manchester: Manchester University Press. McDonald C J, Wilson G A & McCabe GP Jr (1980) Physician response to computer reminders. Journal of the American Medical Association 244(14): 1579-1581. Moertel CG (1984) Improving the efficiency of clinical trial: a medical perspective. Statistics in Medicine 3: 455-465. Musen MA, Rohn JA, Fagan LM & Shortliffe EH (1987) Knowledge engineering for a clinical trial advice system: Uncovering errors in protocol specification. Bulletin du Cancer 74: 291-296. Neil H A W (1988) Compact: a computer package for clinical trials. Diabetic Medicine 5: 795. PoUak VE (1985) The computer in medicine; its application to medical practice, quality control and cost containment. Journal of the American Medical Association 253: 62-68. Rose GA, Holland WW & Crowley A E (1964) A sphygmomanometer for epidemiologists. Lancet i: 296-300. Schoones JW (1990) Searching publication data bases. Lancet i: 481. Smith R (1987) Computerised literature searches: online access to biomedical information. In Dalton KJ & Fawdry RDS (eds) The Computer in Obstetrics and Gynaecology, pp 35-40. Oxford: IRL Press Ltd. Smyllie HC, Blendis LM & Armitage BP (1965) Observer disagreement in physical signs of respiratory system. Lancet ii: 412-413. Tattersall AB & Ellis R (1989) The use of a hand-held computer to record clinical trial data in general practice: a pilot study. Journal oflnternational Medical Research 17: 185-189. Trapilo TC & Friedman RH (1984) The impact of technology on clinical trials: the clinical trials computer system. Psychopharmacology Bulletin 20: 59-63. White SJ & Friedman FR (1978) Allocation of patients to treatment groups in a controlled clinical study. British Journal of Cancer 37: 849-857. Wilcox J (1961) Observer factors in the measurement of blood pressure. Nursing Research 10: 4-20.