Detecting data errors with a statistical screens

Detecting data errors with a statistical screens

706 Abstracts (of STEIN-type) for the regression parameters and a way for the diagnostic validation of the compared method. As a result of the propo...

138KB Sizes 1 Downloads 90 Views

706

Abstracts

(of STEIN-type) for the regression parameters and a way for the diagnostic validation of the compared method. As a result of the proposal, the method comparison expedment should be performed only till the change of the clinical-chemical test method has (with a fixed probability) no consequences for the diagnostic decision. So the new procedure is cost optimal in the sense of the actual given task. A new quality characteristic called "diagnostic relevant residual variance" is defined. With the help of this characteristic the computation of the STOP-criterion only from the diagnostic demands is possible. The mathematical-statistical properties of the new strategy are demonstrated and the procedure of the method comparison experiment is referred.

PIO0 DATA ENTRY TRAINING FOR A SMALL CLINICAL TRIAL Scott D. Corley, Grace Ng, and Mary Jo GIIleeple

BiostaUsUcs Cardiovascular Research Unit University of Washington Seattle, Washington In August 1990, TRAP (Trial to Reduce AIIoimmunization to Platelets) Study Coordinators from seven Clinical Centers were trained in data entry on the PC. Top-down organization, balance between group instruction and hands-on practice, and handouts prepared in advance made it possible to provide in-depth training in each of the data entry procedures while maintaining the cohesiveness and momentum of the seminar. The seminar started with an overview of the data entry cycle and continued with coverage of each part of the cycle. Coverage of each part of the cycle, in turn, started with an overview and continued with detailed instruction and practice (top-down organization). One of the more difficult aspects of the training seemed to be that of balancing the amount of group instruction given with the amount of time devoted to practicing on the PCs. A good balance was struck by combining introductory group instruction with ample practice time and one-on-one assistance. Accommodation of different skill levels was important. Four types of practice forms and several work sheets containing hypothetical data were prepared in advance. Errors were intentionally included on some of the forms. The Coordinators were allowed to keep the handouts so that they could practice at their sites prior to the start of the study. A data entry manual was provided to serve as a step-bystep guide and detailed reference volume.

P1Ol DETECTING DATA ERRORS WITH STATISTICAL SCREENS Robert Ledlnghem, Mellssa Huther, Ruth McBride, Barbara Bane, Gunnel Hedelln, and Gunnel Schlyter

University of Washington Seattle, Washington The Cardiac Arrhythmia Suppression Trial (CAST) is a muiticenter RCT. Data in clinical records are transcribed to paper forms and entered. Range and logical checks are performed at data entry, and data must be verified before transmission to the Coordinating Center (CC). Certain key data are also elicited directly by the CC, and consistency is checked between these key data and the corresponding data entered at the clinic. Statistically screening the data to find "outliers" may catch further errors. The most desirable screens are those which minimize the work load (number of outliers to verify) and maximize the yield (number of errors found). An obvious screen for continuous variables is to consider a data point an outlier if it is ~> X standard deviations (SD) from the clinic-specific norm. We report here on pilot data (screening 8,478 data points) to determine the relative medts of using X = 3.0 SD vs. X = 2.5 SD. The method was: i) compute clinic-specific norms; ii) generate a report of outliers for each clinic; iii) clinic verifies value or, if incorrect, submits the correct value. =Outliers" True Outliers Errors Work

Load

SD = 2.5 + .................... 126 + .................... 7

SD --- 3.0 + .................... 55 -P . . . . . . . . . . . . . . . . . . . . 6

Jr-

-~

....................

133

....................

+ -I-}-

61

Abstracts

707

We conclude that the SD = 2.5 screen requires approximately twice as much work as the SD = 3.0 screen for a gain of 17% in error detection.

P102 PROCESS OF OBTAINING NDI INFORMATION TO DETERMINE THE VITAL STATUS AND CAUSE OF DEATH OF CASS REGISTRY PATIENTS Grace Ng, Mary Jo GIIlesple, and Kathryn Davis University of Washington Seattle, Washington Identifying data on all CASS (Coronary Artery Surgery Study) registry patients, except for randomized cases and patients from Montreal Heart Institute, who are still believed to be alive are submitted to the NDI (National Death Index) for potential death information. For all deaths identified, NDI provides the names of the States where the deaths occurred, along with corresponding death certificate numbers and dates of death. Among the possible NDI matches, CASS is only interested in matches on social security number or personal data (exact match of first name, last name, sex, race and month, day and year of birth). In order to verify that the matches are true matches and to obtain the cause of death information, CASS then makes the necessary arrangements with appropdate state vital statistics offices to obtain death certificates. Each death certificate received is logged and visually checked against the CASS patient information to vedfy that the certificate is for a CASS patient. The matching death certificates are then sent to the nosologist for classification. When all the cause-ofdeath forms for a year's search have been entered, they are merged with the CASS main database, and the cycle is repeated.

P103 USE OF COMPUTER APPLICATIONS TO FACILITATE RETROACTIVE UPDATES TO A DATABASE Rlta M. Peluelo, Charlyne Miller, Constance Orme, Mark L.Andrews, Cindy Casacell, Jenny Ueou, Karl Kleburtz, Jory Wlxsom, Ira ShouIson, and the Parklnson Study Group DATATOP Coordination Center University of Rochester Rochester, New York DATATOP (Deprenyl end Tocopherol AntioxidativeTreatment of Parkinsonism) is a multicenter controlled clinical trial investigating new therapies for the treatment of early-stage Parkinson's disease. A highly successful recruitment campaign resulted in accelerated enrollment of subjects at all 28 study centers with concomitant acceleration of database development and management of the trial [1]. Rapid collection of an extremely large volume of data resulted in interim analysis early in the trial [2]. Routine monitoring of the database by the trial's Safety Monitoring Committee identified a need to extract greater detail surrounding data variables with the generic code designation of "other" in the adverse experiences data sets, for which test dascdption was not readily available in the electronic database. Prior to implementation of procedures for prospective review and coding of variables reported as =other," a plan was developed to update approximately 2,000 records in the database which were previously designated as =other." Manual and electronic methods were employed to facilitate review, correction, and updates to the data by means of a team approach including operations, programming, and clinical staff, and study site coordinators. 1. Pelusio, at al.: Controlled Clinical Trials (abstract) 11:304, 1990. 2. Parkinson Study Group: NEJM 321:1364-1371, 1989. (Supported by NS24778, NINDS, USPHS)

P104 IMPROVED MONITORING OF MULTICENTER TRIALS BY AN EXTENDED DATABASE RoB Holle, Volker Welnbergef, and ChrlMIne Flecher University of Heidelberg Heidelberg, Germany Dudng the conduct of multicentar clinical trials several monitoring tasks have to be performed by the statistics end data center, aiming at three different levels of information: intedm results about the study hypotheses, protocol adherence, end completeness and correctness of the data. The information concerning