Journal of Clinical Epidemiology 54 (2001) 1195–1203
Using the national registry of HIV-infected veterans in research: Lessons for the development of disease registries Linda Rabenecka,b,*, Terri Menkea,b, Michael S. Simberkoff c, Pamela M. Hartigand, Gordon M. Dickinsone, Peter C. Jensenf, W. Lance Georgeg, Matthew B. Goetzg, Nelda P. Wraya,b a
Department of Veterans Affairs (VA) Health Services Research and Development (HSR&D) Center for Excellence, Houston, TX, USA b Department of Medicine, Baylor College of Medicine, Houston, TX, USA c Manhattan VA Medical Center (VAMC) and New York University, New York, NY, USA d VA Cooperative Studies Program Coordinating Center, West Haven, CT, USA e VAMC and University of Miami, Miami, FL, USA f VAMC and University of California, San Francisco, CA, USA g West Los Angeles VAMC and University of California, Los Angeles, CA, USA Received 29 September 2000; received in revised form 26 April 2001; accepted 16 May 2001
Abstract Disease-specific registries have many important applications in epidemiologic, clinical and health services research. Since 1989 the Department of Veterans Affairs has maintained a national HIV registry. VA’s HIV registry is national in scope, it contains longitudinal data and detailed resource utilization and clinical information. To describe the structure, function, and limitations of VA’s national HIV registry, and to test its accuracy and completeness. The VA’s national HIV registry contains data that are electronically extracted from VA’s computerized comprehensive clinical and administrative databases, called Veterans Integrated Health Systems Technology and Architecture (VISTA). We examined the number of AIDS patients and the number of new patients identified to the registry, by year, through December 1996. We verified data elements against information obtained from the medical records at five VA sites. By December 1996, 40,000 HIV-infected patients had been identified to the registry. We encountered missing data and problems with data classification. Missing data occurred for some elements related to the computer programming that creates the registry (e.g., pharmacy files), and for other elements because manual entry is required (e.g., ethnicity). Lack of a standardized data classification system was a problem, especially for the pharmacy and laboratory files. In using VA’s national HIV registry we have learned important lessons, which, if taken into account in the future, could lead to the creation of model disease-specific registries. © 2001 Elsevier Science Inc. All rights reserved. Keywords: HIV infection; Disease registry; Health outcomes
1. Introduction A registry has been defined as “a file of documents containing uniform information about individual persons, collected in a systematic and comprehensive way, in order to serve a predetermined purpose” [1]. Registry data systems are powerful tools used by researchers in a variety of healthrelated fields. The applications of disease-specific registries in epidemiologic, clinical, and health services research are many. For example, disease-specific registries (disease registries) are used to estimate disease prevalence and incidence, to estimate health care resource utilization and clinical outcomes, and to track changes in these parameters over
* Corresponding author. VA Medical Center (111D), 2002 Holcombe Blvd, Houston, TX 77030. Tel.: 713-558-4514; fax: 713-748-7357. E-mail address:
[email protected] (L. Rabeneck).
time. Disease registries may also serve as data sources for conducting comparisons of health care utilization and outcomes across different categories of patients, for example, according to ethnicity, gender, age, or geographic areas. Disease registries make it possible to estimate the demand for health services. In addition, disease registries may serve as sampling frames for selecting patients who fulfil specific study eligibility criteria. Cancer registries are well known, but many other disease registries have been developed, including cleft lip [2], trauma [3], pediatric neurologic disorders [4], vascular surgery [5], and spinal cord dysfunction [6]. In 1989 the Department of Veterans Affairs established a national registry of HIV-infected veterans. Although several city, county, and statewide HIV or AIDS registries exist [7– 12], the VA HIV Registry is unique because to our knowledge it is the only national HIV registry in the US. We recently completed a pilot study that made extensive use of
0895-4356/01/$ – see front matter © 2001 Elsevier Science Inc. All rights reserved. PII: S0895-4356(01)00 3 9 7 - 3
1196
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
the VA HIV Registry. This was the first time that the registry was used for research purposes. During the conduct of our pilot study, we gained experience that we believe has broad relevance for the development and use of disease registries. The purpose of this article is to describe the structure, function, applications, and limitations of the national VA HIV Registry, and to discuss how investigators developing disease registries in the future could benefit from our experience. 2. Background of the national VA HIV Registry 2.1. History and implementation 2.1.1. History of the registry In 1989 a group of VA infectious disease physicians with specific expertise in the management of HIV-infected patients identified the need for a national VA HIV registry. The registry was developed to obtain an accurate count of HIV workload to ensure appropriate funding was provided to the facilities. On December 2, 1991, Directive 10-91-142 was distributed to all VA facilities from the Chief Medical Director at VA Headquarters in Washington, DC. The directive mandated immediate installation and use of the VA registry software by all facilities. The directive provided a major impetus for use of the registry because funding for HIV care for a facility was dependent on the number of HIV-infected patients identified to the registry. 2.1.2. Identification of patients to the HIV registry The national VA HIV Registry is a longitudinal database containing data which are electronically extracted from the computerized Veterans Integrated Health Systems Technology and Architecture (VISTA) databases and electronically transmitted to a central repository. VISTA is a comprehensive clinical and administrative database that captures almost all recorded clinical information at VA medical centers. VISTA data are obtained from a variety of sources and services, including Medical, Dental, Nursing, Laboratory, Radiology, Pharmacy and Medical Administration. Because VISTA is in use at all VA facilities, the VA HIV Registry contains data pertaining to all veterans who are diagnosed with HIV infection in the VA Health Care System. The diagnosis of HIV infection is based on a documented positive HIV serologic test done at a VA facility, regardless of whether the patient is an outpatient or an inpatient. At each facility at the time of diagnosis, each HIV-infected patient is identified to the registry, (i.e., manually entered into the registry and assigned a registry category, described below). The responsibility for these tasks rests with a registry coordinator at each facility. Once a patient is entered, relevant VISTA data are extracted and automatically transmitted to the Hines VA Information Service Center (ISC), which houses the registry. The VISTA information that is transmitted includes any available retrospective data from the time before the date the patient was identified to the registry as well as all subsequent prospective data. Each patient’s
file is updated nightly. To maintain confidentiality, patient records contained in the VA HIV Registry are listed according to an encrypted social security number (registry number) that is generated at Hines ISC. The registry number is different from the encrypted social security number used to identify veterans in all other VA administrative databases. 2.2. Contents of the registry 2.2.1. VISTA packages The registry software is programmed to extract data from five VISTA packages: Medical Administration Service (e.g., hospitalizations, surgical procedures, outpatient clinic visits), Laboratory (e.g., hematology, chemistry, CD4 count), Radiology (e.g., plain films, CT scans), Pharmacy (prescriptions, unit costs ), and Dental (clinic visits, dental procedures). The registry contains rich and detailed data on all health care provided to HIV-infected patients in VA facilities. 2.2.2. Registry files The information in the VA HIV Registry is arranged in 14 files. Table 1 summarizes the data elements in each file. All files contain the patient’s registry number and VA facility number, allowing linkage of records for individual patients or facilities. The patient file summarizes important demographic and clinical information including the dates of first and most recent transmission of data to the registry, demographic information, HIV transmission category (e.g., homosexual), date of diagnosis of HIV infection, date of death, date/type of first AIDS-defining diagnosis, dates of HIV categories (see below). Each of the other 13 files contains data on a specific type of health care utilization. The outpatient file contains all scheduled and unscheduled clinic visits, the type of clinic, the number of clinic visits, and the dates of visits (by month/year). Several files are devoted to inpatient care, providing data for each admission, such as the type of inpatient unit, up to 10 ICD-9 discharge diagnoses, dates of admission and discharge, procedure dates and up to 5 ICD-9 procedure codes for each date. Several files are devoted to laboratory and radiology tests. They contain information on the type of test, CPT codes, dates of tests, and test results. The prescription drug files provide data on all prescriptions dispensed by VA pharmacies. The dental visit file contains information on dental visits and procedures. 2.2.3. HIV registry category definitions Changes have occurred in the registry category definitions over time as follows: 1. At its inception in 1989, the registry classified patients into three categories: asymptomatic; symptomatic (AIDS-related complex [ARC]); and AIDS (presence of AIDS-defining condition); 2. In 1990, the classification was changed to four categories: category 1 (CD4 500, asymptomatic); category 2 (CD4 500, asymptomatic); category 3
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
1197
Table 1 Data elements in VA’s national HIV registry Name of file
Contents of file
Patient
Registry numbera, facility numbera, date of first data transmission, data of most recent data transmission, gender, date of birth, country of birth, race, transmission category, date of diagnosis of HIV infection, date of ARC diagnosis, date of AIDS diagnosis, date of death, date/type of first AIDS-defining diagnosis, category when first identified to the registry, date of category 1, date of category 2, date of category 3, date of category 4 Scheduled clinic visits, unscheduled clinic visits, clinic codes (e.g., general medicine, infectious diseases), number of visits, date of visit (month/year) Discharge bedsectionb (e.g., general medicine, acute psychiatry), up to 10 ICD9 codes, admission date, discharge date, type of disposition (e.g., outpatient, transfer to another facility) Bedsection, up to 10 ICD9 codes, date admitted to bedsection, date discharged from bedsection Procedure date, up to 5 ICD9 procedure codes for each date Procedure date, bedsection, up to 5 ICD9 procedure codes for each date Name of radiology test, CPT code, date of test Name of prescription, date filled, quantity, number of days supply, unit cost Order number, name of drug, medication route, schedule (e.g., tid), schedule type (e.g., continuous, prn), total dispensed ($ amount), starting date, ending date Data on hematology and chemistry tests: Name of test, date of test, result of test Lab test name, number of tests, date of test (month/year), result of test Source of specimen (e.g., right upper lobe bronchus), type of specimen (e.g., bronchial washing), name of organism identified, quantity on culture, antibiotic sensitivities, date of test No data in file Date of visit, bedsection, type of procedure
Outpatient Inpatient Bedsectionb Inpatient surgical procedures Other inpatient procedures Radiology Outpatient prescription Unit dose pharmacy Laboratory test historyc Laboratory test utilizationd Microbiology test historye Intravenous pharmacy Dental visit history a
All files contain this information. Bedsection refers to the type of inpatient unit (e.g., general medicine, intensive care). c This file is a subset of the Laboratory Test Utilization file. It contains information on selected tests of particular relevance to HIV (e.g., CD4 count). d This file contains information on specific hematology and chemistry tests of relevance to HIV. e This file contains information only on specimens for which the bacterial cultures were positive. ARC: AIDS related complex; category 1: CD4 500; category 2: CD4 200–499; category 3: CD4 200; category 4: AIDS-defining condition. b
(symptomatic, no AIDS-defining condition); category 4 (AIDS; presence of AIDS-defining condition [13]); 3. In January 1993, the category definitions were changed to take into account the 1993 revised CDC AIDS surveillance case definition: category 1 (CD4 500); category 2 (CD4 200–499); category 3 (AIDS, denoted by CD4 200 or CD4 percent 14); category 4 (AIDS, presence of AIDS-defining condition [14]). On April 10, 1993, Directive 10-93-045 was distributed to all VA facilities from the VA Under Secretary for Health indicating that the new category definitions were to be applied to all patients seen after January 1, 1993. The categories of those patients in the registry who had been seen prior to this date, but who were not subsequently seen for care, were not affected by the new definition (i.e., these patients’ categories were not to be “updated”). If a registry coordinator does not assign the patient a category at the time the patient is first identified to the registry, the software automatically assigns the patient to category 1 (i.e., category 1 is the default category). 2.2.4. Manual updating of specific information Although CD4 counts are electronically transmitted to the registry as part of VISTA’s laboratory package, all registry category changes that occur must be manually entered by the registry coordinators at each facility. Only those category changes that reflect a worsening are to be entered (i.e., a patient’s category cannot improve). In addition, the spe-
cific AIDS-defining conditions and deaths occurring outside VA must be manually entered by the registry coordinator. 3. Methods 3.1. Pilot study using the national VA HIV Registry 3.1.1. Overview Our VA pilot study entitled “Resource Utilization and Costs of HIV Care in the VA” (VA SDR 92-003) involved the collection of primary data obtained from random samples of HIV-infected patients enrolled at five participating VA Medical Centers (New York, NY; Houston, TX; Miami, FL; Los Angeles, CA; San Francisco, CA) over a 1-year period during 1994 and 1995. Briefly, the goals of the study were to develop a model to forecast future numbers of HIVinfected patients who will receive treatment, to identify the determinants of resource utilization [15], and to estimate future resource utilization. Using the HIV registry as a sampling frame, randomly generated subsets of HIV-infected patients were identified at each of the five sites. To do this we identified all patients at the five sites who were listed in the registry as alive on March 31, 1994. A random number generator was used without replacement, within site, to generate successive random samples of 140 patients at each site by HIV category (category 1 29; category 2 29, category 3 29, category 4 53). We used a graduated sampling frame to sample more intensively from those expected to consume more
1198
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
resources. The patients in these subsets were contacted by our research nurses and asked to participate. The study also involved secondary data contained in the national VA HIV registry. Prior to enrollment patients at the participating sites gave informed consent, which included permission to access their medical records as well as the VISTA data that pertained to them at the site. 3.1.2. Maintaining confidentiality of VA HIV Registry data Patient consent is not required before a veteran is identified to the VA HIV Registry. Because of concerns regarding patient confidentiality, during our pilot study we were not permitted to link registry data to information in other VA databases, such as the patient treatment file (PTF), which is the VA’s national discharge database, or the Beneficiary Identification Record Locator Subsystem (BIRLS), which records the dates of death of veterans whose survivors request death benefits. This was a problem because we were unable to verify registry data against other VA databases. We therefore had to verify the registry data against the patients’ records at the participating sites. 3.2. Registry verification 3.2.1. Number of patients entered into the registry over time We examined the cumulative number of AIDS patients and the number of new patients with HIV infection and with AIDS identified to the registry, by year, through December 1996. The purpose of this was to identify any large information gaps. 3.2.2. Distribution of data elements by region and by participating site We examined the number of records in each file by region (north east, north central, south, west) and by participating site, through December 1996. The purpose of this was to identify any missing data by region or facility, and to identify large changes from year to year that might signify incomplete data in years with fewer records. 3.2.3. Comparison of registry data with information obtained at participating sites We selected several data elements for verification against information obtained at the sites. These were veteran status, HIV status, CD4 count, vital status, presence and type of AIDS-defining diagnosis. Using information available in the patients’ charts and in VISTA, the research nurses determined the registry categories of the patients in the random samples as of March 31, 1994. The first set of random samples of 100 patients per site was used as a practice exercise to ensure the research nurses understood the HIV category definitions. The results of this practice exercise were reviewed in detail with the research nurses by one of the study co-chairs (L.R.). After the practice exercise, West Haven CSPCC generated four successive random samples per site, each containing 140 patients. For each random sample the research nurses determined the patients’ registry categories using data available in the charts and in
VISTA. The results were reviewed in detail by the study cochair to ensure that category definitions were appropriately and uniformly applied across the sites, and the percent disagreement was calculated. 4. Results 4.1. Entry of patients into the registry By December 1996, 40,000 patients had been identified to the VA HIV Registry, nationally. Fig. 1 shows the cumulative number AIDS patients identified to the registry from its inception to December 1996. As expected the cumulative number of AIDS patients rose from 0 in the early 1980s to about 30,000 by 1996. Fig. 2 shows the number of new AIDS patients identified to the registry over the same time period. Small numbers of patients were identified in the mid-1980s. Increasing numbers of new patients were identified in the late 1980s and early 1990s as awareness of the disease spread. Fig. 3 shows the number of new HIVinfected patients (with and without AIDS) identified to the registry. The sharp increase in the number of patients in 1991 that appears in Figs. 2 and 3 illustrates the influx of new patients identified to the registry during that year, when use of the registry was mandated for all facilities. The sharp increase is likely accounted for by the entry of patients during that year who should have been entered earlier. We concluded that the number of AIDS and HIV-infected patients appeared reasonable, aside from the sharp rise in incidence in 1991. 4.2. Completeness and accuracy of registry data 4.2.1. Non-veterans, non-HIV-infected patients, and deaths Table 2 shows that a few patients at each site who had been entered into the registry were either not veterans or were not HIV-infected. Table 2 also shows that identification of deaths to the registry was incomplete. For a given random sample, the number of deaths not identified varied across the sites from lows of 0 at Miami and San Francisco to a high of 27 (27/ 140 or 19%) at Los Angeles. Detailed review of these results indicated that the under-identification of deaths to the registry occurred because of lack of manual updating of deaths by the registry coordinators. This was particularly a problem for patients who died outside of a VA facility, such as those who died at home or in a hospice. Because of confidentiality concerns, as described earlier, the research team did not have access to patients’ SSNs that would have allowed the registry files to be merged with BIRLS. Because BIRLS contains the dates of death of all veterans whose family members request death benefits (estimated at about 95% of all veterans deaths) this would have allowed us to ascertain deaths that had not been identified to the registry. 4.2.2. Registry categories and deaths As shown in Table 3, the percent disagreement in registry category between the research nurses and the registry
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
1199
Fig. 1. Cumulative AIDS cases in veterans using the VA system.
varied within and across the sites. Examining the results across sites shows that Los Angeles (3.6–11.5%) and New York (11.2–13.9%) had the least disagreement, San Francisco was intermediate (13.6–17.1%), and Houston (18.7– 28.9%) and Miami (36.0–41.0%) had the most disagreement. The disagreement also varied across registry categories. Detailed review of the results indicated two major reasons for the disagreement. First, as described previously, changes occurred in the registry category definitions over time, reflecting changes in the CDC AIDS Surveillance Case Definition. However, the changes were not consistently applied across the sites. For example, although the
registry coordinators were advised not to “update” the categories of patients in the registry unless the patients presented for care after January 1, 1993, in Miami the coordinator made an intensive effort to update the categories of all the patients. The second reason for the disagreement was the lack of manual updating of registry category changes by the registry coordinators. For example, consider a patient with a CD4 count of 250 at the time s/he was first identified to the registry. On the basis of this CD4 count, the patient would be assigned to registry category 2. Suppose that 7 months later the patient’s CD4 count had fallen to 180. A CD4 count of 180 means that the patient should be assigned to
Fig. 2. AIDS incidence in veterans using the VA system.
1200
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
Fig. 3. Number of HIV patients entered into the National VA HIV Registry.
registry category 3. However, since all category updates had to be done manually, this required an additional effort by the registry coordinator. Lack of manual updating by the registry coordinator was an important reason for disagreements reported in Table 3. 4.2.3. Missing data Missing data occurred for two reasons. First, information on some items (e.g., ethnicity, transmission category) was missing because it was not manually entered by the registry coordinator at the time the patient was identified to the registry. Thus, of the 16,344 HIV-infected patients who used the VA in 1993, in 14% the ethnicity was unknown, and in 34% transmission category was unknown (data not shown). Very similar proportions of the 15,966 patients who used the VA in 1994 had missing data for these two items.
The second reason for missing data related to the computer programming that creates the registry. This software needs to know the location of the VISTA data at each facility, so that it can “point to it” and extract the needed information. If the VISTA location is incorrect no data will be extracted. This problem particularly affected the pharmacy files. The outpatient pharmacy file contained no information on antiretroviral prescriptions for one of the sites (New York). The unit dose pharmacy file contained no data in the file for two regions, which in turn affected three participating sites. 4.2.4. Lack of uniformity of data In some instances, there was lack of uniformity in grouping and naming structure in the data. This was particularly noted for the laboratory file. For example, chemistry data are listed as multiple entries, such as Na, K, Cl, HCO3, etc.
Table 2 Comparison of HIV registry data with information obtained by the research nurses at the participating sites Number of patients with disagreement Random sample 1a Non-HIVb Deathsc Random sample 2a Non-HIVb Deathsc Random sample 3a Non-HIVb Deathsc Random sample 4a Non-HIVb Deathsc a
Miami
Houston
New York
San Francisco
Los Angeles
2 3
1 16
6 10
0 1
0 27
1 0
5 23
3 11
0 0
1 12
3 2
6 18
3 10
1 3
0 6
1 3
4 10
1 10
na na
0 4
Random samples of 140 patients at each site who were denoted in the registry as being alive on March 31, 1994. Patients who should not have been in registry (e.g., non-veteran; not HIV-infected). c Patients whose death status had not been entered into the registry. na: Information not available because of time constraints experienced by the research nurse. b
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
1201
Table 3 Comparison of HIV registry category data with information obtained by the research nurses at the participating sites Registry category
Sample size (n 140)
Number of patients with disagreement Miami
Houston
New York
San Francisco
Los Angeles
29 29 29 53
2 8 21 20 37.0
11 12 8 3 24.5
1 10 1 3 11.2
4 2 3 10 13.6
5 4 7 0 11.4
29 29 29 53
8 1 24 17 36.0
11 13 10 5 28.9
1 10 1 4 11.7
4 4 0 16 17.1
3 4 7 2 11.5
29 29 29 53
5 8 21 20 39.4
9 6 8 2 18.7
1 11 4 3 13.9
0 2 na 13 13.6
5 3 2 0 7.1
29 29 29 53
4 7 24 22 41.0
12 7 15 4 27.9
3 12 3 1 13.7
na na na na na
2 2 1 0 3.6
a
Random sample 1 1 2 3 4 Disagreement (%)b Random sample 2a 1 2 3 4 Disagreement (%)b Random sample 3a 1 2 3 4 Disagreement (%)b Random sample 4a 1 2 3 4 Disagreement (%)b
a Random samples of 140 patients at each site who were denoted in the registry as being alive on March 31, 1994; category 1: CD–4 500; category 2: CD4 200–499; category 3: CD4 200; category 4: AIDS-defining condition. b [Total number of patients with disagreement]/[140 minus number of non-HIV patients]. na: Information not available because of time constraints experienced by the research nurse.
rather than being grouped together as a single entry, called “chemistry,” or “electrolytes.” Similarly, liver enzymes appear as multiple entries, such as SGOT, AST, alkaline phosphatase, rather than as a single entry called “liver chemistry,” or “liver enzymes.” Thus, tests that in clinical practice are ordered by physicians as a group or “panel” are disaggregated into their constituents in the laboratory file, which means that extensive computer programming is needed to aggregate and classify this information, before it can be used by investigators. In addition, there is lack of a standardized name for the same item. For example, CD4 count appeared in dozens of different ways, such as: helper T, help-T, absolute T4, helper T cells, T helper, T4 helper, abs CD4, abs. CD4, abs CD4-onc, CD4 lymphs, CD 4(T4: T-helper), CD4 (helper cells), CD4 abs, CD4 absolute, helper T cell, CD4 lymphocyte, etc. To make use of CD4 count information in the laboratory file, the file first had to be manually reviewed by the study co-chair and all entries referring to CD4 count identified and grouped under the same variable name. The lack of grouping and lack of a standardized name also pertained to the outpatient prescription file. A given drug, such as zidovudine appeared in different ways, such as: AZT, ZDV, zidovudine, zidovudine capsule, zidovudine/retrovir, retrovir, retrovir/zidovudine, and x-AZT. To make use of this information, the outpatient pharmacy file was manually reviewed by the study co-chair, and the en-
tries were grouped into 11 different categories of drugs (antimicrobials, antiretrovirals, psychiatric agents, decongestants, analgesics, nutritional supplements and vitamins, gastrointestinal agents, non-steroidal anti-inflammatory drugs, steroids, cardiovascular agents, and others). 5. Conclusions In this article we report our experience with the VA’s national HIV registry, which, to our knowledge, is the only national HIV registry in the US. The basis of this experience is our pilot study, which was the first time that the registry had been used for research. During the conduct of the pilot study we learned many valuable lessons about the development, implementation and use of the VA’s national HIV registry. We believe these lessons have broad application for clinical, epidemiologic, and health services research involving disease-specific registries. The VA’s national HIV registry was created by frequent, automatic, electronic transmission of data to a central repository from specific administrative and clinical computer packages that are in place at all VA facilities. Three key strengths of the VA’s national HIV registry are that it is national in scope, it contains longitudinal data, and it contains detailed resource utilization and clinical information. Despite its considerable strengths, the VA’s HIV registry has major limitations for research applications. The three
1202
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
main problems we identified are missing data, lack of uniformity of the data, and patient confidentiality issues. In the future, for the VA’s national HIV registry as well as other disease-specific registries that might be developed using a similar approach, the problem of missing data could be addressed in two ways. The first is by avoiding the manual entry and updating of information into the registry as much as possible. For example, patients could be automatically electronically identified to the registry at the time the first positive HIV test appears in a VISTA laboratory package at a facility. In addition, the manual updating of HIV categories should be replaced by automatic updating based on CD4 count information that is electronically transmitted to the registry. An important principle for investigators to bear in mind in developing disease-specific registries is that the less manual data entry and updating that is required, the less chance there will be missing data. For information that requires manual entry because it is not available in any VISTA package (e.g., HIV transmission category), a computerized monitoring system could be devised to “flag” the missing data and automatically send a message to the registry coordinator at the facility to indicate the need to enter the information. The second way that the problem of missing data could be addressed is by having a full-time individual, such as a computer programmer, who is responsible for the ongoing maintenance of the registry. This individual would be responsible for a system that monitors data extraction for completeness and accuracy, and communicates directly with the VISTA package coordinators at the facilities when problems occur. This would ensure that regular computerized data checks are conducted to detect large gaps that occur because the registry software does not “point” to the correct location of the VISTA information at the facilities. In the case of the VA’s national HIV registry, this would ensure that the large data gaps that we observed in the outpatient pharmacy, unit dose pharmacy, and IV pharmacy files do not occur. The problem of lack of uniformity of registry data could be addressed in two ways. The first is to develop standardized definitions and labels for the data elements that would be used at all facilities. This would ensure that the information that is transmitted is standardized. The second is to aggregate the data in a standardized format that is retained during data transmission from the facility. For example, in the case of laboratory tests, the tests could be aggregated in user-friendly bundles, with standardized labels, such as “chemistry panel,” rather than having multiple individual tests (i.e., Na, K, glucose, etc). In the case of medications, the same medication (e.g., AZT, zidovudine) could be given a single label (zidovudine) and the drug could also be identified by type (antiretroviral agent). This would avoid the very cumbersome, labor intensive work of having to review the pharmacy files by hand to create standardized drug labels and to identify the types. The second way that the problem of lack of uniformity of data could be addressed would be to have the computerized monitoring system run data
checks for breaks in the use of standardized drug labels and types so that this could be promptly remedied. The confidentiality issue poses a very real problem for users of VA’s national HIV registry because it precludes investigators from linking the registry to other VA databases, such as the patient treatment file (PTF), which is VA’s hospital discharge database, or VA’s Beneficiary Identification and Record Locator Subsystem (BIRLS), which contains information on veterans’ deaths. The ability to link the HIV registry to the PTF would allow a more extensive verification of the registry data. In the case of deaths, linkage with BIRLS would obviate the need for manual entry of dates of death. The inability to link the registry to BIRLS means that deaths are underidentified to the registry, which has important implications for research that is intended to estimate health care resource use and outcomes. The VA is currently in the process of revamping some of the procedures for entry of data to the HIV registry and for maintenance of the registry. An HIV Coordinating Center has been established at the San Diego VA Medical Center. One of their responsibilities is to develop procedures to improve the quality and completeness of the registry data. These procedures include identifying missing data and establishing a mechanism to feed this information back to VA facilities with database problems. In addition, the Hines VA Medical Center, which houses the registry, will now be permitted to merge death data from BIRLS into the registry, eliminating the problem of manual entry of deaths. The importance of our work with the VA’s national HIV registry is that the lessons learned could be used to guide the development of model disease-specific registries in the future. Within VA, given that the VISTA packages are in place in all VA facilities, VISTA data can be extracted and electronically transferred to a central repository in a similar fashion, creating longitudinal databases that are national in scope. Outside VA, any large health care system or HMO could potentially create such a registry using a similar approach. Such registries can serve as incredibly rich sources of clinical and health care resource utilization information that are not generally available in administrative databases. Acknowledgments Supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service, SDR Project # 92-003. Dr. Rabeneck is the recipient of a VA Health Services Research and Development (HSR&D) Advanced Research Career Development Award. References [1] Brooke EM. The current and future use of registers in health information systems. Geneva: World Health Organization, 1974. [2] Hammond M, Stassen L. Do you CARE? A national registry for cleft lip and palate patients. Br J Plastic Surgery 1999;52:12–7. [3] Owen JL, Bolenbaucher RM, Moore ML. Trauma registry databases:
L. Rabeneck et al. / Journal of Clinical Epidemiology 54 (2001) 1195–1203
[4]
[5]
[6]
[7]
[8]
[9]
a comparison of data abstraction, interpretation, and entry at two level 1 trauma centers. J Trauma: Injury, Infec Crit Care 1999;46:1100–4. Kozinetz CA, Skender ML, MacNaughton NL, Del Junco DJ, Almes MJ, Schultz RJ, Glaze DG, Percy AK. Population-based registries using multidisciplinary reporters: a method for the study of pediatric neurologic disorders. J Clin Epidemiol 1995;48:1069–76. Taylor SM, Robison JG, Langan EM, Crane MM. The pitfalls of establishing a statewide vascular registry: The South Carolina experience. Am Surgeon 1999;65:513–8. Samsa G, Hoenig H, Carswell J, Sloane R, Bovender CR, Lukas CV, Horner RD. Developing a national registry of veterans with spinal cord dysfunction: experiences and implications. Spinal Cord 1998;36: 57–62. Reynolds P, Saunders LD, Layefsky ME, Lemp GF. The spectrum of acquired immunodeficiency syndrome (AIDS)-associated malignancies in San Francisco, 1980-1987. Am J Epidemiol 1993;137:19–30. Centers for Disease Control and Prevention. Surveillance of tuberculosis and AIDS co-morbidity—Florida, 1981-1993. Morb Mortal Wkly Rep 1996;45:38–41. Thompson MA. The AIDS Research Consortium of Atlanta database. J Acquir Immune Defic Syndr Hum Retrovirol 1998;17(Suppl 1):S20–2.
1203
[10] Crystal S, Kersting RC. Stress, social support, and distress in a statewide population of persons with AIDS in New Jersey. Soc Work Health Care 1998;28:41–60. [11] Moore M, McCray E, Onorato IM. Cross-matching TB and AIDS registries: TB patients with HIV co-infection, United States, 19931994. Public Health Rep 1999;114:269–77. [12] Chaisson MA, Berenson L, Li W, Schwartz S, Singh T, Forlenza S, Mojica BA, Hamburg MA. Declining HIV/AIDS mortality in New York City. J Acquir Immune Defic Syndr 1999;21:59–64. [13] Centers for Disease Control. Revision of the CDC surveillance case definition for acquired immunodeficiency syndrome. Morb Mortal Wkly Rep 1987;36:1–15S. [14] Centers for Disease Control and Prevention. 1993 Revised classification system for HIV infection and expanded surveillance case definition for AIDS among adolescents and adults. Morb Mortal Wkly Rep 1992;41(RR-17):1–19. [15] Menke TJ, Rabeneck L, Hartigan PM, Simberkoff MS, Wray NP. Clinical and socioeconomic determinants of health care use among HIV-infected patients in the Department of Veterans Affairs. Inquiry 2000;37:61–74.