Pergamon
Journal of Government Information. Vol. 24. No. 1. DD.27-37.1!Wl Copyright 8 1997 El&e; Science Ltd Printed in the USA. All rights reserved 1352-0237197 $17.00 + .OO
PDI 81352-0237(%)00051-2
PRELIMINARY FINDINGS ABOUT THE SELECTION OF U.S. CENSUS DATA BY STATE AND LOCAL GOVERNMENT AGENCIES AND :NONPROFIT ORGANIZATIONS* LEONARD M. GAINES** New York State Department of Economic Development, One Commerce Plaza, Albany, NY 12245, USA, Internet:
[email protected]
Abstract-Historically, accurately determining the data needs of state and local govemment agencies and nonprofit organizations has been difficult. Many of these types of organizations participate in the Census Bureau’s State Data Center and Business/Industry Data Center programs. This paper presents preliminary findings of a diary-based study that looks at the census data used by over 200 organizations participating in these programs. It is possible to determine objectively the relative demand for various data subjects utilized by having staff of these agencies record summaries of the census data, including decennial and economic censuses and current statistics. It is further possible to determine the frequency with which these data are used along subject and geographic lines. While the diaries are being maintained over the period of one year, this paper presents findings based on the first three months of these diaries covering about 2,000 individual data requests. The preliminary findings reported in this paper indicate that these organizations routinely use a very wide variety of census data covering the full range of geographic detail. Additionally, this paper looks at the census data used by various types of organizations. Copyright 0 1997 Elsevier Science Ltd Keywords-State
Data Center. Census data, Information selection, State government, Local government, Nonprofit organizations
INTRODUCTION Historically, the U.S. Bureau of the Census has had difficulty in accurately determining which data from the decennial census are needed by state and local government agencies. Through diary records being collected for a study on the relationship between organizational characteristics and. the range and amount of census data used by these organizations, this study identifies the decennial census data actively being used by *This paper is an expanded version of a paper presented at the 1995 Joint Statistical Meetings in Orlando, Florida on August 12-17, 1995 sponsored by the American Statistical Association, the Biometric Society Eastern and Western North American Regions, and the Statistical Society of Canada. The views expressed in this paper are solely those of the author and do not represent those of the New York State Department of Economic Development, the State University of New York-Empire State College, or Rensselaer Polytechnic Institute. This ipaper is taken in part from a thesis to be submitted in partial fulfillment for the degree of Doctor of Philosophy in Urban and Environmental Studies at Rensselaer Polytechnic Institute. The author is grateful to Dr. Robert Kraushaar, Dr. J. R. Norsworthy, Dr. Douglas Rebne, Robert Scardamalia, Jeffrey Schnur, and to the three anonymous readers for the useful comments they provided in reviewing drafts of this paper. **Leonard Gaines is a Program Research Specialist in the New York State Department of Economic Development’s Division of Policy and Research, an Adjunct Professor at the State University of New York Empire State College, and a Ph.D. candidate at Rensselaer Polytechnic Institute. 27
28
L. M. GAINES
state and local governments. This paper details the problem, describes the study’s methodology, presents preliminary results from the diaries, states the major limitations of this report, and discusses the significance of these results. PROBLEM
STATEMENT
This is the point in the decade when the Census Bureau makes a number of significant decisions regarding the next decennial census. One of these decisions relates to the topics to be included in the upcoming census. Specifically, these decisions involve questions about the geographic detail of the specific data topics that data users actually use. The answers to this question help determine which data topics are to be collected by surveys such as the current population survey and which topics are to be collected as part of the decennial census. In developing the background for these decisions, the Census Bureau consults with a wide variety of data users. The answers to this question are particularly important in planning for the 2000 Census because the Census Bureau is under pressure from Congress to reduce the number of, or even eliminate, the questions asked of a sample of the population in the decennial census due, in part, to a decline in the mail response rate between the 1980 and 1990 Censuses. This pressure is coming in spite of an evaluation done by the National Research Council showing that the number of questions included on the sample questionnaire does not significantly affect the response rate [l]. Many of the data users from outside the federal government are represented on Census Advisory Committees by their trade or professional organizations, including the American Marketing Association, representing the business community, and the American Statistical Association, representing the statistical and social science research communities. One group of major users are state and local governments. Due to this group’s diversity, however, no major organization represents its common interests. Consequently, the data needs of state and local government agencies have been historically underrepresented in the decennial census planning process except to the extent that they are members of organizations represented on the advisory committees. The Census Bureau has conducted customer satisfaction types of surveys of non-federal data users centered around the question of how important they believe certain data topics and tables are, rather than how much they use these data or why they need these data. As reported by Mills and Cresce, the Census Bureau conducted its first survey that looked at the needs of non-federal data users this past winter [Z]. The purpose of this study was to determine which data topics are needed by non-federal data users to perform properly their jobs and what alternative data sources might be available. This study showed that for each data topic included in the 1990 Census at least some of the non-federal data users needed it, generally for program planning, evaluation, or development purposes. It also found that in most cases there were no acceptable alternative data sources. For the purposes of reporting the results of this survey to the U.S. Office of Management and Budget and to Congress, the U.S. Bureau of the Census divided the data topics into three categories, namely: those specifically required by federal law (mandatory), those required for the administration of federal programs but not specifically required by federal law (required), and those traditionally used for the programmatic purposes by the federal government (programmatic) [3]. Table 1 identifies which data topics are included in each of these categories. Also, the Association of Public Data Users (APDU) studied the ways non-federal data users use the data topics classified by the Census Bureau as programmatic based on responses of about 250 non-federal data users to a widely distributed census data use
29
Selection of U.S. Census Data Table 1. 1990 Census topics by federal need category Mandatory
Age
Citizenship Educational Attainment Ethnicity Housing Units Income Income by Source Language Spoken at Home Marital Status Place of Birth Place of Work Population Race Relationship Rooms in Unit School Enrollment Sex Tenure Travel Mode Travel Start Travel Time Vacancy Status Vehicles Available Veteran Status Year Structure Built Year of Entry
Required
Programmatic
Bath, Presence of Class of Worker Condo Fees Contract Rent Disability Electricity Cost Gas Cost Heating Fuel Heating System Hours Worked Last Week Industry Insurance Cost Kitchen, Presence of Labor Force Status Number of Bedrooms Occupation Type of Housing Units in Structure Water Cost
Ancestry Children Ever Born Meals in Rent Residence in 1985 Sewage Disposal Teleohone in Unit Type of Sewer Value of Housing Unit Water Source Work Status Last Year
description form. The APDU stud! reported that these data topics are used for small geographic levels by non-federal statistical users as standard measuring sticks, largely to identify and establish priorities for their own actions or to comply with governmental regulations. This study also found that there are now no alternatives to the Census for these data [4]. In preparing the National Research Council report mentioned above, Conrad, Citro, and Edmonston surveyed the State: Data Centers in 18 states about their perception of the importance of the various data topics, their priorities for the features of census data, and their reaction to various potential census designs. This survey, like the others mentioned above, found that state and local governments rely on all of the census’ data topics for all geographic levels. It also found that these data centers felt a need for accurate, timely data for small geographic units such as census tracts, block groups, and blocks. Finally, it found that the data centers consider the census to be the only source for much of the data, and the only source for reliable data, about each of the data topics included in the census [5]. While the studies described above examined the data needs of non-federal data users of the census, none of the studies attempted to measure the relative frequency of data use. By looking at the amount of use of the 1990 Census topics during a three-month period several years after they had been released, this study begins to fill in that gap. PROCEDURES Study Population The organizations selected for this study are participants in the U.S. Bureau of the Census’ State Data Center and Business/Industry Data Center (SDC/BIDC) programs. By participating in these programs, and in exchange for agreeing to provide the data to
30
L.M.GAINES
the public, these approximately 1,800 organizations receive data products from the U.S. Bureau of the Census free of charge. The types of organizations participating in these programs are state government agencies (15 percent), regional or local government agencies (56 percent), university-based organizations (22 percent), and nonprofit organizations and trade associations (7 percent). General Structure
The study providing the data for this analysis is being conducted in two data collection phases. The first phase, already completed, was the collection of organizational characteristics, serving also to recruit participants for the study’s second phase. Of 819 organizations surveyed in this phase of the study, 517 completed questionnaires. The second phase of this study requires study participants to maintain diaries of the census data requested by their own organization. These diaries are to be maintained for 17 one-week periods every third week for approximately one year. On a rotating basis, roughly one-third of the agencies participating in this phase of the study maintain diaries during any given week. Of the 517 organizations returning their questionnaires in the first phase, 211 agreed to maintain diaries. These were divided into six subpanels, the first three subpanels included a total of 144 organizations. The first panel began its diaries the week of October 3, 1994, the second and third panels started keeping their diaries the following two weeks, and the fourth through sixth panels started maintaining their diaries on December 5,1994 and the following two weeks. This report is based on the diaries completed through December 30,1994, therefore, the organizations included in the first three panels maintained diaries for four weeks and the second three panels kept them for one week. The Diaries
The participants in the diary phase of the study were asked to record any request from within their own organization for data produced by the U.S. Bureau of the Census, noting in particular the subject content and geographic levels of each request. During the three-month period used as the basis for this paper, 119 of the participants reported at least one request for Census Bureau data. The remaining participants either did not return the diaries or reported no internal requests for census data. The participants were not provided a list of standard data topics to use in recording data requests. Therefore, the diaries that were returned recorded such requests as “demographic data” or “poverty status.” In order to relate these to the topics included in the 1990 Census, whenever possible, these requests were recoded to original topics. For example, a request for “poverty status” was recoded to be requests for income, age, and relationship since those are the data topics used by the Census Bureau to define poverty status. If a reported request was not specific enough to permit this type of recoding or was based on a Census Bureau product other than the decennial censuses, it was dropped from the list of requests. As a result of these deletions and additions and due to multiple topics in a request, the original 2,196 requests reported by the study’s participants resulted in 2,212 specific topic/geographic level combinations. RESULTS
AND
DISCUSSION
This section presents the actual data requests by topic, geographic level, and topic and geographic level reported by this study’s participants. It also provides information on the number of states and the types of organizations in which people are using these data topics.
31
Selection of U.S. Census Data
Topic Usage
From Table 2 it can be seen that the most heavily requested data topics focus on: (1) income (396 or 18 percent of the 2,212 requests), (2) population (377 or 17 percent), (3) age (344 or 16 percent), (4) relationship (215 or 10 percent), and (5) race (196 or 9
Table 2. Number of requests for 1990 census topics reported by selected SDUBIDC organizations Topic Income Population Age Relationship Race Sex Labor Force Status Ethnicity Educational Attainment Occupation Housing Units Value of Housing Unit Disability Tenure Insurance Cost Type of Housing Unit Electricity Cost Gas Cost Water Cost Units in Structure Type of Sewer Industry Contract Rent Condo Fees Place of Work Travel Time Vacancy Status Water Source Income by Source Marital Status Year Structure Built Veteran Status Ancestry Hours Worked Last Week Class of Worker Vehicles Available Language Spoken at Home Residence in 1985 Citizenship Bath, Presence of Children Ever 13orn Kitchen, Presence of Heating Fuel Travel Mode Travel Start Year of Entry Meals in Rent Heating System Place of Birth School Enrollment Rooms in Unit
Total Requests 396 371 344 215 196 99 92 69 62 41 42 26 22 13 12 12 12 12 12 10 10 9 8 8 7 7 7 7 7 I 7 6 s 5 5 4 4 4 4 3 3 3 2 2 2 1
1 1 1
1 1
L. M. GAINES
32
percent). All of these topics are classified as mandatory. Of these, only income is collected on a sample basis. According to Table 2 the most commonly requested required subjects are: (1) labor force status (92 or four percent), (2) occupation (47 or two percent), and (3) disability (22 or one percent). This table also shows that the value of the housing unit is the most heavily used programmatic variable, with 26 requests (or one percent). A number of topics included in two of the three federal need categories had no requests within the time period of this report. These topics included the programmatic subjects “work status last year” and “last year worked” and the required subject “year moved into unit.” Use of Geographic Levels
As mentioned above, in order to determine the appropriate method of collecting potential census data topics there is a need to know the geographic levels being requested by data users. From Table 3 it can be seen that the organizations involved with the SDC/BIDC programs primarily use data at the county and municipal levels. Together, these levels account for over one-half of the requests reported by these organizations. This appears to be consistent with the local nature of the organizations involved in these programs. These organizations are heavy users of data for geographic levels smaller than governmental units. Nearly 15 percent of the requests reported by these organizations were for data at the census tract, block group, or block level. Requests for data at the block group or block levels were more common than requests for the entire United States. Topic and Geographic Level
As noted above, one question of interest in planning the data content of the next decennial census is the lowest geographic level being used for each data topic. Table 4 adTable 3. Number of requests by geographic level and number of subjects by geographic level Geographic Level
Requests
County Municipality State Census Tract us Block Group MSA ZIP Code Block PUMA Native American Area Traffic Analysis Zone Unspecified School District Division Region Congressional District Urban/Rural Urbanized
648 636 367 174 96 76 73 41 41 19 7 7 6 4 4 4 4 3 2
Subjects 42 40 34 28 12 18 14 7 10 8 5 5 4 3 3 4 4 2 2
County
Heatine Svstem Language-Spoken at Home Meals included in Rent Residence in 198.5
State
Place of Birth Rooms in Unit School Enrollment Year of Entry Ancestry Children Ever Born Citizenship Condominium Fees Electricity Cost Gas Cost Heating Fuel Hours Worked Last Week Industry Insurance Cost Marital Status Occupation Veteran Status Water Cost
Municipality Class of Worker Source of Income Place of Work Tenure Travel Mode Travel Start Time Travel Time Type of Housing
Census TractIBNA
Table 4. Data topics by lowest geographic level requested
Bath, Presence of Disability Educational Attainment Income Kitchen, Presence of Type of Sewer Units in Structure Vehicles Available Year Structure Built
Block Group
Age Contract Rent Ethnicity Housing Units Population Race Relationship Sex Vacancy Status Value of Housing Unit
Block
B D,
g b g. g % c + c, : 2tl
34
L. M. GAINES
dresses this question by presenting the data topics requested by the lowest geographic level requested. As can be seen from this table, the participating organizations use many of these data topics at geographic levels smaller than governmental units. Moreover, there are a large number of data topics that need to be collected using a very large Table 5. Number of states with requests for census data by topic Topic
Age
Ancestry Bath, Presence of Children Ever Born Citizenship Class of Worker Condo Fees Contract Rent Disability Educational Attainment Electricity Cost Ethnicity Gas Cost Heating Fuel Heating System Hours Worked Last Week Housing Units Income Income by Source Industry Insurance Cost Kitchen, Presence of Labor Force Status Language Spoken at Home Marital Status Meals in Rent Occupation Place of Birth Place of Work Population Race Relationship Residence in 1985 Rooms in Unit School Enrollment Sex Tenure Travel Mode Travel Start Travel Time Type of Housing Type of Sewer Units in Structure Vacancy Status Value of Housing Unit Vehicles Available Veteran Status Water Cost Water Source Year Structure Built Year of Entry
States Using Data 34 5 2 2 3 3 4 5 9 22 4 19 4 1 1 3 16 36 : 4 2 25 3 3 1 14 1 3 39 27 32 3 1 1 23 7 1 1 5 4 3 4 3 11 1 2 4 2 5 1
Number of Requests 344 5 3 3 4 5 8 8 22 62 12 69 12 2 1 5 42 396 7 9 12 3 92 4 I 1 47 1 7 377 196 215 4 1 1 99 13 2 2 I 12 10 10 -I 26 4 6 12 7 7 1
Selection of U.S. Census Data
sample in order to produce ipants.
the data at the geographic
35
levels used by this study’s partic-
State Usage
The 119 participants reporting requests come from 44 states and territories. As shown in Table 5, there is no relationship between the number of states with participants requesting a data topic and whether the topic is classified as mandatory, required, or programmatic. Consistent with common, but previously unmeasured beliefs, the number of requests for a topic and the number of sta.tes from which those requests come are very highly correlated (r = 0.9183). Further, it should be noted that due to the nonlinear relationship between these two variables,, the actual correlation is probably even higher than stated. This high correlation implies that the demand for the more frequently requested data topics is a widespread demand rather than being concentrated in just a few areas. In particular, it should be noted that the eight topics with requests coming from over 20 states are also the most commonly requested data topics. Organizational Type
As can be seen in Table 6, the most commonly used topics are used by all of the types of organizations included in this study. The overall distribution of requests by organization type fairly closely follows that of the SDC/BIDC participants. The distributions shown in this table imply that the .more heavily requested data topics are requested by a wide range of organizational types.
LIMITATIONS If, as is generally believed to be the situation, data are most heavily used when they are first released, the timing of this study has the effect of understating the use of 1990 census data by the types of organizations included. These data were collected in a relatively short period of time. This 1:lmitation may have the effect of understating the use of census data by not capturing the periodic uses of census data such as an increase in usage related to evaluating new program alternatives as is often done during legislative sessions. Further, this descriptive study looks at the actual levels of census usage data rather than the reasons for this usage. This scope limits the ability to understand why certain data are used more than others. Also, it is possible that participants recorded the primary topics of interest rather than all topics. For example, someone interested in educational attainment by race may have reported just educational attainment since that topic was their primary interest. This reporting problem has the effect of decreasing the level of usage of some data topics. Finally, the organizations involved in the SDUBIDC programs tend to be interested in particular kinds of data topics. Therefore, the number of requests for specific data topics may understate the need for topics reporting little usage. For example, the small number of transportation agencies involved with these programs may explain the small amount of reported usage of the travel related topics.
36
L. M. GAINES Table 6. Census topics requested by organization type Topic
Nonprofit
Income Population Age Relationshin Race ’ Sex Labor Force Status Ethnicity Educational Attainment Occupation Housing Units Value of Housing Unit Disability Tenure Insurance Cost Type of Housing Electricitv Cost Gas Cost’ Water Cost Units in Structure Type of Sewer Industry Contract Rent Condo Fees Place of Work Travel Time Vacancy Status Water Source Income by Source Marital Status Year Structure Built Veteran Status Ancestry Hours Worked Last Week Class of Worker Vehicles Available Language Spoken at Home Residence in 1985 Citizenship Bath, Presence of Children Ever Born Kitchen, Presence of Heating Fuel Travel Mode Travel Start Year of Entry Meals in Rent Heating System Place of Birth School Enrollment Rooms in Unit
7 II
Topic Total Percent of Total Percent of Organizations
6 2 1 1 1 0
0 0 0 0 0 0 0 0
0
2
Regional/Local 2.50 240 167 114 94 33 S8 29 21 I8 37 16
3 6 6 5 3 4 2 2 3 6
0 0 0
State 9s
44
56 126
70 45 14 52 I1 9 I1 21 9 0 0 3 4 1 0 1 1 1 1 0 2 0 1 0
85
49 54 24 29 I3 20 5 8
4 2
I
1
2 5
1 0 0 0 0 0 2 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0
4
2
2 0 0 0
0 0 0 0
0
38 1.7% 7%
1228 55.5% 56%
University
638 28.8% 15%
308 13.9% 22%
Total Requests 396 377 344 21.5 196 99 92 69 62 47 42 26 22 13 12 12 12 12 12 10 10 9 8 8 I 7 7 7 7 I 7 6 5 5 5 4 4 4 4 3 3 3 2 2 2 1
1 1
1 1 1 2212
Selection of U.S. Census Data
37
CONCLUSIONS Since this study may have a tendency to understate the total usage of census data by local and state organizations, it is clear that such agencies make use of essentially all data topics included in the decennial census. Further, it is clear that many of these data topics are used for very small geographic levels. It is also apparent that, while they are used, several of the data topics included in the 1990 Census are of comparatively little interest to the organizations that participate in the SDC/BIDC programs. Based on the findings of this report it is clear that there is active use for most, if not all, of the data topics included in the 1990 Census at geographic levels for which it is necessary to use very large samples to obtain accurate data. Combined with the National Research Council finding that the inclusion of additional questions on a long-form questionnaire has minimal impacl on a census’ cost and mail response rate, it becomes clear that in order to meet the needs of state and local governments and nonprofit organizations, all of the data topics included in the 1990 Census should be asked in the 2000 Census using both complete count and sample questions. Naturally, it would be sufficient if another method can be found to provide accurate small area data for this large variety of data topics at a lower cost and on a more timely basis than the decennial censuses. That, however, is unlikely. NOTES 1. U.S. National Research Council, Modernizing the U.S. Census (Washington, DC: National Academy Press, 1995). 2. Karen M. Mills and Arthur R. Cresce, “Surveying Non-Federal Data Users for 2000 Census Needs,” in American Statistical Association, 1995 Proceedings of the Government Statistics Section (Alexandria, VA: The Association, 19%) 210-5. 3. U.S., Bureau of the Census, [Survey of Census Needs of Non-Federal Data Users. Questionnaire (Form S-631) and Information Sheet] (Washington, DC: The Bureau, 1994). 4. Association of Public Data Users, APDU Year 2000 Census Content Working Group Final Report on Activities: Report 1 Overview and Summary of Responses (Princeton, NJ: The Association, 1995). 5. Michele L. Conrad, Constance F. Citro and Barry Edmonston, “State and Local Needs for Census Data,” (Paper Presented at the Joint Statistical Meetings of the American Statistical Association, Orlando, FL, August 1995).