Census Mapping D. Martin, University of Southampton, Southampton, UK & 2009 Elsevier Ltd. All rights reserved.
Glossary Census Complete enumeration of a population for statistical purposes. Choropleth Map constructed from shaded areas in which shading represents the value of the attribute being mapped. Ecological Fallacy The fallacy of making inferences about individuals from aggregated data. Enumeration Method for collecting and recording characteristics on each member of the population, conducted by census officer called an ‘enumerator’. Geocoding Conversion of addresses, place names, or other geographic references to coordinates, such as latitude and longitude. Geodemographic Classification Classification of areas based on their socioeconomic characteristics so as to place the most similar areas into the same groups. Gerrymandering Deliberate manipulation of geographical boundaries so as to maximize the electoral advantage of a particular party. Modifiable Areal Unit Problem The dependence of analysis of aggregate data on the precise reporting zones used for aggregation.
Introduction Census mapping refers to the cartographic presentation of the results of censuses in order to reveal spatial patterns in the data. Despite difficulties in achieving complete enumeration, the high population coverage of censuses means that it is usually possible to obtain data for very small geographical areas which can form the basis for detailed maps. Censuses are usually administered by means of a questionnaire for each member of the population covering a range of demographic and socioeconomic topics. Postenumeration processing of these forms permits the production not only of simple counts and percentages such as numbers of households without a car or percentage persons unemployed, but also a wide range of cross-tabulations and derived indicators such as social class, geodemographic classifications, and multivariate deprivation indicators. This article considers the overall role of censuses, the uses of census mapping, and a range of census mapping techniques.
12
The Place of Censuses Censuses of population play an important part in the government of nations, particularly with regard to the spatial allocation of resources. The earliest modern censuses, from which results have been widely published, can be dated from around the start of the nineteenth century, with the first census of population in 1790 in the US and 1801 in England and Wales. Censuses represent one of a range of methods for counting population which also includes continuous administrative and registrationbased systems, particularly favored in Scandinavia, and indirect methods such as estimation from satellite remote sensing. A census is an attempt, typically decennial, to measure the location and characteristics of the entire population at a single point in time, in contrast to continuous approaches which rely on tracking population members, usually for administrative rather than statistical purposes. These approaches all contrast with sample survey methods which do not attempt to directly record the entire population. A successful census offers the single most powerful combination of population coverage, geographical and socioeconomic detail within a single integrated dataset, and can support a variety of data outputs, including, for example, microdata samples and interaction data which describe migration and commuting flows. The most traditional output has been aggregate responses for geographical areas, which form the basis for most census mapping. Historically, there have been multiple motivations for conducting a census. An understanding of population resources may be required for purposes as diverse as raising an army, organizing political representation, levying taxation, or allocating welfare funding. The results of a census may be particularly politically sensitive when, for example, they demonstrate the relative strength of different population subgroups, such as the proportions and degree of mixing between Protestant and Catholic groups in Northern Ireland or the various ethnic groups in South Africa. Where there are conflicts within a population, subgroups may actively boycott census enumeration in protest about government policy, challenge the results of a census as not adequately representing their interests or lobby for the inclusion of specific questions or definitions. Some population subgroups such as unregistered migrants may actively seek to avoid detection by government, including a census. A longstanding criticism of censuses and their analysis is that they include only questions which are acceptable to
Census Mapping
the prevailing government and are therefore fundamentally limited in their ability to challenge established power relations in a society. Modern censuses are generally conducted under specific statistical legislation and incorporate sophisticated data protection techniques in an attempt to provide respondents with privacy assurances. These include the random swapping of individuals between areas, rounding, or modification of small counts and suppression of results for areas with small populations. The extent to which the census technical and legislative framework provides effective protection to the individual, rather than simply contributing to surveillance by the state, is a strongly contested issue and varies between national contexts. Any data adjustment process will have an impact on census mapping. In general, statistical disclosure control procedures have the biggest impact on data at the neighborhood or village level, but are negligible at the scale of cities and regions. Nevertheless, since the advent of computer-based census operations the user base for census data has increased dramatically. Computer-readable census outputs are widely used by central and local government, healthcare planners, academic researchers, and the commercial sector for demographic analysis and business or service planning. Census mapping, in the 1970s the preserve of university computer laboratories, is now readily undertaken online by schoolchildren. There is thus a strong tension between increasing the content and outputs of censuses to meet contemporary demands versus a desire to maintain comparability over time and preserve response rates. Internationally, census response rates are falling due to both public concerns over the privacy of personal information and increasing difficulty in accessing and delivering census forms to every household. In the early 2000s nations are adopting different strategies to provide the statistical information required by government – either through increasing reliance on administrative data sources or by spreading census enumeration over time. France has adopted a rolling census, aiming to cover all areas of the country during a 5-year time period, while the US has moved to a short-form only census accompanied by a continuously administered sample survey. In addition to these alternative strategies, increasing effort is required in order to deal with the impacts of census nonresponse, including enumerator follow-up of households from which there is no return and statistical imputation of missing values in the census data.
The Role of Census Mapping Modern census organization involves the coordination of a very large workforce and geographical information
13
systems (GISs) are routinely used to manage the logistics of enumeration. This organizational mapping does not generally form part of published outputs but nevertheless underpins the operational success of the census. Successful geocoding of individual census returns, linking them to the appropriate areas, is essential if counts are to be correctly aggregated. The management of address lists and geography lookup tables is a major element of census operations and often serves to underpin the geocoding of many noncensus data sources. Enumeration areas used are not necessarily the same as those used for publication of results, which usually need to be meaningful in terms of administrative and electoral geographies or are designed for general-purpose statistical reporting. Mapping makes a unique contribution to the interpretation of census results due to the ability of the human eye to identify pattern in graphical data. Thus, complex patterns of social geography may be readily conveyed by mapped census data which would be almost impossible to extract from tabular results. Ongoing fascination with these representations can be traced back to nineteenthcentury interests in the mapping of social conditions such as Charles Booth’s poverty maps of London, through to the development of social area analysis and geodemographic classification whereby areas are grouped according to their characteristics defined across a large pool of census variables. The production of a final map, for example, of an area classification scheme, may be the result of extensive unseen manipulation in GISs and statistical software. The dual independent map encoding (DIME) digital data structure developed to accompany the 1970 US census was influential in the emergence of GIS standards and one of the first examples of a statistical organization’s digital boundary data to accompany a census. Digital boundaries allow users to map and analyze census data using their own software and provide the population base layer for many GIS applications. While geodemographic classification tends to find commercial applications, such as direct marketing or site location for retail outlets, similar approaches also serve as the starting point for area-based government policy, particularly the weighting of additional resources to neighborhoods with high levels of unemployment and multiple social deprivation. All aggregated census data are susceptible to the modifiable areal unit problem (MAUP). This relates to the fact that observed patterns are a reflection of the number and placement of area boundaries as much as of the underlying population characteristics: by redesigning area boundaries it is frequently possible to change the apparent pattern in the map. Such aggregate representations are also subject to the ecological fallacy, whereby associations seen in the data at one level of aggregation will not necessarily be observed between individuals in the population, or at other levels of aggregation. These
14
Census Mapping
phenomena are particularly relevant to the use of census mapping in political districting. The general principle requires that small census areas be grouped into larger electoral districts of similar population size to achieve equality of representation, either by manual or automated means. The ability to concentrate supporters of particular parties by boundary redesign can also lead to the direct manipulation of boundaries for purposes of electoral gain, known as gerrymandering.
Techniques and Examples By far the most widely used cartographic representation of census data is the choropleth, or shaded area, map. The example provided in Figure 1 shows the 2001 census percentage of white ethnic groups in the city of Southampton, England. Values in the map are based on the aggregation of all the individual census results within each output area, clearly revealing how the outer suburbs are almost entirely white while some central areas are dominated by nonwhite ethnic groups. It is generally inappropriate to map counts in this way, as the values will often be related to the size of the area rather than prevalence in the population. A disadvantage of all choropleth mapping is that the visual dominance of an N
area is related to its geographical area rather than its population size. The largest areas in Figure 1 comprise mostly docks and open water, while the most densely populated areas are so small as to be barely visible. This type of map is increasingly available online from the websites of national statistical agencies, such as that illustrated in Figure 2, from the Neighborhood Statistics service for England and Wales which again shows Southampton, this time the percentage of households without a car in 2001 for larger census areas called wards. In this case, additional interactivity is provided by dynamic linkage between the histogram and the map: pointing at an area on one will highlight its identity and position in the other. The figure also demonstrates the effects of generalization, compared to Figure 1, in which the map has been simplified to speed the web application, with the consequence that these area boundaries would be difficult to locate accurately on a street map of the city. The availability of some census data within online mapping and virtual globe software such as Google Earth has not to date produced new forms of mapping so much as increased options for access and interaction. If area boundaries are not available, it is possible to produce simple maps by the generation of synthetic boundaries, known as Thiessen or Voroni polygons, or the placement of proportional symbols at centroid locations Population % white ethnic groups 2001 Southampton output areas 23−50 50−82 82−90 90−96 96−100
4 kilometers
Figure 1 Choropleth map showing percentage population from white ethnic groups for output areas in Southampton, England. Source: Office for National Statistics, 2001 Census: Key Statistics (England and Wales). ESRC/JISC Census Programme, Census Dissemination Unit, MIMAS (University of Manchester); 2001 Census, Output Area Boundaries. Census output is Crown copyright and is reproduced with the permission of the Controller of HMSO and the Queen’s Printer for Scotland.
Census Mapping
15
Figure 2 Online mapping of national statistical agency data: Percentage of households without a car for wards in Southampton, England. Source: National Statistics website: www.statistics.gov.uk Crown copyright material is reproduced with the permission of the Controller of HMSO.
representing each area, although these types of representations tend to introduce additional interpretational difficulties, for example, due to altered area geometry or overlapping symbols in areas of high population density. All census mapping is affected by general cartographic design considerations. Choices of map scale, number and placement of class intervals, selection and order of colors, and other layout considerations will affect how the map reader interprets the data. A further challenge facing census users is that of comparing data from successive censuses when there have been changes in the area boundaries. True comparisons can only be made for areas which encompass equivalent population groups. Where boundaries have changed it is necessary either to aggregate to the smallest possible comparable areas or to interpolate values from one time period to another. This challenge makes long-term comparison of local populations especially difficult where census outputs are tied to administrative boundaries which themselves experience high levels of intercensal change.
A variety of alternative techniques are available which address some of the deficiencies of choropleth mapping. In dasymetric mapping, additional information is used to restrict the shading of mapped values. For example, a map of inhabited areas might be superimposed on census boundaries and shading displayed only within the inhabited areas, thus providing a much more accurate impression of the population distribution. Another approach designed to address this type of difficulty is population surface modeling, whereby census variables are redistributed onto the cells of a regular grid, as illustrated in Figure 3. This figure again shows the Southampton region, here showing population density in 200 200 m grid cells from the 1991 census. In this case, many of the cells remain unpopulated and the resulting map more accurately reflects the true distribution of population. These regular grid cells can aid comparison over time or with the output of other spatial models but the unconventional appearance of the map can deter users familiar with choropleth representations.
16
Census Mapping
N
Population density: persons per 200 × 200 m grid, 1991 0−86 86−192 192−334 334−563 563−1439
5 km
Figure 3 Population surface model showing population density for 200 200 m grid squares in Southampton region, England with coastline and local government boundaries overlaid. Source: 1991 Census: Small Area Statistics (England and Wales). Census output is Crown copyright and is reproduced with the permission of the Controller of HMSO and the Queen’s Printer for Scotland. Digitized Boundary Data (England and Wales) provided with the support of the ESRC and JISC and uses boundary material which is copyright of the Crown and the ED-Line Consortium.
An entirely different data visualization is produced by the use of a population cartogram, in which locational accuracy is relaxed in order that each area can be represented proportional to its population size. Using this approach, areas with larger populations will take up a greater proportion of the map, but with the consequence that distance and direction cannot be accurately interpreted. An example of this approach is illustrated in Figure 4. This map shows change in the percentage of households without a car between 1991 and 2001. The cartogram illustrates the enormous population dominance of the London conurbation, represented by the dark area to the lower right of the map, where growth in car ownership has been much slower than in more remote areas with smaller populations. The figure embodies two important innovations over a conventional choropleth map: first, the sizes of the areas (here, counties and unitary authorities) are proportional to their population sizes and second, the use of stable geographical areas has permitted the creation of a map showing change over time. Censuses continue to provide an enormously rich resource for those concerned with social geography. The large statistical data volumes make mapping an essential key to understanding geographical trends and processes in these data, although much spatial analysis is performed without the production of intermediate maps. The choropleth method continues to dominate census mapping, both in the popular media and academic publications, although its representational weaknesses are well documented and a variety of alternative methods are available which allow the user to manipulate and interrogate census data with greater geographical and statistical rigor.
Change in % households without car 1991−2001 England and Wales counties and UAs −19 to −15 −15 to −13 −13 to −10 −10 to −6 −6 to −5
Figure 4 Population cartogram showing change in percentage of households without a car for counties and unitary authorities in Britain. Source: 1991 and 2001 Censuses: Small Area Statistics/ Census Area Statistics (England and Wales, Scotland). Census output is Crown copyright and is reproduced with the permission of the Controller of HMSO and the Queen’s Printer for Scotland. Boundary data: JISC CHCC project www.ccg.leeds.ac.uk/ teaching/chcc.
Census Mapping See also: Geodemographics; Georeferencing, Geocoding; Geovisualization; Gerrymandering; Internet/ Web Mapping; Modifiable Areal Unit Problem.
Further Reading Brewer, C. A. (2005). Designing Better Maps: A Guide for GIS Users. Redlands, CA: ESRI Press. Brewer, C. A. and Suchan, T. A. (2001). Mapping Census 2000: The Geography of US Diversity. Washington, DC: US Census Bureau. Cook, L. (2004). The quality and qualities of population statistics, and the place of the census. Area 36, 111--123. Curry, M. R. (1998). Digital Places. Routledge: London. Dewdney, J. C. (1983). Censuses past and present. In Rhind, D. W. (ed.) A Census User’s Handbook, pp 1--16. London: Methuen. Dorling, D. and Thomas, B. (2004). People and Places: A 2001 Census Atlas of the UK. Bristol: Policy Press. Durham, H., Dorling, D. and Rees, P. (2006). An online census atlas for everyone. Area 38, 336--341. Martin, D. (1996). An assessment of surface and zonal models of population. International Journal of Geographical Information Systems 10, 973--989. Martin, D. (2006). Last of the censuses? The future of small area population data. Transactions of the Institute of British Geographers 31, 6--18. Monmonier, M. (1996). How to Lie with Maps (2nd edn.). Chicago, IL: University of Chicago Press.
17
Openshaw, S. and Rao, L. (1995). Algorithms for reengineering 1991 census geography. Environment and Planning A 27, 425--446. Singer, E. (2003). The eleventh Morris Hansen lecture: Public perceptions of confidentiality. Journal of Official Statistics 19, 333--341. United Nations (2000). Handbook on Census Management for Population and Housing Censuses. New York: United Nations. Wood, D. (1992). The Power of Maps. Routledge: London.
Relevant Websites http://www.census.gov/main/www/stat_int. html List of national statistical organizations maintained by US Census Bureau. http://www.neighborhood.statistics.gov.uk Neighborhood Statistics Service for England and Wales census mapping tools. http://www.personal.psu.edu/cab38/Color Brewer/Color Brewer.html Online guidance on class intervals and color schemes for thematic mapping, Penn State Personal Web Server. http://www.ccg.leeds.ac.uk The CCG Online GIS atlas British census cartogram tools. http://www.statistics.gov.uk UK Statistics Authority. http://factfinder.census.gov US Census Bureau’s American Factfinder census mapping tools.