Basic Mapping Principles for Visualizing Cancer Data Using Geographic Information Systems (GIS)

Basic Mapping Principles for Visualizing Cancer Data Using Geographic Information Systems (GIS) Cynthia A. Brewer, PhD Abstract: Maps and other data ...

Download PDF

3MB Sizes 0 Downloads 49 Views

Report

PDF Reader
Full Text

Basic Mapping Principles for Visualizing Cancer Data Using Geographic Information Systems (GIS) Cynthia A. Brewer, PhD Abstract:

Maps and other data graphics may play a role in generating ideas and hypotheses at the beginning of a project. They are useful as part of analyses for evaluating model results and then at the end of a project when researchers present their results and conclusions to varied audiences, such as their local research group, decision makers, or a concerned public. Cancer researchers are gaining skill with geographic information system (GIS) mapping as one of their many tools and are broadening the symbolization approaches they use for investigating and illustrating their data. A single map is one of many possible representations of the data, so making multiple maps is often part of a complete mapping effort. Symbol types, color choices, and data classing each affect the information revealed by a map and are best tailored to the specific characteristics of data. Related data can be examined in series with coordinated classing and can also be compared using multivariate symbols that build on the basic rules of symbol design. Informative legend wording and setting suitable map projections are also basic to skilled mapmaking. (Am J Prev Med 2006;30(2S):S25–S36) © 2006 American Journal of Preventive Medicine

Introduction

A

geographic information system (GIS) allows epidemiologists and cancer researchers to investigate spatial patterns within their data and understand relationships between cancer and other health, socioeconomic, and environmental variables. High-quality maps also allow researchers to present a compelling case to others who are interested in their work. GIS is an additional tool in the exploration, analysis, and communication of cancer data, and knowledge of the basic principles for representing data can help cancer researchers make the most of GIS and the opportunities for insight it offers. This article is structured in three sections: mapping methods, mapping multiple variables, and map finishing. Two common symbol types, choropleth mapping and proportional symbols, are featured, and decisions involved in making effective use of these symbols are summarized. Supporting figures present maps of prostate cancer data to correspond with the topic of this special issue of the American Journal of Preventive Medicine. These maps were produced in ArcGIS (ESRI, Redlands CA, version 9) with no further augmentation in illustration software. The data and geography used for these maps are from the National Cancer Institute (NCI) Cancer Mortality Maps and Graphs Website.1

From the Department of Geography, Pennsylvania State University, University Park, Pennsylvania Address correspondence and reprint requests to: Cynthia A. Brewer, PhD, Department of Geography, Pennsylvania State University, 302 Walker Building, University Park PA, 16802-5011. E-mail: [email protected].

The available years of prostate cancer mortality data are mapped (aggregations for 1950 –1994), drawing from the counts, rates per 100,000 person-years (age adjusted using 1970 populations), and upper and lower bounds of 95% conference interval (CI) offered on the site for black and white races. The map area is cropped to produce compact demonstration figures that can be compared in series. The data are freely available through the NCI website to other mapmakers who would like to work with the methods described. The basic overview of thematic mapping offered in this article has wide application in cancer and epidemiologic mapping. Other tools in GIS are also of use to epidemiologists, such as address geocoding and network analysis. The links between spatial statistics software tools and GIS are also improving.2 The focus of this short article, however, is limited to symbolizing statistical data, which is a common use of GIS. Basic criteria for choosing symbols to map derived values, significance levels, model results, and smoothed rates are the same as for simpler measures such as crude rates. Likewise, multivariate maps that combine or overlay model results with original data or related variables can illuminate relationships between them by combining symbolization approaches. Cartographers use visual tools, and epidemiologists use statistical tools to investigate their data. This is an oversimplification, to be sure, but it seems to be a core difference in approach between the two fields and each could be enhanced by further use of the other’s methods. The tools cartographers use to improve their visual representations of data can complement epidemiolo-

Am J Prev Med 2006;30(2S) © 2006 American Journal of Preventive Medicine • Published by Elsevier Inc.

0749-3797/06/$–see front matter doi:10.1016/j.amepre.2005.09.007

S25

gists’ sophisticated adjustments for potentially spurious rates and small numbers. For example, cartographers adjust class breaks when mapping a given data set. Epidemiologists adjust the data while mapping with a given classing algorithm without adjustment, such as quantiles. Epidemiologists question the data; cartographers question the symbolization. This contrast is exaggerated in the hope that this brief introduction will encourage epidemiologists to expand their insights from data by expanding their approaches to data representation.

Mapping Methods A basic characteristic of cancer data that guides choice of a map symbol is whether categories or quantities are recorded. Categorical differences in cancer may be case/control or benign/malignant. They may code race differences (Figure 1) as well as many other socioeconomic categories. Quantitative data may be counts, ranks, or derived values such as rates and percentages (Figure 2).

Symbols that show categorical differences well are color hue and symbol shape. Symbols well suited to representing quantities are color lightness and symbol size. There are a variety of other symbol characteristics (such as pattern spacing) that may be used for data representation, and these are organized as “visual variables” in the cartographic literature.3 The workhorse visual variables are hue, shape, lightness, and size for the types of data common in cancer mapping, and common symbolization methods have names: For example, choropleth maps use lightness to represent quantitative areal data, and proportional symbols use size to represent quantitative data at points or for areas.

Choropleth Mapping Choropleth maps present areal enumeration units— such as states, counties, ZIP codes, and census tracts— filled with colors that symbolize ranges in the data (Figure 2). In addition to choosing which type of enumeration unit best suits your mapping goals,4 two basic decisions for choropleth mapping are color selection and data classing. The decision criteria for color and classing choices can also be applied to point symbols, but they are presented in this article with choropleth examples.

Color Symbols

Figure 1. Hue symbolizes categorical difference in counts for two race groups. Pie chart symbols are scaled to a constant size and show relative proportions of mortality for two populations: black and white males.

S26

The main goal in choosing colors for choropleth maps is to order lightness so it parallels ordering in the data. The simple case is light-to-dark color for low-to-high values with a constant hue (blue is used in Figure 2a). Adding hue variation can help make it easier to see differences between color symbols. A lightness sequence combined with a progression through adjacent hues produces some of the best sequential choropleth color schemes (for example, yellow– green– blue are adjacent in the ordering of hues through the spectrum; Figure 2b). These hue and lightness sequences are more challenging to design, and useful series of readymade sequential schemes are offered online through ColorBrewer.org (Figure 3) to assist mapmakers who are not experienced with color specification.5,6 Many map readers find spectral (rainbow) schemes appealing (Figure 2c). These color schemes are not well suited to sequential data because lightness varies through the spectrum (yellow and cyan are often lighter than other hues). Spectral schemes can be adjusted to better order lightness, and the intrinsically light yellow hue can also be used to emphasize critical values within a data range.7,8 For these diverging schemes (versus sequential schemes), lightness diverges from a mid-range critical value toward two contrasting hues. Figure 4 shows a modified spectral diverging scheme (Figure 4a) and other diverging examples that use fewer hues (Figure 4b,c).

American Journal of Preventive Medicine, Volume 30, Number 2S

www.ajpm-online.net

Figure 2. Three color schemes are shown for the same data set. (a) Sequential, single hue scheme (blue). (b) Sequential scheme with hue transition (yellow-green-blue). (c) Spectral scheme. The spectral scheme is used as a diverging scheme with the lightest colors marking the overall U.S. rate.

Diverging data may have an obvious structure, such as positive and negative values diverging from zero (Figure 4c). Dark red to white to dark blue is an example color scheme that parallels this diverging structure. Data may also be presented as diverging from a calculated value such as a national rate, threshold value in disease incidence, or median. These data might be equally well represented using a sequential

Figure 3. An example screen from ColorBrewer.org, an online tool offering color specifications for each color in schemes suited for thematic maps. Color schemes are grouped into sequential, diverging, and qualitative sets.

February 2006

scheme, and looking at distributions using a variety of representations may offer the most insight (compare Figure 2b to Figure 2c). Color blindness in map readers becomes an issue when using multiple hues.9 About 8% of men and ⬍1% of women have one of the varied forms of red– green color vision deficiency. Color blind people do see many hues but there are predictable groupings of hues that will be confused with each other. The extent of color confusions depends on the severity of a person’s color vision deficiency. The range of hues from red through orange, brown, yellow, and green may all look the same or similar if they are also similar in lightness. This set of color confusions means that some popular color schemes, such as spectral and “stop light” (red–yellow– green) schemes, produce maps that are difficult to read for a substantial number of people. Red– green combinations are not the only hues that are confused by people with common color vision deficiencies. Other example sets of hues that can be confusing are magenta– gray– cyan and blue–purple. Example hue pairs that work well as the anchors in diverging schemes for color blind readers are: red– blue, red–purple, orange– blue, orange–purple, brown– blue, brown–purple, yellow– blue, yellow– purple, yellow– gray, and blue– gray.10 The colorfulness of spectral schemes can also be taken advantage of while still accommodating most readers’ vision impairments by using a spectral scheme that skips the greens: Am J Prev Med 2006;30(2S)

S27

Figure 4. Example diverging schemes. (a) Spectral scheme modified to accommodate color blind map readers by skipping green hues. (b) Two hues (green and purple) diverging from a central light class at the U.S. rate. (c) Change in rates between two time periods with diverging reds (increasing rates) and blues (decreasing).

dark red, orange, yellow, light blue, blue–purple, dark purple (Figure 4a).8 ColorBrewer includes a variety of these diverging schemes with full color specifications. The tools at Vischeck.com are also useful for correcting the appearance of graphics to accommodate people with color vision impairments.

Classing Data classing is another basic decision made when creating choropleth maps of data. For example, in Figure 5a, counties with data values between 19.23 and 21.54 are grouped into one class and represented by a green color. There are numerous methods for classing data11 and most GIS and mapping programs offer a selection that often includes quantiles (Figure 5a), equal intervals (Figure 5b), and a Jenks optimized method (Figure 5c). Other choices include classing by standard deviations and minimizing differences across boundaries. There is no one correct way to class a data set, and different methods will produce different map patterns, especially if data are skewed or include extreme outliers. Quantile classing assigns the same number of enumeration units to each class (it is a generalized form of percentiles). Four quantiles (quartiles) allocate one quarter of the data values to each class with the median at the middle break. For example, four classes of 391 S28

units each are shown in Figure 5a. Equal interval classing breaks the data range into equal segments for predictable and equal class ranges (unlike the variation in quantile ranges, as seen in Figure 5a, where the first class has a range of 19.2 deaths and the second has a range of 2.3). The number of counties in each class varies with equal intervals (Figure 5b). Jenks methods (called natural breaks in ArcGIS; Figure 5c) minimize variation within classes and maximize variation between classes. With this approach, enumeration units that share a color are statistically more similar to each other than to units in other color classes.3 Cartographers most commonly choose a Jenks method for their first look at data. In contrast, quantile classing is the more common choice of epidemiologists, perhaps because variation in calculated values produced by different types of standardization and age adjustment means death rates may be usefully seen as ranked values.11 Cartographers recommend looking at a histogram (Figure 6) or other aspatial graph of the data to assist in choosing classes.3 Generally, a sound approach is to start with a standard classification and adjust breaks to improve the map based on knowledge of the data and the audience. For example, a useful adjustment is to group extreme outliers into their own class and then class the rest of the data range using a standard method. In Figure 7a, for example, rates

American Journal of Preventive Medicine, Volume 30, Number 2S

www.ajpm-online.net

Figure 5. Three classifications of the same data set showing different patterns resulting from different classing methods. (a) Quantile. (b) Equal interval. (c) Jenks optimized classification (natural breaks). The number of counties in each class is shown to the right of each legend.

⬍12.71 and ⬎30.64 are in separate classes and equal intervals are applied to the remainder of the data range (compare Figure 7a to Figure 5b). Likewise, when there are many zero values in a data set, it works well to separate them into their own class and then class the remainder of the data set. Another adjustment strategy is to apply Jenks for good statistical breaks and then adjust classes to include the national rate and round data values to assist map reading by a general audience (Figure 7b).12 Watch map patterns while changing methods and adjusting breaks to check the sensitivity of the distribution. The more classes used, the less changeable the map pattern will be with different classing methods and adjustments. There are diminishing returns with increasing numbers of classes, and it becomes difficult to

Figure 6. Example histogram display in the classify window of ArcMap (ESRI, Redlands CA, v9).

February 2006

assign colors that readers can tell apart with too many classes (the extreme being an n-class map). Seven classes is often the most you will want to use on a choropleth map, and an optimal number of classes can be calculated by examining diminishing reductions in variance with increasing numbers of classes.3 A quick look at a rough proportional symbol map can also provide an alternative understanding of the data distribution that helps you judge how well the classed view represents the data.

Proportional Symbols Another way to represent quantitative data, for either points or areas, is with symbols that vary by size in proportion to data values. Symbols such as circles and squares are usually scaled by the software in proportion to the square root of each data value so that symbol areas visually represent the data values. Sizes of linearly scaled symbols, such as bars, are more accurately interpreted by map readers, but they soon become impractical with large data ranges. A symbol scaled by area, such as a square, is more compact and easier to associate with the location for which it represents data. Proportional symbols may be placed directly at data points, such as cities or address locations, or they may be centered in areas. The order in which symbols are drawn, so that smaller symbols appear above larger ones, aids map reading. Use of proportional symbols for enumeration areas is particularly useful for count data (total number of Am J Prev Med 2006;30(2S)

S29

C O L O R

C O L O R

Figure 8. Example of a proportional symbol map with legend showing example symbol sizes and data values they represent. Each map symbol is scaled to an individual county value.

C O L O R

incidences for example). On choropleth maps, the size of an enumeration unit has a big effect on the amount of color shown on the map, but unit area may have little relationship, or even an inverse relationship, to base populations and related counts. United States counties are a good example of this inversion with dense populations in small eastern counties and sparse populations in large western counties that then have an overwhelming impact on the look of the data distribution. This is a common failing of choropleth maps that is improved on by using proportioned symbols. Symbols that vary by size may also be used to represent data ranges, as with choropleth classing. The same decisions in choosing classes are required for this

Figure 7. Classification examples using more customization to suit a particular dataset. (a) High and low extreme values grouped into separate classes with the remaining range classed using equal intervals. (b) Rounded Jenks classes for improved map reading (data are re-classed but minimal change to map pattern results; compare to Figure 5c).

S30

American Journal of Preventive Medicine, Volume 30, Number 2S

www.ajpm-online.net

symbol form, and a selection of readily differentiated symbol sizes are usually assigned to the classes. One name for this symbol type is graduated symbols. They are sometimes used when data ranges are too great to practically represent the full range on a small map. Another solution to proportioning symbols to extreme data ranges is to assign all values below a threshold (⬍100 in Figure 8) to a single small symbol before proportioning. Thus, Figure 8 combines a graduated symbol (for the range 0 to 99) and proportioned symbols (for counts of 100 and over). Another option for extreme data ranges is to switch to a volumetric symbol and cube-root scaling.

flow lines or networks, and “spider” maps offer a version of linear symbol sometimes used for epidemiologic data, such as connecting locations to service points.14

Mapping Multiple Variables Improved representations of data and relationships between cancer variables may be revealed by overlaying symbols for one variable onto those of another; mapping related variables as a series; mapping differences, modeled indices, residuals, or other derived values; and mapping with symbols that combine variables. Examples of these approaches are described in this section.

Other Symbols Many other symbol types are common in mapping although less useful in cancer mapping. Cartograms are used much like proportioned symbols, with the forms of the enumeration units warped or resized to produce areas proportioned to data values. Cartograms were popular during and after the 2004 U.S. presidential elections for interpreting voting data,13 so epidemiologists may find that they are now a more accepted symbolization form. Dot density representations that vary the number of dots in enumeration areas in proportion to data values (for example, one dot represents 100 people) are well suited to sparse or discrete phenomena. Continuous surfaces, such as air quality, are represented with isolines, filled isolines, or smooth gradations of color based on values across a continuous surface. Linear phenomena may be represented with

Map Series Comparisons among maps of related data are facilitated by using the same class breaks on all maps in the series.2,11 This often means that manual breaks are applied to each map and that some maps in the series will not include classes from the entire data range. Comparisons among map patterns are also aided by arranging them as small multiples,15 with many small maps on a page or screen. Figures 9 and 10 show two series of three maps. In the first series (Figure 9), each map is classed using quantiles. The second series (Figure 10) uses a set of breaks shared among all three maps. The shared breaks are a mix of rounded values and the U.S. rate for each 5-year aggregation. These two series provide quite different views of the data. The quantiles show how the relative locations of the highest

Figure 9. Example of a map series with each map classed separately using quantile classing. The maps are a time series.

February 2006

Am J Prev Med 2006;30(2S)

S31

Figure 10. The same map series seen in Figure 9 with all maps sharing the same set of classes to aid map comparison within the time series. Class breaks based on the U.S. rate for each time period are included on all maps; the U.S. rate for the 5-year period mapped is highlighted in each maps’ legend.

and lowest rates change, with the highest rates shifting to the east by 1990 –1994. In contrast, the shared breaks of Figure 10 also make the overall increase in rates through time more obvious (Figure 9 requires careful study of the legends for this information).

smoothing and representation of a more generalized surface. Likewise, aggregating to larger enumeration units, through longer spans of time, and across related cancer types may improve the meaningfulness of maps of cancer data.4

Overlay and Special Classes

Multivariate Symbols

Standard mapping methods are often improved with augmentations such as overlaying reliability information or other data. For example, a cancer distribution may produce a revealing pattern of rates across counties with reasonably high populations, but the extreme highs and lows associated with small numbers may interfere with the evaluation of this pattern. Masking or hatching counties with low populations (or using another relevant variable such as significance) helps bring the more stable patterns to the fore.16,17 Figure 11 shows data on low numbers of deaths as area symbols beneath death rates symbolized with point symbols to signal sparse and unreliable data. This added information allows readers to focus their attention on counties with high rates where data are not sparse. Figure 12 includes overlays that mask areas with no deaths, mask areas where rates are not significantly different than the U.S. rate, and hatch areas where data are sparse. A variety of special categories appear in the legend of Figure 12 to indicate these exceptions to the regular choropleth classes. Small-number problems may also be handled by statistical modeling or Bayesian

Maps may combine variables by including them within one representation. Symbols that combine two or more variables include proportioned pie charts and twovariable choropleth maps.3 More generally, multivariate symbols can be grouped as category/category, category/quantity, and quantity/quantity combinations. A category/category symbol may use shape for one variable and hue for another. An example category/quantity representation is size for the quantitative variable and hue for the category variable. Quantity/quantity symbols may be separable, such as size and lightness (Figure 13), or more integral, such as the bivariate choropleth shown in Figure 14. Captions for Figures 13 and 14 provide tips on how to read the two-dimensional legends on these bivariate maps.

S32

Map Finishing Guidelines for mapmakers often do not discuss the seemingly trivial issues of map titles and legend titles or issues of file export. The look and credibility of final maps are also affected by selecting a map projection

American Journal of Preventive Medicine, Volume 30, Number 2S

www.ajpm-online.net

suited to the area mapped. These are keys to making a map that can be understood, be presented in multiple media, and be distributed to a wide audience.

Wording Completed maps may be missing critical information about the calculations behind the symbols they present or, conversely, they may have such laborious titles that the main issue presented by the map is obscured. A map title should present the basic topic of the map and invite the reader to investigate further. The legend title should provide details of the map calculations (i.e., it should not be labeled “legend” or something so terse as “%”). If the calculation is complex, then the clarification is best continued in a note in small type on the map or in associated text. Map sources, data sources, and authorship are also in small type on the map. This format varies with media. For example, a journal pub-

Figure 12. Special classes improve a map: zero deaths are separated to a class and significance and sparse data symbols are overlayed on a diverging choropleth representation. This map also shows more complete wording for a thematic map, with general information in the map title, and specific information about the calculation and data mapped in the legend title and note.

Figure 11. Two variables are shown on the map. Proportioned point symbols for rates overlay choropleth symbols for number of deaths. Counts of zero, ⬍6, and ⬍12 are used to indicate sparse populations and suggest caution in judging rates (especially extreme rates) in these counties.

February 2006

lication may move the title and note information to a figure caption, but the legend title within a map figure should still be complete. Figure 12 shows a map title, legend title, and note suited to publication of a single map. The wording that comes up automatically for the corresponding map on the NCI Cancer Mortality website1 (www.cancer.gov/ atlasplus) lists the data parameters chosen by the user in the map title: Cancer mortality rates by state economic areas (age-adjusted 1970 U.S. population), Prostate Gland: black males, 1970 –1994, all ages. The corresponding legend title online is: Rates per 100,000 person-years, 1970 –1994. The lengthy and complete online title is reasonable because the user creating the map selected the parameters and is already involved with exploring the data. The wording in Figure 12 is better suited to presenting the map to a wider audience who need to be invited to read the map, to understand its primary topic, and to then learn about the particular Am J Prev Med 2006;30(2S)

S33

subset of data presented by the map. The title is shorter and details are in the legend title and note in Figure 12.

File Export When preparing a GIS map, prepare in advance to export it for distribution to others. It is difficult to share a map project (such as an .MXD from ESRI ArcGIS) directly with graphics production people, and often it

Figure 14. A bivariate choropleth map offers a visual combination of two variables, making visible their covariation. Breaks between classes for white male death rates separate columns. Breaks for black male death rates separate rows. Overall U.S. rates for black and white groups are used as class breaks for both races. The lightest color (light green) shows the lowest values for both groups: both are below the U.S. rate for white males. The darkest color (top right in legend) represents counties where each group is above their U.S. rate. Blacks’ rates (rows) are above 47.2 and whites (columns) are above 22.0 in counties filled with dark purple.

Figure 13. A bivariate map showing both number of deaths and death rates. Size is used for the count variable (rows; larger symbols for more deaths) and hue and lightness are used for the rate data (columns; light yellow to dark red). For example, the large dark square represents counties with many deaths and high rates (upper right in legend). The combination of small size and color reduces the visual prominence of counties with few deaths and thus less reliable rates. Size and color are separable visual variables. The 2-D legend reads much like a cross-tabulation table.

S34

does not even work to share it with others who do have the GIS software because paths to associated geography files and data tables are difficult to preserve. If the map is going to press, the publishers will likely request that it to be exported in an Adobe Illustrator format (.AI) or a bitmap format (such as .TIFF or .JPEG). Many mapping packages export a version of the .AI format that can be read by other graphics software, not just Illustrator. The high resolutions and highquality compression required for bitmap files to preserve small lettering and fine lines will produce large files that may make file transfer more challenging for

American Journal of Preventive Medicine, Volume 30, Number 2S

www.ajpm-online.net

some authors. Likewise, very detailed geographic databases, such as county shapes, may also produce large .AI files. If you want to post map products on the web, exports to .PDF and .JPEG will be good choices. All of these export formats will cause some problems so be careful with interesting fonts (that you are not licensed to share); special type effects (such as halos); complex symbols (that may be derived from fonts that are not on others’ computers); and patterns (that may produce very large files). These choices may not export and transfer as you intend—the basic rule of thumb is to test exported files before you do much custom work.

Map Projection Another basic mapmaking issue is map projection.3 A projection transforms the base geography from a spherical model of the Earth to the flat page or screen. GIS implements map projections by applying a series of equations to geographic coordinates. Projections that preserve area are suitable for most epidemiologic maps. These are called equal area projections. The Albers Equal Area projection is commonly used for the U.S. (seen in all figures in this article). In addition, customized projections can be created to suit any map scale and world area. If the mapmaker does not attend to map projection, software defaults usually present the mapped area underpinned with a regular grid of latitude and longitude lines, producing an inappropriate east–west stretch and north–south compression at U.S. latitudes. These distortions interfere with readers’ judgments of densities and relative areas of cancer rates, which are crucial for much epidemiologic map interpretation. Slab-like default projections also mark a map as the product of an amateur, calling into question competence of data handing and other GIS decisions. Figure 15 shows a portion of the southeast U.S. on three maps at the same scale with a graticule (latitude and longitude lines) overlay. Figure 15a has no projection set, the graticule is square, and the counties and states are distorted by being stretched east to west. Figure 15b is projected using an Albers Equal Area projection with settings for the entire U.S., causing this eastern portion to be tilted, as seen in the angled latitude lines. The third example, Figure 15c, is adjusted by changing the central meridian so north is up

C O L O R

Figure 15. Example of effects of different projections of the same map area in the southeast United States. (a) No projection is set so latitude and longitude remain in the square-grid default arrangement. (b) Albers Equal Area projection for the U.S. (c) Albers modified by adjusting the central meridian to the center of the mapped area to position north as up (this rotation may be made with the Data Frame toolbar in ArcMap). The gray-filled counties had more than five prostate cancer deaths for black males, 1970 –1994.

February 2006

Am J Prev Med 2006;30(2S)

S35

in the center of the mapped area. This third example has suitable projection settings: it is equal area, and the central meridian runs through the middle of the area mapped.

Conclusion Maps and other data graphics can play a role in generating ideas and hypotheses at the beginning of a project. They are useful as part of analyses for evaluating model results and then at the end of a project when researchers present their results and conclusions to varied audiences, such as their local research group, decision makers, or a concerned public. Cancer researchers are gaining skill with GIS mapping as one of their many tools and are broadening the symbolization approaches they use for investigating and illustrating their data. A single map is one of many possible representations of the data, so making multiple maps is often part of a complete mapping effort. Symbol types, color choices, and data classing each affect the information revealed by a map and are best tailored to the specific characteristics of data. Related data can be examined in series with coordinated classing and can also be compared using multivariate symbols that build on the basic rules of symbol design. Cartography texts,3,18 –20 more advanced reference books,21,22 and handbooks oriented to wider audiences23–26 include guidance on mapping choices implemented in GIS that epidemiologists may find useful. No financial conflict of interest was reported by the authors of this paper.

References 1. National Cancer Institute. Cancer mortality maps and graphs. 1999. Available at: www.cancer.gov/atlasplus. Accessed August 1, 2005. 2. Wiggins L, Using geographic information systems technology in the collection, analysis, and presentation of cancer registry data: a handbook of basic practices. Springfield, IL: North American Association of Central Cancer Registries, 2002.

S36

3. Slocum TA, McMaster RB, Kessler FC, Howard HH. Thematic cartography and geographic visualization. 2nd ed. Upper Saddle River, NJ: Pearson Education, 2005. 4. Boscoe FP, Pickle LW. Choosing geographic units for choropleth rate maps, with an emphasis on public health applications. Cartogr Geogr Inf Sci 2003;30:237– 48. 5. Brewer CA, Hatchard GW, Harrower MA. ColorBrewer in print: A catalog of color schemes for maps. Cartogr Geogr Inf Sci 2003;30:5–32. 6. Harrower MA, Brewer CA. ColorBrewer.org: an online tool for selecting colour schemes for maps. Br Cartogr Soc 2003;40:27–37. 7. Brewer CA, MacEachren AM, Pickle LW, Herrmann D. Mapping mortality: evaluating color schemes for choropleth maps. Ann Assoc Am Geogr 1997;87:411–38. 8. Brewer CA. Spectral schemes: Controversial color use on maps. Cartogr Geogr Inf Sci 1997;24:203–20. 9. Olson JM, Brewer CA. An evaluation of color selections to accommodate map users with color-vision impairments. Ann Assoc Am Geogr 1997;87:103–34. 10. Brewer CA. Guidelines for selecting colors for diverging schemes on maps. Br Cartogr Soc 1996;33:79 – 86. 11. Brewer CA, Pickle LW. Evaluation of methods for classifying epidemiological data on choropleth maps in series. Ann Assoc Am Geogr 2002;92:662– 81. 12. Brewer CA. Reflections on mapping Census 2000. Cartogr Geogr Inf Sci 2001;28:213–35. 13. Gastner MT, Shalizi CR, Newman MEJ. Maps and cartograms of the 2004 U.S. presidential election results. 2004. Available at: www-personal. umich.edu/⬃mejn/election. Accessed August 1, 2005. 14. Armstrong MP, Densham PJ, Lolonis P, Rushton G. Cartographic displays to support locational decision making. Cartogr Geogr Inf Sys 1992;19:154 –164. 15. Tufte ER. Envisioning information. Cheshire, CT: Graphics Press, 1990. 16. MacEachren AM, Brewer CA, Pickle LW. Visualizing georeferenced data: representing reliability of health statistics. Environ Plan A 1998; 30:1547– 61. 17. Pickle LW, Mungiole M, Jones GK, White AA. Atlas of United States mortality. Hyattsville, MD: National Center for Health Statistics, 1996. 18. Dent BD. Cartography: thematic map design. 5th ed. Boston: McGraw-Hill, 1999. 19. Robinson AH, Morrison JL, Muehrcke PC, Kimerling AJ, Guptill SC. Elements of cartography. 6th ed. New York: Wiley, 1995. 20. Kraak MJ, Ormeling F. Cartography: visualization of geospatial data. 2nd ed. Upper Saddle River, NJ: Pearson Education, 2003. 21. MacEachren AM. How maps work: representation, visualization, and design. New York: Guilford Press, 1995. 22. Fairchild MD. Color appearance models. 2nd ed. New York: John Wiley & Sons, 2005. 23. Brewer CA. Designing better maps: a guide for GIS users. Redlands, CA: ESRI Press, 2005. 24. Brown A, Feringa W. Colour basics for GIS users. Upper Saddle River, NJ: Pearson Education, 2003. 25. Krygier J, Wood D. Making maps: a visual guide to map design for GIS. New York: Guilford, 2005. 26. Monmonier, M. Mapping it out: expository cartography for the humanities and social sciences. Chicago, IL: University of Chicago, 1993.

American Journal of Preventive Medicine, Volume 30, Number 2S

www.ajpm-online.net

Basic Mapping Principles for Visualizing Cancer Data Using Geographic Information Systems (GIS)

Basic Mapping Principles for Visualizing Cancer Data Using Geographic Information Systems (GIS)

Recommend Documents