Government Information Quarterly 29 (2012) 532–542
Contents lists available at SciVerse ScienceDirect
Government Information Quarterly journal homepage: www.elsevier.com/locate/govinf
The wireless abyss: Deconstructing the U.S. National Broadband Map Tony H. Grubesic Geographic Information Systems and Spatial Analysis Laboratory, College of Information Science and Technology, Drexel University, Philadelphia, PA 19104, USA
a r t i c l e
i n f o
Available online 31 July 2012 Keywords: Wireless Broadband Uncertainty National Broadband Map Policy GIS
a b s t r a c t The U.S. National Broadband Map (NBM) is arguably the most complex articulation and synthesis of telecommunications data ever generated by the federal government. Drawing upon information collected by fifty U.S. states, five territories and the District of Columbia, broadband provision is tabulated at the Census block level and made available to the general public in a variety of formats (e.g., maps, tabular databases, and geographic coverages). One major policy challenge associated with deepening our understanding of wireless broadband provision in the United States is developing a methodological process for accurately rearticulating NBM wireless data collected at the block level to more meaningful economic units (e.g., Census block groups or tracts). Without this ability, policy analysis and an objective evaluation of the goals set forth in the National Broadband Plan are compromised. The purpose of this paper is to outline such a methodology, while simultaneously highlighting several consistency checks for ensuring completeness and data aggregation integrity. © 2012 Elsevier Inc. All rights reserved.
1. Introduction The National Broadband Map (NBM) is a collective effort between the National Telecommunication and Information Administration (NTIA), the Federal Communications Commission (FCC), fifty states, five territories, and the District of Columbia to provide a detailed snapshot of broadband provision in the United States. According to the FCC, broadband in the United States constitutes download (i.e., to the customer) speeds of at least 4 megabytes per second (Mbps) and upload (i.e., from the customer) speeds of at least 1 Mbps (FCC, 2010a,b). To confuse matters, the NTIA defines broadband as 768 Kbps download and 200 Kbps upload speeds. Both definitions have evolved significantly over the past decade, when as recently as 2004 the FCC defined broadband as download speeds of at least 200 Kbps (FCC, 2004). Clearly, as delivery technologies continue to evolve and improve, so too will the definition of broadband. The NBM is a small, but significant facet of a much larger National Broadband Plan (“Plan”) (FCC, 2010b). The Plan outlines a strategic agenda for both developing and enhancing broadband infrastructure because of its perceived importance to a variety of critical sectors in the U.S. economy (e.g., health care, education, energy) as well as government performance, public safety and civic engagement (FCC, 2010b). Understandably, one of the major challenges set forth in the Plan was to determine broadband provision levels throughout the United States. As noted by Grubesic (2012), issues of information asymmetry have plagued broadband-related development efforts in the U.S. for many years. One major reason that asymmetries exist is the lack of quality data regarding broadband provision, pricing and
E-mail address:
[email protected]. 0740-624X/$ – see front matter © 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.giq.2012.05.006
quality of service (QOS) at the local level (Greenstein, 2007). For example, prior to the release of the NBM, the only viable broadband provision data available to analysts was the FCC Form 477 database, which had been aggregated to the ZIP code or Census tract level. While these data were suitable for a rough snapshot of advanced telecommunications provision, their lack of spatial resolution was a significant hindrance to understanding disparities in broadband and related competitive effects. In addition, because the Form 477 data did not include pricing information, robust evaluations of the economic impact of broadband and associated telecommunications policies were difficult. Sadly, while pricing information is still unavailable in the NBM, the spatial resolution of provision data is greatly improved. Detailed provision data are now available at the Census block level (e.g., providers, upload/download bandwidth, etc.), which is the smallest geographic unit that the Census Bureau publishes decennial survey information on. As noted by Grubesic (2012), however, there are several major problems with the NBM data. First, provider participation in the NBM varied significantly between states, ranging from 27% (Virginia) to 100% (Indiana, Illinois, and six others). Second, issues of data uncertainty for digital subscriber line (xDSL) service were not rectified, likely leading to a significant overestimation of broadband xDSL provision coverage in the U.S. Third, the NBM currently identifies thousands of zero population blocks (i.e., no residents or businesses) as having broadband. In part, these errors can be attributed to providers claiming an ability to provide service to these eligible locations within a 10 day window (NTIA, 2011), but as Grubesic (2012) notes, there is an important difference between regions that could have broadband and regions that have broadband. Finally, the sheer size of the databases associated with the NBM creates problems. For example, the wireline provider database contains approximately 12.5 million records. As a result, any
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
effort to aggregate these data to alternative units for analysis is computationally burdensome, time consuming and prone to error. Although the problems associated with wireline NBM data are fairly well understood, much less attention has been paid to the wireless NBM data. This is a notable gap because there is growing sentiment that wireless options may have a disruptive effect on the overall broadband market, making wireline options (e.g., fiber to the home) less attractive (Middleton & Given, 2011). This suggests, that now more than ever, developing an understanding of where wireless broadband options are available is critical to evaluating disparities in broadband and evaluating the relative success or failure of the National Broadband Plan over time. Unfortunately, in their current form, the NBM wireless provision data are both unwieldy and the antithesis of user-friendly. With over 50 million individual records for wireless provision, efforts to manipulate, analyze and visualize these data at a national scale are both time consuming and computationally intensive. Thus, the purpose of this paper is to provide a methodological framework for rearticulating the raw, wireless broadband provision data from the NBM to more meaningful economic units for policy evaluation, spatial econometric analysis and geographic visualization. Specifically, provision data are aggregated from Census blocks to block groups using a multistep process that leverages the data manipulation abilities of a geographic information system (GIS). A variety of data consistency checks for ensuring completeness and data aggregation integrity are also detailed. 2. Wireless broadband in the United States Wireless broadband comes in many forms, connecting a home or business to the internet without wires, typically via a radio link between a customer's location and a facility operated by a service provider (Sawada, Cossette, Wellar, & Kurt, 2006). A simple typology to differentiate between types of wireless broadband is fixed and mobile. There are also subtle differences between platforms that use licensed and unlicensed spectrum (Sirbu, Lehr, & Gillett, 2006). Where the latter is concerned, unlicensed spectrum is shared among internet service providers, while licensed spectrum is dedicated to a single provider. For wireless platforms, fixed technologies allow subscribers to access the internet from a fixed point (while stationary), and usually require a direct line-of-sight between the wireless transmitter and receiver. Fixed wireless technologies include WiFi and WiMAX (Abichar, Peng, & Chang, 2006; Vaughan‐Nichols, 2004) and have proven to be popular in rural and remote areas where wireline and mobile technologies are not as widespread (Zhang & Wolff, 2004). Mobile wireless connections provide broadband in specific geographic locations to mobile objects (cars, trucks, boats, pedestrians, etc.) using spectrum that is dedicated to an internet service provider. Mobile wireless technologies include 3GPP Long Term Evolution (LTE) and CDMA2000 (EVDO) among others (Agashe, Rezaiifar, & Bender, 2004; Dahlman, Parkvall, Skold, & Beming, 2008). Finally, it is important to note that satellite broadband technologies from providers such as Hughes, Wild Blue and Spacenet are also used throughout the United States, although the number of households subscribing to satellite services remains very small (~ 1 million during the first quarter of 2010) (NSR, 2010). For a brief overview of wireless broadband platforms and their associated speeds, see Table 1. To put the U.S. wireless market in perspective, consider the recent statistics published within the National Broadband Plan (FCC, 2010b). Wireless broadband use is growing exponentially, with Cisco projecting that wireless networks in North America will carry approximately 740 petabytes per month by 2014, a 40-fold increase from 2009 (~ 17 petabytes). In part, this massive increase is attributable to the growing use of smart phones, but it also fueled by the use of LTE-enabled laptop computers and tablet devices. The FCC (2010a,b, 77) also notes that machine-based wireless communications will increase dramatically within the next few years, as sensor networks
533
Table 1 Wireless broadband overview. Technology
WiMAX UMTS-TDD 3GPP LTE CDMA2000/EVDO 3GPP2 ultra mobile broadband MBWA
Air interfacea
OFDM, OFDMA TDMA OFDM, OFDMA FDMA OFDMA OFDMA
Data rate (Mbps) Downlink
Uplink
75 16 100 3.1 275 1
75 16 50 1.8 75 1
OFDMA = Orthogonal frequency-division multiple access. TDMA = Time division multiple access. FDMA = Frequency division multiple access. Source: Kong, D.T., Liang, P-Y. and Chang, Y. (2009). Wireless Broadband Networks. Wiley: Hoboken, NJ. a OFDM = Orthogonal frequency-division multiplexing.
and “smart devices take advantage of the ubiquitous connectivity afforded by high-speed, low-latency, wireless packet data networks.” The notion of broadband ubiquity is interesting. While there is no doubt that the number of smart devices leveraging wireless broadband networks is on the rise, the ubiquity of wireless broadband is less certain (Middleton & Bryne, 2011; Sawada et al., 2006). Fig. 1 provides some perspective on the spatial dimensions of new technology rollouts by private telecommunications providers. Specifically, it highlights locales throughout Indiana and Ohio that have access to the new wireless 4G WiMAX network built by Clear Communications.1 This system is designed to provide average download speeds of 3 to 6 Mbps, with bursts up to 10Mbps. Although this does not reflect “true” 4G speeds as defined by the ITU (100 Mbps) (ITU, 2008), it is representative of average 4G speeds for mobile devices in the U.S. These speeds can support a variety of internet-based activities ranging from streaming audio and video to online gaming (GAO, 2010). Dark green portions of Fig. 1 denote areas with excellent coverage and high bandwidth capacities, light green areas have only partial coverage and lower available bandwidth. Areas with no green shading represent coverage gaps in the Clear Communications network. Given this information, it is evident that portions of the Cincinnati and Columbus, Ohio metropolitan areas have excellent coverage and high bandwidth capacities (dark green), but the Dayton, Ohio metropolitan area (pop. 847,502) is without any coverage (Fig. 1) from Clear Communications. Further, the second largest metropolitan area in the Midwest, Indianapolis, Indiana (pop. 1.83 million), is only partially covered and likely under-capacitated in terms of available bandwidth (light green). While early, this geographic perspective on the rollout of the much touted 4th generation network highlights underserved regions that might benefit from policy interventions if these gaps in provision persist. Granted, this is a geographic snapshot of a single provider in a mixed urban/suburban/rural region, but the notion of 4G ubiquity remains relatively far-fetched, particularly for more rural areas. Further, there are concerns that the architecture and capabilities of wired and wireless access networks will never converge, primarily because of the limitations associated with wireless spectrum availability and its associated capacity (Lehr & Chapin, 2010). That said, the FCC recently moved to open television spectrum to wireless broadband in the hope of relieving the strain on existing spectrum allocations (Benton Foundation, 2010). Regardless of one's stance on these issues, the ability to identify heterogeneities in provision between urban, suburban, rural and remote communities is critical to obtaining a better understanding of wireless broadband provision and access in the United States and supporting the National Broadband Plan.
1 Both Indiana and Ohio have a good mix of urban, suburban and rural settings, providing a fairly representative snapshot of wireless coverage and technology for the U.S.
534
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
Stronger Coverage Weaker Coverage Fig. 1. Clear Communications 4G WiMAX coverage: portions of Indiana and Ohio. Source: http://www.clear.com/coverage.
3. Wireless provision data and the national broadband map The National Broadband Map details broadband provision information for wireline and wireless services (including satellite), as well as documenting “anchor institutions” that can serve as gateways for broadband access (e.g., schools, colleges and libraries). 2 Similar to the wireline data collection and reporting process, wireless broadband provision data are linked to Census blocks in the national map. Table 1 illustrates a sample of the wireless provision data at the block level for the state of Ohio and provides the working definition of the fields in the database (USDOC/NTIA, 2010). Perhaps the most interesting column in Table 1 is labeled “pct_blk_in_shape”. The supporting documentation available from the NBM website does not contain any reference to this field, nor does it provide a definition of it. Upon closer inspection, however, this column begins to make sense when one deconstructs the methodological process used to generate wireless provision data for Census blocks in the NBM (Table 2). Unlike the process associated with estimating provision for wireline carriers (Grubesic, 2012), wireless provision at the block level is estimated through a series of interpolation techniques in a GIS environment. 3 While there are some minor variations in the methodologies applied and modeling details given by each of the agencies charged with the provision of tabulation in each state, the bulk of them used wireless signal propagation modeling to generate a geographic coverage associated with each tower and their antennae. For example, the end product for the state of Ohio “depicts a graphical illustration of the theoretical propagation characteristics of a selected frequency range based on defined variables (receiver sensitivity of the home/mobile device, foliage factor, and digital elevation terrain input)” (Connect Ohio, 2011). Put more simply, Connect Ohio (the grantee charged with developing both wireline and 2 For more details on broadband use in anchor institutions such as public libraries, see McClure, Jaeger, and Bertot (2007) and Jaeger, Bertot, McClure, and Langa (2006). 3 Readers are referred to the NBM methodologies page (http://www2.ntia.doc.gov/files/ broadband-data/All_Grantees_Methodologies_December-2010.zip) for further details.
wireless broadband provision data for Ohio) provided the NTIA with a map of wireless coverage, for all of the participating providers, for the state. Again, the approach used for generating these coverages varied somewhat for each state. There are some rather awkward questions that arise regarding the NBM data once it is realized that these raw geographic coverages of wireless broadband provision are used to estimate broadband provision at the block level. For example, consider Fig. 2, which displays the provision footprints associated with each of wireless broadband providers in the state of Ohio. Please note that the information represented in Fig. 2 corresponds to the ArcGIS compatible shapefile available for download from the NBM site (USDOC/NTIA, 2010). Aside from the diversity in geographic footprints associated with each provider, there are a number of important observations associated with the underlying data that must be noted. First, returning to the discussion of Table 1 for a moment, it becomes clear that these wireless coverage footprints are used to generate the values in the “pct_blk_in_shape” column. For example, Census block 391336007032011 is served by Mikulski Communications LLC. However, only 6.98% of this particular block is within the Mikulski service footprint. Can this block actually be considered served by Mikulski? If not, how much coverage is necessary to consider a block served? Fifty percent? Seventy five percent or greater? Second, issues of wireless network ubiquity are also at issue. In this particular instance, in addition to Mikulski, block 391336007032011 is 100% covered by three other providers (Cellco [doing business as Verizon], Sprint, and AT&T). That said, there are many others instances where blocks have partial coverage only. For example, consider block 390019901001000, also located in Ohio, which is 40% covered by one Scioto Wireless coverage zone, 20% covered by another and 1.9% from Verizon. Again, does this instance qualify as coverage? Fig. 3 provides a spatial perspective on this issue, highlighting the geographic manifestations of coverage for this block. There are notable gaps, but it is important to remember that these wireless coverages are estimates based on cellular propagation models. Actual coverage may vary. Thus, from a policy perspective, can one consider this block as having been provided wireless broadband? Regardless of how one might answer these types of questions, it is clear that a continuum of wireless provision exists
3-1-39-6-22 3-1-39-6-22 3-1-39-6-22 3-1-39-6-23 3-1-39-6-23 3-1-39-6-23 FRN—FCC Registration Number. PROVNAME—Provider Name. DBANAME—Doing Business As Name. TRANSTECH—Technology Code (see below for valid values). MAXADDOWN—Maximum Advertised Download Speed (see below for valid values) from record level. MAXADUP—Maximum Advertised Upload Speed (see below for valid values) from record level. TYPICDOWN—Typical Download Speed (see below for valid values). TYPICUP—Typical Upload Speed (see below for valid values). DOWnloadSPEED—Maximum Advertised Download Speed if provided from Overview table. UPLOADSPEED—Maximum Advertised Upload Speed if provided from Overview table. Source: U.S. Dept of Commerce, National Telecommunication and Information Administration, State Broadband Initiative (December 30, 2010).
TYPICUP
0 0 0 5 5 5 0 0 0 5 5 5
TYPICDOWN MAXADUP
3 3 3 5 5 5 4 4 4 5 5 5
MAXADDOWN
0.0698 0.7886 0.6763 0.0215 0.0624 0.6473 391336007032011 391336010001020 391336008003014 391390030022004 391390010009024 391390021014005 Mikulski Communications LLC Mikulski Communications LLC Mikulski Communications LLC Mechcom Dot Net Mechcom Dot Net Mechcom Dot Net
TRANSTECH pct_blk_in_shape censusblock_fips HOCONAME HOCONUM
900470 900470 900470 900541 900541 900541 Mikulski Net Mikulski Net Mikulski Net Mechcom Dot Net Mechcom Dot Net Mechcom Dot Net Mikulski Communications LL Mikulski Communications LL Mikulski Communications LL Mechcom Dot Net Mechcom Dot Net Mechcom Dot Net 17116831 17116831 17116831 19792696 19792696 19792696
DBANAME PROVNAME FRN
Table 2 National Broadband Map Wireless Broadband Provision Data, December 2010.
70 70 70 70 70 70
objectid
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
535
throughout the United States, manifesting at a highly local level. This issue will be discussed more thoroughly later in the paper. There are several additional aspects of the wireless broadband provision data from the NBM that are worth highlighting. As noted previously, there are 50,013,289 records for wireless broadband provision in the NBM database. Needless to say, this is a massive amount of information and is extremely difficult to manipulate, visualize and analyze without a significant commitment of computational resources. The sheer size of this database also makes aggregation of wireless broadband provision data to more meaningful administrative units such as block groups, transportation analysis zones, or municipal boundaries difficult. Granted, analysts could break the national database into more manageable file sizes, and to their credit, the NTIA has already done this for the state level. However, to conduct any national-level analysis, this means analysts must re-aggregate state-level data to compile a national database. Moreover, this process still does not address the question of how to efficiently aggregate 50 million block data records to alternative geographical units. The reason this is so important is because much of the truly rich demographic, socio-economic and business data are only available from the Census Bureau and other agencies at larger levels of aggregation. Further, because there are so many different types of administrative boundaries (e.g., congressional districts, school districts, state legislative districts, urbanized areas, etc.) that are not neatly tied to the Census administrative hierarchy, more simplistic aggregation approaches that are available in statistical analysis packages like SAS and SPSS may not be viable. So, if one is interesting in modeling the economic impacts of wireless broadband (Thompson & Garbacz, 2011), particularly at a local level, both provision data and information on the demographic/socio-economic/business composition of an administrative unit are required. Further, an efficient and reliable methodology for rearticulating the raw wireless provision data must be developed. In the next section, a process is outlined to make this more manageable for analysts interested in policy and spatial econometric analysis of wireless broadband in the United States. 4. Data and methodology As highlighted in the previous section, two databases were acquired from the National Broadband Map for analysis. The first is the complete U.S. broadband availability database, by comma separated values (CSV) (USDOC/NTIA, 2010). The second is the complete U.S. broadband availability database, by shapefile (SHP), a popular geospatial vector data format compatible with a wide range of GIS software (ibid). Both databases correspond to the December 2010 NBM records on broadband provision. There are some fundamental differences between these two files that are worth detailing. For example, the raw version of the CSV file on broadband provision from the NBM contains 50,013,289 individual records at the block level. Considering that there were only 8,262,363 Census blocks in the United States during 2000, which is the vintage of the geographic base files from the Census used for tabulating the NBM (Grubesic, 2012), this equates to an average of six provision records for each block. Again, the structure of this database is highlighted in Table 1. This inflation of raw record counts occurs because the NBM contains a record for each instance of provision for each block. Thus, while some blocks in the United States do not appear in the NBM database, others appear multiple times. The SHP file for broadband provision contained 29,088 unique wireless coverages in the United States, but these coverages were composed of thousands of individual and/or multipart polygons. Multipart polygons refer to coverages that consist of non-contiguous (i.e., spatially separate) components that are stored as a single feature. For example, T-Mobile's wireless coverage in Ohio consists of two unique coverages, one with a maximum download speed of 4 and a maximum upload of 2 and the other with a maximum download of 6 and a maximum upload of 4. Other providers in the state have up to four different coverages. To put this in perspective, NBM records in the state of Montana included
536
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
Cleveland
Cincinnati Wireless Broadband Coverage
0
20
40
80
Kilometers 160
120
Fig. 2. Wireless broadband coverages for the state of Ohio, December 2010. Source: USDOC/NTIA (2010). U.S. Dept of Commerce, National Telecommunications and Information Administration, State Broadband Initiative (SHP format December 30, 2010).
administrative units such as block groups, congressional districts or another geographic unit without double counting provision instances, or, (b) developing a strategy for handling 29,088 geographic coverages for accomplishing the same task. Considering that the CSV file took nearly 40 min to open in SPSS (while caching locally) on a sixteen core, Intel Xeon system with 8 gigabytes of memory, the latter option seems sensible, but is certainly not the only option. One major advantage to dealing with geographic coverages in the NBM is that they served as
16,258 unique coverages for Montana Internet Corporation alone. While there was certainly some variation in speed between these coverages, a closer examination of the data for Montana suggests that the mountainous terrain throughout the state helped generate thousands of non-contiguous patches of wireless coverage. Given this data ecosystem within the NBM, analysts are basically forced to make a choice between: (a) managing 50 million block records in a CSV file and determining a way to aggregate these data to alternative
0
0.1
0.2
0.4
0.6
Kilometers 0.8
Verizon 1 Scioto Wireless 1 Scioto Wireless 2 Block 390019901001000
Fig. 3. Marginal wireless broadband coverage for a census block in rural Ohio. Source: USDOC/NTIA (2010). U.S. Dept of Commerce, National Telecommunications and Information Administration, State Broadband Initiative (SHP format December 30, 2010).
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
537
Start
Generate unified wireless broadband provision coverage layer
ArcGIS Merge Command
Manual Record Correction
No
Wireless NBM Data: Geographic Coverages by State
Disentangle Multipart Provider Polygons
ArcGIS Explode Command
Repair Polygon Geometry
Automated inspection of each feature in a wireless coverage for geometry problems. Upon discovery of a geometry problem, a relevant fix will be applied
Reconstruct Wireless Coverages
ArcGIS Dissolve Command
Are Providers Consistent?
Yes
Compile Unique Wireless Coverages
No
ArcGIS Assign Data by Spatial Location Command (e.g. complete containment, intersection, etc.)
ArcGIS Dissolve Command
Aggregate to Census Unit?
Evaluate Geographic Intersection
Census Block Groups, Tracts, etc.
End
Fig. 4. Aggregation strategy for NBM wireless broadband data.
the foundation for determining broadband provision to Census blocks in the CSV file. Thus, choice (b) will provide more control over the raw NBM data for the rearticulation process to alternative geographic units, although there are some computational complexities that make this option challenging as well. These will be discussed in the next section.
4.1. Deconstructing NBM wireless coverages Fig. 4 details the multistep process used for rearticulating the NBM wireless data aggregation to alternative economic units for policy and spatial econometric analysis. The purpose of this subsection is to provide
538
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
some contextual details on the process, highlighting several potential pitfalls for analysts when attempting to compile a comprehensive wireless database using NBM data. The first step of this process is to merge all of the separate SHP files (one for each state) published by the NBM to create a single geographic coverage for the United States. By creating a single, master database of wireless broadband provision for the U.S., one will not need to recreate subsequent steps for each state individually. 4 The end product is a GIS shapefile that consists of all fifty states (and the District of Columbia) and their associated wireless coverages. The second step is to explode each of the individual wireless coverages associated with each provider in the master database. This serves two purposes. First, because the raw NBM data consisted of hundreds (and sometimes thousands) of individual coverages for the same provider in certain states (e.g., Montana), not accounting for these duplicitous, multipart polygons could create aggregation problems (i.e., double counting) later in this process. Second, this will allow analysts to create a single, nationwide coverage for each provider in the NBM database (e.g., Verizon), rather than multiple coverages from single providers in all 50 states. In a nutshell, the explosion process will help reduce data redundancy and improve the geographic consistency of wireless provision information. The purpose of the third step is to improve the geometric characteristics of each provider's wireless coverage area. In this context, it is important to remember that the shapefile (.shp) is an open format to which many software packages write. As noted by ESRI (2012), errors are common because of bugs in the software or programs not following the documented specification of the shapefile format. Because the NBM consists of data collected from a diverse range of agencies across 50 states, bugs and errors in the generated shapefiles are virtually guaranteed. Therefore, one way to ensure the objective evaluation of wireless broadband provision when using NBM geographic coverages is to mend geometric errors in the database. This is an automated process that both identifies and repairs geometric inconsistencies. Specifically, ArcGIS version 10, a commercial geographic information system, was used to identify and repair errors in the wireless coverages. 5 In total, 181 errors in wireless coverage geometry were found for the 50 states and the District of Columbia and all of them were repaired successfully. 6 Once the geometry was repaired for each of the wireless coverages, individual coverages were re-assimilated into a single coverage for each provider in the United States. Specifically, using the dissolve command in ArcGIS allowed for each of the individual polygons associated with a provider to be compiled into a single polygon based on the operating carrier's name. Although using provider names is not an ideal option for conducting this type of geoprocessing, there is a compelling reason for this approach. Surprisingly, the wireless NBM database does not contain column that assigns a unique identification number to individual providers. Worse, although the Federal Registration Number (FRN) should serve as a unique identification for each provider, it does not. For example, in the state of Ohio, there are seventeen providers that have a FRN code of 9999. 7 Holding company number (HOCONUM) is not a viable option either because it is not included as a field in the wireless coverage shapefile. As a result, 4
If the study area of interest consists of a single state, this step is unnecessary. For more details on the identification and repair process for geometric errors in ArcGIS, see http://tinyurl.com/czggd4n. 6 The majority of these errors were polygon “self intersections,” where areas of overlap within a polygon are dissolved. Other fixes included the removal of null or empty geometries and incorrect ring ordering in the polygon (where its boundaries are mixed clockwise and counterclockwise orientations). For more details and a general overview of geometric errors in spatial data coverages, see Ubeda and Egenhofer (1997). 7 Wilkshire Communications, Wabash Communications, Slane Telecom, Skymax Broadband, SAA bright.net, Redbird Internet Services, RAA Services, Mango Bay Communications, LightSpeed Technologies, Jenco Wireless, g wireless, Coyote Wireless Broadband, Champaign Telephone, Blu Sky Wireless, Access Ohio Valley, Computers4U, and Avolve. 5
the only columns which are remotely close to being unique in the NBM database are PROVNAME or DBANAME. For this paper, PROVNAME was utilized to ensure that providers doing business with different names in a region were not double counted (e.g., Clearwire Corporation as “Clearwire” and/or “Clear”). Alas, using the PROVNAME column was difficult due to numerous spelling errors and typos. For example, there were many instances where “T-Mobile USA, Inc” was entered as “T-Mobile USA Inc” (missing a comma). Similar errors for other providers were also found. Individually, these errors are not hugely problematic, but collectively, they all required correction before the final unique wireless coverages were compiled for the U.S. Once the manual process of fixing errors in the PROVNAME column was complete, the ArcGIS dissolve command is used for the final time to compile the unique wireless coverages for each provider in the United States. The end product is a shapefile that consists of 819 unique wireless coverages for the U.S. that is geographically consistent, geometrically correct and do not contain any duplicitous entries. 8 The final wireless coverage map for the United States includes all terrestrial fixed wireless (licensed and unlicensed) as well as terrestrial mobile wireless (Fig. 5). Furthermore, not only is it an exact cartographic match to the map displayed by the National Broadband Map site for 2010 data, but it consists of a radically simplified underlying database with a single coverage for each unique provider. From this point, the aggregation process is relatively straightforward. For example, to count the number of unique wireless broadband providers for each block group in the United States, a commercial GIS system must evaluate individual wireless coverage intersections (n = 819) with each of the block groups (n = 207,507), yielding nearly 170 million combinations. Fortunately, ArcGIS is relatively efficient at this, although it does take time and resources. The entire process required 23 h to complete on a 24 core IBM server with 16 gigabytes of RAM. It is important to note that computational effort increases/decreases linearly with the number of administrative units. Given these computational challenges, one must ask if the tradeoff associated with geoprocessing 170 million record combinations has any advantages over processing 50 million in standard tabular format. Recall that the primary motivation for the spatial deconstruction of the wireless provision data is because of the added flexibility and control it offers for data aggregation to alternative units. Again, once the process is complete, one can aggregate to block groups, tracts, ZIP codes, counties, school districts, state legislative districts or any other unit of interest. Further, because this process maintains provider coverages in their native format, it allows for geometric inconsistencies to be identified and enables analysts to explore alternative coverage operators. Specifically, there are a number of options for evaluating the presence of wireless coverage, including operators for intersection, contain, within, and closest functions (Table 3). For the purposes of this paper, an intersection rule was applied. That is, if a wireless coverage shape intersected a block group, it was considered “covered” by the wireless provider. This is a relatively liberal definition of coverage that ensures the inclusion of any provider with a presence in a census block group. Rightly or wrongly, this is essentially how the NBM captures wireless provision (recall the example illustrated in Fig. 3). A more strict interpretation of wireless provision would be to consider a block group covered only when a provider's geographic service area completely contains the block group in question. This would effectively eliminate the partial coverage questions associated with providers illustrated in Fig. 3 and 8 For the purposes of this paper, satellite providers were excluded from the final database. There are two reasons for this. First, many of the geographic coverages associated with these providers were complete state boundaries. Second, because some states included these boundaries in their report to the NBM and others did not, the satellite broadband provision data in the NBM are highly inconsistent. These data uncertainties combined with the relatively small market for satellite broadband make its exclusion relatively low impact when evaluating provision nationally.
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
539
Fig. 5. Simplified wireless broadband coverage map derived from NBM data: December 2010.
provide an interesting, alternative perspective on wireless ubiquity for the U.S. 5. Results and discussion Fig. 6 illustrates the derived representation of wireless broadband provision in the United States at the block group level by standard deviation. Specifically, this map is the final product of all the error checks and GIS manipulations outlined previously. The descriptive statistics for wireless broadband provision by block group are: max = 15, min = 0, average = 4.82 and a standard deviation of 1.73. The corresponding geographic patterns are not particularly surprising; wireless broadband provision options are strongly tied to major metropolitan areas and transportation corridors. However, there are several interesting anomalies in this wireless landscape. First, there appears to be an agglomeration of providers along some state borders (e.g. Idaho and Washington; California and Nevada). In the case of Idaho and Washington, there are a number of regional service hubs in this corridor, including the university cities of Pullman (Washington State University) and Moscow (University of Idaho) as well as
the river ports of Clarkston, Washington and Lewiston, Idaho. All of these cities are connected by major state highways, and are served by numerous wireless providers. In the case of California and Nevada, there are two facets of local geography at work. First, the presence of Lake Tahoe, numerous ski resorts, Interstate 80 and the cities of Reno, NV, Sparks, NV and Truckee, CA certainly enhance wireless provision in the region. However, the size and geographic extent of block groups in this region also impact provider counts. As highlighted by Grubesic (2008), larger administrative units are prevalent in the western states and issues of broadband ubiquity and coverage are impacted by these regional geographic characteristics. Specifically, the enormous block group (~ 1162 km 2) that basically mimics the boundary of Washoe County Nevada (oriented north/south) on the western border of Nevada has nine wireless providers and is between 1.5 and 2.5 standard deviations above average for wireless broadband provision in the United States. Much like the example illustrated in Fig. 3, this block group is collecting marginal wireless coverage from a variety of sources—some spilling in from California, some from Oregon and some from nearby areas in Nevada. For verification purposes, one only needs to consult Fig. 5 to see that this region is a functional
Table 3 Overview of selected spatial selection operators. Operator
Explanation
Remarks
Within a distance of
Creates buffers around a source feature and returns all the features intersecting the buffer zones The geometry of the target feature must fall inside the geometry of the source feature. Overlapping boundaries are permitted. To be selected, all parts of the target features must fall inside the geometry of the source feature. Touching of boundaries is not allowed. The geometry of the source feature must fall inside the geometry of the target feature, including its boundaries. All parts of the target feature must completely contain the geometries of the source feature. Touching of boundaries is not allowed A target feature is selected if the centroid of its geometry falls into the geometry of the source feature or on its boundaries Returns any feature that either fully or partially overlaps the source feature
Select cities within 100 m of a river
Within Completely within Contain Completely contain Centroid in Intersect
The state of Minnesota can be selected even though it shares a boundary with the United States. All counties within Ohio, but not spatially adjacent to its border would be selected Inverse of the “within” operator Inverse of the “completely within” operator Works for points, lines and polygons Can be used for point, line and polygon coverages
540
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
Fig. 6. Rearticulated wireless broadband provision by block group (Mean = 4.82).
dead-zone for wireless coverage, yet, the aggregation process has attributed 9 providers to this block group. There are other instances of these geographic aggregation errors for Death Valley, CA, the National Grasslands in Northern Colorado and the Grand Canyon (Arizona and Nevada border) among others. All of these large block groups have collected marginal coverage from numerous providers in/ around their borders. 9 Clearly, this presents a significant problem for policy and spatial econometric analysis of wireless broadband provision in the United States. While the wireless provider counts illustrated in Fig. 6 are accurate, it would be wrong to suggest that these marginally covered block groups represent “core” wireless regions. If anything, the bulk of these areas would be at the opposite end of the continuum—sparsely covered with poor quality of service. Given this landscape and the problems highlighted with the National Broadband Map data, where does this leave the National Broadband Plan and telecommunications policy? First, it is important to note that these aggregation errors are not fatal. Where spatial econometric analysis is concerned, the unified provider coverages that are displayed in Fig. 5 can be used to tag regions exhibiting marginal geographic coverage. For example, cross-sectional econometric models that are using economic units such as block groups can differentiate geographic space with spatial regime (i.e., dummy) variables. In this instance, if one is interested in discriminating between large blocks with marginal coverage and those with more complete coverage, one can generate a relevant binary assignment variable (0, 1) as a control mechanism to differentiate coverage variation over geographic space (Anselin, 1992; Grubesic, 2003; Mack & Grubesic, 2009). Second, one could alter the coverage operator. Instead of using intersect, it is possible that some
9 This may also be a function of residual geometric errors in the database. According to the Census Bureau, Cartographic Boundary Files are vector files generalized from 1:100,000-scale TIGER Line Files and are designed for use at scales from about 1:500,000 to 1:5,000,000 (Census Bureau, 2010). At 1:100,000, the spatial accuracy of block groups would be +/− 166.67 ft. Thus, non-matching geographic base files, unintended coverage spillover and marginal wireless coverage could all be contributing to the derived results.
type of contain or within function may be more appropriate for portions of the West. Alternatively, one could simply remove these outliers/anomalies from the database prior to statistical analysis. This is a reasonable strategy for two reasons. Not only does it eliminate bias from statistical models, but it also reflects a hard truth— wireless providers are unlikely to make infrastructure investments in these regions. While the removal of these areas is couched in an analytical context, it does hint to the practical problems in 1) developing a ubiquitous wireless broadband network for the United States, and 2) obtaining good data for benchmarking network development and generating evidence-based policy using the data that are available. Both problems will be discussed in turn. Where ubiquity is concerned, the architecture and capabilities of wired and wireless access networks are slow to converge. As noted earlier, one major problem has been the limitations associated with wireless spectrum availability and its associated capacity (Lehr & Chapin, 2010). However, with additional TV spectrum being made available for broadband (Benton Foundation, 2010), the landscape is about to change. Broadband convergence aside, it is also important to acknowledge that private firms have difficulty in adequately providing “public goods” such as communications infrastructure in a spatially balanced manner (Grubesic, 2006). Simply put, private firms have no interest in making huge infrastructure investments in regions where populations are sparse and there is virtually no demand for terrestrial or mobile wireless services. As a result, it is possible that leaving portions of the United States unconnected is a viable national strategy given the tremendous expense associated with providing wireless (and wireline) broadband to rural and remote locations. This is undoubtedly a controversial view, but one that needs to be considered given the economics associated with infrastructure provision. It is possible that satellite broadband technologies can fill in the gaps, although given the spotty data on satellite service provision, availability, its geographic reach and quality of service currently prevents analysts from making any definitive statements on this alternative. This is not to say, however, that broadband options in rural areas are a vain hope. Broadband mesh networks are already providing
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
service for many of the more remote regions of the country, including those with challenging local terrain. Mesh systems are fully wireless and employ multihop communications to forward traffic to/from wired internet access points (Bruno, Conti, & Gegori, 2005; Shillington & Tong, 2011). These multihop systems effectively eliminate the need for “line-of-site” connections and allow regions with mountainous (or just hilly) topography to receive wireless broadband. For example, in portions of southwestern Illinois, where the terrain is rolling and line-of-sight wireless systems are impractical, the Illinois Rural Electric Cooperative has employed a wireless broadband mesh network that leverages internet connections from the utility's power substations to deliver last mile connectivity to coop members (IREC, 2012). The last mile connections are essentially WiFi, but the installed infrastructure from SkyPilot Networks helps mitigate interference and subscribers are provided hardware that supports a 5 GHz network signal from up to 7 miles from a node with dedicated bandwidth (SPN, 2012). Pricing is reasonable, with a basic 512 k upload and 128 k download package for $27 per month and a 2 MB download and 512 k upload for $60 per month. Of course, activation and installation fees apply (~$250). Given the results of this paper and previous empirical work regarding the National Broadband Map (Grubesic, 2012), it is also clear that the current iteration of the NBM wireless provision data leave much to be desired. Aside from the sheer size of the NBM database and the inability of average desktop computers and software to deal with 50 million records efficiently, coverage geometries were determined to be problematic, there were thousands of duplicitous wireless records and hundreds of errors in the database (misspelled provider names, lack of unique identification numbers). Not only does this impact one's ability to conduct reliable analysis, it also means that there is no convenient path to develop meaningful evaluations of the National Broadband Plan using NBM data, or broadband telecommunications policy in general. Again, while data resolution at the census block is certainly a welcome change from ZIP codes, most of the relevant demographic, socio-economic and firm-level data are only available from larger economic units (e.g. block groups, tracts, etc.), forcing analysts to develop aggregation routines to make the NBM data truly accessible. Finally, given the limitations of these data, can any meaningful policy analysis be generated for wireless broadband provision in the United States? This is a difficult question to answer and the results of this paper are not meant to condemn nor condone inferences that may be drawn from the NBM data. Although the wireless coverages provided by the NBM are flawed and are extremely difficult to use (both in tabular and GIS format), they appear to give a relatively viable snapshot of where wireless broadband is available. The subtleties associated with measures like speed and quality of service are more difficult to assess. For example, reconsider Fig. 3. If Verizon and Scioto wireless are offering different speeds within their respective coverages, a subscriber using a mobile device (in a car) that crosses one of the coverage areas will not necessarily see an immediate and/or instantaneous drop-off in speed or service quality. The influence of terrain, tower loads and differences in mobile device quality can all impact service performance. Finally, the lack of pricing data on wireless broadband from the NBM is a limitation that is shared with its companion, wireline broadband data. Pricing data must be made available for policy evaluation. Without it, policy analysis will continue to be hampered by information asymmetries (Grubesic, 2012). 6. Conclusion The purpose of this paper was to outline a methodological framework for rearticulating the raw, wireless broadband provision data from the NBM to more meaningful administrative units for policy evaluation, spatial econometric analysis and geographic visualization.
541
Specifically, provision data were aggregated from Census blocks to block groups using a multistep process that leverages the data manipulation abilities of a geographic information system. However, the true utility of the outlined approach is the flexibility to rearticulate the wireless data to any administrative unit. A variety of data consistency checks for ensuring completeness and data aggregation integrity were also detailed. Results suggest that the National Broadband Map is fraught with data inconsistencies, but if tracked carefully, can be managed and mitigated to develop a more accurate representation of wireless broadband provision in the United States. It is hoped that the outlined strategy will make the NBM data more accessible to those interested in telecommunications policy evaluation, disparities in communications availability and the growing information economy. Further, although this paper is somewhat critical of the NBM, it is also important to note that these data, while imperfect, represent a major improvement when compared to previous efforts.
References Abichar, Z., Peng, Y., & Chang, J. M. (2006). WiMax: The emergence of wireless broadband. IT Professional Magazine, 8, 44–48. Agashe, P., Rezaiifar, R., & Bender, P. (2004). CDMA2000® high rate broadcast packet data air interface design. IEEE Communications Magazine, 42, 83–89. Anselin, L. (1992). Spatial data analysis with GIS: An introduction to application in the social sciences. Technical Report 92‐10. Santa Barbara: National Center for Geographic Information and Analysis, University of California. Benton Foundation (2010). Opening TV spectrum to new wireless broadband services. http://benton.org/node/45917 Bruno, R., Conti, M., & Gegori, E. (2005). Mesh networks: Commodity multihop ad hoc networks. IEEE Communications Magazine, 43(3), 123–131. Census Bureau (2010). Cartographic boundary files. http://www.census.gov/geo/www/ cob/scale.html Connect Ohio (2011). Official April 2011 Update Submission to the National Telecommunications and Information Administration Under the State Broadband Data and Development Grant Program for the State of Ohio. Retrieved from. http://www2.ntia.doc.gov/files/broadband-data/ All_Grantees_Methodologies_December-2010.zip Dahlman, E., Parkvall, S., Skold, J., & Beming, P. (2008). 3G evolution: HSPA and LTE for mobile broadband (2nd ed.). Academic Press. Environmental Systems Research Institute [ESRI] (2012). Retrieved from http:// resources.esri.com/help/9.3/arcgisengine/java/gp_toolref/ data_management_toolbox/checking_and_repairing_geometries.htm FCC (2004). Availability of advanced telecommunications capability in the United States. Washington, D.C.: Federal Communications Commission Retrieved from b http:// www.fcc.gov/broadband/706.html> FCC (2010a). Mobile broadband: The benefits of additional spectrum. Federal Communications Commission Retrieved from b http://hraunfoss.fcc.gov/edocs_public/ attachmatch/DOC-302324A1.pdf> FCC (2010b). Connecting America: The national broadband plan. Federal Communications Commission Retrieved from b http://transition.fcc.gov/Daily_Releases/Daily/Business/ 2010/db0720/FCC-10-129A1.pdf> GAO (2010). National broadband plan reflects the experiences of leading countries, but implementation will be challenging. Report: United States Government Accountability Office: Report to Congressional Requesters http://www.gao.gov/products/GAO-10-825 Greenstein, S. (2007). Data constraints and the internet economy: Impressions and imprecision. NSF/OECD meeting on factors shaping the future of the internet Retrieved from b http://www.oecd.org/dataoecd/5/54/38151520.pdf> Grubesic, T. H. (2003). Inequities in the broadband revolution. The Annals of Regional Science, 37, 263–289. Grubesic, T. H. (2006). A spatial taxonomy of broadband regions in the United States. Information Economics and Policy, 18, 423–448. Grubesic, T. H. (2008). Spatial data constraints: Implications for measuring broadband. Telecommunications Policy, 32, 490–502. Grubesic, T. H. (2012). The U.S. national broadband map: Data limitations and implications. Telecommunications Policy, 36, 113–126. Illinois Rural Electric Cooperative [IREC] (2012). http://www.e-co-op.com/ International Telecommunications Union [ITU] (2008). Requirements related to technical performance for IMT-Advance radio interface(s). http://www.itu.int/dms_pub/itu-r/ opb/rep/R-REP-M.2134-2008-PDF-E.pdf Jaeger, P. T., Bertot, J. C., McClure, C. R., & Langa, L. A. (2006). The policy implications of internet connectivity in public libraries. Government Information Quarterly, 23(1), 123–141. Kong, D. T., Liang, P. -Y., & Chang, Y. (2009). Wireless Broadband Networks. Wiley: Hoboken, NJ. Lehr, W., & Chapin, J. M. (2010). On the convergence of wired and wireless access network architectures. Information Economics and Policy, 22, 33–41. Mack, E. A., & Grubesic, T. H. (2009). Forecasting broadband provision. Information Economics and Policy, 21, 297–311. McClure, C. R., Jaeger, P. T., & Bertot, J. C. (2007). The looming infrastructure plateau? Funding, connection speed, and the ability of public libraries to meet the demand for free internet access. First Monday, 12(12–3).
542
T.H. Grubesic / Government Information Quarterly 29 (2012) 532–542
Middleton, C. A., & Bryne, A. (2011). An exploration of user-generated wireless broadband infrastructures in digital cities. Telematics and Informatics, 28, 163–175. Middleton, C. A., & Given, J. (2011). The next broadband challenge: Wireless. Journal of Information Policy, 1, 36–56. NTIA (2011). National broadband map: Data comparison methodology. National Telecommunications and Information Administration. Retrieved from http://tinyurl. com/3sjv37u Sawada, M., Cossette, D., Wellar, B., & Kurt, T. (2006). Analysis of the urban/rural broadband divide in Canada: Using GIS in planning terrestrial wireless deployment. Government Information Quarterly, 23, 454–479. Shillington, L., & Tong, D. (2011). Maximizing wireless mesh network coverage. International Regional Science Review, 34, 419–437, http://dx.doi.org/10.1177/0160017610396011. Sirbu, M., Lehr, W., & Gillett, S. (2006). Evolving wireless access technologies for municipal broadband. Government Information Quarterly, 23, 480–502. Sky Pilot Networks [SPN] (2012). Rural utility delivers broadband services using scalable wireless mesh. http://tinyurl.com/86cl7e2 Northern Sky Research [NSR] (2010). Satellite Broadband Holding Its Own. Now at 1 Million+ Subscribers URL: http://tinyurl.com/7h4kang Thompson, H. G., & Garbacz, C. (2011). Economic impacts of mobile versus fixed broadband. Telecommunications Policy, 35, 999–1009.
Ubeda, T., & Egenhofer, M. J. (1997). Topological error correcting in GIS. In M. Scholl, & A. Voisard (Eds.), Advances in spatial databases. Lecture Notes in Computer Science, http://dx.doi.org/10.1007/3-540-63238-7. USDOC/NTIA (2010). US Dept of Commerce, National Telecommunications and Information Administration, State Broadband Initiative. SHP format December 30, 2010. Vaughan‐Nichols, S. J. (2004). Achieving wireless broadband with WiMax. Computer, 37, 10–13. Zhang, M., & Wolff, R. S. (2004). Crossing the digital divide: Cost-effective broadband wireless access for rural and remote areas. IEEE Communications Magazine, 42, 99–105.
Tony H. Grubesic is an associate professor in the College of Information Science and Technology and Director of the Geographic Information System and Spatial Analysis Laboratory (GISSA) at Drexel University. His research and teaching interests are in geographic information science, regional development and public policy evaluation, critical infrastructure, geospatial intelligence and urban health disparities. Grubesic obtained a B.A. in Political Science from Willamette University, a B.S. in Geography from the University of Wisconsin-Whitewater, a M.A. in Geography from the University of Akron, and a Ph.D. in Geographic Information Science from the Ohio State University.