Regional Science and Urban Economics 31 (2001) 571–599 www.elsevier.nl / locate / econbase
An empirical test of geographic knowledge spillovers using geographic information systems and firm-level data Scott J. Wallsten* Stanford University and The World Bank, Stanford Institute for Economic Policy Research ( SIEPR), 579 Serra Mall at Galvez St., Stanford, CA 94305, USA Received 31 August 1999; accepted 25 September 2000
Abstract Most research on economic geography focuses on large geographic areas, such as nations and states. I use a geographic information system and a firm-level dataset to explore agglomeration and spillovers at the firm level over discrete distances. I calculate the distance between each firm-pair to explore co-location, and use these calculations to devise a test of spillovers: is participation in the Small Business Innovation Research (SBIR) program, which provides R&D grants to small firms, a function of whether nearby firms win SBIR grants? I find that the number of other SBIR firms within a fraction of a mile predicts whether a firm wins awards, even controlling for regional, firm, and industry characteristics. 2001 Elsevier Science B.V. All rights reserved. Keywords: Spillovers; Agglomeration; GIS JEL classification: R110; R120
1. Introduction The economic success of areas such as California’s Silicon Valley has caused the study of economic geography to boom as researchers explore the extent, implications, and causes of industrial concentration. Researchers find that in*Tel.: 11-650-724-4371; fax: 11-786-513-5727. E-mail address:
[email protected] (S.J. Wallsten). 0166-0462 / 01 / $ – see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S0166-0462( 00 )00074-0
572
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
dustrial clustering in regions is widespread (e.g. Krugman (1991a, 1998)) and greater than would be expected if geographic distributions were random (Ellison and Glaeser, 1997). Others find evidence of regional knowledge spillovers (e.g. Jaffe (1989) and Jaffe et al. (1993)). Unfortunately, data limitations have forced the existing literature to explore these issues only within large geographic units such as nations or states. As a result, while we know that industries cluster in certain regions and that firms located in the same regions benefit from knowledge spillovers, we know little about distances between individual firms within regions and whether distance matters for spillovers. These limitations are unfortunate because some industries and firms cluster in areas much smaller than states or cities. Venture capital, for example, is famously concentrated: about a third of all venture capital in the United States originates in a 2-mile stretch of Sand Hill Road in Menlo Park, California. A company tracking local real estate markets notes that venture capitalists ‘all want to be within walking distance of each other, and that little stretch of Sand Hill Road is where you’ve got to be’ (Colliver, 1998). Policy makers and private entities believe that benefits arise when firms locate very close together. Governments and large firms fund more than 550 business incubators — facilities that house tens or even hundreds of small firms — in North America (National Business Incubation Association, 1998). Nonetheless, we have not yet empirically explored the effects of tight concentrations of firms. I use a computerized Geographic Information System (GIS) and firm-level data to explore economic geography over distances rather than within pre-defined geographic units. The paper has two primary components. Firstly, I develop a method of measuring spatial agglomeration at the firm level. Using GIS location coordinates I calculate the distance between each firm-pair in a large dataset of small, high technology firms. For each firm I then create ‘density variables’, which measure the number of other firms within any radius. These variables allow a detailed look at how close together firms locate and how they are distributed over specific distances. Secondly, I use these variables and time variation in the data to devise and implement a test of spillovers. The test asks whether these small, high-technology firms are more likely to receive a government grant from the Small Business Innovation Research (SBIR) program if their neighbors received grants.1 The results are intriguing. Firstly, firms that participate in SBIR tend to locate very close together — almost 20% are within one-tenth of a mile of at least one other SBIR firm — and are not distributed uniformly over small distances. The fact that firms are co-located motivates a test of spillovers over distance. Empirical 1 SBIR is a government program that provides grants to small firms for R&D aimed at producing a commercially viable project. Firms submit proposals and the government chooses whether to fund them. See Wallsten (1998, 2000) or www.sba.gov / SBIR for more detail on the SBIR program.
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
573
tests reveal that short distances matter: the number of neighboring firms in the SBIR program is a strong predictor of the observed firm’s participation, even when controlling for firm and regional factors that may also influence participation. Most interestingly, the number of other firms within one-tenth of a mile that won an SBIR grant in time t is a robust predictor of whether a firm will win an additional award in time t 1 1. The number of SBIR firms more than one-half mile away, however, has no measurable impact. The paper proceeds as follows. Section 2 discusses factors involved in industrial and firm agglomeration and why geographic aggregation can be problematic in economic geography research. Section 3 introduces the dataset I construct. Section 4 describes how the firms are located across states and MSAs. The section goes on to explain a new method of exploring firm agglomeration, which involves calculating distances between firms. Section 5 uses that information to implement the test of spillovers. Section 6 concludes.
2. Clustering, spillovers, and empirical problems caused by aggregating over large geographic areas It is well-known that industries concentrate in particular geographic areas. In the early nineteenth century, US manufacturing was concentrated in a small part of the Northeast and the Midwest. Historically, shoes were produced in Massachusetts and rubber in Akron, Ohio. Carpet producers are still disproportionately located in Dalton, Georgia, and jewelry producers around Providence, Rhode Island (Krugman, 1991a). Today, high technology firms concentrate in areas like Silicon Valley. Marshall (1920, as cited in Davenport, 1935) hypothesized three reasons for industry localization: benefits of a pooled labor supply, access to specialized inputs, and information flows between people and firms. These factors can generate positive feedback loops as additional concentration brings additional labor and other inputs as well as more people to share ideas (Arthur, 1994; Krugman, 1991b). The first two factors are regional — firms may benefit from common labor and input markets by co-locating in the same city or state. The relevant geographic size for knowledge spillovers is less clear. Some knowledge flows through mechanisms unrelated to geography, such as the Internet and journals. Other knowledge flows through mechanisms closely linked to geography. Firms are generally believed to benefit from locating near universities, for example. Jaffe (1989) finds that university research positively impacts corporate patenting on a state level. Firms and universities can benefit from each other in many ways. One way they might benefit is when firm employees and university faculty and students attend each others’ seminars. This knowledge-transfer mechanism probably best operates within cities or regions — geographic areas
574
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
small enough to facilitate low-cost seminar attendance.2 People are more likely to attend a weekly seminar if attendance requires only a few hours and little advance planning. Few are likely to travel extensively on a regular basis to attend a 2-hour seminar. They can also benefit by collaborating on research projects. Some types of collaboration require that researchers be geographically close enough to facilitate frequent face-to-face interactions. Some knowledge flows between firms are also regional in nature. Organized interactions may require that firms be within commuting distance, but not necessarily any closer. Jaffe et al. (1993) find empirical evidence of spillovers within nations, states, and MSAs.3 Others have attempted to explore the relationship between knowledge spillovers and clustering within these regions.4 Clearly, some spillovers and agglomerations occur within relatively large regions, and the literature has explored many aspects of these regional phenomena. Other spillovers and clusters, however, may occur within much smaller areas. Because data typically force researchers to aggregate up to large geographic units, we have been unable to study how firms co-locate over short distances and whether some spillover mechanisms operate within small areas. The following sub-section details some problems that arise when studies aggregate geographically.
2.1. Limitations arising from aggregating up to large geographic areas One problem arising from geographic aggregation is purely econometric. Aggregation introduces bias into any estimation by discarding firm-specific variation. Feige and Watts (1972) found, in an empirical exercise using data on banks, that aggregating up to states introduced so much bias that estimates became 2
Saxenian (1994) reports that Stanford University engineering faculty have an open invitation to seminars at Xerox’s Palo Alto Research Center and that Stanford faculty can make up a large share of the audience at these seminars. 3 Jaffe et al. (1993) find that ‘citations to domestic patents are . . . more likely to come from the same state and SMSA as the cited patents, compared with a ‘control frequency’ reflecting the pre-existing pattern of related research activity’. 4 Glaeser et al. (1991) find that ‘industries that are more heavily concentrated in the city than they are in the US as a whole’ grew more slowly from 1956 to 1987 than industries that were less heavily concentrated in the city. They conclude that geographic knowledge spillovers must be between industries rather than within industries. Audretsch and Feldman (1996a,b) test whether geographic knowledge spillovers can help account for geographic clustering of innovations within states. They find that innovations cluster spatially in industries that are R&D-intensive and employ a high degree of skilled labor. Because they assume that geographic knowledge spillovers are more prevalent in industries where new knowledge is important, they conclude that geographic knowledge spillovers cause spatial clustering. Harrison et al. (1996) explore how adoption of an automation technology in manufacturing plants is affected by county urbanization and the presence of other, similar, manufacturing establishments. They find that plants in urban areas are more likely to adopt the technology, but that the presence of other plants has no significant impact when controlling for plant size.
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
575
almost meaningless. Firm-level data avoids this bias. Additional problems from geographic aggregation are more conceptual. Firstly, states and MSAs are often large and disparate. Jaffe (1989), critiquing his own work, notes that ‘though convenient, the use of states as the unit of observation is conceptually problematic for our purposes. Thinking of geographic spillovers as occurring similarly in Rhode Island and California strains credulity’. Even MSAs may be too large to capture some aspects of spillovers and clustering. Firms that are next-door neighbors may interact differently from firms on opposite sides of an MSA. Secondly, geographic aggregation prevents us from investigating different spillover mechanisms. Jaffe (1989) and Jaffe et al. (1993) note that we know little about knowledge ‘transport mechanisms’. Given the importance of knowledge flows to economic growth (Romer, 1986, 1990), we should try to understand how these mechanisms function. Some knowledge, for example, may flow through informal conversations. According to Jaffe (1989), ‘if . . . the (knowledge transfer) mechanism is informal conversations, then geographic proximity to the spillover’s source may be helpful or even necessary in capturing the spillover benefits’. Informal conversations are believed to be an important mechanism for transferring knowledge between small, high tech firms. Saxenian (1994), for example, reports that informal social network and communication are important sources of information among engineers in Silicon Valley. If informal conversations are, in fact, an important knowledge transfer mechanism, then some knowledge may flow more easily between firms located very close together than between firms located across town from each other. Studies that aggregate up to large geographic regions may not detect spillovers over small distances. But as Glaeser et al. (1991) noted, ‘(t)he cramming of individuals, occupations, and industries into close quarters provides an environment in which ideas flow quickly from person to person’. By calculating precise distances between firms this paper presents a way to explore clustering and spillovers within close quarters and begins to answer the Jaffe et al. (1993) question of whether there is a spillover ‘advantage to nearby firms’.
3. Data The data consist of firms that received grants from the SBIR program and similar firms that did not. SBIR is a federal program that provides R&D grants to small, high-technology firms. Ten federal agencies participate in the program, funding a broad range of research, all with the goal of commercializing products in the civilian market. Because the program grants several thousand awards a year, it is a good mechanism for identifying small, high-tech firms. The SBIR group consists of 3784 firms that received at least one grant from 1993 through 1996. The non-SBIR group consists of 48,239 small, high-tech firms. This group
576
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Table 1 Standard industrial classification of non-SBIR firms SIC
Description
Number firms
3674 7371-4 8711-11,56 8731
Semiconductors and related devices Computers, system designers and consultants Engineers — consulting, research Commercial physical, and biological research (Including: 8731-01 Laboratories-R&D; -03 Solar energy R&D; -05 Human factors R&D; -06 Electronics R&D; -08 Pharmaceutical R&D; -21 Cryogenic R&D; -25 Lasers-communication R&D; -28 Chemical research; -98 Commercial physical research; -04 Medical R&D; -16 Plastics R&D)
855 33,349 11,868 2167
represents firms in the American Business Disc CD-ROM database that have fewer than 100 employees and a primary Standard Industrial Classification (SIC) identical to one of the 16 most represented SICs among the SBIR group. Table 1 lists these SIC categories.5 Each observation contains a firm name, street address, state, and zip code. I matched each firm to a county through the firm’s zip code. I then added economic variables for the county in which the firm is located.6 The county economic variables come from the US Census Regional Economic Information System (REIS). For the SBIR firms I also have the number of grants received and a written description of each funded project. Based on those project descriptions, I categorized the firms into seven ‘technology areas’ defined by the Small Business Administration, which tracks the SBIR program. Each technology area contains several sub-areas. Because a firm can receive funding for multiple research projects, its awards may fall into more than one sub-area. These are small firms, however, and a firm’s projects typically fall into only one of the seven major areas. When a firm’s projects did cross major technology areas, I placed it into the technology area that fit the majority of its awards. The technology areas are (1) computers, information processing, and analysis; (2) electronics; (3) materials; (4) mechanical performance of vehicles, weapons, and facilities; (5) energy conversion and use; (6) environment and natural resources; and (7) life sciences (primarily biotech). Table 2 lists the seven technology areas, their corresponding sub-areas, and the number of firms in each of the seven categories.
5 American Business Disc (ABD) compiles firms from listings in telephone book yellow pages, and confirms the information by calling each business. These firms come from the 1998 edition of ABD, meaning that the firms advertised in 1997. 6 Occasionally zip codes fall into more than one county. In that circumstance I aggregated the data from the two counties to create the relevant zip code data.
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
577
Table 2 Technology areas and subareas Technology area
Subareas
Number
Computer, information processing, analysis
Computer and communication systems Information processing and management Signal and image processing Systems studies Mathematical sciences
920
Electronics
Microelectronics Electronics device performance Electronic equipment and instrumentation Electromagnetic radiation / propagation Microwave and millimetre wave electronics Optical devices and lasers
756
Materials
Advanced materials Materials processing and manufacturing Coatings, corrosion, and surface phenomena Materials performance Fundamentals and instrumentation
462
Mechanical performance of vehicles, weapons, and facilities
Hydrodynamics Aerodynamics Acoustics Mechanical performance of structures and equipment Control Mechanical measurements
351
Energy conversion and use
Transport sciences Propulsion / combustion technology Large scale energy usage Energy conversion / electric power
212
Environment and natural resources
Ocean science Atmospheric sciences Water management Earth sciences Environment protection
171
Life sciences
Medical instrumentation Biotechnology and microbiology Behavioral sciences Physiology and miscellaneous
1002
4. Spatial distribution of SBIR firms This section describes the spatial distribution of the SBIR firms. It first demonstrates that the firms are concentrated in certain regions of the country, as
578
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
we would expect for a group of small, high technology firms. It then turns to a new way of measuring the spatial distributions using the GIS. This new approach demonstrates that the SBIR firms co-locate very close together — often within one-tenth of a mile — and are not distributed uniformly over space.
4.1. Regional distributions The SBIR firms exhibit regional concentration patterns expected for small, high-tech firms. Fig. 1 shows the location of SBIR firms across the country. Each dot on the map represents one firm.7 This national view shows firms concentrated in population centers — the Boston–Washington corridor on the East Coast, Los Angeles, the San Francisco Bay Area and Seattle on the West Coast, and major cities in the interior of the country. Table 3 documents that these small, high-tech firms are concentrated in certain states. California hosts almost 24% of all SBIR firms, and Massachusetts has more than 11%. New York is a distant third, with about 6% of all SBIR firms. This crude state-level view provides little information about the regional location of these firms. Table 4 documents firm location by MSA. Boston is the leading recipient, with 10.7% of all awards, followed by the Washington, DC metro area with 8.5%, Los Angeles with 5.3%, and San Jose with 5.2%. Simply looking at the number of firms in a region or using those regions as the unit of observation, however, discards valuable information in the data. The next section discusses a way to refine and to exploit that information.
4.2. A new approach to clustering The street addresses in the data allow me to recover each firm’s exact longitude and latitude coordinates from the GIS.8 I use these coordinates to calculate the distance between each pair of firms.9 With the distances I create ‘density variables’ for each observation — the number of other firms within one-tenth, one-half, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 45, 50, 100, and 200 miles 7
Some of the dots actually correspond to more than one firm. Some firms are located at identical street addresses, but are located in different suites within the building at that address. 8 The dataset originally consisted of all 4983 firms that won SBIR awards from 1993 through 1996, and 66,179 non-SBIR firms. The GIS software could ‘geocode’ only about 75% of the SBIR firms and 80% of the non-SBIR firms (the non-SBIR sample is somewhat smaller than 80% of 66,179 because I removed 4880 firms that had multiple locations and 367 firms that turned out to have received SBIR grants). Most of the remaining firms reported P.O. boxes as their addresses, preventing geocoding. In addition, some firms reported addresses on streets that the GIS software could not locate. 9 The following equation produces the distance, in miles, between points A and B on the earth: Distance in miles 5 hArccosh(sin(A 1 ) 3 sin(B 1 ) 1 (cos(A 1 ) 3 cos(B 1 ) 3 cos(A 2 2 B 2 ))jj 3 69 where A 1 and B 1 are latitudes of points A and B, respectively, and A 2 and B 2 are longitudes of points A and B, respectively (Snyder, 1987).
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Fig. 1. SBIR firms across the continental US (dot represents one SBIR firm). 579
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
580
Table 3 Concentration of SBIR firms by state State
No. firms
Percent
State
No. firms
Percent
State
No. firms
Percent
State
No. firms
Percent
CA MA NY MD VA TX CO PA NJ OH WA FL MI
923 440 225 211 195 156 143 140 132 131 119 116 77
23.8 11.4 5.8 5.4 5.0 4.0 3.7 3.6 3.4 3.4 3.1 3.0 2.0
IL CT NM MN AZ OR NC GA AL UT TN DC IN
75 74 61 60 55 50 48 45 38 36 33 30 29
1.9 1.9 1.6 1.5 1.4 1.3 1.2 1.2 1.0 0.9 0.9 0.8 0.7
MO WI NH KS RI OK LA VT HI NE KY SC MT
28 27 27 15 14 12 11 10 9 8 8 7 7
0.7 0.7 0.7 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2
DE SD ND NV ME IA ID WY WV MS AR AK
7 6 6 5 5 5 5 2 2 2 2 2
0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
Table 4 Concentration of SBIR firms in top 25 metropolitan statistical areas Metropolitan statistical area
No. firms
Percent
Boston, MA Washington, DC Los Angeles, CA San Jose–Santa Clara, CA San Diego, CA Oakland (Alameda, Contra Costa), CA Philadelphia, PA Baltimore, MD Seattle, WA Orange County, CA New York, NY Chicago, IL Denver, CO Albuquerque, NM Houston, TX Minneapolis–St. Paul, MN San Francisco, CA Boulder, CO Pittsburgh, PA Atlanta, GA Dayton–Springfield, OH Middlesex–Somerset–Hunterdon, NJ Albany–Schenectady–Troy, NY Dallas, TX Nassau–Suffolk, NY
404 319 199 195 169 92 87 83 78 77 63 62 60 56 55 58 55 54 53 46 44 45 39 38 37
10.7 8.5 5.3 5.2 4.5 2.4 2.3 2.2 2.1 2.0 1.7 1.6 1.6 1.5 1.5 1.5 1.5 1.4 1.4 1.2 1.2 1.2 1.0 1.0 1.0
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
581
Table 5 Co-location of firms that won SBIR awards in 1993 and 1994 Radius r miles
0.1 0.5 1 2 3 4 5 10 20 50
Number of firms within r miles of z other firms (Percent of total in parentheses) z 1
2
3
4
5
10
436 (17.3) 915 (36.3) 1301 (51.5) 1745 (69.1) 1981 (78.5) 2129 (84.4) 2226 (88.2) 2381 (94.3) 2450 (97.1) 2488 (98.6)
146 (5.8) 523 (20.7) 889 (35.2) 1337 (53.0) 1668 (66.1) 1897 (75.2) 2038 (80.7) 2284 (90.5) 2401 (95.1) 2459 (97.4)
76 (3.0) 331 (13.1) 638 (25.3) 1049 (41.6) 1405 (55.7) 1681 (66.7) 1857 (73.6) 2203 (87.3) 2347 (93.0) 2436 (96.5)
50 (2.0) 232 (9.2) 500 (19.8) 895 (35.5) 1217 (48.2) 1478 (58.6) 1668 (66.1) 2109 (83.6) 2275 (90.1) 2398 (95.0)
32 (1.3) 160 (6.3) 410 (16.2) 775 (30.7) 1065 (42.2) 1322 (52.4) 1528 (60.5) 2046 (81.1) 2233 (88.5) 2370 (93.4)
0 (0.0) 29 (1.1) 133 (5.3) 448 (17.7) 679 (26.9) 896 (35.5) 1067 (42.3) 1682 (66.6) 2028 (80.3) 2271 (90.0)
of each firm in the data. These counts provide a measure of firm density at each firm’s location. Table 5 shows co-location among firms that won SBIR awards in 1993 or 1994. The table demonstrates that these firms tend to locate very close together. For example, 17% of the firms are within one-tenth of a mile of at least one other SBIR firm, and more than half are within 1 mile of at least one other SBIR firm. Six percent are within one-half mile of five other SBIR firms, and 31% are within 2 miles of five other SBIR firms. We can get an idea of the geographic distribution of the SBIR firms within small areas by calculating the implied average firm density — in firms per square mile — within each radius. Table 6 shows these calculations. For each radius r the table shows the area in square miles, the average number of other SBIR firms located within r miles of each SBIR firm, and the implied number of firms per square mile. If the firms were distributed uniformly over space, the implied density would remain constant as the radius increased. The table shows, however, that the implied density does not remain constant. Instead, firms bunch together tightly. The average number of other SBIR firms within one-tenth of a mile implies an SBIR density of about 9.5 firms per square
582
Radius r miles
Area (miles 2 )
US
San Jose
Boston
LA
Mean no. other SBIR firms w/in r miles
Implied SBIR density (firms/mile 2 )
Mean no. other SBIR firms w/in r miles
Implied SBIR density (firms/mile 2 )
Mean no. other SBIR firms w/in r miles
Implied SBIR density (firms/mile 2 )
Mean no. other SBIR firms w/in r miles
Implied SBIR density (firms/mile 2 )
0.1 0.5 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20
0.031 0.79 3.14 12.6 28.3 50.3 78.5 113.1 153.9 201.1 254.5 314.2 452.4 615.8 804.2 1017.9 1256.6
0.30 1.00 2.30 5.30 9.09 13.24 17.48 21.90 26.65 31.30 35.82 40.37 49.07 57.27 65.00 72.07 78.67
9.53 1.28 0.73 0.42 0.32 0.26 0.22 0.19 0.17 0.16 0.14 0.13 0.11 0.093 0.081 0.071 0.063
0.25 1.56 4.55 13.17 24.01 35.55 46.98 57.27 68.97 78.83 86.96 96.33 110.95 124.20 134.08 141.49 148.20
7.96 1.99 1.45 1.05 0.85 0.71 0.60 0.51 0.45 0.39 0.34 0.31 0.25 0.20 0.17 0.14 0.12
0.64 2.47 6.41 16.19 27.46 41.01 56.09 72.30 88.48 105.32 122.01 137.60 169.47 195.12 216.68 234.85 251.99
20.24 3.14 2.04 1.29 0.97 0.82 0.71 0.64 0.57 0.52 0.48 0.44 0.37 0.32 0.27 0.23 0.20
0.08 0.62 1.66 4.17 8.19 12.11 15.66 19.66 23.47 27.31 30.89 33.81 40.98 49.01 58.81 68.79 78.69
2.67 0.79 0.53 0.33 0.29 0.24 0.20 0.17 0.15 0.14 0.12 0.11 0.09 0.08 0.07 0.07 0.06
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Table 6 Mean number of SBIR firms within r miles of each firm that won awards 1993–1994 and implied firm density
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
583
mile. The average number of other SBIR firms within 1 mile of each SBIR firm implies a density of 0.7 firms per square mile, and the average number within 5 miles implies a density of only 0.2 firms per square mile.10 On average, firms within any radius of a firm tend to be concentrated very close together. This result demonstrates both that SBIR firms tend to locate close together and are not distributed uniformly over small areas. This result is not driven by any particular region of the country. The table shows the implied average firm densities in San Jose, Boston, and Los Angeles. Firms in Boston are the most concentrated, with an average of 0.64 firms within a tenth of a mile of an observed firm. This average number is 0.25 and 0.08 for San Jose and Los Angeles, respectively. These numbers imply an average SBIR firm density within one-tenth of a mile of 20.2, 8.0, and 2.7 SBIR firms per square mile for Boston, San Jose, and Los Angeles, respectively, in line with the relative densities of the cities. Although the absolute numbers differ by region of the country, the trends are the same: the implied firm density is higher the smaller is the radius. Firms are not distributed uniformly.11 By themselves these results provide only limited information. Zoning laws, industrial parks, large buildings that house many firms, and the layout of a region’s infrastructure will cause most firms to be located close to some other firms, and most likely close to similar firms. The point here is not that SBIR firms (or high-tech firms generally) necessarily locate closer together than do other firms. Instead, the analysis demonstrates simply that they co-locate, and that therefore it is sensible to look for spillovers over those short distances. The regional description in the first part of this section confirms that these small, high-tech firms concentrate in certain geographic regions. The new approach using actual distances between firms shows that they also locate close together in very small areas within geographic regions.12 Jaffe (1989) noted that the use of states as the geographic unit of analysis is inappropriate for studying some issues in 10
These calculations do not include the observed firm. Including the observed firm obviously adds one to each average number of firms within the given radius. This would increase the implied density at each radius, but not the trend of decreasing densities at larger radii. 11 The implied average densities beyond some radius, of course, cease to be meaningful. Local geography will dictate the maximum meaningful radius. Silicon Valley, for example, is |30 miles long, stretching north from San Jose towards San Francisco. The mountains to the west and the San Francisco Bay to the east squeeze much of Silicon Valley to perhaps five miles wide. The Atlantic Ocean to the east of Boston means that for many firms a radius of more than a few miles will include the ocean, where, obviously, no firms locate. As a result, the implied number of firms per square mile will decline rapidly as the radius increases beyond several miles. Nonetheless, the implied densities within small radii, say, less than 2 miles, show that the firms tend to cluster together tightly. 12 These empirical observations are consistent with Krugman’s (1995) theoretical prediction and numerical simulations of firm equilibrium location. He notes that opposing forces cause firms to both like and dislike locating near other firms. These ‘centripetal’ and ‘centrifugal’ locational forces can produce clusters of firms — business districts — within cities. The empirical evidence presented here of firms clustering in very small areas is consistent with Krugman’s observations.
584
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
economic geography. That similar firms locate so close together suggests, moreover, that geographic units of analysis defined by political or other borders may be too large for exploring some aspects of clustering and spillovers. Firms 5 miles apart may impact each other differently than do firms only a fraction of a mile apart, even though they may be in the same state or city. The next section begins to explore that question using the distance information.
5. Empirical method and results A common theme of the economic geography literature is that firm and industry agglomeration can generate positive feedback loops. Spillovers among firms involved in the SBIR program could lead to a similar phenomenon. If so, firm participation in SBIR may be a function of whether neighboring firms win awards. In this section I use the density measures developed above to test whether co-location in small areas can encourage such spillovers. Specifically, I conduct two tests. Firstly, does density affect whether a firm participates in the program? Secondly, does density affect whether a firm wins awards in multiple time periods instead of just one? SBIR participation could be a function of the participation of nearby firms for several reasons. One is that firms learn about the program from their neighbors. If informal conversations are a mechanism for knowledge transfer, then certain knowledge may spread between densely concentrated firms more quickly than between isolated firms. Knowledge that could flow most easily in this manner would be information that is valuable for the firm to have, but not harmful to share. Small, high tech firms share a common quest for funding, and employees of neighboring firms may talk about their various fund-raising strategies with each other. One strategy could be to apply to the SBIR program for research grants. SBIR awards more than $1 billion per year to small firms, with each grant worth $100,000 to $750,000. Because each award is a tiny share of the total available pool, sharing information about the program with neighbors would not significantly harm one’s own chances of winning an award. Indeed, some state technology development agencies encourage firms to talk to repeat-winners to learn the secrets of winning an SBIR award.13 Employees of neighboring firms may therefore be willing — and even encouraged — to discuss SBIR with each other. If they do, the number of neighboring firms with SBIR awards could increase the chances that a firm learns about the program. Once the firm knows about the program it can improve its chances of winning awards by learning about the program’s myriad details. For example, each participating federal agency runs its SBIR program independently, meaning 13
See, for example, the Wyoming SBIR Initiative Newsletter, June 26, 1998. http: / / www.zyn.com / sbir / rnews / wy980626.htm.
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
585
proposal requirements can vary. Each agency may also have its own priorities when choosing which proposals to fund. A firm can gain this information from sharing information with nearby firms that also have awards. A second possibility is that large firms that are ineligible for the program may incorporate their own researchers as small firms in order to win these grants. While the empirical tests will control for the number of large firms in the radius, that variable may not accurately control for this phenomenon. The large firm, for example, may be headquartered far away and thus not located in that small radius. Under this scenario, firms that win multiple awards may be close to other award winners because those firms are all connected to the same parent company. A third possibility is that similar firms locate very close together. Such similarities could also cause them to win SBIR awards. The second test attempts to control for this by including technology area dummy variables and the average number of awards each firm won per year. The technology areas are quite broad, however, and may not adequately capture firm similarities. Firms, for example, may work on complementary technologies and thus choose to locate close together. Such similarities could encourage both firms to apply for SBIR awards. And if one firm’s research is funded, complementary research may be funded, as well. These complementarities may not be captured by the technology area dummy variables. The sub-sections below describe the tests and present results.
5.1. Does SBIR density affect whether a firm enters the program? If geographic clustering encourages spillovers, then firms near award-winners may be more likely to enter the program than are firms further away from award winners. To investigate this hypothesis I first classify the SBIR firms as ‘old’ and ‘new’ award-winners. Old firms won at least one award in 1993 or 1994. New firms won at least one award in 1995 or 1996, but none in 1993 or 1994. These dates separate old and new SBIR firms because of an exogenous policy change that increased the number of SBIR grants federal agencies had to award. In 1994, SBIR funding increased by approximately one-third — from about $600 million in both 1993 and 1994 to more than $900 million in both 1995 and 1996. Although most of the increase went to the old firms, which won more awards than they did previously, the increase also allowed additional firms to take advantage of the program. The 3784 firms break down into 2524 old and 1350 new SBIR firms. The general nature of the test will be to explore whether new award-winners tend to be closer to old award winners than are non-SBIR firms. Does being near a firm that had an award in the first time period increase the chances that a firm that had not previously won an award wins one in the second period? To put it differently, is a firm located close to award-winners more likely to enter the program than is a more isolated firm? I calculate SBIR density using the method described above. I count how many
586
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Table 7 Mean number of nearby firms (standard deviation in parentheses) x miles
0.1 0.5 1 2 3 4 5 10 20 50
Mean no. old SBIR firms within x miles of New winners
Old winners
Non-SBIR firms
Mean no. non-SBIR firms within x miles of old winners
0.26 (0.84) 0.85 (1.95) 1.99 (4.34) 4.84 (8.79) 8.31 (13.3) 12.3 (18.3) 16.3 (23.8) 37.2 (49.9) 71.3 (81.8) 122.8 (109.9)
0.29 (0.86) 1.00 (1.99) 2.30 (4.50) 5.30 (8.93) 9.09 (13.8) 13.2 (19.2) 17.5 (24.9) 40.4 (52.1) 78.7 (86.5) 131.8 (112.5)
0.06 (0.32) 0.32 (1.02) 0.82 (2.28) 2.21 (5.24) 3.99 (8.66) 5.91 (12.0) 7.91 (15.4) 20.2 (34.3) 44.6 (62.7) 93.6 (99.8)
1.09 (2.25) 6.05 (13.4) 15.7 (30.3) 42.1 (65.5) 76.2 (105.0) 112.9 (141.3) 151.1 (180.4) 385.6 (405.4) 851.9 (746.5) 1788.2 (1452.8)
old award-winners are within a given radius of each new award-winner, and how many old award-winners are within a given radius of each of the 48,239 non-SBIR firms. Table 7 shows that, on average, 0.26 and only 0.06 old winners are within one-tenth of a mile of each new winner and non-SBIR firm, respectively. On average, 1.99 and only 0.82 old winners are within 1 mile of each new winner and non-SBIR firm, respectively. In other words, SBIR firms appear to be more likely to be close to other SBIR firms than they are to non-SBIR firms. Co-location of firms that win SBIR awards could occur for reasons that have nothing to do with spillovers. As Jaffe et al. (1993) note, ‘the most difficult problem confronted by the effort to test for spillover-localization is the difficulty of separating spillovers from correlations that may be due to a pre-existing pattern of geographic concentration of technologically-related activities’. Localization also arises from what Ellison and Glaeser (1997) call ‘natural advantage’. Natural advantage, as they put it, ‘includes the forces that lead the wine industry to concentrate in California and large shipyards to locate on bodies of water’. This test, then, must be cognizant of the many factors that may cause firms to locate in certain regions and to win awards. For example the Department of Defense makes more than half of all SBIR awards. Firms that do other contract work for the DoD
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
587
may be more likely to take advantage of the SBIR program and already locate near military installations. I run probit estimations to control for non-spillover factors that may influence whether a firm wins an award. These regressions include as observations all new award winners and non-SBIR firms. An observation is a firm and the dependent variable is whether the firm is a new award-winner or a non-SBIR firm. Eq. (1) provides the precise specification Pr(winning award) 5 F h b0 1 b1 Xold 1 b2 Xlarge 1 d (demogs) 1 a (regions)j (1) where F represents the standard normal cumulative distribution. The independent variable of interest is Xold — the number of old award-winners within x miles of each observed firm. A positive b1 coefficient would mean that additional old award-winners nearby increase the probability that the observed firm is a new award-winner. The remaining variables control for other factors that could affect whether a firm wins an award. Xlarge is the number of large firms within x miles. The large firms, like the small non-SBIR firms, come from the American Business Disc CD-ROM.14 These firms have more than 500 employees (making them ineligible for SBIR awards) and the same primary SICs listed above.15 Xlarge controls for the possibility that small firms spin off of large firms to benefit from the program. These spinoffs would be densely concentrated if they locate near the parent. ‘Demogs’ includes county-level demographic data that control for local economic factors that may influence whether a firm enters the program. These variables include federal employment, military employment, manufacturing employment, and per capita income in the firm’s county in 1995. Federal and military employment control for the fact that these awards are doled out by federal agencies. Certain firms locate near federal installations and focus exclusively on dealings with the government. Higher military employment in a county indicates the presence and size of military installations. Proximity to such installations may affect whether a firm enters the program. Manufacturing employment controls for the size of the county’s industrial base. Per capita income controls for regional wealth, which may be related to the population’s ability to start and support small, high-tech firms. Finally, ‘regions’ include state and MSA dummy variables to control for region-specific factors that may affect whether a firm enters the program, such as 14 An obvious omitted variable in this equation is the number of non-SBIR firms in the specified radius. Unfortunately, calculating this variable for each firm essentially means generating a 50,0003 50,000 matrix, which was beyond the computing power available to me. 15 Geocoding left 267 large firms in these industries. Using the longitude and latitude, I calculated how many large firms are within the given radius of each observed small firm.
588
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
regional technology boards whose objectives include attracting SBIR awards to their areas. I estimate Eq. (1) several times with different values of x to explore the effects of distance. Table 8 highlights the results. Each column reports results for a different radius, starting at one-tenth of a mile and working out to 5 miles. Firms located in counties with higher (non-military) federal employment are more likely to enter the program. Firms located in counties with higher military employment are less likely to enter the program, ceteris paribus. These results may reflect general trends in federal spending and federal funding allocation patterns. R&D spending by the Department of Defense was |10% lower (in real dollars) in 1995 and 1996 than it was in 1993 and 1994. If firms near military installations receive SBIR funding primarily from DoD we would expect fewer firms to enter the program in a period of reduced spending. Non-military federal employment, meanwhile, may be a signal of a legislator’s effectiveness in allocating discretionary federal spending to her district. If so, SBIR funding could be similarly affected. We would then expect federal employment in a county to help predict whether a firm wins SBIR funding. Interestingly, firms located near large companies are less likely to enter the program. A possible explanation is that small firms locate near large firms they serve as contractors. These small firms may depend less on federal funding than do other small firms and are thus less likely to apply for SBIR funding. The coefficient of interest, b1 , is positive. The number of nearby firms that won SBIR awards in the first period positively predicts whether a firm will enter the program in the second period. That is, firms are more likely to win an SBIR award if they are located near firms that won awards in the previous time period. Although the magnitude of the coefficient decreases as the radius increases, it is difficult to interpret the meaning of this result. On one hand, the marginal effect of an additional firm on the probability of winning an award decreases as the radius increases. On the other hand, the average number of firms increases with the radius. As Table 8 demonstrates, a one standard deviation change in the number of firms has a similar marginal impact at each radius. To investigate further, I run the regression again, this time including a variable for the number of firms at each distance. This new regression includes simultaneously the number of firms within one-tenth of a mile, one-tenth to one-half mile, one-half to 1 mile, 1 to 2 miles, and 2 to 5 miles. Table 9 shows the regression results.16 The number of firms within one-tenth of a mile remains positive, significant, and almost equal in magnitude to the coefficient on that variable in Table 8. The number of firms at other distances, with one exception, has no significant impact on the probability of the firm being a new award-winner.
16
The table shows only the estimates on the density variables. All other results are almost identical to Table 10.
Table 8 Probit estimation. Prediction of probability of entering the SBIR program as a function of number of nearby firms (absolute t-statistics in parentheses)a
Radius r r50.1 miles Constant Number award winners within radius r Mean no. in radius r Standard deviation dP/dX (dP/dX)3S.D. Number large firms within radius r County federal employment in 1995 County military employment in 1995 County manufacturing employment in 1995 County per capita income in 1995 n549,589 Log-likelihood Percent correct predictions a
22.46 (32.0) 0.26 (12.6)
0.80E-05 (3.81) 20.77E-05 (2.21) 0.32E-06 (1.01) 0.82E-05 (3.04)
25805 97.3
r50.5 miles 22.46 (32.0) 0.26 (12.6) 0.06 0.35 0.016 0.0054 20.06 (0.44) 0.80E-05 (3.82) 20.77E-05 (2.21) 0.32E-06 (1.01) 0.82E-05 (3.04)
25805 97.3
22.43 (31.8) 0.08 (9.93)
0.75E-05 (3.59) 20.79E-05 (2.24) 0.37E-06 (1.19) 0.71E-05 (2.62)
25831 97.3
r52 miles 22.43 (31.8) 0.09 (10.3) 0.33 1.06 0.0051 0.0054 20.12 (2.79) 0.81E-05 (3.83) 20.81E-05 (2.30) 0.35E-06 (1.12) 0.70E-05 (2.59)
25827 97.3
22.44 (31.9) 0.02 (9.51)
0.74E-05 (3.53) 0.72E-05 (22.08) 0.38E-06 (1.24) 0.69E-05 (2.56)
25833 97.3
r55 miles 22.44 (31.9) 0.02 (9.20) 2.28 5.38 0.0012 0.0065 20.02 (1.60) 0.76E-05 (3.62) 20.74E-05 (2.11) 0.38E-06 (1.23) 0.68E-05 (2.51)
25832 97.3
22.42 (231.6) 0.009 (10.8)
0.61E-05 (2.90) 20.77E-05 (22.18) 0.44E-06 (1.42) 0.54E-05 (1.99)
25817 97.3
22.41 (231.6) 0.01 (10.9) 8.14 15.75 0.00064 0.010 20.02 (2.90) 0.63E-05 (3.00) 20.79E-05 (22.22) 0.45E-06 (1.45) 0.53E-05 (1.96)
25813 97.3
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Dependent variable: enter SBIR program (yes/no)
State and MSA dummy variables included but not shown in table. 589
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
590
Table 9 Probit estimation. Effect of nearby firms on probability of being a new award-winner a Radius (in miles)
Descriptive statistics
Probit results
Mean no. firms in radius
Standard deviation
Coefficient estimate
t-statistic
dP/ dX
0.35 0.91 1.57 3.52 11.47 21.72
0.23 0.0053 0.010 0.0046 0.0084 0.00077
10.7 0.45 1.26 1.1 6.26 0.89
0.014 0.00034 0.00061 0.00027 0.00050 4.56E-05
Tenth 0.063 Tenth–half 0.27 Half–one 0.52 1–2 1.42 2–5 5.86 5–10 12.52 a
(dP/ dX)3S.D.
0.00473 0.000309 0.000958 0.000945 0.005684 0.00099
Other included variables same as in Table 10; estimation results not shown here.
The effect of other firms on whether a firm wins an award comes largely from firms located within one-tenth of a mile. Although the results are provocative, the data have several limitations. Firstly, the non-SBIR firms are not properly stratified by industry. While I attempted to select non-SBIR firms in the same industries as firms that won awards, the share of the non-SBIR sample in each industry is not proportional to the share of SBIR firms in each industry. Because the SBIR firms are not classified by SIC group it was not possible to ensure the correct proportions. The two groups of firms may therefore not be directly comparable. Indeed, the fact that out of more than 50,000 firms pulled from the American Business Disc database only 367 turned out to have won awards suggests the non-SBIR sample comes from a different population. Secondly, while this test avoids the aggregation bias discussed earlier, it suffers from another type of bias. In the probit estimation the dependent variable is whether the firm won an award. The non-SBIR firms, however, were chosen precisely because they did not win awards. Because I selected this group on the basis of the value of the dependent variable, the regression yields biased results (Maddala, 1983). Because the equation has multiple regressors and is estimated in a non-linear fashion, it is unfortunately impossible to sign the bias (Greene, 1993). The next section eliminates these problems by considering only a group of firms that won awards. The results of the next section are qualitatively similar to those presented in this section, suggesting that the bias does not affect the sign of the coefficients.
5.2. Does SBIR firm density affect whether a firm wins awards in multiple periods? The above regressions attempt to control for factors besides knowledge spillovers that may influence whether or not a firm enters the SBIR program. Other
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
591
unobserved factors, however, may both cause the SBIR firms to win awards and to locate very close together. To deal with that issue and to eliminate the bias discussed above, this section investigates two groups of firms that won awards. I divide the old award-winners into two subgroups: ‘repeat-winners’ and ‘one-period winners’. Repeat winners won at least one award both in the first (1993–1994) and in the second (1995–1996) periods. One-period winners won at least one award in the first period, but no new awards in the second. The 2524 old SBIR firms break down into 1017 repeat-winners and 1507 one-period winners. If geographic proximity encourages spillovers, then locating near other firms with awards may help a firm become a repeat winner. Geographic spillovers and positive feedback loops suggest that firms may continue to win awards if they are near other firms that also win awards. I test whether being located close to other award winners predicts being a repeat-winner. As above, to control for observable factors that may affect whether a firm becomes a repeat-winner, I run probit estimations. The regressions in this section include all old award-winners. An observation is an award-winner and the dependent variable is whether that old award winner is a repeat or one-period winner. Eqs. (2) and (3) provide the precise specifications Pr(repeat winner)
b0 1 b1 Xold 1 b2 Xlarge 1 b3 Xnon 1 g1 (awds) 1 g2 (min)
5
O d [u 1 d (demogs)] 1 a(regions) b 1 O d [u 1 b X Pr(repeat winner) 5 F 5F
7
1
i
i
i
i 51
6
(2)
7
5
0
i
i
1i
old
6
1 di (demogs)] 1 b2 Xlarge
i 51
1 b3 Xnon 1 g1 (awds) 1 g2 (min) 1 a (regions)
(3) where F represents the standard normal cumulative distribution. The variables are defined as follows:
Xold , number of old award-winners within x miles Xlarge , number of large firms within x miles Xnon , number of non-SBIR firms within x miles awds, average number of awards a firm won in years that it won at least one award min, minority dummy variable demogs, county economic variables, including military employment, federal (non-military) employment, manufacturing employment, and per-capita income regions, state and MSA dummy variables d i , technology area dummy variables (i51, . . . , 7).
592
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
d and a are vectors of coefficients, allowing each regional variable to have a separate coefficient. Controls included here but not in Eq. (1) are dummy variables for technology area, a proxy for firm size, and the number of non-SBIR firms within x miles of each firm. The technology area dummy variables control for industry-specific effects. The average number of awards a firm wins in years that it won at least one award should be a powerful variable. The more awards a firm wins in years that it wins at least one award, the more likely it should be to win awards in multiple time periods. This variable thus helps control for the many unobserved factors that influence whether a firm wins an award. The minority ownership dummy is included because one of SBIR’s goals is to increase minority access to federal R&D dollars. Thus, minority-ownership may affect a firm’s probability of winning an award. The number of nearby non-SBIR firms helps control for the possibility that firm clustering in general, but not necessarily SBIR-specific clustering, may cause firms to win awards. Both equations allow regional factors to affect the probability of winning an award in the second period in each technology area differently by interacting the industry dummy variables with the regional variables. Eq. (2) does not allow the clustering effect to differ across technology areas. Eq. (3) does allow the clustering effect to differ across technology areas by interacting the industry dummy variables with the number of firms in a given radius. As above, the coefficient of interest is b1 — the coefficient on the density variable. Several aspects of this variable can yield insight into whether clustering promotes spillovers. Firstly, I estimate each equation with several values of x. Comparing changes in the value and statistical significance of b1 as x increases allows us to explore the effects of distances. Secondly, I estimate each equation with the number of firms within radius x in the same technology area as the observed firm, and then estimate each equation with the number of firms within radius x in all other technology areas. Comparing the value of b1 across these estimations allows us to examine whether the density effects differ within and across technology areas. Table 10 shows the result of the probit estimation of Eq. (2). I estimate the equation separately for number of firms within one-tenth of a mile, one-half mile, 2 miles, and 5 miles of the observed firm. The table shows that the regional variables, number of large firms, and number of non-SBIR firms are largely insignificant. The average number of awards a firm wins in years that it wins at least one award, as expected, strongly predicts whether the firm is a repeat winner.17 Minority status, also, predicts whether the firm is a repeat winner. Nearby large and non-SBIR firms have no statistically significant impact on whether a firm is a repeat winner.
17
I also interacted the technology area dummies with number of awards, but it was positive, significant and of similar magnitudes for all technology areas.
Table 10 Probit estimation. Predicted probability of being a repeat-winner as a function of nearby award-winners (absolute t-statistics in parentheses)a Dependent variable: repeat winner (yes/no)
r50.1 miles Constant Number award-winners w/in radius r Mean no. in radius r S.D. dP/dX (dP/dX)3S.D. Number small non-SBIR firms in radius r Number large firms within radius r Average number of awards Minority
a
0.81 (18.9) 0.35 (4.32) 20.65 (1.61) 21.22 (2.43) 20.048 (0.96) 20.042 (0.06) 20.53 (0.78) 20.60 (1.61) 21361 71.1
r50.5 miles 21.09 (3.79) 0.09 (2.54) 0.30 0.86 0.030 0.026 0.013 (0.93) 0.43 (1.39) 0.82 (18.9) 0.35 (4.35) 20.65 (1.61) 21.23 (2.44) 20.039 (0.08) 20.039 (0.06) 20.55 (0.81) 20.60 (1.62) 21359 70.9
Interaction terms and state and MSA dummies not shown.
21.06 (3.68) 0.05 (3.33)
0.81 (18.8) 0.35 (4.34) 20.66 (1.66) 21.25 (2.49) 20.06 (0.12) 20.10 (0.16) 20.56 (0.84) 20.61 (1.65) 21360 71.4
r52 miles 21.03 (3.57) 0.05 (2.99) 1.01 1.99 0.02 0.030 0.12E-02 (0.52) 0.06 (0.58) 0.81 (18.8) 0.35 (4.37) 20.68 (1.69) 21.27 (2.53) 20.07 (0.15) 20.13 (0.20) 20.60 (0.89) 20.62 (1.65) 21359 71.5
21.06 (3.61) 0.01 (1.93) 5.33 8.96 0.003 0.027 0.22E-04 (0.03) 0.007 (0.20) 0.81 (18.8) 0.35 (4.34) 20.63 (1.58) 21.26 (2.50) 20.05 (0.10) 20.14 (0.21) 20.58 (0.86) 20.59 (1.59) 21362 70.9
r55 miles 21.08 (3.68) 0.003 (1.47) 17.5 24.95 0.001 0.024 0.76E-02 (0.43) 20.15E-03 (0.42) 0.81 (18.8) 0.34 (4.28) 20.62 (1.55) 21.24 (2.46) 20.04 (0.07) 20.12 (0.17) 20.58 (0.86) 20.59 (1.60) 21363 70.8
593
Computers dummy (d1) Materials dummy (d3) Mechanical dummy (d4) Energy dummy (d5) Environment dummy (d6) Life sciences dummy (d7) n52524 Log-likelihood Percent correct predictions
21.10 (3.81) 0.10 (2.95)
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Radius r
594
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
The coefficient of interest — on the density variable — is positive and significant. The probability of being a repeat-winner increases with the number of SBIR firms within a given radius. The magnitude and statistical significance of the coefficient decreases as the radius increases. An additional firm a tenth of a mile away has a relatively large and statistically significant impact on the probability of being a repeat-winner. An additional firm 5 miles away has no statistically significant impact, and the coefficient estimate itself is much smaller. That is, the effect of a given number of surrounding firms on a firm’s probability of being a repeat-winner decreases the further away those firms are. These results are somewhat easier to interpret than in the previous section since b1 not only becomes smaller as the radius increases, but quickly becomes statistically insignificant, as well. Nonetheless, they do not completely tell us which distances matter. For example, the number of firms within two miles has a statistically significant impact on a firm’s probability of being a repeat winner, but we do not know whether that effect is driven by immediate neighbors or by firms 2 miles away. To answer this question I run the regression again, simultaneously including variables for the number of old award winners within one-tenth of a mile, one-tenth to one-half mile, one-half to 1 mile, 1 to 2 miles, and 2 to 5 miles. The results, presented in Table 11, are striking. The number of old winners within one-tenth of a mile significantly and positively predicts whether a firm is a repeat-winner. The number of firms from one-tenth to one-half does not significantly predict whether a firm is a repeat winner (t-statistic 1.32). As the distances increase the statistical significance of the estimate decreases even more. Even if the coefficient estimates were significant, calculations reveal that the marginal effect of a one standard deviation change in the number of firms decreases as distance from the firm increases. These regression results are robust. Including or excluding other variables has little impact on the effect of the density variables. Even including the average number of awards a firm won over years that it won awards, which is a powerful predictor of whether a firm is a repeat winner, has almost no impact on the
Table 11 Probit estimation. Predicted probability of being a repeat-winner as a function of nearby awardwinners. All distances estimated simultaneously Radius (in miles)
Descriptive statistics
Probit results
Mean no. firms in radius
Standard deviation
Coefficient estimate
t-statistic dP/ dX
0.86 1.61 3.02 5.38 18.11
0.082 0.031 20.0015 0.0051 0.00053
2.29 1.32 0.11 0.59 0.21
Tenth 0.30 Tenth–half 0.71 Half–one 1.31 1–2 3.02 2–5 12.21
(dP/ dX)3S.D.
0.025 0.02174 0.0097 0.015512 20.00046 20.0014 0.0016 0.008454 0.00017 0.0030
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
595
results.18 The number of immediate neighbors that had awards in the first period — old award winners one-tenth of a mile away or less — predicts whether a firm is a repeat winner, but firms further away do not. Next I examine whether these effects differ by technology area. The impact of a firm’s neighbors could depend, in part, on whether the firms do similar work. Table 12 shows the results of the probit estimations of Eq. (3), which allows co-location to affect each technology area differently. The table shows only the coefficients on the clustering variables interacted with industries. The coefficient estimates on all other variables are essentially identical to those above and are not shown in the table. The table shows side-by-side results of 10 regressions. I again estimate radii of tenth, half, 1, 2, and 5 miles. For each radius I estimate the equation with the density variable defined as the number of firms within the observed firm’s technology area and then in all other technology areas. The first important result is that the effect of nearby firms is not identical across technology areas. Firms in the materials and energy technology areas are less likely to be repeat winners if they are near other old winners. But these technology areas represent only a small share of the data.19 Firms in the computers, electronics, mechanical performance, and life sciences technology areas are more likely to be repeat winners if they are near other old winners. These technology areas together represent about 80% of the firms in the data. Firms in these technology areas seem to benefit from having other firms very close by. The second important result is that neighboring firms in the same technology area have a bigger impact on repeat-winning than do neighboring firms in other technology areas. This result holds for all technology areas, whether that impact is positive or negative. The result also holds regardless of whether one considers the marginal effect of one additional firm in the radius or of a one standard deviation change.20 That is, the marginal impact of either an additional firm or a one standard deviation change in the number of firms in a given radius has a larger impact if those firms are in the same technology area. It is conceptually sensible 18 Additional evidence of a real effect from only neighboring firms comes from aggregating up to the county level, where the density variable becomes number of old award winners in the county. (Regression results not shown.) The probit finds that this density variable is positive and significant (P-value 0.07) if the average number of awards variable is excluded. Including the average number of awards, however, causes the density variable to disappear completely (P-value 0.96). This result is analogous to the results of Harrison et al. (1996), who found that without controlling for firm size, adoption of a new technology in a manufacturing firm is positively correlated with the number of other firms in the county. Controlling for firm size made their county-wide density variable disappear, as well. 19 The small number of firms in these technology areas seems to cause these results. The energy technology area, for example, contains only 72 repeat winners. Only one of those 72 firms is located within one-tenth of a mile of an old winner in the energy technology area. Meanwhile, seven of the 76 one-period winners are located within one-tenth of a mile of an old winner in the energy technology area. 20 Tables not shown.
596
Dependent variable: repeat award winner? (yes/no) Industry dummy*
Radius r
Tenth mile
No. firms within radius r Industry dummy 5
Within/across industries?
Within
Computers
0.48 (2.35) 0.43 (1.68) 20.46 (1.29) 0.41 (0.95) 21.64 (22.28)
Electronics Materials Mechanical performance Energy Environment Biotech/life sciences n52524 Log-likelihood Percent correct predictions
b
0.19 (1.85) 21352 70.8
Half mile Across
0.24 (2.08) 0.19 (1.78) 20.03 (0.24) 0.10 (0.85) 20.05 (0.32) 20.03 (0.20) 0.07 (0.83) 21359 70.7
Within
0.22 (2.69) 0.13 (0.91) 20.15 (0.62) 0.46 (1.79) 21.60 (2.24) 20.34 (0.47) 0.06 (1.49) 21354 71.3
1 mile Across
0.12 (2.73) 0.08 (1.66) 20.10 (1.71) 0.15 (1.99) 0.001 (0.01) 0.01 (0.17) 0.05 (0.51) 21354 71.3
Within
0.12 (2.61) 0.09 (0.96) 20.12 (0.78) 0.13 (0.75) 20.39 (1.09) 0.71 (1.39) 0.16 (0.80) 21358 70.9
2 miles Across
0.04 (2.19) 0.01 (0.62) 20.06 (1.72) 0.04 (0.88) 0.04 (0.88) 0.03 (1.10) 0.04 (1.66) 21358 70.9
Within
0.06 (2.35) 0.03 (0.70) 20.06 (0.61) 0.01 (0.07) 20.20 (0.69) 0.13 (0.41) 0.02 (1.57) 21361 70.9
5 miles Across
0.02 (1.78) 0.01 (1.25) 20.02 (1.15) 0.01 (0.95) 0.03 (1.18) 0.02 (1.10) 0.005 (0.48) 21360 71.0
Within
0.02 (1.66) 20.004 (0.24) 20.008 (0.22) 20.01 (0.18) 0.11 (0.89) 0.08 (0.60) 0.003 (0.45) 21362 70.7
Across
0.006 (1.60) 0.005 (1.37) 20.005 (0.11) 20.009 (0.17) 0.008 (0.10) 0.008 (1.18) 0.003 (0.69) 21362 70.8
a Included variables not shown: number non-SBIR small firms in radius r, number large firms in radius r, state dummies, MSA dummies, technology area dummy variable* (military employment, federal employment, manufacturing employment, average income in county in 1995), technology area dummy variables, average number of new awards over years firm won awards. b Cannot be estimated. No firms in the ‘environment’ category are within one-tenth of a mile of another firm in that technology area.
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
Table 12 Probit estimation. Prediction of probability of being a repeat-winner as a function of nearby winners within and across technology areas (absolute t-statistics in parentheses)a
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
597
that firms in the same industry would have a greater impact on each other than would firms in different industries. Firms in similar industries are more likely to work on similar technologies. They may cooperate on research projects, and their employees may interact because they work in similar fields. At the very least, they will probably care more about what a neighbor is doing if that neighbor does similar work than if the neighbor is in a different industry. The final important result is that the effect of other firms on a firm’s probability of being a repeat winner comes from immediate neighbors — firms not more than one-tenth of a mile away, as above. Both the coefficient estimates and their statistical significance become smaller as distance from the firm increases. The number of firms within a tenth of a mile of firms in the computers, electronics, and life-sciences technology areas strongly predicts repeat-wins. But the effect becomes much smaller within 1 mile, smaller still within 2 miles, and almost completely disappears by 5 miles. That is, one additional firm within a tenth of a mile of a firm in the computer technology area, for example, substantially increases that firm’s probability of winning additional awards. An additional firm within 5 miles, however, has almost no effect at all.
6. Conclusion This paper proposes a way to study firm clustering and geographic knowledge spillovers using discrete distances and firm-level data. I compile a large dataset of small, high-tech firms and use a Geographic Information System to recover their longitude and latitude coordinates and create density variables. I use this information to explore spatial distributions over small areas and to devise and implement a test of spillovers. The test asks whether proximity and clustering with other SBIR firms over short distances affects the probability that a firm will win an SBIR award. I find that they do. Firms clustered with SBIR winners are more likely to enter the program and to win awards in multiple time periods than are isolated firms, even when controlling for regional, firm, and industry characteristics. Firm co-location in a very small radius — not more than a fraction of a mile — strongly predicts winning SBIR awards. Previous research has demonstrated firm and industry agglomerations and their effects within regions; the analyses in this paper show that distances matter. The results are consistent with a general spillovers hypothesis. They suggest that, to answer Jaffe’s question, nearby firms do have an advantage in certain spillovers. While the data do not allow us to rule out other hypotheses (e.g. that similar firms locate close together), the robust results suggest that interesting phenomena occur over very short distances. Geographic information systems provide us a powerful
598
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
tool for studying these phenomena. Future research should take advantage of them to empirically address these issues in economic geography.
Acknowledgements I thank Tim Bresnahan, Roger Noll, John Quigley, Paul Romer, and two anonymous referees for their thoughtful and detailed comments. I, of course, am responsible for all mistakes. I gratefully acknowledge financial support from the Alfred P. Sloan Foundation.
References Arthur, B., 1994. Increasing Returns and Path Dependence in the Economy. The University of Michigan Press, Ann Arbor. Audretsch, D., Feldman, M., 1996a. R&D spillovers and the geography of innovation and production. The American Economic Review 86, 630–640. Audretsch, D., Feldman, M., 1996b. Innovative clusters and the industry life cycle. Review of Industrial Organization 11, 253–273. Colliver, V., 1998. Highest rents are on Sand Hill Road. The San Francisco Examiner, June 27. Davenport, H.J., 1935. The Economics of Alfred Marshall. Cornell University Press, Ithaca. Ellison, G., Glaeser, E., 1997. Geographic concentration in US manufacturing industries: a dartboard approach. Journal of Political Economy 105, 889–927. Feige, E., Watts, H., 1972. An investigation of the consequences of partial aggregation of microeconomic data. Econometrica 15, 343–360. Glaeser, E., Kallal, H., Scheinkerman, J., Schleifer, A., 1991. Growth in cities. NBER, Cambridge, Working paper series no. 3787. Greene, W., 1993. In: 2nd Edition. Econometric Analysis. Macmillan, New York. Harrison, B., Kelley, M., Gant, J., 1996. Innovative firm behavior and local milieu: exploring the intersection of agglomeration, firm effects, and technological change. Economic Geography 62, 233–258. Jaffe, A., Trajtenberg, M., Henderson, R., 1993. Geographic localization of knowledge spillovers as evidenced by patent citations. Quarterly Journal of Economics 108, 577–598. Jaffe, A., 1989. Real effects of academic research. The American Economic Review 79, 957–970. Krugman, P., 1998. Space: the final frontier. Journal of Economic Perspectives 12, 161–174. Krugman, P., 1995. Innovation and agglomeration: two parables suggested by city-size distributions. Japan and the World Economy 7, 371–390. Krugman, P., 1991a. Geography and Trade. MIT Press, Cambridge. Krugman, P., 1991b. Increasing returns and economic geography. Journal of Political Economy 99, 483–499. Maddala, G.S., 1983. Limited Dependent and Qualitative Variables in Econometrics. Cambridge University Press, Cambridge. National Business Incubation Association, 1998. http: / / www.nbia.org / facts Romer, P., 1990. Endogenous technological change. Journal of Political Economy 98, S71–S102. Romer, P., 1986. Increasing returns and long-run growth. The Journal of Political Economy 94, 1002–1037.
S. J. Wallsten / Regional Science and Urban Economics 31 (2001) 571 – 599
599
Saxenian, A., 1994. Regional Advantage: Culture and Competition in Silicon Valley and Route 128. Harvard University Press, Cambridge. Snyder, J., 1987. Map projections — A working manual. In: US Geological Survey Professional Paper Series. United States Government Printing Office, Washington, Paper no. 1532. Wallsten, S., 1998. Rethinking the small business innovation research program. In: Branscomb, L., Keller, J. (Eds.), Investing in Innovation: Creating a Research and Innovation Policy that Works. MIT Press, Cambridge. Wallsten, S., 2000. The effects of government-industry R&D programs on private R&D: the case of the Small Business Innovation Research program. RAND Journal of Economics 31, 82–100.