Is there persistence in innovative activities?

Is there persistence in innovative activities?

International Journal of Industrial Organization 21 (2003) 489–515 www.elsevier.com / locate / econbase Is there persistence in innovative activities...

387KB Sizes 0 Downloads 54 Views

International Journal of Industrial Organization 21 (2003) 489–515 www.elsevier.com / locate / econbase

Is there persistence in innovative activities? Elena Cefis* Department of Economics, University of Bergamo, via dei Caniana 2, 24127 Bergamo, Italy Received 10 February 2001; received in revised form 21 November 2001; accepted 5 June 2002

Abstract This paper examines firm innovative persistence using patent applications of 577 UK manufacturing firms. Non-parametric techniques show the empirical distributions of patents are neither Geometric nor Poisson. There exists a threshold effect represented by the first patent: the probability to go from zero to one patent is uniformly much lower than from n to n 1 1 patents, with n > 1. Transition Probability Matrices show little persistence in general, but strong persistence among ‘great’ innovators that account for a large proportion of patents requested: innovative activities, at least which are captured by patents, are persistent. There is heterogeneity across industrial and size classification.  2002 Elsevier Science B.V. All rights reserved. JEL classification: L20; L60; O31; D21 Keywords: Innovation; Patents; Persistence; Transition probability matrices

1. Introduction The purpose of this paper is to analyse patent time series as a proxy for innovative activities at firm level. The focus in particular is upon the dynamic features of the patent time series. Pertinent questions concern whether one observes e.g. convergence to the mean, persistence or explosive paths.

*Tel.: 139-035-2052-547; fax: 139-035-277-549. E-mail address: [email protected] (E. Cefis). 0167-7187 / 02 / $ – see front matter  2002 Elsevier Science B.V. All rights reserved. doi:10.1016/S0167-7187(02)00090-5

490

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

These issues have implications for patterns of firms’ growth, the nature of innovative activities, and industrial dynamics. An important stylised fact concerning the dynamics of firms’ growth is that firms display persistent differences (Dosi et al., 1995; Coriat and Dosi, 1998, Bottazzi et al., 2001, 2002). These differences (or asymmetries) pertain to productivity and costs (Nelson and Winter, 1982; Baily and Chakrabarty, 1985), to profitability (Mueller, 1990; Geroski et al., 1993), and to innovative output (Griliches, 1986; Patel and Pavitt, 1991). What is particularly intriguing is the persistence of these asymmetries, so that for example, firms enjoying higher (lower) profits can be expected to earn higher (lower) profits also in the future: that is to say, profits do not seem to converge to a common rate of return 1 . Persistent asymmetries among firms involve interesting questions, such as what their sources are, why competitive interactions do not make them vanish, and what their consequences are for industrial dynamics. Since technological innovation is one of the ‘driving forces’ of firms’ growth, and, at the same time, there are differences among firms in innovative output, it is of particular interest to ask whether data on innovative activities at the firm level show persistence across firms. The issue of persistence in innovative activities is particularly relevant in the context of the discussion about the properties of the patterns of innovative activities. Since Schumpeter, the economic literature on technological change has developed two main views of the innovation process. Referring to what Schumpeter states in The Theory of Economic Development (1934) (known as the Schumpeter Mark I model) the process of technological change is considered a process of ‘creative destruction’. Conversely, referring to the Schumpeter of Capitalism, Socialism,and Democracy (1942) the process is seen as a process of ‘creative accumulation’ (or Schumpeter Mark II model). The difference between the two depends on fundamental assumptions about the properties of technology and of the innovative process 2 . In a rather simplified way, in Schumpeter Mark I technology is equally accessible to everybody and technological change is a random process, driven by a

1 Substantial research effort has been devoted to the examination of profit persistence. Recent literature has addressed the following question: do industrial profits rates eventually converge to a common rate? Several empirical studies have shown that firms display persistent differences in profitability (Mueller, 1990; Geroski and Jacquemin, 1988). That is, profits do not seem to converge to a common rate of return. Moreover, evidence seems to indicate that the adjustment of profits to their firm-specific ‘permanent’ values is rather quick, although a significant variability is observed across different countries (see for example, Odagiri and Yamawaki, 1990; Cubbin and Geroski, 1987). However, it is hard to say to what extent the observed persistence in profitability differentials reflects the persistence of differential ‘efficiency’ levels which are not eroded away by the competitive process. Only very recently has it been tried to link the strong inertia of firm profits and the persistence in innovation activities (Cefis, 2001). 2 See Dosi (1988), Martin (2001, ch. 14) and Winter (1984).

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

491

population of homogeneous firms who have a certain probability of realizing technological opportunities. Innovation generates monopoly power that is at best only temporary, since it is quickly challenged and eroded by competitors. Since the relevant knowledge base is easily available, new innovators systematically substitute for incumbents and typical innovators are small, newly established firms. Conversely, in Schumpeter Mark II technical knowledge has a strong tacit component and is highly specific to individual firms. Innovation results from the accumulation of technological competencies by heterogeneous firms. Firm-specific technical change is cumulative, meaning that the generation of new knowledge builds upon what has been learned in the past and accumulated competencies significantly constrain the future technological performance of the firm. Over time, the firm-specific, tacit, and cumulative nature of the knowledge base builds high barriers to entry. A few (large) firms eventually come to dominate the market in a stable oligopoly. Thus, the presence or absence of persistence in innovative activities is a major property of the innovative process and an important feature of the patterns of technological change, which has significant implications for both theory 3 and policy-making. Because of its cumulative nature, technological change is usually characterised by dynamic increasing returns (learning-by-doing, learning-to-learn, research breeds new opportunities, etc.). Persistence in innovative activities might mean that technological change could be one source of increasing returns that can support persistent growth (Barro and Sala-i-Martin, 1995). Persistence, in general, would give some support to the ‘competence-based’ theory of the firm at the microeconomic level (Nelson and Winter, 1982; Teece and Pisano, 1994), as well as to endogenous growth theories at the macroeconomic level. On the contrary, persistence in innovative activities would weaken those interpretations of the processes of growth of firms, industries and countries (ranging from simple Gibrat-type processes to the models of the real business cycle) where dynamics is essentially driven by small uncorrelated shocks. More generally, understanding whether innovative activities are persistent or not at the firm level would constitute an important piece of evidence for founding and improving current theories of industrial dynamics and evolution, where some forms of dynamic increasing returns play a major role in determining degrees of concentration and its stability over time, rates of entry and exit and so forth (Klepper, 1996; Jovanovic, 1982; Hopenhayn, 1992; Nelson and Winter, 1982; Dosi et al., 1995).

3 Models as different in inspiration as those of Nelson and Winter (1982) and Ericson and Pakes (1992) show that these two alternative patterns of technological change can be interpreted as two faces of the stochastic process which drives technological accumulation at the firm level and thereby drives the dynamics of the industry.

492

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

However, very little is known about the relative empirical relevance of persistence in innovative activities. Recently, a few studies have begun to provide some (and somewhat contrasting) empirical results. Malerba and Orsenigo (1999), examining the patterns of innovative dynamics, find that occasional innovators (firms who patent just once) constitute a large part of the whole population of innovators but a much lower share of the total number of patents at any given time. Furthermore, large innovators (both in terms of patents and employees) tend to remain large over time and constitute a relatively stable core that persistently generates innovative activities, accounting for a very large share of total patents. Conversely, Geroski et al. (1997) estimate a Proportional Hazards Model of the probability that the spell of time in which a firm innovates will end at any particular moment. They find little evidence of persistence at the firm level. Very few firms innovate persistently and this happens only after a threshold level (5 patents or 3 major innovations) which only a few firms ever reach. In the present study I analyse the persistence properties of patent data, using a non-parametric approach based on Transition Probabilities Matrices (TPMs). The data set is composed of patents requested from the European Patent Office by a random sample of 577 UK manufacturing firms during the period 1978–1991. Examining the properties of the empirical distributions of patents, I found that patent distributions are not geometric. They do not display the lack-of-memory property and exhibit negative duration dependence. There appears to exist a threshold effect represented by the first patent every year in the sample. It is much more difficult to apply for the first patent than to go from n to n 1 1 patents, with n > 1. The transition probabilities across states suggest that in general there is little persistence in innovative activities, but at the same time there is evidence of ‘bimodality’ in the estimated TPMs, especially as the transition period is longer. In other words, there is a strong persistence in remaining in the polar states, namely in the state in which firms do not apply for a patent every year or in the state in which firms apply for many patents (at least 6). The ‘great’ innovators (firms that apply for at least six patents per year) are very few in number (2.37% on average), but they account for the large majority of patents requested (77.85% on average). These results suggest that innovative activities, at least those that are captured by patents, are persistent. Finally, there is evidence of heterogeneity across the dimensions of the sample, industrial and size classification. The mechanical engineering sector shows low persistence and low bimodality, while the chemical industry shows high persistence and high bimodality, implying that innovative activities are sector specific. Large firms show more persistence than small firms, in line with the Schumpeter Mark II hypothesis. The rest of the paper is organised as follows: the next section describes the data, while Section 3 describes the methodologies used to analyse the data. Section 4 presents the results and Section 5 the sensitivity analysis. Conclusions follow.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

493

2. The data The data set of this study is a time-series of patents requested by 577 U.K. manufacturing firms during the period 1978–1991. The random sample of firms was chosen from the population of 6052 U.K. firms that from 1978 to 1991 have applied 4 for at least one patent to the European Patent Office 5 . In addition, for the year 1991, there are variables concerning firm size, industrial classification and other firm characteristics (quoted vs. non-quoted, independent vs. subsidiary). The number of employees was chosen as a proxy for size. According to the number of employees in 1991, firms were divided in four size groups: small (from 1 to 99 employees), medium (100–499), medium-large firms (500–999), and large (at least 1000 employees). Firms were grouped according to two-digit industrial classification although information at four-digit level for each firm was available (provided by ICC and Datastream), mainly because firms are usually very diversified at four-digit level (and it is very difficult to assign a firm to a particular four-digit industry 6 ), and, secondly, because to undertake a meaningful empirical analysis it is necessary to have sub-groups with a sufficient number of observations. Because of these data constraints, only four sectors were analysed (namely, chemical, 69 firms; mechanical engineering, 132; electrical and electronic machinery, 100; and instrument engineering, 56). The histograms of the patents for every year in the sample are very similar to each other: there is a large mass at zero and a very long tail 7 . The proportion of firms that do not apply for any patent during one year ranges from 97% in 1978 to 77% in 1989: on average, the large majority of the firms, 83%, do not apply for a patent. Except for 1978 and 1979, at least 13.9% of the firms apply for at least 1 4

Since the purpose of the paper is to analyse the properties of patent data as a proxy of firm innovative activities, patent applications per firm were chosen instead of the number of patents actually issued. Applying for a patent is costly from the point of view of the firm, but even when the patent is not issued the firm receives from the Patent Office a detailed report on the latest state-of-art in the related technological field, which constitutes a valuable source of information for firms who are active on the technological frontier. On average, 85% of patent applications was actually issued by the European Patent Office. 5 The data on patent applications from the European Patent Office data-bank were kindly provided by the CESPRI of the Bocconi University, Milan. I thank Franco Malerba and Luigi Orsenigo of Bocconi University, who allowed me to use these data. 6 Typically, there are two ways to get around this problem. One is to assign to each firm the four-digit industry data for its principal industry. The other is to use weighted average industry data for each firm, with the weights approximating the share of firm’s activities (sales, employment, profits) in each industry. Neither of these solutions is possible in this case since information either on the share of various firm’s activities or on its main branch of activity is not available for many firms in the sample, especially for non-quoted firms that represent 71% of the overall sample. 7 This feature explains the high values of the skewness and the kurtosis of all patent variables. Descriptive statistics may be found in the supplementary material on the IJIO’s Editorial Office web site.

494

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

patent per year; while if I include 1978 and 1979 I obtain that, on average, 10% apply for a patent and 6% apply for more than one patent per year. The frequency distribution of total patents requested over the entire period 1978–1991 shows that 53% of the firms request only one patent over the entire period, 16% two patents and 15% apply for between half a dozen and two hundred patents. The figure that clearly emerges from this data is that about half of the firms who patent, did so only once, while 16% produce at least one patent per year on average. Reorganising the data according to firm characteristics provides interesting additional evidence. On average, 17.5% of quoted firms apply for at least 1 patent every year, compared with only 9% of the non-quoted firms. The difference between the two averages is statistically significant, while the difference between the same averages for the independent firms (25%) and the subsidiaries firms (24%) is not significant. While it matters that a firm is quoted or non-quoted, it makes no difference with respect to the ‘production’ of patents that it is independent or subsidiary. Indeed, the subsidiary sub-group includes research laboratories of large business firms / groups (when they are an autonomous firm) that apply for many patents per year. The distribution of patents by industry is very stable over time. On average, 25% of the firms in chemical industry have at least 1 patent per year, 16.6% in mechanical engineering, 17% in electrical and electronic machinery, and slightly above 20% in instrument engineering. Cumulatively, 23% of firms have at least 6 patents in chemical industry, 14% in mechanical engineering, 15% in electrical and electronic machinery, and 18% in instrument engineering. The data show that the chemical industry has the highest propensity to patent, although to put these numbers into a correct perspective we have to keep in mind that these are the four major innovative sectors of the industry classification.

3. Methodology

3.1. The distributional properties of patent data To analyse the distributional properties of patent data I examined the empirical distributions of patents for every year in the sample. In particular, I am interested in testing whether these distributions are geometric or Poisson. If patent distributions are geometric, they will display ‘lack of memory’ and a constant hazard rate, while if they are Poisson it means that the patenting process is random with a given intensity (or mean capacity) to patent that cannot certainly be cumulative or path-dependent. In order to test these hypotheses, I assume that the individual variables, the number of patents requested each year by a firm, are independent, which is a strong assumption to make in view of the rest of the analysis. To partially

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

495

overcome this limitation, I perform the same tests on the distribution of the cumulative sum of patents for every firm over the 14 years of the sample.

3.2. Persistence properties I measure persistence as a firm’s probability of remaining in its initial state. Namely, if a firm is a systematic innovator (that is, applies for at least one patent in a year) I am interested in knowing what is the probability that it remains a systematic innovator as time passes 8 . To investigate whether patents data show some persistence, one could simply model the data for each firm as an autoregressive model and estimate the persistence parameter by standard econometric tools. However, given the shortness of the patents time series, standard OLS regression tools give a biased estimate of the true persistence parameter in small samples. One needs to exploit both the cross sectional and the time series information of the sample. An econometric strategy suggested by Quah (1993a,b) deals with the dynamics and cross-section dimensions, based on what is called Random Fields. At each point in time there is a cross-section distribution of firms’ patents, which is the realisation of a random element in the space of distributions. The idea is to describe their evolution over time, which will allow us to analyse intra-distribution mobility and persistence of the firms’ innovative activities. In the context of Random Fields, the realisation of a random element is a cross-section distribution function that can be estimated from the data (Silverman, 1986, Section 2.10). However, there are two limitations of the distribution functions in this context. One is that persistence is generally a dynamic concept while the cross-section distributions are point-in-time estimates, available for 1978–1991. Further, the distribution functions do not give any information about the firm’s relative situation and its movement over time. To deal with these limitations, it is necessary to derive a law of motion for the cross-section distributions in a more formal structure. Let Ft denote the distribution of patents across firms at time t; and let us describe hFt : integer tj’s evolution by the law of motion: Ft 11 5 P ? Ft

(1)

where P maps one distribution into another, and tracks where points in Ft end up in Ft11 . Eq. (1) is analogous to a standard first-order autoregression, except that its 8

It is worth emphasising that defining as ‘innovators’ firms that applied at least once for a patent in the period of observation, our sample is constituted only by innovating firms. The purpose of the paper is to detect whether the firms that innovate do so persistently. Therefore, problems of endogenous sample selection do not arise because properties of the total population of firms are not inferred from this sample but only properties of the sub-group of innovating firms.

496

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

values are distributions (rather than scalars or vectors of numbers), and it contains no explicit disturbance or innovation. By analogy with autoregression, there is no reason why the law of motion for Ft need be first-order, or why the relation need be time-invariant. Nevertheless, (1) is a useful first step for analysing the dynamics of hFt j, and afterwards I will drop these assumptions and test the robustness of the results to these hypotheses. Operator P of (1) can be approximated by assuming a finite state space 9 for firms S 5 hs 1 , s 2 , . . . , s r j, where s i (i 5 1, . . . , r) are the possible states. In this case P is simply a Transition Probability Matrix (TPM). P encodes the relevant information about mobility and persistence within the cross section distributions. Therefore, assuming that the law of motion for Ft is time-invariant and of first-order, the one-step transition probability is defined by: pij 5 P(Xt 1n 5 juXt 5 i)

(2)

with t51978, 1979, . . . , 1993 and n51, 5, 10 years. The TPM P is the matrix with pij as elements measuring the probability of moving from state i to state j in one period. This TPM offers useful information for analysing persistence since it measures the probability that a firm goes from a state to another state in one period. A state is identified by the number of patent applications filed each year. TPMs can be computed on the percentiles of the empirical distribution or on arbitrary bounds selected by the user. The focal attention of the analysis is on the transition of firms from the state in which they do not apply for a patent in a given year to the state in which they apply for at least one patent in the subsequent year. Within this latter state, the attention focuses on the transitions between states in which firms applied for a low, medium or great number of patents. Subsequently, two and four state TPMs are computed. In the first TPM, i.e. the two states, the first state is defined as having requested no patents at all in a year (what it is called the ‘occasional innovator’ state), while the second one represents having requested at least one patent (the ‘systematic innovator’ state). In the four-state TPM, the states were selected as follows, first state (occasional innovator): having requested no patents at all; second state (small innovator): having requested 1 patent; third state (medium innovator): having requested from 2 to 5 patents; fourth state ( great innovator): having requested at least six patents. Once TPMs have been obtained, the first-order autoregressive parameter implied 9 Suppose a moving particle and denote its range by I. This may be a finite or infinite set of integers and it may be an arbitrary countable set of elements, provided that the definition of random variables is extended to take values in such a set. Let us call I the state space and an element of it a state. The particle moves from state to state. There is a set of transition probabilities pij , where, such that: if the particle is in the state i at any time the probability that it will be in the state j after one step is given by pij .

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

497

by each chain is calculated, as suggested in Amemiya (1994). This will be used as a synthetic measure of persistence in innovative activities. Let x t be a stochastic process approximated by a two-state Markov chain with transition probabilities:

F

P[Xt 5 juXt 21 5 i] 5

p 12q

G

12p . q

(3)

The implied AR(1) process for x t can be constructed as: x t 5 (1 2 q1 ) 1 r1 x t 21 1 vt ,

(4)

where r 5 p 1 q 2 1. According to our definition, there is persistence in innovative activities if r is greater than 0. This certainly happens when both the diagonal elements of the TPM are larger than 0.5. When the elements of the main diagonal are all equal to 1, there is perfect persistence. Conversely, if p and q are both smaller than 0.5, r ,0, that is, there is a tendency to revert from one state to the other in every period, and the innovative activities could be characterised as non-persistent. TPMs are computed for three different period lengths: (i) 1 year; (ii) 5 years to capture medium run dynamics; (iii) 10 years to illustrate the long run dynamics. Therefore, three different first-order autoregressive parameters r1 , r5 , and r10 are calculated which measure persistence in the following AR(1) processes: x t 5 (1 2 q1 ) 1 r1 x t 21 1 vt x t 5 (1 2 q5 ) 1 r5 x t 25 1 vt x t 5 (1 2 q10 ) 1 r10 x t210 1 vt Once the TPMs of interest have been calculated, the resulting probabilities are estimates of the transition probabilities. Then a non-parametric approach is used to assess the accuracy of these estimates. This approach consists in applying the bootstrapping methodology to the transition matrices to find the standard errors associated with transition probabilities (see Cefis, 1999 for bootstrapping applied to TPMs).

4. Results

4.1. Examining the distributional properties of patent data Examining the histograms of the patents for every year in the sample suggests that patents data could come from a geometric distribution. I use a graphical procedure, namely a Q–Q plot (Wilk and Gnanadesikan, 1968), in order to

498

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

understand if the assumption is reasonable 10 . Assuming that the empirical patent distributions are geometric distributions with distribution function F(x) 5 1 2 (1 2 p)x 11 with x > 0, I have plotted 2 ln(1 2 F(x)) 5 2 (x 1 1)ln(1 2 p), where F(x) is the empirical distribution function calculated in x (the number of patents requested in a given year). If the theoretical distribution assumed were correct, the graph would have looked approximately like a straight line. The Q–Q plots were constructed for all the years on the overall sample and on each industrial and size sub-sample. Neither the aggregate plot (Fig. 1), nor the disaggregate ones (as an example see Fig. 2 11 ) display a straight line: the patent data, both at aggregate and at disaggregate level, do not come from a geometric distribution. Indeed, the Pearson x 2 test rejects the null hypothesis of the geometric distribution as parent distribution for the overall sample as well as for all sub-samples, confirming the results of the graphical analysis (see Table 1)12 . The result is primarily due to the very long right tail that the patents distribution

Fig. 1. Q–Q plot for the overall sample (year 1983).

10

A Q–Q plot on linear rectangular coordinates is a collection of two-dimensional points specifying corresponding quantiles from two distributions. Typically, one distribution is empirical and the other is a hypothesised theoretical one. The primary purpose is to determine visually if the data could have arisen from the given theoretical distribution. If the empirical distribution is similar to the theoretical one, the expected shape of the plot is approximately a straight line; conversely, large departures from linearity suggest different distribution and may indicate how the distributions differ. 11 The other Q–Q plots, as well as the histograms, may be found in the supplementary material on the IJIO’s Editorial Office web site. 12 In the Table are reported only the results for the year 1983 and for the cumulative sum of the patents. The other results may be found in the supplementary material on the IJIO’s Editorial Office web site.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

Fig. 2. Q–Q plot for sub-samples (year 1983).

499

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

500

Table 1 The Pearson x 2 test (year 1983) Sub-sample

H 0 : Poisson distribution

H 0 : Geometric distribution

Cumulative sum of patents H 0 : Poisson

Cumulative sum of patents H 0 : Geometric

Small firms Large Chemical Mechanical engineering Electrical and electronics Instrumental engineering

1.60e1014 7.00e1011 854 010 109 800 50 704 000 3 165 500

823.56 404.77 142.69 224.04 146 610 120.88

60 457 5.49e1015 2.89e1031 49 236 000 232 160 993.33

189.31–0.06 857.370 124.53 182.80 394.93 121.51

displays for every year in the sample. The geometric distribution is also rejected for the cumulative variable. Rejecting the geometric distribution means that our distributions do not possess the ‘lack of memory’ property. This property is formally defined by the following condition: P(X 2 u > zuX > u) 5 P(X > u 1 z) /P(X > u) 5 P(X > z) for arbitrary z, u > 0

(5)

The property means that, with origin at any value u, the distribution function is unchanged. That is, truncation on the left makes no difference. In our case, on the contrary, truncation on the left makes difference (that is, the distribution function depends on the threshold I choose): the probability of getting an additional patent is always larger if a firm has already applied for at least one patent than if it has not patented at all. For example, for 1984, P(x > 7uu 5 6) 5 0.83 while P(x > 1uu 5 0) 5 0.104. This result holds for each patent distribution and no matter how we rearrange the sample. In addition, our distributions do not display another characteristic property of a geometric distribution, the ‘constant hazard rate’ property. Roughly, the hazard rate is the rate at which spells are completed after duration x, given that they last until x. The hazard function is hsxd 5 f(x) / 1 2 Fsxd where F(x) is the distribution function and f(x) the density at x of the number of patents requested. For the geometric distribution h(x) is a constant, for each x. The geometric distribution is the only discrete distribution that possesses this property. In this case, the property states that the probability that a firm has to stop applying for an additional patent does not depend on x (the number of patents already requested), but is always the same. Patent distributions display decreasing hazard rates or negative duration dependence. For example, for 1984 for the overall sample, I obtain: h(0) 5 4.43, h(1) 5 1.3, h(2) 5 0.58, h(3) 5 0.46, h(4) 5 0.25, h(5) 5 0.08 and so on, with the hazard rate decreasing as the number of patents requested increases.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

501

For each year, the hazard rate drastically decreases from h(0) to h(1) and then decreases more smoothly. This feature, jointly to the fact that the probability of getting an additional patent is always larger if a firm has already applied for at least one patent than if it has not patented at all, suggests that there is a threshold to patenting, where the threshold is represented by the first patent. A firm that patents a lot has a higher probability of getting an additional patent than a firm that has not patented at all. In other words, it is much more difficult to obtain the first patent than to go from n to n 1 1 patents, with n > 1. These characteristics are maintained no matter how I rearrange the samples into groups according to firms characteristics 13 . Quite often in the economic literature, the patenting process is represented as a Poisson process. The Poisson distribution is the counting distribution for the corresponding Poisson process. The process continues over a certain index set of time or space where specified rare events occur at random in the index with some fixed mean rate. Let X(t) be the number of occurrences of the specified events (for example, the number of the patents requested in a year) in the time interval [0, t) with t 5 0 and X(0) 5 0. Usually, the standard postulates reported in the supplementary material are made to define the Poisson process. Under the standard postulates the probability qx (t) of exactly x events (in our case, the probability to apply for exactly x patents) occurring in the time interval [t, t 1 t) is qx (t) 5 ( lt)x e 2 l t /x! for x 5 0, 1, 2, . . . ., which is nothing but the Poisson distribution with parameter lt, and hence X(t) is called the Poisson process with intensity (or mean) l (in this case l represents the propensity, or better, the average capacity to patent). To get the form of qx (t) heuristically, the following intuitive approach is available. Dividing the time interval [0, t), for example a given year, into n disjoint intervals of equal length h 5 t /n (they can be days, working hours, etc.), then, under the postulates above, we have that the probability to get a certain number of patent applications, x, in a year, t, is given by a Poisson distribution. In other words, if the patenting process is Poisson, the patent data should have a Poisson distribution. To verify whether the empirical patent distributions are Poisson distributions, I performed a Pearson x 2 test, in which the null hypothesis is that the patent data come from a Poisson distribution. As Table 1 shows, the null hypothesis is rejected for the overall sample as well as for the industrial and size sub-samples. Rejecting the Poisson distribution means that the postulates on which a Poisson process is based may not hold. In particular, if the patenting process is Poisson, it 13

These results are not in contrast with those of Geroski et al. (1997). They found that there exists a threshold represented by the 5th patent or the 3rd major innovation beyond which ‘‘some form of ‘dynamic scale economies’ may govern the production of patents’’. This effect is captured later in this study where it is shown that estimated TPMs display strong ‘bimodality’ (see next footnote), that is ‘great’ innovators tend with a very high probability to remain great innovators.

502

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

means that the process is certainly not cumulative and path-dependent, due to the stationarity (the probability of a number of events occurring in a time interval depends only on the number of the events and on the length of the interval, but not on time) and independence assumptions (the number of events occurring on non-overlapping time intervals are mutually independent 14 ). Rejecting the Poisson distribution does not mean that the patenting process is cumulative or pathdependent, but, at least, we cannot exclude the hypothesis.

4.2. Persistence properties Considering first the overall sample (see Table 2), the two-state TPMs show a strong persistence of the occasional innovator state, that is, a firm which in a certain year t has not applied for a patent, will not apply the following year (if one year transition period t 1 1 is considered) for a patent with a high probability (0.8769). Besides, there is a tendency (0.5525) to reversion towards the state in which firms do not apply for a patent if firms started from the state of applying at least for one patent in a year. The longer is the transition period, the stronger is the reversion. Indeed, the autoregressive coefficient goes from 0.3244 for transitions over 1 year to 0.1103 for transitions over 10 years (see Table 7). The four-state TPMs show that there is little persistence in general (not all the diagonal elements are substantially larger than 0.5), but there is a high probability of remaining in the polar states (for one year transition: 0.8769 and 0.7841), which with some abuse of terminology I refer to as bimodality 15 . In other words, there is a very strong persistence in remaining in the polar states (the occasional innovator state and the great innovator state). Note also that over 10 year period, there is a probability of 0.20 that a firm applying for at least 6 patents goes to the state in which it does not apply for a patent, while there is only a probability of 0.01 of going the other way around: it is easier to lose the knowledge and organisational capabilities necessary to innovate than to acquire them, even in the long run. Concerning the firm classification by industrial sectors (Tables 3 and 4 16 ), the two-state TPMs show that in the chemical industry, and contrary to the other three sectors, there is a strong persistence of remaining in the state in which the firm started regardless of the length of the transition period. For the other three industries there is a stronger tendency to go from the state in which firms apply for at least one patent to the state in which firms do not apply for a patent, than to 14 See The Poisson Process — postulate number 1 and 2 in the supplementary material on the IJIO’s Editorial Office web site. 15 It is worth emphasising that by the term ‘bimodality’ I am not referring to a bimodal distribution; it is only a way of saying that in a TPM the probabilities on the extreme of the main diagonal are much higher than the other diagonal probabilities. 16 Only the results on the chemical and the mechanical sectors are reported as well as on the large and small firms sub-samples. The other results can be found in Cefis (1999) or in the supplementary material on the IJIO’s Editorial Office web site.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

503

Table 2 Overall sample (a) Two-state transition probabilities No Patent

Patents

One year transition No patents Patents

0.8769 (0.0032) 0.5525 (0.0271)

0.1231 (0.0032) 0.4475 (0.0271)

Five year transition No patents Patents

0.8339 (0.0062) 0.6341 (0.0320)

0.1661 (0.0062) 0.3659 (0.0320)

Ten year transition No patents Patents

0.8130 (0.0104) 0.7027 (0.0486)

0.1870 (0.0104) 0.2973 (0.0486)

(b) Four-state transition probabilities No patents

1 Patent

2–5 Patents

At least 6 patents

One year transition No patents 1 Patent 2–5 Patents At least 6 patents

0.8769 (0.0032) 0.7436 (0.0179) 0.3734 (0.0350) 0.0227 (0.0127)

0.0985 (0.0027) 0.1556 (0.0131) 0.2215 (0.0226) 0.0398 (0.0157)

0.0227 (0.0019) 0.0893 (0.0101) 0.3070 (0.0324) 0.1534 (0.0357)

0.0019 0.0115 0.0981 0.7841

(0.0006) (0.0042) (0.0182) (0.0477)

Five year transition No patents 1 Patent 2–5 Patents At least 6 patents

0.8339 (0.0608) 0.7938 (0.0219) 0.5127 (0.0429) 0.1238 (0.0519)

0.1219 (0.0044) 0.0804 (0.0117) 0.1624 (0.0265) 0.0381 (0.0203)

0.0365 (0.0034) 0.0928 (0.0145) 0.1878 (0.0342) 0.1333 (0.0407)

0.0077 0.0330 0.1371 0.7048

(0.0019) (0.0110) (0.0353) (0.0798)

Ten year transition No patents 1 Patent 2–5 Patents At least 6 patents

0.8130 (0.0100) 0.8500 (0.0355) 0.5962 (0.0836) 0.2000 (0.0118)

0.1270 (0.0741) 0.0071 (0.0072) 0.0192 (0.0190) 0.0667 (0.0555)

0.0460 (0.0055) 0.0786 (0.0238) 0.1346 (0.0528) 0.0667 (0.0493)

0.0139 0.0643 0.2500 0.6667

(0.0037) (0.0263) (0.0718) (0.1401)

N.B.: Standard errors in parentheses.

remain in the state in which they apply for at least one patent each year. This tendency is stronger the longer is the transition period. Firms in the mechanical engineering industry show a much higher probability of not applying for a patent, given the fact they started applying for at least one patent (for t 1 1, 0.6748; for t 1 5, 0.7625; and for t 1 10, 0.9623), than firms belonging to all other industries. The four-state TPMs show a very strong bimodality in the estimated TPMs of the chemical industry. The two polar state probabilities are very high. Bimodality becomes stronger as the transition period lengthens. On the other hand, the mechanical engineering industry has a very strong tendency toward the state in

504

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

Table 3 Chemical Industry (a) Two-state transition probabilities No Patent

Patents

One year transitions No patents 0.8793 (0.0096) Patents 0.2869 (0.0518)

0.1207 (0.0096) 0.7131 (0.0518)

Five year transitions No patents 0.8149 (0.0202) Patents 0.2781 (0.0637)

0.1851 (0.0202) 0.7219 (0.0637)

Ten year transitions No patents 0.7634 (0.0305) Patents 0.2308 (0.0718)

0.2366 (0.0305) 0.7692 (0.0718)

(b) Four-state transition probabilities No patents

1 Patent

2–5 Patents

At least 6 patents

One year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8793 (0.0095) 0.6667 (0.0593) 0.1818 (0.0651) 0.0204 (0.0173)

0.0913 (0.0070) 0.1609 (0.0395) 0.1818 (0.0421) 0.0204 (0.0184)

0.0232 (0.0074) 0.1609 (0.0351) 0.4394 (0.0896) 0.0918 (0.0348)

0.0062 0.0115 0.1970 0.8664

(0.0030) (0.0111) (0.0533) (0.0510)

Five year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8149 (0.0200) 0.6383 (0.0838) 0.2391 (0.0678) 0.0172 (0.0220)

0.1298 (0.0135) 0.0638 (0.0292) 0.1304 (0.0507) 0.000 (0.0000)

0.0447 (0.0116) 0.1277 (0.0451) 0.3043 (0.0898) 0.1034 (0.0507)

0.0106 0.1702 0.3261 0.8793

(0.0049) (0.0772) (0.1226) (0.0590)

Ten year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.7634 (0.0302) 0.5714 (0.1450) 0.2105 (0.0907) 0.000 (0.0000)

0.1563 (0.0239) 0.000 (0.0000) 0.0526 (0.0510) 0.0526 (0.0692)

0.0402 (0.0143) 0.2143 (0.1070) 0.2632 (0.1142) 0.0526 (0.0555)

0.0402 0.2143 0.4737 0.8947

(0.0135) (0.0999) (0.1323) (0.0946)

which firms do not apply for any patent. This tendency becomes stronger as the transition period lengthens. In fact, over 5 year period, firms in the mechanical engineering sector have 0.50 chance of not applying for a patent (0 patents) starting as ‘great’ innovators (at least 6 patents), while over 1 year period they have 0.50 probability to remain ‘great’ innovators. In the mechanical engineering industry innovative activities appear to be very temporary, while in the chemical industry very persistent. Indeed, for the chemical industry the first-order autoregressive coefficient is 0.5924 over a 1 year period and 0.5326 over a 10 year period, while for the mechanical industry it is 0.2028 over a 1 year period and

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

505

Table 4 Mechanical engineering industry (a) Two-state transition probabilities No Patent

Patents

One year transitions No patents 0.8776 (0.0067) Patents 0.6748 (0.0472)

0.1224 (0.0067) 0.3252 (0.0472)

Five year transitions No patents 0.8599 (0.0112)

0.1401 (0.0112)

Ten year transitions Patents 0.7625 (0.0435) No patents 0.8421 (0.0208) Patents 0.9623 (0.0280)

0.2375 (0.0435) 0.1579 (0.0208) 0.0377 (0.0280)

(b) Four-state transition probabilities No patents

1 Patent

2–5 Patents

At least 6 patents

One year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8776 (0.0062) 0.7598 (0.0415) 0.4918 (0.0798) 0.0000 (0.0000)

0.1014 0.1453 0.2459 0.1667

0.0197 0.0894 0.2295 0.3333

0.0013 0.0055 0.0328 0.5000

(0.0010) (0.0050) (0.0332) (0.2631)

Five year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8599 (0.0114) 0.8220 (0.0432) 0.6111 (0.0827) 0.5000 (0.2651)

0.1148 (0.0090) 0.0763 (0.0246) 0.3056 (0.0813) 0.16667 (0.3966)

0.0243 (0.0048) 0.0847 (0.0286) 0.0556 (0.0364) 0.16667 (0.0884)

0.0009 0.0169 0.0278 0.1667

(0.0009) (0.0118) (0.0262) (0.0884)

Ten year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8421 (0.0206) 0.9459 (0.0388) 1.0000 (0.0000) 1.0000 (0.4778)

0.1116 0.0000 0.0000 0.0000

0.0442 0.0270 0.0000 0.0000

0.0021 0.0270 0.0000 0.0000

(0.0021) (0.0273) (0.0000) (0.0000)

(0.0051) (0.0272) (0.0512) (0.3982)

(0.0139) (0.0000) (0.0000) (0.0000)

(0.0041) (0.0192) (0.0597) (0.1754)

(0.0123) (0.0278) (0.0000) (0.0000)

20.1202 over a 10 year period (see Table 7). The other two industries are between these two extreme cases. One observes also important differences in persistence across size classes (see Tables 5 and 6): in general, persistence increases as size increases. The autoregressive coefficient increases as size increases: for a one year transition period, it is 0.0973 for small firms, 0.2212 for medium firms, 0.3143 for medium-large firms, and 0.4866 for large firms. The difference in persistence across size classes is maintained as the transition period lengthens. This result is in line with the Schumpeter Mark II hypothesis (the larger firms are more innovative

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

506

Table 5 Small firms (a) Two-state transition probabilities No Patent

Patents

One year transitions No patents 0.8899 (0.01085) Patents 0.7949 (0.07126)

0.1101 (0.01085) 0.2051 (0.07126)

Five year transitions No patents 0.8722 (0.01772) Patents 0.8627 (0.06237)

0.1278 (0.01772) 0.1373 (0.06237)

Ten year transitions No patents 0.8750 (0.02576) Patents 0.9375 (0.06553)

0.1250 (0.02576) 0.0625 (0.06553)

(b) Four-state transition probabilities No patents

1 Patent

2–5 Patents

At least 6 patents

0.8899 (0.01111) 0.8730 (0.04805) 0.4667 (0.16601) NA*

0.0927 (0.01054) 0.0952 (0.03715) 0.2667 (0.11975) NA

0.0157 (0.00498) 0.0317 (0.02063) 0.2667 (0.09727) NA

0.0017 (0.00174) 0.0000 (0.00000) 0.0000 (0.00000) NA

Five year transitions No patents 0.8722 (0.01692) 1 Patent 0.9024 (0.04304) 2–5 Patents 0.7000 (0.17077) At least 6 patents NA

0.1053 (0.01310) 0.0732 (0.03946) 0.1000 (0.07861) NA

0.0201 (0.00753) 0.0244 (0.02357) 0.2000 (0.12614) NA

0.0025 (0.00249) 0.0000 (0.00000) 0.0000 (0.00000) NA

Ten year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.0978 (0.02305) 0.0714 (0.07084) 0.0000 (0.00000) NA

0.0217 (0.01322) 0.0000 (0.00000) 0.0000 (0.00000) NA

0.0054 (0.00535) 0.0000 (0.00000) 0.0000 (0.00000) NA

One year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8750 (0.02675) 0.9286 (0.07084) 1.0000 (0.48169) NA

NA5Not available.

than the smaller ones), since the persistence of the distributions increases with firm size. Persistence dramatically decreases within each size group as the transition period lengthens: for example for medium-large firms the autoregressive coefficient goes from 0.0973 to 20.0101 to 20.0896 as the transition period lengthens. It is worth noting that bimodality is not observed: it appears only in the largest size class and even in this case, bimodality rapidly decreases as time goes by because the probability of remaining in the great innovator state significantly decreases. These results are not too surprising. If economies of scale exist in innovative

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

507

Table 6 Large firms (a) Two-state transition probabilities No Patent

Patents

One year transitions No patents 0.8480 (0.0075) Patents 0.3614 (0.0291)

0.1520 (0.0075) 0.6386 (0.0291)

Five year transitions No patents 0.7598 (0.0118) Patents 0.4485 (0.0382)

0.2402 (0.0118) 0.5515 (0.0382)

Ten year transitions No patents 0.6924 (0.0214) Patents 0.4964 (0.0600)

0.3076 (0.0214) 0.5036 (0.0600)

(b) Four-state transition probabilities No patents

1 Patent

2–5 Patents

At least 6 patents

One year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.8480 (0.0074) 0.5858 (0.0301) 0.3037 (0.0372) 0.0163 (0.0113)

0.1076 (0.0053) 0.2249 (0.0206) 0.2290 (0.0300) 0.0380 (0.0154)

0.0415 (0.0049) 0.1598 (0.0208) 0.3271 (0.0372) 0.1413 (0.0346)

0.0029 0.0296 0.1402 0.8043

(0.0012) (0.0108) (0.0271) (0.0481)

Five year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.7598 (0.0147) 0.6495 (0.0409) 0.4545 (0.0721) 0.0901 (0.0678)

0.1480 (0.0098) 0.1289 (0.0218) 0.1364 (0.0460) 0.0180 (0.0159)

0.0756 (0.0086) 0.1598 (0.0297) 0.2424 (0.0524) 0.1171 (0.0604)

0.0166 0.0619 0.1667 0.7748

(0.0073) (0.0533) (0.0671) (0.1045)

Ten year transitions No patents 1 Patent 2–5 Patents At least 6 patents

0.6924 (0.0217) 0.7059 (0.0508) 0.4375 (0.1337) 0.1795 (0.1475)

0.1752 (0.0156) 0.0441 (0.0111) 0.0313 (0.0323) 0.0513 (0.0541)

0.0979 (0.0121) 0.1765 (0.0471) 0.1250 (0.0495) 0.0769 (0.0705)

0.0345 0.0735 0.4063 0.6923

(0.0091) (0.0197) (0.1253) (0.1621)

activities, due for instance to the fixed and sunk costs linked to R&D (Cohen and Klepper, 1996), large firms would turn out to be both more innovative and more persistent. Notice, however, that the direction of causation might well go from persistence to high innovative performance in the long run rather than the other way around. The advantage of size might be linked to the fact that e.g. the accumulation of competencies and infrastructures for R&D generates more persistent innovative activities over time and hence more innovations, even if static economies of scale are irrelevant or even negative. For the overall sample, a 20-state TPM was estimated in order to investigate whether the bimodality was due to right truncation or to the fact that in the last

508

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

Table 7 Estimates of the first-order autoregressive parameters

Overall Small Medium Medium-large Large Chemical Mechanical engineering Electrical & electronics Instrumental

r1

r5

r 10

0.32 0.10 0.22 0.31 0.49 0.59 0.20 0.33 0.37

0.20 20.01 0.05 0.16 0.31 0.54 0.10 0.20 0.09

0.11 20.09 20.05 20.06 0.20 0.53 20.12 0.10 0.02

class there was another mode of the empirical distributions. The bounds of this matrix were defined as follows: having applied for 0 patents, for 1 patent, for 2–3 patents, for 4–5 patents, and so on up to the last state defined as having asked for more than 20 patents. The results show that there still appears evidence of bimodality due to right truncation. In conclusion, bimodality in the four-state TPMs is due to the fact that the last state (at least 6 patents) is an open-ended class. Using an open-ended class permits me to examine the persistence properties in the activities of those here defined as ‘great’ innovators. Persistence in this state together with the fact that the large majority of patents are requested each year by ‘great’ innovators (as shown in Fig. 3) suggests that innovative activities, at least those captured by patents, are persistent.

Fig. 3. The importance of great innovators.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

509

4.3. The great innovators Fig. 3 shows pretty clearly the importance of the great innovators in innovative activities: they are very few in number (2.37% on average each year), but they account for the large majority of the total number requested each year (77.85% on average). Defining a great innovator as a firm that has applied for 6 or more patent at least one year over the 14 years of the sample period, I find 40 great innovators over 577 firms (7% of the UK random sample). Considering the industrial classification, the majority of the great innovators belongs to the chemical sector (on average the chemical firms represent the 50% of the sample, excluding the first year), followed by the electrical and electronic sector and by the instrumental engineering sector. This rank is not surprising, since it reflects exactly the ranking of the sectors in terms of persistence. It is worth noting that the ranking does not change as time passes and the percentages of firms belonging to these sectors remain almost constant over time. The classification of great innovators according to firm size gives a picture in which the large firms play the most important role (on average the percentage of large firms is 75%, excluding the first year), followed by the medium-large class. Firms with more than 500 employees represent almost the totality of the great innovators. Also in this case the percentages of the size classes are quite stable over time. For the great innovator sample, two and four state TPMs were estimated for three transition periods 17 . The estimated probabilities are different from the ones analysed in the previous section, since these probabilities are conditional on the fact that firms, sooner or later in the sample period, must have applied for at least 6 patents. Formally, I have estimated the following probabilities pij 5 P(Xt 1n 5 juXt 5 i, max Xt > 6) t

(6)

Not surprisingly, two-state TPMs show that there is high persistence, but that declines as the transition period lengthens. Indeed, the first-order autoregressive parameter goes from 0.66 for a 1-year transition period, to 0.29 for 5 years, to 0.13 for 10 years. However, the decline is due to the dramatic decrease of the probability of remaining in the state in which firms do not apply for a patent. The same picture is given by the four-state TPMs: there is persistence and bimodality. As the transition period lengthens all the diagonal elements, except the

17 TPMs for the great innovator sub-sample may be found in the supplementary material on the IJIO’s Editorial Office web site.

510

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

last on the right, decrease quite rapidly, while the probabilities on the right of the main diagonal increase. Moreover, the sum (by row) of the elements out of the main diagonal on the right is always greater than the sum of the elements on the left. It is more likely that the number of patent applications increases than that it decreases. It is exactly the opposite of what happens in the TPMs previously estimated on the random samples.

5. Sensitivity analysis The above analysis has been conducted using various assumptions. In this section I study whether the results obtained are robust to changes in some of these assumptions. In particular, I will consider: (i) the possibility that the transition probabilities are not time invariant but display structural breaks within the 14 years of the sample and (ii) the possibility that the data can be represented by second-order Markov chains 18 . Concerning the first assumption, the sensitivity analysis shows time homogeneity of the sample and that the features of the TPMs are time invariant, suggesting that (short-term) business cycle considerations have little importance for qualitative features of the results As regards the second assumption, the transition probabilities I have considered thus far allowed the probability measure pij to depend on the state at time t 2 1, but not on the state at time t 2 2 or other lagged values. This limitation is not as restrictive as it may at first seem: any second (or higher) order Markov process (that is, one in which two (or more) lagged values affect the distribution of the current value) can be viewed as a first-order process with an expanded state space (Stokey and Lucas, 1989, Ch. 8.4). Given this, I consider the possibility that the data can be represented by second-order Markov chains. I estimate two and four second-order TPMs, especially in order to see whether they suggest a totally different picture and / or whether some additional features can emerge from this representation. Assuming that Markov chains are homogeneous on the parameter space and of second-order, the one-step transition probability is defined by: pkij 5 P(Xt 5 juXt 21 5 i, Xt22 5 k)

(7)

with t51978, 1979, . . . , 1991. The second-order TPM Q is the matrix with pkij as elements measuring the probability of moving to state j at time t, given that the firm was in state i at time t 2 1, and in state k at time t 2 2. 18 Complete results of the sensitivity analysis may be found in the supplementary material on the IJIO’s web site.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

511

Table 8 Overall sample second-order Markov chain (a) Two-state transition probabilities

Xt 21 5 0, Xt 22 5 0 Xt 21 5 0, Xt 22 5 1 Xt 21 5 1, Xt 22 5 0 Xt 2 1 5 1, Xt 22 5 1

No patents: Xt 50

Patents: Xt 51

0.8772 0.7473 0.8365 0.2919

0.1228 0.2527 0.1635 0.7081

(b) four-state transition probabilities

Xt 21 5 0, Xt 21 5 0, Xt 21 5 0, Xt 21 5 0, Xt 21 5 1, Xt 21 5 1, Xt 21 5 1, Xt 21 5 1, Xt 21 5 2, Xt 21 5 2, Xt 21 5 2, Xt 21 5 2, Xt 21 5 3, Xt 21 5 3, Xt 21 5 3, Xt 21 5 3,

Xt 22 5 0 Xt 22 5 1 Xt 22 5 2 Xt 22 5 3 Xt 22 5 0 Xt 22 5 1 Xt 22 5 2 Xt 22 5 3 Xt 22 5 0 Xt 22 5 1 Xt 22 5 2 Xt 22 5 3 Xt 22 5 0 Xt 22 5 1 Xt 22 5 2 Xt 22 5 3

No patents: Xt 50

1 Patent: Xt 51

2–5 Patents: Xt 52

>6 Patents: Xt 53

0.8905 0.8170 0.6202 0.0000 0.8182 0.6122 0.2069 0.0909 0.1414 0.7856 0.4920 0.0909 0.5385 0.0769 0.0833 0.0000

0.052 0.1312 0.2558 0.0000 0.1299 0.2245 0.3276 0.6364 0.7238 0.1455 0.2781 0.0909 0.1538 0.2308 0.0278 0.0165

0.0141 0.0518 0.1085 0.8571 0.0487 0.1429 0.4483 0.0909 0.1309 0.0657 0.2139 0.5454 0.2301 0.3846 0.3611 0.0661

0.0002 0.0000 0.0155 0.1429 0.0032 0.0204 0.0172 0.1818 0.0039 0.0031 0.0160 0.2727 0.0769 0.3077 0.5278 0.9174

The second-order TPMs display overall features which are similar to first-order ones. As Table 8 shows, in the two-state TPM, the probabilities of being in the polar states, that is the probability of having applied for no patents given that in the two previous years the firm has not applied for a patent as well as the probability of having applied for at least one patent given that in the previous two years the firm has applied for at least one patent, are very large, suggesting stronger persistence and bimodality features than those displayed by first-order TPMs. There is stronger evidence of heterogeneity across the dimension of the sample and especially across industrial classification (see Table 8). The four-state TPMs reinforce the results obtained by two-state TPMs: there is more persistence and more bimodality. For example, in the chemical industry the probability of applying for no patent given that the firm has not applied for a patent in the previous two years is 0.9042, and the probability of applying for at least six patents given that the firm has applied for at least six patents in the previous two years is 0.9740, while in the mechanical industry the two probabilities are respectively 0.8825 and 0.50. There is another interesting feature that suggests a stronger persistence in innovative activities: the probability of applying

512

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

for no patent given the fact that in the previous two years the firm has applied for at least six patents is always 0, and, vice versa, the opposite probability of applying for at least six patents given that in the previous two years the firm has not applied for a patent is always 0 or almost 0, for the overall sample and for every sub-sample across size and industrial classification. This result suggests that firm innovative activities are cumulative and pathdependent: it is quite difficult for a ‘great’ innovator to lose suddenly all its innovative capabilities and to exhaust all its innovating opportunities, and vice versa it is quite difficult for a firm that has innovated once in a while in the past to become suddenly a firm that has knowledge and organisational capabilities to innovate on a continuous basis. The estimation of second-order Markov chains suggests that the patenting process is not a first-order Markov process 19 . Indeed, if it were the case, the conditional probabilities of moving from time t 2 1 to time t among states would not have depended on the state in which the firm was at time t 2 2 and the estimated probabilities would have been very similar. The actual estimated probabilities, instead, show that the probabilities of moving from one state to another depend crucially on the state in which the firm was in previous periods. This observation suggests that a longer history matters in determining firms’ current innovative activities than simply the state in the last period.

6. Conclusions The paper analysed the statistical properties of the stochastic process generating the patent time series of UK manufacturing firms. Although the empirical distributions of patents for every year in the sample look geometric, the geometric distribution is rejected. The distributions do not display the lack-of-memory property and show decreasing hazard rate and, therefore, negative duration dependence. This result suggests that there exists a threshold to patenting and the threshold is represented by the first patent. The probability of going from zero to one patent application is uniformly much lower than the probability of going from n to n 1 1 patents, with n > 1. Moreover, applying for an additional patent becomes easier (in the sense that the probability to get an additional patent becomes higher) as the number of patent applications becomes higher. This can be interpreted as suggesting that once the threshold is crossed, innovative activities carried out inside the firm enjoy economies of scale. For the overall sample, TPMs show that there is little persistence (but not negligible) in general, but a strong bimodality. That is, there is a high probability 19 However, this does not mean that the estimated first-order TPMs lose their ‘descriptive’ validity: we are just saying that the process that generates the patent series is not a first-order Markov process.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

513

of remaining in the polar states, namely in the states in which firms do not apply for a patent (the ‘occasional innovator’ state) and in which firms apply for at least six patents (the ‘great innovator’ state), especially the longer is the transition period. This result, together with the observation that the large majority of patents are requested each year by ‘great innovators’, suggests that innovative activities, at least those captured by patents, are persistent. There is evidence of heterogeneity across the industrial classification and size dimensions of the sample. Differences in the estimated entries of the TPMs are particularly important across industrial sectors. The patent data show that in the chemical sector there is a high probability of remaining in the state in which the firms started regardless of the length of the transition period and the bimodality of the estimated TPMs is striking. In the mechanical engineering sector the state in which firms have not applied for patents is almost an absorbing state. This suggests that innovative activities are sector specific, that is, they proceed differently across industries: in some sectors persistence is very low, in others quite high. In line with the Schumpeter Mark II hypothesis, persistence tends to increase monotonically with firm size, with large firms being more persistent than small firms. Finally, the estimated second-order Markov chains suggest that firm innovative activities are path-dependent: it is quite difficult for a ‘great’ innovator to lose suddenly all its innovative capabilities and to exhaust all its innovating opportunities, and vice versa it is quite difficult for a firm that has innovated once in a while in the past to become suddenly a firm that has knowledge and organisational capabilities to innovate on a continuous basis. The analysis previously presented should be considered as a piece of evidence towards a systematic analysis of the sources and implications of persistence in innovative activities. Indeed, so far, we know nothing about the determinants of persistence and these results do not allow us to draw any strong theoretical conclusions. Yet, on the whole, these findings suggest that innovation is not a purely random phenomenon driven by small shocks, but it implies systematic heterogeneity across firms and / or some forms of dynamic increasing returns. Among firms that innovate (the majority do not innovate), many do so only occasionally and very few persistently. Innovating is difficult and persistent innovating is even more difficult. Even for great innovators it is often hard to maintain their innovativeness over prolonged periods of time. However, persistent innovators (large and small) originate a high share of all innovative activities. From a theoretical point of view, we might interpret these results as supporting the theory of ‘dynamic capabilities’, in which innovative performance is generated and has to be supported by systematic and continuous processes of accumulation of resources and competencies over time (Teece and Pisano, 1994), more than the ‘competence-based’ theory of the firm (Nelson and Winter, 1982). From a normative perspective, these results show that great innovators are persistent innovators. This might imply either that persistence is an important

514

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

ingredient for high innovative performance or that success in innovation breeds further success. In both cases, these results suggest the hypothesis that persistence rather than the size of firms or the size of investments in innovative activities per se might be an appropriate target for economic policies supporting innovation and for managerial strategies.

Acknowledgements I am particularly grateful to Soren Johansen, Franco Malerba, Luigi Orsenigo and Luis Phlips for encouragement and helpful suggestions. I would also like to thank Giovanni Dosi, Giuseppe Espa, Luigi Marengo, Stephen Martin, Chiara Monfardini, Danny Quah for interesting comments. This is a substantially revised version of a paper presented at RES 96, IGIER 96, EEA 96, ESEM 96, EARIE 96, University of Trento 98, and at Universitat Pompeu Fabra 98. I thank the participants and the seminar audiences for useful comments. The financial support of the EU TRM program, ERBFMBICT 96-0805, and of the University of Bergamo (grant ex 60%, n. 60CEFI01), Dept. of Economics) is gratefully acknowledged.

References Amemiya, T., 1994. Introduction to Statistics and Econometrics. Harvard University Press, Cambridge, MA. Baily, M.N., Chakrabarty, A.K., 1985. Innovation and productivity in US industry. Brookings papers on Economic Activity 2, 609–632. Barro, R., Sala-i-Martin, X., 1995. Economic Growth. McGraw Hill, New York. Bottazzi, G., Dosi, G., Lippi, M., Pammolli, F., Riccaboni, M., 2001. Innovaton and corporate growth in the evolution of the drug industry. International Journal of Industrial Organization 19 (7), 1161–1187. Bottazzi, G., Cefis, E., Dosi, G., 2002. Corporate growth and industrial structures: some evidence from the Italian manufacturing industry. Industrial and Corporate Change 11 (4), 705–723. Cefis, E., 1999. Persistence in innovative activities. An empirical analysis. Ph.D. Thesis, European University Institute, Florence. Cefis, E., 2001. Persistence in innovation and profitability. Submitted to Rivista Internazionale di Scienze Sociali. Cohen, W.M., Klepper, S., 1996. A reprise of size and R&D. Economic Journal 106 (437), 925–951. Coriat, B., Dosi, G., 1998. Learning how to govern and learning how to solve problems: on the co-evolution of competences, conflicts and organizational routines. In: Chandler, A.D., Hagstrom, P., Solvell, O. (Eds.), The Dynamic Firm: The Role of Technology, Strategy, Organization, and Regions. Oxford University Press, Oxford. Dosi, G., 1988. Sources, procedures and microeconomic effects of innovation. The Journal of Economic Literature 26 (3), 1120–1171. Cubbin, J., Geroski, P., 1987. The convergence of profits in the long run: inter-firm and inter-industry comparisons. Journal of Industrial Economics 35 (4), 427–442.

E. Cefis / Int. J. Ind. Organ. 21 (2003) 489–515

515

Dosi, G., Marsili, O., Orsenigo, L., Salvatore, R., 1995. Learning, market selection and the evolution of market structure. Small Business Economics 7, 411–436. Ericson, R., Pakes, A., 1992. An alternative theory of the firm and industry dynamics. Cowles Foundation Discussion Paper No. 1041, Cowles Foundation for Research in Economics at Yale University. Geroski, P., Jacquemin, A., 1988. The persistence of profits: a European comparison. Economic Journal 98, 375–389. Geroski, P., Machin, S., Van Reenen, J., 1993. The profitability of innovating firms. RAND Journal of Economics 24 (2), 198–211. Geroski, P.A., Van Reenen, J., Walters, C.F., 1997. How persistently do firms innovate? Research Policy 26 (1), 33–48. Griliches, Z., 1986. Productivity, R&D and basic research at the firm level in 1970s. American Economic Review 76, 141–154. Hopenhayn, H.A., 1992. Entry, exit and firm dynamics in long run equilibrium. Econometrica 60 (5), 1127–1151. Jovanovic, B., 1982. Selection and the evolution of industry. Econometrica 50 (3), 649–670. Klepper, S., 1996. Entry, exit and innovation over the product life cycle. American Economic Review 86 (3), 562–582. Malerba, F., Orsenigo, L., 1999. Technological entry, exit and survival. Research Policy 28, 643–660. Martin, S., 2001. Advanced Industrial Economics, 2nd Edition. Blackwell, Oxford and Cambridge, MA. Mueller, D.C., 1990. Profits and the process of competition. In: Mueller, D. (Ed.), The Dynamics of Company Profits: An International Comparison. Cambridge University Press, Cambridge. Nelson, R., Winter, S., 1982. An Evolutionary Theory of Economic Change. The Bellknap Press of Harvard University Press, Cambridge, MA. Odagiri, H., Yamawaki, H., 1990. The persistence of profits in Japan. In: Mueller, D.C. (Ed.). Patel, P., Pavitt, K., 1991. Europe’s technological performance. In: Freeman, et al. (Ed.), Technology and the Future Of Europe: Global Competition And The Environment in the 1990s. Pinter Publisher, London. Quah, D., 1993a. Empirical cross-section dynamics in economic growth. European Economic Review 37, 426–434. Quah, D., 1993b. Galton’s fallacy and tests of the convergence hypothesis. Scandinavian Journal of Economics 4, 427–443. Schumpeter, J.A., 1934. The Theory of Economic Development. Harvard Economic Studies, Cambridge MA. Schumpeter, J.A., 1942. Capitalism, Socialism and Democracy. Harper and Brothers, New York. Silverman, B.W., 1986. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London. Stokey, N.L., Lucas, R.E., 1989. Recursive Methods in Economic Dynamics. Harvard University Press, Cambridge, MA. Teece, D., Pisano, G., 1994. The dynamic capabilities of firms: an introduction. Industrial and Corporate Change 3 (3), 537–555. Wilk, M.B., Gnanadesikan, R., 1968. Probability plotting methods for the analysis of data. Biometrika 55, 1–17. Winter, S., 1984. Schumpeterian competition in alternative technological regimes. Journal of Economic Behaviour and Organization 5 (3–4), 287–320.