European Journal of Operational Research 217 (2012) 509–518
Contents lists available at SciVerse ScienceDirect
European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor
Production, Manufacturing and Logistics
Specification and estimation of primal production models q Subal C. Kumbhakar ⇑ Department of Economics, State University of New York – Binghamton, Binghamton, NY 13902, United States
a r t i c l e
i n f o
Article history: Received 13 January 2011 Accepted 26 September 2011 Available online 6 October 2011 Keywords: Production function Input distance function Input requirement function Cobb–Douglas Translog
a b s t r a c t While estimating production technology in a primal framework production function, input and output distance functions and input requirement functions are widely used in the empirical literature. This paper shows that these popular primal based models are algebraically equivalent in the sense that they can be derived from the same underlying transformation (production possibility) function. By assuming that producers maximize profit, we show that in all cases, except one, the use of ordinary least squares (OLS) gives inconsistent estimates irrespective of whether the production, input distance and input requirement functions are used. Based on several specifications of the production and input distance function models, we conclude that one can estimate the input elasticities and returns to scale consistently using instruments on only one regressor. No instruments are needed if either it is assumed that producers know the technology entirely (including the so-called error term) or a system approach is used. We used Norwegian timber harvesting data to illustrate workings of various model specifications. 2011 Elsevier B.V. All rights reserved.
Specification and estimation of the production function is important in production economics. In spite of many advances in the last 80 plus years since the introduction of Cobb–Douglas production function (Cobb and Douglas, 1928) in 1928 some of the fundamental issues are still debated. The two main issues of concern are specification and estimation of the underlying technology. The specification issue is important because there are many different ways in which one can specify the underlying technology. Although these alternative specifications are algebraically the same, they are not the same from econometric estimation point of view. These specifications use different econometric assumptions, and their data requirements are often different. Needless to say that the empirical results differ and this creates a big problem to the applied researchers who want to know which approach is appropriate to use. The choice is often dictated by what is endogenous (choice/decision variables) to the producers, and what is the objective of the producers. Cost minimization and profit maximization behaviors are widely discussed in microeconomics. Here we assume that the producers maximize profit in deciding their optimal output and input quantities (which are endogenous). Although the endogeneity issue was first addressed in Marschak and Andrews in 1944, it is still debated whether it is necessary
to use a system approach to handle endogeneity problems.1 In this paper we address these issues both theoretically and empirically. Our focus is on the primal specifications. We address the endogeneity issue primarily from economic theory (producer behavior) point of view as in Hoch (1958, 1962), Mundlak and Hoch (1965), Mundlak (1961) and Zellner et al. (1966). Furthermore, we discuss both single and system approaches for estimating the technology using crosssectional data under a variety of cases. Since the endogeneity problem comes from what are decision variables to the producers and what is the objective (economic behavior) of the producers, it is likely that one method cannot handle all situations. Zellner et al. (1966) showed that if producers maximize expected profit, the use of OLS in estimating the production function representation of the technology is appropriate in the sense that the OLS estimators are consistent. However, if producers know the so-called ‘unobserved’ (to the researchers) managerial input or management (Mundlak, 1961), the OLS estimators of the production function will be inconsistent even under expected profit maximization behavioral assumption. Similarly, if producers minimize cost and output is exogenously given, the use of OLS to the input distance function (not the production function) is appropriate in the sense that the OLS estimators are consistent. In a multiple output case (not considered here), the use of OLS to the output distance function formulation is appropriate under the
q I would like to thank three anonymous referees for their constructive and encouraging comments. ⇑ Tel.: +1 607 777 4762; fax: +1 607 777 2681. E-mail address:
[email protected]
1 See Nerlove (1965) for a comprehensive treatment of the issue from a system approach. Recently Levinsohn and Petrin (2003) addressed the issue from a panel data point of view. They proposed use of a single equation approach in which investment is used as a control for correlation between input levels and firm-specific effects.
1. Introduction
0377-2217/$ - see front matter 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2011.09.043
510
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
assumption that producers maximize revenue and inputs are exogenously given (Coelli, 2000). Kumbhakar (2011) show that the use of OLS to either input or output distance function will give inconsistent parameter estimates if producers maximize return to the outlay (Färe et al., 2002) and both inputs and outputs are endogenous (choice variable). The use of OLS in the input requirement function formulation (Diewert, 1974) gives inconsistent parameter estimates if there are more than one endogenous input variables. This is the case even if outputs are exogenous. Various formulations of the underlying production technology are routinely used in the literature without discussing the endogeneity issue that we focus here.2 The rest of the paper is organized as follows. Section 2 introduces both the Cobb–Douglas and translog transformation functions and show how one can derive the production, input distance and input requirement functions by using different normalizations. Section 3 deals with estimation of the Cobb–Douglas and translog specifications in many different forms using a single equation framework. We do the same in Section 4 using the system approach. Section 5 describes the data and Section 6 reports results from both Cobb–Douglas and translog models from both the single and system approaches. Section 7 concludes the paper. 2. Representations of the transformation function
which is the input requirement function (IRF) introduced by Diewert (1974). It is clear from the above that starting from the transformation function in (1) one can obtain the production function, the input distance function, and the input requirement function simply by using different normalizations. No additional assumptions are necessary for this. Theoretically, the transformation function in (1) can be traced back starting from any of these functions. Thus, the production function, input distance function and the input requirement function are algebraically equivalent in the sense that they are all derived from the same transformation function but use different normalizations. This is true for flexible functional forms such as the translog which is shown next. 2.2. Translog transformation function As before we assume that a producer uses a J inputs x to produce a single output y. The functional relationship between x and y is expressed as Af(y, x) = 1, where f(y, x) is assumed to be translog (TL), i.e.,
TL transformation function : X 1 bj ln xj ln f ðy; xÞ ¼ ay ln y þ ayy ln y2 þ 2 j
2.1. Cobb–Douglas transformation function
þ
Assume that a producer uses J inputs x to produce a single output y. The functional relationship between x and y is usually described by a production function f : RKþ ! Rþ where y = A f(x), where A is the efficiency parameter (function). We write the relationship in a more general form, viz., Af(y, x) = 1, and call it a transformation function instead of a production function which becomes a special case. Suppose that f() is Cobb–Douglas (CD) so that we can write it as
CD transformation function : Aya
Y
b
xj j ¼ 1
ð1Þ
X 1XX b ln xj ln xk þ djy ln xj ln y 2 j k jk j
ð6Þ
where bjk = bkj. Note that we need to impose (J + 2) identifying/normalizing constraints for the model in (6). If one uses the following normalizations ay = 1, ayy = 0, djy = 0"j = 1, . . . , J in (6) the standard translog production function is obtained, which is
TL production function : X 1XX bj ln xj þ b ln xj ln xk þ u ln y ¼ a0 þ 2 j k jk j
ð7Þ
j
We now show that various primal functions representing the above transformation function can be derived simply by using different identifying (normalizing) constraints. Note that for the CD specification in (1) we need to normalize one parameter. If we normalize a = 1 then we get the standard CD production function specification, viz.,
Production function : y ¼ A
Y
b
xj j
ð2Þ
j
where ln A = a0 + u. If we rewrite (6) as
X 1 ln f ðy; xÞ ¼ ay ln y þ ayy ln y2 þ bj lnðxj =x1 Þ 2 j X 1XX b lnðxj =x1 Þ lnðxk =x1 Þ þ djy lnðxj =x1 Þ ln y þ 2 j k jk j " # " ( ) # X X X bj ln x1 þ bjk ln xj ln x1 þ j
If we rewrite (1) as
P Y bj Ax1 j ya fxj =x1 gbj ¼ 1
þ
P and use the normalization j bj ¼ 1, then we get the input distance function (IDF) formulation (Shephard, 1953, 1970), viz.,
Y
fxj =x1 gbj
djy ln y ln x1
P P and use the following normalizations j bj ¼ 1; j bjk ¼ 0; 8k; P j djy ¼ 0 the input distance function representation is obtained, which is,
TL IDF :
Finally, if we normalize b1 = 1 in (1) it can be rewritten as
Y
b
xj j
ln x1 ¼ a0 þ ð5Þ
j¼2
2
k
ð4Þ
j¼2
Input requirement function : x1 ¼ A ya
j
#
j
ð3Þ
j¼2
Input distance function : x1 ¼ Aya
" X
There are too many papers which use various specifications we refer to in this paper in the next section. Some of the recent papers in the operation research literature are: Boussemart et al. (2009), Parelman and Santín (2009), among many others.
X j¼2
bj ln ^xj þ
1XX b ln ^xj ln ^xk þ ay ln y 2 j¼2 k¼2 jk
X 1 djy ln ^xj ln y þ u; þ ayy ln y2 þ 2 j¼2
ð8Þ
where ^ xj ¼ xj =x1 ; j ¼ 2; . . . ; J. Finally, if we use the following normalizations in (6) b1 = 1, b1j = 0"j, d1y = 0, the input requirement function is obtained, which can be written as
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
TL IRF : 1XX bj ln xj þ b ln xj ln xk þ ay ln y ln x1 ¼ a0 þ 2 j¼2 k¼2 jk j¼2 X 1 djy ln xj ln y þ u: þ ayy ln y2 þ 2 j¼2
511
prices and p is output price. The first-order conditions (FOCs) of profit maximization are: p@y/@xj = Ap@f(x)/@xj = wj. We can rewrite these FOCs as
X
ð9Þ
Thus, similar to the CD case the translog production function, input distance function and the input requirement function are algebraically equivalent in the sense that they are all derived from the same translog transformation function using different normalizations.
bj ¼
xj @ ln y wj xj p ¼ ) ¼ bj @ ln xj wj p y y
which show that xj/y are independent of u (note that ln A = u + a0). Consequently, xj/x1 will also be independent of u. Thus all we need to do is to rewrite (2) in such a way that the right hand side variables are expressed as xj/y, j = 1, . . . , J. That is,
ln y ¼ l0 þ 3. Estimation: a single equation approach Although the production function, input distance function and the input requirement function are algebraically equivalent the question is whether they also give same results empirically. This issue is related to endogeneity of the input and output variables. Since the right hand side variables differ depending on the specification one uses, estimation procedure will depend on endogeneity of the right hand side variables. We will now discuss the endogeneity issue of the regressors in each specification and suggest possible solutions to avoid inconsistency that arises due to correlation of the regressors with the error term. Before proceeding further it is necessary to discuss the source of this endogeneity a bit more. Hoch (1958, 1962), Mundlak and Hoch (1965), Mundlak (1961) argued that u captures ‘management’ and other unobserved variables that are either fully or partially known to the producer (which in our analysis is a micro production unit such as a manufacturing firm/plant or agricultural farm) but not to the analyst/researcher.3 If u is known to the producer then it would affect input demand and output supply if the producer maximizes profit conditional on u. That is, output and input quantities will be correlated to the error term (which in the present case is simply u). This will make OLS estimators inconsistent (Hoch, 1958; Mundlak and Hoch, 1965; Mundlak, 1961). Hoch (1962) and Mundlak (1961) suggested solutions assuming that panel data is available and u is time-invariant. Our primary focus here is cross-sectional data. Furthermore, like Hoch, Mundlak and others we will address endogeneity problem from economic behavior of the producers and look for possible solution of this problem with as little outside information (for instrumental variables) as possible. This is because it is difficult to obtain outside instrumental variables in a micro setting.4 3.1. The CD technology
ð10Þ
X
lj lnðxj =yÞ þ u
ð11Þ
j
P where l0 = a0/(1 r), lj = bj/(1 r), u⁄ = u/(1 r) and r ¼ j bj . ⁄ Since xj/y does not depend on u and therefore on u , OLS applied to (11) will give consistent estimators of l0 and lj. One can obtain consistent estimators of a0 and bj from the estimators of l0 and lj. Mundlak and Hoch (1965) call this an IV (indirect least squares) estimator. Thus, it is not necessary to have any instruments from outside. The endogeneity problem associated with the inputs can be eliminated by expressing the production function in terms of ratios of inputs to output. 3.1.3. Case B: u is fully known but there are optimization errors With optimization errors the first-order conditions (FOCs) of profit maximization are: p@f ðxÞ=@xj A ¼ p@y=@xj ¼ wj efj where fj R 0 represents optimization error for input xj. We can rewrite these FOCs
xj p ¼ bj y wj efj
ð12Þ
Since xj/y does not depend on u and therefore on u⁄, OLS applied to (11) will give consistent estimators of l0 and lj provided that cov (fj, u) = 0, "j. This is a standard assumption in the literature. However, if this assumption does not hold the endogeneity problem remains and OLS applied to (11) will be inconsistent. 3.1.4. Case C: u is partially known and there are no optimization errors Now we assume that u has two components u1 and u2. The u1 part is known to the producers (something like managerial ability) and u2 is unknown (outside producer’s control such as weather, machine breakdown, etc.). For input decision producers maximize expected profit conditional on u1. Write y ¼ EðyÞeu2 =h ye eu2 =h where h ¼ Eðeu2 Þ. The resulting FOCs are
xj p xj p u2 ¼ bj ) ¼ bj he wj wj ye y
ð13Þ
3.1.1. Estimation of the CD production function I Since the endogeneity problem is associated with correlation between regressors (which vary with specifications) and u we consider the source of this correlation is details. Here we follow Mundlak and Hoch (1965) and consider the following cases.
Since ln(xj/y) = ln(xj/ye) u2 + lnh it is clearly correlated with the error term in (11) which now contains (u1 + u2). Thus, OLS applied to (11) will give inconsistent estimators of l0 and lj.
3.1.2. Case A: u is fully known and there are no optimization errors Assume that producers maximize profit, p = py w0 x, subject to Q b the technology y ¼ Af ðxÞ ¼ A xj j , where w is the vector of input
xj p xj p ¼ bj ) ¼ bj heu2 ye y wj efj wj efj
3 McElroy (1987) and Kumbhakar and Tsionas (2011) considered a framework in which u is input-specific and assumed to be fully known to the producers. On the other hand, Zellner et al. (1966) took the opposite view and assumed that u is unknown to both producers and analysts. In such a situation inputs will not be correlated with u, especially if producers maximize expected profit. Under this scenario, there is no endogeneity problem in estimating production function by OLS. This paper is cited by many to claim that OLS is appropriate to estimate the production function. 4 Prices are natural instruments for inputs but prices are often either not available or do not vary much if they are available.
3.1.5. Case D: u is partially known and there are optimization errors The FOCs in this case are
ð14Þ
Solutions of ln(xj/y) from the above FOCs will depend on u2 and the fs. Therefore, ln(xj/y) will be correlated with the error term in (11) which will include u1 and u2, even if cov (fj, u1) = cov (fj, u2) = 0, "j. Thus, OLS applied to (11) will give inconsistent estimators of l0 and lj. 3.1.6. Case E: u is completely unknown and there are optimization errors This is the Zellner et al. (1966) case. If producers maximize expected value of profit, input demand functions will be independent
512
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
of u. If there are optimization errors and are assumed to be independent of u, OLS can still be used to estimate the standard production function in which the inputs (x) are regressors. One possible solution to the endogeneity problem associated with ln(xj/y) is to use Klein’s approach (1953). To illustrate this approach, we consider the most general case (Case D). The FOCs in Case D are
xj p xj wj ¼ b ) ln ¼ ln bj þ ln h fj u2 j ye py wj efj
ð15Þ
One can get consistent estimates of bj (provided that lnh = 0) by running OLS on each of the above FOCs. Alternatively, one can use the xj wj predicted values of (xj/y) (obtained from the predicted values of py in (15)) as regressors in (11). Note that this approach requires information on prices, although it is not necessary for these prices to vary across observations. If prices are available and they do vary across firms one can simply use them as instruments for inputs. Note that if h – 0 Klein’s approach will give inconsistent estimates of bj from (15).
3.2. The IDF function Now we consider estimation of the IDF in (4) which is expressed as
ln x1 ¼ a0 þ
X
bj lnðxj =x1 Þ þ a ln y þ u
ð18Þ
j
where u can be either fully or partially known (in which case we write u = u1 + u2) to the producer.5 As argued above the input ratios (xj/x1) are likely to be independent of the error term. If so, the only regressor that is correlated with the error term is lny which is clearly correlated with u in all cases A–D. In other words, OLS cannot be used to estimate the IDF when y is endogenous. We need instrument(s) for only lny while estimating (18). This is much better than direct estimation of the production function which requires instruments for all the input variables that appear on the right hand side of (2). 3.3. The IRF function To estimate the IRF in (5) we rewrite it as
ln x1 ¼ a0 þ
X
bj ln xj þ a ln y þ u
ð19Þ
j
3.1.7. Estimation of the CD production function II In this section we follow the production function approach but express it differently to avoid endogeneity of some of the regressors. For this we rewrite (11) as
ln y ¼ l0 þ
X
lj lnðxj =x1 Þ þ ly lnðx1 =yÞ þ u
ð16Þ
j
P where ly ¼ j lj . Going back to Cases A–D we find the following: (i) In Case A input demand and output supply functions will be affected by u in the exact same way so that their ratios are independent of u. That is, the regressors in (16) will be independent of the error term. (ii) In Case B input demand and output supply functions will be affected by u and all f but their ratios will depend only on the fs. Therefore, if fj are independent of u, the regressors in (16) will be independent of the error term. (iii) In Case C input demand functions will be affected by u1 whereas the output supply function will depend on u1 and u2. Furthermore, the input ratios will be independent of errors. However, ln(x1/y) will depend on u2 which cannot be independent of the error term in (16) which contains u1 + u2. Thus, we will have endogeneity problem with one regressor, viz., ln(x1/y). (iv) In Case D input demand functions will be affected by u1 and f whereas the output supply function will depend on u1, u2 and f. Furthermore, the input ratios will depend on only f and can be independent of the error term in (16) if fj are independent of u1 and u2. However, ln(x1/y) will depend on f1 + u2 which cannot be independent of the error term in (16) which contains u1 + u2. Thus, we will have endogeneity problem with one regressor, viz., ln(x1/y). In summary, the advantage of rewriting the production function in the form of (16) is that the (J 1) regressors are log of input ratios which are independent of the error term and therefore one needs to find instrument(s) for only one regressor. If prices are available, one can use Klein’s approach to construct an instrument for ln(x1/y) from its predicted value from the following regression
ln
x1 w1 ¼ ln b1 f1 u2 y p
Similar to the production function, all the regressors in the above IRF are endogenous (correlated with u) and therefore one needs to use instrument for all of them. In this sense estimation of the IRF is no different from estimation of the production function. The only time when IRF will be different from the production and IDF (and therefore appropriate to use econometrically) is if all the regressors are exogenous. That is, only one input is variable and all other inputs and output are exogenously given (not decision variables). If not there are no special advantages of IRF. Discussions of this section can be summarized as follows: since different model formulations use different regressors, the endogeneity problem (and therefore requirement of instruments) in these specifications are not the same, although the underlying transformation function is exactly the same. That is, whether all or some of the regressors are endogenous or not will depend on which formulation/specifaction is used for estimation. In arriving at this conclusion we assumed that producers maximize profit.6 3.4. The TL transformation function 3.4.1. Estimation of the TL production function Similar to the CD case, direct estimation of the translog production function in (7) using OLS will give inconsistent parameter estimates because of endogeneity of the right hand side variables (inputs are correlated with u). This will be true in all cases, except for Case E. However, if we start from the transformation function specification in (6) and use a different normalization along with the profit maximizing behavioral assumption it might possible to use OLS and yet get consistent estimator of the parameters in some cases. Rewrite the transformation function in (6) as
ln f ðy; xÞ ¼
1XX b lnðxj =yÞ lnðxk =yÞ 2 j k jk j ( ) ( ) X X X þ ay þ bj ln y þ djy þ bjk ln xj
X
bj lnðxj =yÞ þ
j
ð17Þ
(
ln y þ Alternatively, one can use the prices and/or other exogenous variables as instruments for ln(x1/y). Thus the main conclusion from this section is that endogeneity of the regressors is driven by whether u is known to the producers or not. If inputs are independent of u it is good news for estimating the production function but bad for input distance function which we discuss next.
j
XX 1 ayy bjk 2 j k
)
k
ln y2
ð20Þ
5 To estimate it by OLS, the dependent variable ln x1 cannot be independent of u which happens in Case E (Zellner et al., 1966). So we have to rule out Case E to discuss estimation of IDF. 6 Note that the profit maximizing conditions are not explicitly used here. These conditions are used only in system estimation, which is discussed later.
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
and use the following (J + 2) normalizations7: P P PP ay þ j bj ¼ 1; djy þ k bjk ¼ 0 8j; ayy j k bjk ¼ 0, to express the technological relationship as
ln y ¼ a0 þ
X
bj lnðxj =yÞ þ
j
1XX b lnðxj =yÞ lnðxk =yÞ þ u 2 j k jk
ð21Þ
Thus, the conclusion is that in more realistic cases (Cases C and D) the OLS estimator in (21) will be inconsistent. Now we explore one possibility to reduce the number of endogenous regressors. For this we rewrite (21) as
ln y ¼ a0 þ
X
bj ln ^xj þ
j
Consider first Case A where u is fully known to the producer and there are no optimization errors. If we assume that producers maximize profit subject to the transformation function in (6), the FOCs are: p + kfy(y, x)A = 0 and wj + kfj(y, x)A = 0 where fy() and fj() are partial derivatives of f(y, x) with respect to y and xj, respectively, and k is the Lagrange multiplier. These FOCs can be expressed as wjxj/py = @lnf(y, x)/@lnxj @lnf(y,x)/@lny. Using the translog function in (6) these derivatives are
@ ln f ðy; xÞ=@ ln xj ¼ bj þ
X
bjk ln xk þ djy ln y
k
¼ bj þ
X
( bjk lnðxk =yÞ þ
k
djy þ
X
bjk
ln y
k
X
@ ln f ðy; xÞ=@ ln y ¼ ay þ ayy ln y þ djy ln xj ( ) X X ¼ ay þ ayy þ djy ln y þ djy lnðxj =yÞ j
j
ð22Þ If we apply the above normalizations in (22), the FOCs become8
P wj xj bj þ k bjk lnðxk =yÞ P ¼ p y ay þ j djy lnðxj =yÞ
ð23Þ
It is clear from these FOCs that one can solve for xj/y which will be independent of u. Thus, there is no need to use any instrument for them and the relationship in (21) can be consistently estimated using OLS. The trick is to use express the right hand side variables in ratio form and make use of the profit maximizing behavior to argue that these ratios are not correlated with the u term. This result will also hold in Case B where the FOCs are
P wj efj xj bj þ k bjk lnðxk =yÞ P ¼ p y ay þ j djy lnðxj =yÞ
1XX b ln ^xj ln ^xk þ ð1 þ ay Þ 2 j k jk
X 1 lnðy=x1 Þ þ ayy lnðy=x1 Þ2 þ djy ln ^xj lnðy=x1 Þ þ u; 2 j
ð24Þ
which can be solved for xj/y. These solutions will be independent of u and therefore, if fj are assumed to be independent of u, OLS estimates in (21) will be consistent. In Case C, u = u1 + u2 and u1 is known. As before we assume that producers maximize expected profit (profit based on E(y) conditional on u1, i.e., ye). This will give FOCs that are same as those in Case A, except that y will be replaced by ye. Furthermore, xj/ye will be independent of the error term in (21), but when unobserved xj/ye are replaced by xj/y they will be correlated with the error term in (21). Thus one needs instruments for each and every regressor in (21). The same story holds in Case D. Here the FOCs will be similar to those in (24) with the exception that y is replaced by ye. Under the assumption that fj are independent of u1 and u2, xj/ye will be independent of the errors in (21). But since xj/y are the regressors to be used in estimation and they are correlated with the error term in (21) (via the common error term u2), we need instruments for all the regressors. In Case E inputs will be independent of the errors in (21) but not xj/y. In such a case one should use OLS to estimate (7) not (21). 7 Note that these normalizations are different from those used to get the translog production function in (7). 8 The same FOCs can be derived from maximizing profit subject to the production function in (21).
ð25Þ
To check endogeneity of the regressors, we consider Case D only because it is the most general and more realistic case. If we rewrite the FOCs of profit maximization based on ye as (wjxj/ w1x1)exp(fj f1) = @lnf(ye,x)/@lnxj @lnf(ye, x)/@lnx1, and (p ye/ w1x1)exp(f1) = @lnf(ye, x)/@lnye @lnf(ye, x)/@lnx1 the FOCs in (23) can be rewritten as
P bj þ k bjk lnðxk =x1 Þ þ djy lnðye =x1 Þ wj xj P expðfj f1 Þ ¼ w1 x1 b1 þ k b1k lnðxk =x1 Þ þ d1y lnðye =x1 Þ P e ay þ k bjk lnðxk =x1 Þ þ djy lnðye =x1 Þ p y P expðf1 Þ ¼ w1 x1 b1 þ k b1k lnðxk =x1 Þ þ d1y lnðye =x1 Þ
)
513
ð26Þ
These FOCs can be solved for xj/x1 and ye/x1 which will depend on the fs which are assumed to be independent of u1 and u2. However, when unobserved ye/x1 is replaced by y/x1 it will be correlated with the error term in (25), and we need instruments for this regressor only. Thus, by estimating (25) we reduce the number of correlated regressors from (J 1) to 1. 3.4.2. Estimation of the TL IDF We now consider estimation of the TL IDF in (8). It is clear from discussions above that all the regressors (especially lny) in (8) cannot be independent of the error term. To show this we consider the Case A for simplicity. Using the normalizations P P P j bj ¼ 1; k bjk ¼ 0; 8j; j djy ¼ 0 8m that are used in (8) the FOCs of profit maximization are;
P bj þ k bjk lnðxk =x1 Þ þ djy ln y wj xj P ¼ w1 x1 b1 þ k b1k lnðxk =x1 Þ þ d1y ln y P ay þ k bjk lnðxk =x1 Þ þ ayy ln y p y P ¼ w1 x1 b1 þ k b1k lnðxk =x1 Þ þ d1y ln y
ð27Þ
It is clear from the first (J 1) FOCs above that solution of input ratios will depend on lny which depends on u.9 So the input ratios as well as output variables in the standard input distance function in (8) are endogenous (correlated with u), and therefore use of OLS to estimate (8) will give inconsistent parameter estimates. This inconsistency can be avoided if we use the normalizations: P P PP ay þ j bj ¼ 1; djy þ k bjk ¼ 0 8j; ayy j k bjk ¼ 0, and rewrite the transformation function in (6) as
ln x1 ¼ a0 þ
X j
bj ln ^xj þ
1XX b ln ^xj ln ^xk þ ay lnðy=x1 Þ 2 j k jk
X 1 djy ln ^xj lnðy=x1 Þ þ u; þ ayy lnðy=x1 Þ2 þ 2 j
ð28Þ
where ^xj ¼ xj =x1 . The corresponding FOCs can be written as
P bj þ k bjk lnðxk =x1 Þ þ djy lnðy=x1 Þ wj xj P ¼ w1 x1 b1 þ k b1k lnðxk =x1 Þ þ d1y lnðy=x1 Þ P ay þ k bjk lnðxk =x1 Þ þ djy lnðy=x1 Þ p y P ¼ w1 x1 b1 þ k b1k lnðxk =x1 Þ þ d1y lnðy=x1 Þ
ð29Þ
9 If the translog function is separable, i.e., djy = 0 "j then the solutions of input ratios will be independent of y and these ratios will be independent of u. In such as case one can treat the input ratios in (8) as exogenous and need to use instrument for ln y to estimate (8). This is mentioned in Coelli (2000).
514
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
These FOCs can be solved for xj/x1 and y/x1 which will be independent of u. Note that the specification of the technology commonly in (8) is not the same as the one specified in (25). That is, the specification in (28) is not an input distance function. However, since all the right hand side variables in (28) are uncorrelated with u (which follows from the FOCs in (29)) the use of OLS to (28) will give consistent parameter estimates. This argument will also hold in Case D for which the FOCs will be the same as those in (26). These FOCs can be solved for xj/x1 and ye/x1 which will depend on the fs which are assumed to be independent of u1 and u2. However, when unobserved ye/x1 is replaced by y/x1 it will be correlated with the error term in (28), and we need instruments for this regressor only. The other alternative to avoid inconsistency is to assume that output is exogenously given and the producer minimizes cost given the technology and exogenously given output quantity. Under this assumption the FOCs will be given by the first (J 1) equations in (27) which can be used to solve for the input ratios (which will be functions of exogenously given input prices and output) which will be uncorrelated to u (by assumption). In such a case one can use OLS to estimate the IDF in (8) consistently. On the contrary, if lny is exogenous then the use of OLS to (25) will give inconsistent parameter estimates, because one of the right hand side variable, viz., ln(y/x1) will be endogenous since lnx1 is endogenous. This clearly shows the importance of behavioral assumption in deciding the appropriate function to be used in estimation, although all these functions are algebraically the same since all of them are derived from the same transformation function. In summary, unless output is exogenous the use of OLS to estimate the IDF is problematic. If output is endogenous one cannot treat the input ratios as exogenous (an assumption that is routinely made in IDF estimation), unless the technology is homothetic (output is separable from the inputs in the transformation function). For the non-homothetic case, the input ratios can be exogenous if the IDF is written differently in which the lny variable in the right hand side is expressed as ln(y/x) which is the only regressor that needs instruments. Given all the discussions above on endogeneity there is no need to have separate discussions on estimation of IRF. So far as the IRF in (9) is concerned all the right hand side variables are endogenous. Thus, the use of OLS will give inconsistent parameter estimates. This is true, even if output is exogenously given. Furthermore, the OLS estimates are not invariant to the choice of the input as the dependent variable. 4. Estimation: a system approach 4.1. Estimation of CD functions Here we consider only two cases (Cases C and D) to avoid repetitions. Also these are more realistic cases. 4.1.1. u is fully known to the producer Estimation procedure discussed earlier was based on a single equation method without using the price information explicitly (other than perhaps as instruments). In this section we assume price data are available and the prices are not the same for all producers.10 We use the FOCs that involve prices, along with the transformation function to define the system. This system consists of the production function in (2) and the FOCs of profit maximization in (12), viz., 10 Even if prices are same for all producers (which is consistent with competitive markets) the system approach will work. In such a case we do not need even data on prices because ln(wj/p) in (30) will be constant parameters that can be estimated along with the other parameters.
ln y ¼ a0 þ
X
bj ln xj þ u ð30Þ
j
ln xj ln y ¼ ln bj lnðwj =pÞ fj ;
j ¼ 1; . . . ; J
which can be estimated using the system approaches like nonlinear 3SLS, FIML, etc. The fj terms in (30) are added to accommodate optimization errors. The exogenous variables in the above system are price ratios (wj/p, j = 1, . . . , J). Since the above system considers endogeneity of y and all x variables there is no need to define a system based on the IDF and/or IRF. These systems will be identical (both algebraically and econometrically) to the system in (30). 4.1.2. u is partially known to the producer We now consider the case when output is also affected by factors unknown to both producer and analyst (u2). The u1 part is known to the producer. If we assume that the objective of the producer is to maximize expected profit (i.e., profit based on ye) the FOCs can be expressed as
ln y ¼ a0 þ
X
bj ln xj þ u1 þ u2 ð31Þ
j
ln xj ln y ¼ ln bj þ ln h lnðwj =pÞ fj u2 ;
j ¼ 1; . . . ; J
Note that the u2 term now appears in each of the equations, and it cannot be separated from the other error components. That is, the error terms in the above system will be u = u1 + u2 and vj = fj u2, which are assumed to be zero mean and constant variance multivariate normal if FIML is used. It is not necessary to assume that fj are independent of u1 and u2. Furthermore, h can be identified since there are no separate intercept terms in the FOCs.11 Since the above system considers endogeneity of y and all x variables explicitly, there is no need to define a system based on the IDF and/or IRF. These systems will be identical (both algebraically and econometrically) to the system in (31). 4.2. Estimation of translog functions 4.2.1. u is fully known to the producer Similar to the CD system the TL system consists of the TL production function in (7) and the FOCs in (23) using the appropriate normalization for the production function specification in (7), viz.,
1XX b ln xj ln xk þ u 2 j k jk j " # X lnðwj =pÞ þ ln xj ln y ¼ ln bj þ bjk ln xk fj ;
ln y ¼ a0 þ
X
bj ln xj þ
j ¼ 1; . . . ; J
k
ð32Þ where as before the fj terms in (32) are added to accommodate optimization errors. The above system can be estimated using FIML (assuming the error vector in the system is normally distributed), or 3SLS. The systems based on the IDF and/or IRF will be identical to the production system above. Note that since endogeneity of the output and input variables are explicitly recognized in the system, there is no need to write the system in such a way that the inputs appear in ratio forms. The system in (32) has (J + 1) equations with (J + 1) endogenous variables. 4.2.2. u is partially known to the producer Using similar arguments we can express the system for the TL production function when producers decide on input allocation based on ye as
11 If prices do not vary because the markets are competitive ln(wj/p) will be constant which can be added with lnh. It is not necessary to identify h.
515
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
1XX b ln xj ln xk þ u1 þ u2 2 j k jk j " # X lnðwj =pÞ þ ln xj ln ye ¼ ln bj þ bjk ln xk fj ; j ¼ 1; . . . ; J ln y ¼ a0 þ
X
bj ln xj þ
k
ð33Þ where lnye = lny u2 + lnh. The above system can be estimated using any of the system approaches after replacing lnye by lny u2 + lnh. As before u2 can not be separated from the error vector u1 + u2, fj u2. 5. Data In this study we used cross-sectional data (for the year 2003) of 3249 active forest owners (i.e., owners who harvest trees) extracted from the 2004 Sample Survey of Agriculture and Forestry, compiled by Statistics Norway. The output variable (y) consists of annual timber sales from the forest. The labor variable (x1) is calculated as the sum of hours worked by contractors and hours worked by the owner, his family or hired labor in 2003. The hectares forest area cut variable (x2) is the area of various types of final fellings in 2003. The capital input variable (x3) is an estimate of the value of the increment from the forest. This is calculated for each property as the mean weighted price of various timber qualities sold in 2003 multiplied by the maximum sustainable yield (MSY) of the forest resources on the property (i.e., the amount of timber that can be cut without decreasing opportunities for future harvesting). Variation in timber quality on forest land (and thus variation in valuation) between properties is accounted for using the weighted price. There are some other control variables that we used as instruments (in various combinations). These are: z1 = age of the forest owner (in years); z2 = income from out field related productions (Norwegian kroner, NOK); z3 = income from agriculture (NOK); z4 = wage income (NOK); z5 = debt (NOK); z6 is a dummy variable for management plan; z7 is a dummy variable for education (1 for bachelor degree or higher education, 0 otherwise); z8 is a dummy variable for centrality (1 for properties close to urban areas, 0 otherwise). See Lien et al. (2007) for details about the data (part of their Table 1 in p. 71 is reproduced in Table 1). 6. Empirical results 6.1. Single equation models: CD specifications 6.1.1. Production function results First we report results based on the production function estimated in the following forms:
Model 1 : ln y ¼ a0 þ
X
bj ln xj þ
ð34Þ
lj lnðxj =x1 Þ þ ly lnðx1 =yÞ þ 1
ð35Þ
dj lnðxj =x1 Þ þ dy ln x1 þ 2
ð36Þ
j
Model 2 : ln y ¼ l0 þ
X j
Model 3 : ln y ¼ d0 þ
X j
The production function in (34) is estimated with and without instruments for the input variables. As argued before if inputs are Table 1 Summary statistics of the main variables. Symbol
Variable name
Mean
Min.
Max.
y x1 x2 x3
Output (m3) Labor (hours) Material (hectares) Capital (NOK)
997.4 58.04 5.97 313,740
2 0.126 0.012 1476
46,068 2632.4 229 16,186,055
Table 2 Input elasticities and returns to scale: the CD models. x1
x2
x3
RTS
Production function Model 1 Std. error Model 1 (IV for all inputs) Std. error Model 2 (IV for ln (x1/y)) Std. error Model 3 (IV for lnx1) Std. error
0.936 (0.005) 0.940 (0.005) 0.860 (0.014) 0.944 (0.005)
0.062 (0.005) 0.059 (0.005) 0.139 (0.014) 0.056 (0.005)
0.042 (0.004) 0.035 (0.006) 0.182 (0.022) 0.032 (0.006)
1.041 (0.003) 1.034 (0.005) 1.182 (0.022) 1.032 (0.005)
Input distance function Model 4 Std. error Model 4 (IV for lny) Std. error
0.928 (0.005) 0.942 (0.005)
0.073 (0.005) 0.058 (0.006)
0.067 (0.004) 0.036 (0.006)
1.068 (0.003) 1.036 (0.005)
System approach Model 5 Std. error
0.891 (0.000)
0.063 (0.000)
0.031 (0.000)
.985 (0.000)
Note: The instruments used in Models 2–4 are z1–z8 and their interactions with ln (x2/x1) and ln (x3/x1). Note that ln (x2/x1) and ln (x3/x1) are exogenous for both profit maximization and cost minimization behaviors.
correlated with the error term the OLS estimators will be inconsistent. One can avoid the inconsistency problem by using instrumental variables for the inputs in (34). Note that one needs at least J instruments for J inputs.12 However, if we rewrite the production function in (34) as (35) in which all the inputs (except for x1) are written in ratio form, it is necessary to use instrument for only ln (x1/y). This is because the input ratios are independent of the error term irrespective of whether producers know u. The formulation in (36) is similar to (35) except that the ln(x1/y) term is replaced by lnx1 for which we need to use an instrument. Thus in both specifications (35) and (36) we have one endogenous regressors. In contrast, (34) has J endogenous regressors. Because of this specifications (35) and (36) are preferred. To make the results from different models directly comparable we report (in Table 2) input elasticities and returns to scale (sum of the input elasticities) from the CD model specifications in (34)–(36). The input elasticities and returns to scale (RTS) estimates from Model 1 (which uses OLS to the production function in (34)) look reasonable from economic point of view. The labor elasticity is the largest, as expected since labor is the most important input in harvesting trees. RTS slightly exceeds unity13 meaning that the forest owners are operating slightly above their most efficient scale. However, if inputs are endogenous and correlated with the error term the OLS estimators will be inconsistent. This correlation comes from the fact the solution of lnxj from the FOCs will depend on u which is also the error term in the production function. Hoch (1958) derived asymptotic bias of OLS estimators and showed that the estimated RTS (based on OLS) is biased (upwards) towards unity. This is also confirmed in Monte Carlo studies by Kmenta and Joseph (1963). This parameter inconsistency can be avoided by using IV for all the inputs. We used various combinations of the z variables and input ratios as instruments. The results for this model (reported under Model I (IV for all inputs)) show that RTS and input elasticities are quite close. Instead of using IV for all the inputs, we consider another variant of the production function (Model 3) in which only one
12 If some of the inputs are quasi-fixed so that they can be treated as exogenous, there is no need to use instruments on them. 13 Note that the second order condition for profit maximization requires RTS be be less than unity, and therefore RTS being greater than unity might be viewed is a violation of profit maximization behavior. However, since the profit maximization behavior is not imposed in this model explicitly, it might not be treated as a violation. Based on this result one can argue for using the first-order conditions of profit maximization explicitly (as done in the system model).
516
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
regressor (lnx1) is endogenous. Since all the inputs will be affected by u in the same way14 these ratios are independent of u and the ratio of inputs will be independent of u irrespective of whether u is fully or partially known. Thus in Model 3 we need to use IV for only lnx1. The resulting estimates of input elasticities are not too different from those in Model I with IV for all inputs. Thus although the input elasticities and RTS from Model 1 without IV are quite similar to those from Model 3, we prefer the latter models because it takes into account endogeneity of inputs. Model 3 might be preferred to Model 1 with IV on the ground that the former requires IV only on lnx1. The Hausman test in Model 3 shows that lnx1 is endogenous. Since RTS exceeds unity, the second order condition of profit maximization is not satisfied in these models. Note that here we are using the profit maximization behavior to determine endogeneity of regressors. The FOCs for profit maximization are not used in estimating these models. The regressors in Model 2 are in ratio form and under the assumption that u is fully known it will affect all the inputs and outputs exactly the same way. If so, the use of OLS to Model 2 will give consistent estimators of input elasticities. Although all the input elasticities are found to be positive, these are substantially different from those in other models. RTS is found to be quite high (1.707) and elasticities associated with x2 and x3 are 6 to 22 times bigger (compared to Model 3). We take this as an evidence that there is something wrong with this model. One possible problem (discussed before) might be that producers know u partially in which case ln(x1/y) will be correlated with the error term. Therefore, the OLS estimators of Model 2 will be inconsistent. When Model 2 is re-estimated using IV for ln(x1/y) the input elasticities are still different and RTS is slightly bigger (1.182) compared to other models. One possible explanation for the difference might be weak IV problem. That is, although the same variables are used as instruments for all models, they are better instruments for lnx1 than ln(x1/y). This is confirmed by low R2 value in the regression ln(x1/y) (compared to lnx1) on the list of instruments. Based on the three different production models, we come to the conclusion that use of Model 3 might be preferred since it recognizes endogeneity of inputs and it requires instruments for for only one regressor. Model 3 (with IV for lnx1) will give consistent parameter estimates even if producers know u partially and there are optimization error, provided that these optimization errors are uncorrelated with the errors in the production function (u1 and u2). If not, IV estimates from Models 2 and 3 will be inconsistent because the input ratios will be correlated with the errors in the production function. In such a case we need IV for all the regressors. Although Models 2 and 3 use IV for the endogenous variables, these models (similar to Model 1) do not use the profit maximizing behavior explicitly. Thus, although RTS exceeds unity in these models, one may not argue that these models are inconsistent with profit maximizing behavior. 6.1.2. Input distance function results The estimated model for IDF is
Model 4 : ln x1 ¼ a0 þ
X
bj lnðxj =x1 Þ þ a ln y þ 3
ð37Þ
j
which is first estimated without IV for lny and then with IV. Based on our arguments above, lny will depend on either u or both u1 and u2 and therefore it will be correlated with the error term in Model 4 (which will contain u1 and f in Case D; and on u1 in Case C). Thus, the OLS estimators of Model 4 will be inconsistent irrespective of whether there are optimization errors and u is known partially. 14 This can be easily shown by deriving the input demand and output supply functions.
Given that lny is endogenous, the results from the IV model will be preferred. The Hausman test also confirms that lny in the above IDF is endogenous. RTS from Model 4 with IV is slightly lower than the one from Model 4 without IV (1.036 vs. 1.068). 6.1.3. Input requirement function results We argued above that all inputs and output are related to u – the error term in the production function. Thus, like the production function models all the regressors in the IRF are endogenous and correlated with the error term. Consequently, the OLS estimators from the IRF will be inconsistent. Because of this we have not reported any result from IRF. Furthermore, IRF has no advantage over the production and input distance function models when it comes to using instrumental variables because all the regressors in IRF are endogenous. On the contrary, only one regressor (lny) in the IDF needs to be instrumented. Also the production function specification in Model 3 requires instruments for one regressor (lnx1). So the conclusion is that there is no point in using the IRF, especially when all the inputs and output are endogenous. 6.1.4. Cobb–Douglas system results In a system approach we are explicitly taking into account endogeneity of inputs and outputs by including the first-order conditions of profit maximization in the estimation process. Because of this, it does not matter whether the system consists of the production function and the FOCs or the IDF and the FOCs or the IRF and the FOCS. These are identical both algebraically and econometrically because the system takes into account endogeneity of the inputs and output variables. This is, however, not the case in the single equation set-up because the endogenous variables are not the same in the production, input distance and input requirement functions. It is worth noting that the system with u fully and partially known to the producer are essentially the same. In the system with output uncertainty the extra u2 term is appended to each equation. Since u2 is not identified, all we can say is that the variance of the error terms will be higher if u is not fully observed. The system approach (Model 5) shows that labor elasticity is somewhat higher and the other two elasticities are smaller compared to all other models that use a single equation model. The standard errors in Model 5 are much smaller because of higher degrees of freedom coming from the use of more than one equation but estimating the same number of parameters. Note that RTS in this model is less than unity (no violations of the second order condition). This is because the FOCs of profit maximization are explicitly used in estimating the model. 6.2. Results from the translog models: single equation and system approaches The advantage of the translog model is that it is flexible and therefore input elasticities and RTS are observation specific. Because of its flexibility some of the input elasticities might be of wrong sign (negative). We discarded these observations and counted them as violations (not satisfying the theoretical properties, namely, positive marginal products/elasticities) of profit maximizing conditions. In Table 3 we report RTS results from all the models considered for the translog case. Given that RTS are observation specific we report quartile, mean, minimum and maximum values of RTS for each model. We also report the number of observations (q) used to compute these measures. Note that we have a total of 3249 observations and thus (3249 q) is the number of violations. None of the models satisfy the theoretical properties at every data point. Number of violations differ across models. In terms of RTS there is remarkable similarity across the models. Input elasticities and therefore RTS from Model 1 (which assumes
517
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518 Table 3 Returns to scale for translog models. Returns to scale
Model 1
Model 1 (IV for all)
Model 2 (IV for ln (x1/y))
Model 3 (IV for lnx1)
Model 4 (IDF)
Model 4 (IV for lny)
Model 5
Mean Median Quartile 1 Quartile 3 Minimum Maximum Std dev No. of violations
1.039 1.04 1.033 1.046 .985 1.089 .0101 90
1.023 1.023 1.015 1.032 .9692 1.087 .0136 178
1.127 1.101 1.077 1.143 1.036 1.666 .0848 439
1.025 1.023 1.011 1.035 .9584 1.108 .0194 815
1.07 1.07 1.065 1.076 1.042 1.096 .0079 67
1.031 1.030 1.025 1.037 .973 1.079 .0101 109
.8400 .8422 .8251 .8551 .7340 .9999 .0267 335
Note: The instruments used in Model 2–4 are z1–z8 and their interactions with ln(x2/x1) and ln(x3/x1). Note that ln(x2/x1) and ln(x3/x1) are exogenous for both profit maximization and cost minimization behaviors.
suited for inputs and are not that good for ln(x1/y). This is confirmed by low correlation of the IV and ln(x1/y). Additional confirmation comes from Model 3 in which we used IV for lnx1 since the other regressors are ratios of inputs (ln(xj/x1)). In Model 3 we have 439 violations. Estimated RTS are very close to those from Model 1 with IV. Although Model 3 has more violations compared to Model 1 with IV, one might prefer Model 3 because it requires IV on one regressor. Next we consider results from IDFs with and without IVs. We argued earlier that for nonhomothetic functions all the regressors in the IDF are endogenous (similar to Model 1) and therefore the estimated input elasticities and RTS will be inconsistent. However, if the IDF is rewritten (with different normalizations) so that the regressors are ln(xj/x1) and ln(y/x1) (as in (28)) consistent estimates can be obtained by using IV on only ln(y/x1). Estimated RTS from the IDF without IV (Model 4) show that almost all the forest owners were operating under increasing RTS. When the model is reestimated using IV estimated RTS is somewhat reduced. We failed to reject endogeneity of lny in the IDF. We also found that ln(y/x1) is endogenous (using Hausman test) in (28). Thus the results from the IV model in (28) should be consistent.
the inputs to be uncorrelated with the error term in the production function) is supposed to be inconsistent. However, it has only 90 violations (out of 3249) and the mean RTS is quite close to other models. The variations in RTS is also quite small in this model (the range is 0.10). This model predicts that more than 99% of the forest owners were operating under increasing RTS. When the model is estimated using GMM which treats all the inputs to be endogenous (input ratios and the z variables, in various combinations, listed in the data section are used as instruments) results do not change much. More than 95% of the forest owners are now found to be operating under increasing RTS. However, the number of violations is increased to 178. The mean and median RTS are decreased from 1.04 to 1.02. To minimize the number of endogenous regressors we used Model 2 in which only one regressor ln(x1/y) is endogenous. The other regressors are input ratios which can be treated as exogenous in all the cases (as discussed in Section 3.4.1). Similar to the CD case, results from this model are very different from those in Model 1 which uses instruments for all regressors (inputs). Number of violations in this model is quite high (815 out of 3249). This might be due to the fact that the instruments we used are better
.95
1
1.05
1.1
Density .95
PF_NOIV
1
1.05
1.1
.95
1
PF_IVALL
1.05
1.1
PF_IVX1
Kernel density estimate
Kernel density estimate
Kernel density estimate
1.2
1.4
1.6
1.8
PF_IVX1Y
Density
Density
5 1
1.04
1.06
1.08
1.1
IDF
kernel = epanechnikov, bandwidth = 0.0092
kernel = epanechnikov, bandwidth = 0.0014
0 10 20 30 40 50
kernel = epanechnikov, bandwidth = 0.0033
0 10 20 30 40 50
kernel = epanechnikov, bandwidth = 0.0023
10
kernel = epanechnikov, bandwidth = 0.0017
0
Density
Kernel density estimate 0 5 10 15 20
10 20 30
Density
Kernel density estimate
0
Density
0 10 20 30 40
Kernel density estimate
.98
1
1.02
1.04
1.06
1.08
IDF_IV kernel = epanechnikov, bandwidth = 0.0016
Density
0 5 10 15 20
Kernel density estimate
.9
.95
1
1.05
1.1
1.15
SYSTEM
kernel = epanechnikov, bandwidth = 0.0041
Fig. 1. Density plot of RTS. Notes: PF_NOIV, PF_IVALL, PF_IVX1, PF_IVX1Y are densities of RTS based on production function estimates without IV, IV for all x1, x2, x3, IV for only x1, IV for x1/y, respectively. IDF and IDV_IV are RTS densities without any IV and IV for y, respectively. Finally, the RTS density associated with the system model is labeled as SYSTEM.
518
S.C. Kumbhakar / European Journal of Operational Research 217 (2012) 509–518
Finally, we report the estimated RTS from the system approach (Model 5) in the last column. All the forest operators are now found to be operating under decreasing RTS, and this result is consistent with the profit maximizing behavior. In other words, the explicit use of the first-order conditions of profit maximization in estimating the model takes care of both endogeneity and gives results that are consistent with profit maximizing behavior. To give a full picture of estimates of RTS from all the models, we report plots of them in Fig. 1. It can be seen that although there are some variations across models, results are quite robust. The exception is the production function model with IV for ln(x1/y) (Model 2). Barring this model, RTS varied from 0.95 to 1.25. Thus, although we find some evidence of increasing RTS, scale diseconomies are not too much. Note that RTS in the system model is less than unity and is consistent with profit maximizing behavior. 7. Conclusion In this paper we addressed two issues. First, we showed that one can derive the production function, input distance function and input requirement function from the transformation (production possibility) function by using appropriate normalizations. This result holds for restrictive transformation function such as the Cobb–Douglas as well as flexible functions like the translog which is widely used in the empirical literature. Thus, algebraically all these functions are equivalent in the sense that they are all derived from the same transformation function. This is, however, not the case empirically. By assuming that producers maximize profit, we show that in many cases OLS gives inconsistent estimates irrespective of whether the production, input distance and input requirement functions are used. However, this result might not hold when one uses a different economic behavior such as cost minimization or maximization of return to the outlay. Based on several specifications of the production and input distance function models, we concluded that one can estimate the input elasticities and returns to scale consistently using instruments on only one regressor (which can be one of the inputs or the output depending on the specifications used). Again this result holds for both the Cobb–Douglas and translog specifications. The other alternative to the single equation approach is to use a system approach which takes into account endogeneity of inputs and output variables explicitly by taking the first-order conditions into account while estimating the parameters of the model. Because of this there is only one system irrespective of whether one thinks of a production function or input distance function or input requirement function. The use of system approach, however, requires information on input and output prices. Since a variety of models are proposed in the literature their appropriateness, especially in using OLS to estimate the parameters consistently, will depend on whether the regressors can be treated as exogenous or not. This, in turn, depends on economic
behavior of the producers. We used profit maximization behavior in which both output and input variables are treated as endogenous. This modeling exercise shows that in none (except one) of the econometric specifications all the regressors are exogenous. However, if a different behavioral assumption is used, endogeneity of the regressors might change. Thus, although the underlying technology (specified by the transformation function) is the same, the choice of the appropriate econometric model will depend on which are the choice variables and what is the economic behavior of the producers. References Boussemart, J.-P., Briec, W., Peypoch, N., Tavéra, C., 2009. a–Returns to scale and multi-output production technologies. European Journal of Operational Research 197, 332–339. Cobb, C.W., Douglas, P.H., 1928. A theory of production. American Economic Review 18, 139–165. Coelli, T.J., 2000. On the econometric estimation of the distance function representation of a production technology, Discussion Paper 2000/42, Center for Operations Research and Econometrics, Universite Catholique de Louvain. Diewert, W.E., 1974. Functional forms for revenue and factor requirements functions. International Economic Review 15, 119–130. Färe, R., Grosskopf, S., Zaman, O., 2002. Hyperbolic efficiency and return to the dollar. European Journal of Operational Research 136, 671–679. Hoch, I., 1958. Simultaneous equation bias in the context of the Cobb–Douglas production function. Econometrica 26, 566–578. Hoch, I., 1962. Estimation of production function parameters combining time-series and cross-section data. Econometrica 30, 34–53. Klein, L.R., 1953. A Textbook of Econometrics. Row, Peterson and Company, New York. Kmenta, J., Joseph, J.E., 1963. A Monte Carlo study of alternative estimates of the Cobb–Douglas production function. Econometrica 31, 363–385. Kumbhakar, S.C., 2011. Estimation of production technology when the objective is to maximize return to the outlay. European Journal of Operational Research 208, 170–176. Kumbhakar, S.C., Tsionas, E.G., 2011. Stochastic error specification in primal and dual production systems. Journal of Applied Econometrics 26, 270–297. Levinsohn, J., Petrin, A., 2003. Estimating production functions using inputs to control for unobservables. The Review of Economic Studies 70, 317–341. Lien, G., Størdal, S., Baardsen, S., 2007. Technical efficiency in timber production and effects of other income sources. Small-scale Forestry 6, 65–78. Marschak, J., Andrews, W.H., 1944. Random simultaneous equations and the theory of production. Econometrica 12, 143–203. McElroy, M., 1987. Additive general error models for prodeuction, cost, and derived demand or share system. Journal of Political Economy 95, 738–757. Mundlak, Y., 1961. Empirical production function free of management bias. Journal of Farm Economics 43, 44–56. Mundlak, Y., Hoch, I., 1965. Consequences of alternative specifications in estimation of Cobb–Douglas production functions. Econometrica 33, 814–828. Nerlove, M., 1965. Estimation and Identification of Cobb–Douglas Production Functions. Rand McNally, Chicago. Parelman, S., Santín, D., 2009. How to generate regularly behaved production data? A Monte Carlo experimentation on DEA scale efficiency measurement. European Journal of Operational Research 199, 303–310. Shephard, R.W., 1953. Theory of Cost and Production Functions. Princeton University Press, Princeton, NJ. Shephard, R.W., 1970. The Theory of Cost and Production Functions. Princeton University Press, Princeton, NJ. Zellner, A., Kmenta, J., Dreze, J., 1966. Specification and estimation of Cobb–Douglas production function models. Econometrica 34, 784–795.