Journal of Econometrics 183 (2014) 22–30
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Likelihood-based inference for regular functions with fractional polynomial approximations John Geweke a,b,∗ , Lea Petrella c a
University of Technology Sydney, Australia
b
Erasmus University, The Netherlands
c
Sapienza University of Rome, Italy
article
info
Article history: Available online 26 June 2014
abstract This paper shows that regular fractional polynomials can approximate regular cost, production and utility functions and their first two derivatives on closed compact subsets of the strictly positive orthant of Euclidean space arbitrarily well. These functions therefore can provide reliable approximations to demand functions and other economically relevant characteristics of tastes and technology. Using canonical cost function data, it shows that full Bayesian inference for these approximations can be implemented using standard Markov chain Monte Carlo methods. © 2014 Elsevier B.V. All rights reserved.
1. Introduction Neoclassical producer and consumer theory provides a structure for the analysis of important private and public sector economic policy problems that is both elegant and practical. The theory stipulates restrictions on the functional form of production and utility functions that are inherently nonparametric: monotonicity and weak concavity for production functions and monotonicity and quasi-concavity for utility functions. For the development of the theory it is convenient for the domain of a production or consumption function to be the entire positive orthant of Euclidean space whose dimension is the number of inputs or consumer goods. In practical applications requiring an econometric model, imposing exactly these restrictions has proven to be challenging. There is a rich mathematical approximation theory providing expansions of continuously differentiable functions. For econometric work it is desirable that the approximations themselves be regular while still being flexible, a demanding criterion. Gallant (1981) and Terrell (1996) are examples of this approach. Alternatively there are parametric forms that easily enforce appropriate shape restrictions globally, but in so doing they exclude most production and utility functions consistent with the theory. Barnett and Serletis
∗ Correspondence to: University of Technology Sydney, CenSoC - Centre for the Study of Choice, 645 Harris St Ultimo, PO Box 123, NSW 2007 Broadway, Australia. E-mail address:
[email protected] (J. Geweke). http://dx.doi.org/10.1016/j.jeconom.2014.06.007 0304-4076/© 2014 Elsevier B.V. All rights reserved.
(2008) provides a discussion of these trade-offs and a survey of the literature for consumption. Economic characteristics of demand and supply functions, for example elasticities of substitution and demand, depend on the first two derivatives of cost, production or utility functions as well as their derivatives. Therefore an important characteristic of any approximation of these functions is that it be able to approximate levels and the first two derivatives simultaneously. This point was emphasized in Gallant (1981) and has generally been recognized since then. An important theme in this literature has been the development of fractional polynomial approximations of cost, production and utility functions. To the best of our knowledge this term was first used in Royston and Altman (1994) and Sauerbrei and Royston (1999) to describe polynomials of the form f ( x1 , . . . , xn ) =
k i=1
ai
n
s
xi iℓ
ℓ=1
where the coefficients ai and exponents siℓ are real numbers and the domain of the polynomial is the strictly positive real orthant. The asymptotically ideal model, set forth by Barnett and Jonas (1983) and extended in Barnett et al. (1991a,b) is the leading example of fractional polynomial approximations in this literature. This study has two objectives. This first is to state carefully the sense in which fractional polynomials can approximate regular functions. The pioneering work of Barnett and coauthors relied mainly on the univariate Muntz–Szász Theorem as conveyed in Rudin (1996). Extending this theorem to the multivariate case
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
proved to be nontrivial and did not take place until the work of Bloom (1989, 1990, 1992). Even then, Bloom’s results were for levels and did not provide the desired simultaneous approximation of levels and the first two derivatives. Evard and Jafari (1994) achieved exactly this result, but for ordinary rather than fractional polynomials. Section 2 extends their result to fractional polynomials. It describes how the approach in Terrell (1996) can then be used to enforce regularity. A colloquial statement of the result is that there exists a regular fractional polynomial expansion that approximates any regular function on a closed compact subset of the positive orthant. The precise statements are in Theorem 2 and Corollary 2.1. The second objective of this study, pursued in Section 3, is to illustrate how these ideas can be implemented in a practical way. That section sets up a fully Bayesian approach to inference for the fractional polynomial approximation of a cost function in which regularity and first-degree homogeneity in prices can either be imposed or cast as testable hypotheses. It applies this approach in the context of the canonical cost function study of Christensen and Greene (1976), comparing alternative fractional polynomial expansions, some with scores of terms. The work concludes by casting the substantive results in terms of familiar elasticities of substitution and demand and examining their sensitivity to alternative fractional polynomials. 2. Fractional polynomial approximations This section begins with a straightforward extension of the multivariate Weierstrass approximation of Evard and Jafari (1994) to fractional polynomials. It then turns to the approximation of regular functions by fractional polynomials that are themselves regular and the constraints on likelihood functions imposed by regularity. 2.1. Notation We begin by establishing some notation for expressing and extending the Weierstrass approximation. The definitions implicit in some of the notation are all conventional. For n × 1 vector α, |α| = |α1 | + · · · + |αn |. The n × 1 vector en = (1, . . . , 1)′ . The n × 1 vector 0n = (0, . . . , 0)′ . If x = (x1 , . . . , xn ) with xi > 0 (i = 1, . . . , n) then x > 0n . If A is negative definite then A < 0; if A is nonpositive definite then A ≤ 0. 6. The positive orthant of Euclidean n-space is
1. 2. 3. 4. 5.
Rn++ = (x1 , . . . , xn )′ : xi > 0 (i = 1, . . . , n) .
α
7. If x = (x1 , . . . , xn )′ ∈ Rn++ and α ∈ Rn then xα = i=1 xi i . 8. C m (S1 , S2 ) denotes the set of all mappings f from an open neighborhood D(f ) ⊆ Rn of S1 ⊆ Rn into S2 ⊆ R that are m times continuously differentiable. 9. A single index is an element of the collection of non-negative integers N = {0, 1, . . .}. 10. A multi-index of order n is an element of the collection of ntuples
n
N = {(m1 , . . . , mn ) : m1 , . . . , mn ∈ N } . n
11. For α ∈ N n the mixed partial derivative f (α) (x) = ∂ |α| f (x) / ∂ α1 x1 . . . ∂ αn xn . 12. For b > 0 a single index (base b) is an element of the collection of real numbers Nb = {0, b, 2b, . . .}. 13. A multi-index (base b) is an element of the collection of n-tuples Nbn = {(r1 , . . . , rn ) : r1 , . . . , rn ∈ Nb } . 14. If αi ∈ Nbn and ai ∈ R (i = 1, . . . , k) then pb (x; a) = i =1 ai xαi is a fractional polynomial of base b defined on x ∈ Rn++ . 15. The Kronecker delta function is δ (j, r ) = 1 if j = r and δ (j, r ) = 0 if j ̸= r.
k
23
2.2. Main results The following result due to Evard and Jafari (1994, Corollary 3) extends the classic Weierstrass approximation of a function to its derivatives. Theorem 1 (Weierstrass Approximation). Let f ∈ C m (S1 , S2 ) where m ∈ N, S1 ⊆ Rn is closed and compact, S2 ⊆ E, and E is an R-Banach space. Then given any ε > 0,there exists a polynomial p : Rn → E such that f (α) (x) − p(α) (x) < ε ∀x ∈ S1 and ∀α ∈ N n with |α| ≤ m.
Suppose that j(i) =
(i)
(i)
j1 , . . . , jn
′
is a sequence of multin
indexes with the property that if m ∈ N then there exists i such that j(i) = m. Then there exists k such that the polynomial p (x) in Theorem 1 may be expressed as p (x; a) =
k
ai x j
(i)
=
k
i=1
ai
n
ℓ=1
i=1
j
(i)
xℓℓ .
The sequence j(i) defines a corresponding sequence of approximations for f . In most substantive applications the specification f ∈ C m (S1 , S2 ) is associated with a belief that f is smooth. Therefore it is natural to choose j(i) so that smaller values of j(i) appear before larger ones. In neoclassical producer and consumer theory f is further restricted. Definition 1. Suppose that G ⊆ Rn++ and f ∈ C 2 (G, R). If for all x ∈ G, f (x) ≥ 0, ∂ f (x) /∂ x ≥ 0, and ∂ 2 f (x) /∂ x∂ x′ ≤ 0 then f is regular on G. If for all x ∈ G, f (x) > 0, ∂ f (x) /∂ x > 0, and ∂ 2 f (x) /∂ x∂ x′ < 0 then f is strictly regular on G. If f is strictly regular on G then for any closed compact set G∗ ⊆ G and any ε > 0 there exists a polynomial p such that for all x ∈ G∗ and i, j ≤ n,
|f (x) − p (x)| < ε,
∂ f (x) /∂ xj − ∂ p (x) /∂ xj < ε,
and
2 ∂ f (x) /∂ xi ∂ xj − ∂ 2 p (x) /∂ xi ∂ xj < ε. Moreover if f is strictly regular on G then we may restrict p ∈ A (p, G∗ ), the set of polynomials regular on G∗ . However, because monomials cannot be both monotone increasing and concave everywhere in Rn++ , the approximation may well require many terms. It is straightforward to extend Theorem 1 to fractional polynomials. Theorem 2 (Fractional Polynomial Approximation). Let f ∈ C m (S1 , S2 ) where m ∈ N, S1 ⊆ Rn++ is closed and compact, S2 ⊆ E, and E is an R-Banach space. Then given any ε > 0 and b > 0, there exists a (α) fractional polynomial pb : Rn → E such that f (α) (x) − pb (x) < ε ∀x ∈ S1 and ∀α ∈ N n with |α| ≤ m. (b)
Proof. Let b = en b, c = en b−1 , z = b (x) = xb , S1
=
(b)
{z : z ∈ S1 }, and define g (z) = f (z ) on S1 . Then Theorem 1 c
c
(b)
applies directly to g (z) on S1 . Since |∂ α b (x) /∂ xα | is bounded above uniformly on S1 for all α : |α| ≤ m, the result follows. Suppose that s(i) =
(i)
(i)
s1 , . . . , sn
p
is a sequence of multi-
indexes (base b) with the property that if r ∈ Nbn , then there exists (i)
i such that sb = r. Then there exists k such that the polynomial pb in Theorem 2 may be expressed as pb (x; a) =
k i =1
(i)
ai xs
=
k i=1
ai
n
ℓ=1
(i) s
xℓℓ .
Corollary 2.1. Suppose f is strictly regular on G ⊆ Rn++ and G∗ ⊆ G is closed and compact. Given any b > 0, any n × 1 vectors c and
24
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
d, and any positive constant ε , there exists a fractional polynomial pb ∈ A (pb , G∗ ), the set of fractional polynomials regular on G∗ , such that for all x ∈ G∗ 1. |pb (x; a) − f (x)| < ε , 2. c′ [∂ f (x) /∂ x − ∂ pb (x) /∂ x] < ε ,
3. d′ ∂ 2 f (x) /∂ x∂ xp − ∂ 2 pb (x; a) /∂ x∂ x′ d < ε .
In applied econometric work that entails exploration of a likelihood function – e.g. maximum likelihood or posterior simulation – the following result simplifies the work. Theorem 3. In Corollary 2.1 the set A = {a : pb (x; a) ∈ A(pb , G )} is convex. ∗
Proof. For all a ∈ Rk , pb (x; a) ∈ C 2 (G∗ , R). Given a1 , a2 ∈ A and any w ∈ (0, 1), let a = a1 w + a2 (1 − w). Then pb (x; a) > 0 and ∂ pb (x; a) /∂ x > 0. For any vector d ∈ Rn , d ̸= 0, and all x ∈ G∗ , d ∂ pb (x; a) /∂ x∂ x d = w d ∂ pb (x; a1 ) /∂ x∂ x d ′
′
2
′
′
2
1 where wtij = sij x− tj zti . Hence the n × 1 gradient vector at xt is
(5) ∂ pb (xt ; a) /∂ xt = W′t a where Wt = w . Then define the nT × k matrix W = ′ tij W1 · · · WT so that ′ (∂ pb (x1 ; a) /∂ x1 )′ · · · (∂ pb (xT ; a) /∂ xT )′ = Wa. (6) For r ̸= j the second derivative at xt is
=
ai
(i) sj
xtj
(1)
(2)
2
n
(i)
ai s j
2 x− tj
(1) s1
(1) s xtnn
′
′
pb (xT ; a)
···
= Za
(3)
(4)
with zti = The first derivative evaluated at xt elements k
(i)
1 ai sj x− tj
ℓ=1
i=1
=
k i=1
n
ai wtij
s
(i)
xt ℓℓ =
ℓ=1
2 ai s2ij x− tj zti −
i=1
=
(i)
k
2 ai sij x− tj zti
i =1
k
ai ctijj
i =1
−2 1 −1 where ctijj = x− tj xtj sij sij zti − sij xtj zti . Hence the n × n Hessian matrix at xt is k
∂ 2 pb (xt ; a) /∂ xt ∂ x′t =
Ct1 . ai Cti = a′ ⊗ In ..
(7)
Ctk
1 −1 −2 ctijr = x− tj xtr sij sir zti − δ (j, r ) sij xtj zti
n
∂ pb (xt ; a) /∂ xtj =
k
s
xt ℓℓ
(j = 1, . . . , n; (8)
Then define the nk × nT matrix
(i) sℓ ℓ=1 xt ℓ .
(i)
2 ai sj x− tj
i=1
=
n
(i) s
xt ℓℓ
(i) s xt ℓℓ
ℓ=1
k
−
n
r = 1, . . . , n; i = 1, . . . , k; t = 1, . . . , T ).
which implicitly defines the k × 1 vectors zt (t = 1, . . . , T ). Then ′ define the T × k matrix Z = z1 · · · zT so that pb (x1 ; a)
(i)
2 sj − 1 x− tj
where the n × n matrix Cti has (j, r ) entry
a1 xt1 · . . . · . . . pb (xt ; a) = · .. = z′t a, (k) . (k) ak s s xt11 · . . . · xtnn
i =1
The level function evaluated at xt is
ℓ=1
k
j =1
s(n1) . s(nk)
···
(i)
ai s j
i=1
arrange the k multi-indexes in the rows of the k × n matrix (1)
k
Because the fractional polynomial pb is linear in the parameter vector a, pb and its first two derivatives can be evaluated using linear operations that are standard in mathematical applications software like Matlab and R. For efficiency it is important to minimize the number of computations in the evaluation of pb (x; a) and its derivatives. Moreover one may choose to enforce regularity of pb at points x that do not correspond to observations. Taken together these tasks present the problem of evaluating levels, first and second derivatives at a set of points xt ∈ Rn (t = 1, . . . , T ) arranged ′ in a T × n matrix X = x1 · · · xT with typical element xtj . Corresponding to the fractional polynomial
i=1
ai ctijr ,
i=1
i=1
n
k
1 −1 where ctijr = x− tj xtr sij sir zti . For r = j,
2.3. Computation
s S = 1(k) s1
1 −1 ai sij x− tj sir xtr zti =
(i) s
xt ℓℓ
i=1
=
k
n
ℓ=1
i=1
∂ 2 pb (xt ; a) /∂ x2tj =
Hence pb (x; a) is regular on S. Therefore A is convex.
pb (xt ; a) =
(i)
1 (i) −1 ai s j x − tj sr xtr
+ (1 − w)d′ ∂ 2 pb (x; a2 ) /∂ x∂ xp d < 0.
k
k
∂ 2 pb (xt ; a) /∂ xtj ∂ xtr =
= (xt1 , . . . , xtn )′ has k i=1
(j = 1, . . . , n)
1 ai sij x− tj zti
C11
···
C = .. C1k
···
.
CT 1
.. .
so that ∂ 2 pb (x1 ; a)
(9)
CTk
···
∂ 2 pb (xT ; a) = a′ ⊗ In C.
2.4. Enforcing shape restrictions Neoclassical consumer and producer theory implies restrictions on objective functions. In the particular case of direct utility functions the restrictions coincide with strict regularity (Definition 1). In the case of indirect utility functions, cost functions and profit functions they do not. It is useful to first take up the case of direct utility functions and strict regularity, and then consider the complications in other objective functions. Consider the implications of strict regularity for a set of points xt ∈ G (t = 1, . . . , T ). The positivity conditions pb (xt ; a) > 0
(t = 1, . . . , T )
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
are equivalent to the T linear inequalities Za > 0, Z being defined in (4). The monotonicity conditions
∂ pb (xt ; a) /∂ xt > 0
(t = 1, . . . , T )
are equivalent to the nT inequalities Wa > 0, W being defined in (6). Thus it is straightforward to examine positivity and monotonicity at specified points xt ∈ G (t = 1, . . . , T ). The strict concavity conditions constitute nonlinear restrictions on the coefficient vector a because the eigenvalues of the n × n matrices (7) are nonlinear functions of a. Given the modest size of n in most applications of neoclassical producer and consumer theory these computations are not particularly demanding, but some attention to rounding error is prudent. There is no practical approach to establishing strict regularity of a fractional polynomial on a closed compact set, to the best of our knowledge. The most effective procedure seems to be that of Terrell (1996), which checks these conditions at a manageable number of points representative of G. The application in Section 3 utilizes this approach, using quasi-random numbers to represent G. In the context of Bayesian inference by means of posterior simulation this amounts to a prior distribution that assigns probability zero to fractional polynomials that are not strictly regular at the points selected to represent G, and this is easy to incorporate in these algorithms. In the context of an unrestricted fractional polynomial the plausibility of strict regularity can be examined in the usual way by computing an appropriate Bayes factor as illustrated in Section 3.3. Indirect utility functions, profit functions and cost functions involve the further complication of homogeneity restrictions. Such functions are no longer strictly regular: they lie at limit points of the space G of strictly regular functions. The argument used in Corollary 2.1 in the strictly regular case no longer applies, and it seems to us that the results of Evard and Jafari (1994) cannot be used to establish a parallel result, at least with any finesse. A practical approach, used in the application in Section 3, is to consider models in pairs, identical except that one imposes homogeneity while the other does not, and then test the restriction. For a fractional polynomial of base b with b−1 ∈ N imposing homogeneity of degree 1 amounts to retaining only those terms i for which e′n s(i) = 1. 3. An application This section begins with some practical details specific to Bayesian inference for cost functions using fractional polynomials. It then describes specifics of the prior distribution in the application to a canonical data set. It continues by summarizing the evidence in these data for the hypotheses of regularity and homogeneity and the base of the fractional polynomial expansion. It concludes with a presentation of posterior distributions of elasticities of substitution and demand. 3.1. Methods The econometrician observes the n inputs, n input prices and output of a unit of production like a plant or a firm. Make the standard specification that production is constant returns to scale, implying that the cost function is first-degree homogeneous in output. Input prices and output are exogenous. Management seeks to minimize costs but either (1) actual and cost minimizing inputs differ by a vector of random variables ε independent of input prices and output; or (2) actual inputs are cost minimizing but the econometrician observes these levels corrupted by an additive random measurement error ε that is again independent of input prices and output; or (3) both. This specification is also standard in the literature. The econometrician computes cost of production in the obvious way from observed input prices and quantities.
25
Table 1 Some aspects of the data. Factor
Labor Capital Fuel
Prices
Shares
Min
Mean
Max
Min
Mean
Max
5063 31.73 9.00
8002 71.42 30.75
13 044 92.65 51.46
0.0459 0.0924 0.2435
0.1390 0.2364 0.6324
0.3291 0.4521 0.8136
Applying the notation of the previous section, x is the n × 1 vector of input prices and let pb (x; a) be the fractional polynomial approximation of production cost per unit of output. Then the econometrician observes the n × 1 vector of factor demands per unit of output ∂ pb (x, a) /∂ x + ε where ε is the vector of random terms described in the previous paragraph. If pb (x; a) is firstdegree homogeneous in x then computed cost is x′ [∂ pb (x, a) / ∂ x + ε] = pb (x, a) + x′ ε. This is a system of n + 1 equations linear in the parameter vector a with a singular distribution of errors. The most convenient way to manage this situation is to eliminate one equation. Likelihood-based inference is invariant with respect to the equation that is eliminated. The work here eliminates the cost equation and retains the n factor demand equations. Imposing homogeneity of degree one of cost in output, in factor demand equation i the dependent variable is the level of factor i divided by output, and from Shephard’s lemma the covariates are the terms in the first derivative of the fractional polynomial with respect to xi (i = 1, . . . , n). Following the usual practice the disturbance n×1 vector ε ∼ N (0, 6) independent of factor prices. The example taken up here considers a sequence of fractional polynomial representations of the cost function in which the base b ∈ {1, 1/2, 1/3, 1/4, 1/5, 1/6}. For each b, the polynomial exn (i) pansion includes those terms for which ℓ=1 sℓ ∈ [0, 1] when homogeneity of degree one in prices is not imposed. With regard to the theory in Section 2.2 these are the first terms in an expansion that continues indefinitely; with regard to the empirical n (i) work, including any terms for which ℓ=1 sℓ > 1 did not improve fit. When homogeneity of degree 1 in prices is imposed then the polynomial expansion includes only those terms for which n (i) ℓ=1 sℓ = 1. The coefficients ai are independent normal and the variance matrix of ε is inverse Wishart in the prior distribution; Section 3.2 details construction of the hyperparameters of these distributions. A conventional two-block Gibbs sampling algorithm (Geweke, 2005, Section 5.2) is used to access the posterior distribution. Imposition of regularity in this algorithm entails the introduction of a random walk Metropolis-within-Gibbs step for the coefficient vector a in which the variance is the usual conditional variance of a scaled by a step-size parameter s. This draw is retained if the regularity conditions are satisfied and rejected in favor of the existing value if not. Marginal likelihood approximation is based on the procedure of Chib (1995) for Gibbs sampling algorithms. 3.2. Data and prior distribution The application is to the electricity cost data of Christensen and Greene (1976), also described in Greene (1997, Section 14.3.1), one of the canonical examples in the neoclassical cost function literature. The data pertain to production of electricity using the inputs labor, capital and fuel, at 158 plants in the US in the year 1970. (Christensen and Greene, 1976 use 123 of the 158 plants; we use all 158.) The variation in factor prices and their shares of cost is considerable in this data set, as indicated in Table 1. Fig. 1 shows more details of the joint distribution of labor and capital factor prices. Let i = 1, 2, 3 index the factors labor, capital and fuel. In the econometric model the dependent variable yi in factor demand equation i is factor demand per unit of output. Consistent with the
26
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
Table 2 Technical details of posterior simulation. R, H
R, H
R, H
R, H
k
Secs
k
Secs
Secs
N
s
α
Secs
N
s
α
4 10 20 29 56 84
71 76 80 91 100 124
3 6 10 15 21 28
71 74 76 81 79 81
981 1118 1832 2177 5524a 8404a
200 1000 3500 4000 5000 5000
0.90 0.60 0.15 0.15 0.10 0.10
0.499 0.232 0.207 0.086 0.113 0.117
880 1033 1082 1657 1747 1718
200 1000 2500 3000 3000 5000
0.90 0.90 0.30 0.20 0.20 0.20
0.480 0.250 0.228 0.366 0.226 0.108
b b b b b b
=1 = 1/2 = 1/3 = 1/4 = 1/5 = 1/6
a
Results are based on 20,000 successive iterations of the algorithm, and in the other 22 cases they are based on 10,000 successive iterations. Sample distribution of labor and capital prices 100 90
Capital price
80 70 60 50 40 30 5000
6000
7000
8000
9000 10000 11000 12000 13000 14000 Labor price
Fig. 1. The points are the 158 pairs of labor price and capital price in the sample. The open circle is the sample mean. The dotted lines denote extreme prices in the sample.
notation of Section 2 let xj denote the price of factor j. Then, for example, if b = 1/2 and first-degree homogeneity of cost in prices is not imposed the fractional polynomial expansion of the cost function is 1/2 1/2
a1 + a2 x 1 + a3 x 1 x 2 1/2 1/2
1/2 1/2
1/2
+ a4 x1 x3 + a5 x1 + a6 x2
1/2
1/2
+ a7 x2 x3 + a8 x2 + a9 x3 + a10 x3 .
3.3. Results
The first factor demand equation is −1/2 1/2
y1 = a2 + a3 (1/2) x1
−1/2
+ a5 (1/2) x1
x2
Consider 24 variants of the fractional polynomial approximation as indicated in Table 2.
−1/2 1/2
+ a4 (1/2) x1
x3
+ ε1 ,
and similarly for the other factor demand equations. The theory imposes cross-equation restrictions on the coefficients, a hallmark of the empirical neoclassical producer and consumer literature. If first-degree homogeneity in prices is imposed then the fractional polynomial expansion of the cost function becomes 1/2 1/2
a1 x 1 + a2 x 1 x 2
1/2 1/2
1/2 1/2
+ a3 x 1 x 3 + a4 x 2 + a5 x 2 x 3 + a6 x 3
(10)
and the first factor demand equation is −1/2 1/2
y1 = a1 + a2 (1/2) x1
x2
−1/2 1/2
+ a3 (1/2) x1
x3
+ ε1 .
Note from (10) that the generalized Leontief, or Diewert, cost function is the fractional polynomial approximation in base b = 1/2 that imposes first-degree homogeneity of the cost function in prices. The mean of the prior distribution for coefficient ai is kc /
3
(i) sℓ
ℓ=1 xℓ
achieving good fit. We experimented with more elaborate kinds of prior distributions on the coefficients aj , but the more complicated priors did not substantively affect the results. The prior distribution of the inverse of 6 = var (ε) is Wishart. The mean of this distribution is the inverse of half the sample variance matrix of the vector of dependent variables y across the 158 firms. The degrees of freedom parameter is 8. Reasonable modifications of this prior distribution had little impact on the results. Regularity is imposed at 1000 points in the hyper-rectangle bounded below by 90% of the minimum factor prices and 110% of the maximum factor prices displayed in Table 1. For labor and capital prices the region of regularity is the area of Fig. 1. The points in the hyper-rectangle were chosen using a Halton sequence with base (3, 5, 7). To be regular at a point the level and all three first derivatives of the fractional polynomial must be positive. It is important that checks for concavity account for rounding error in eigenvalue computation, especially when homogeneity of degree one in prices is imposed and therefore one eigenvalue of the second derivative matrix is always zero. Some experimentation showed that this could be done reliably by normalizing the Hessian so that its diagonal elements were all −1, and then rejecting concavity if any eigenvalue of that matrix exceeded 10−8 . Imposing regularity at 1000 points seems to be conservative. If a regularity check failed, it almost always did so at one of the first 200 points in the Halton sequence.
where c is the sample mean ratio of cost to output and xℓ is the sample mean of factor price ℓ. The constant k is the total number of coefficients in the fractional polynomial expansion. The standard deviation is the same as the mean. The constant k is included to allow for the fact that as fractional polynomial expansions include more terms there is a tendency for coefficients to increase in absolute value, with sign cancellation appropriate to
1. There are six bases of the fractional polynomial, displayed in the first column. Then regularity is imposed (R) or not (R), and whether first-degree homogeneity is imposed (H) or not (H), as indicated in the first row. 2. In the second row of Table 2 k labels columns indicating the number of terms in the polynomial expansion (1); this depends on the base b and whether or not homogeneity is imposed, but not on whether regularity is imposed. 3. The columns headed ‘‘Secs’’ provide total execution time in seconds. For the two cases denoted with superscript a results are based on 20,000 successive iterations of the algorithm, and in the other 22 cases they are based on 10,000 successive iterations. 4. When regularity is not imposed R there is no Metropolis step in the simulation of the coefficient vector a from the posterior distribution, simulations are nearly independent, and there are always N = 200 burn-in iterations. 5. When regularity is imposed (R) the number of burn-in iterations is indicated in the column headed N, the step size factor for the variance matrix in the Gaussian random-walk proposal distribution is s, and the acceptance rate is α as indicated by the columns headed with these symbols. Conventional rules of thumb for Metropolis–Markov chains specify the desirability of acceptance rates roughly in the range (0.20, 0.40). Step size s was chosen with this target in mind, but
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30 Table 3 Log marginal likelihoods of alternative models.
b b b b b b
=1 = 1/2 = 1/3 = 1/4 = 1/5 = 1/6
R, H
R, H
R, H
R, H
5140.9 5147.8 5125.3 5102.9 5078.5 5051.6
5141.9 5153.5 5142.9 5134.4 5126.6 5117.8
5141.4 (0.06) 5146.9 (0.11) 5115.3 (1.41) 5092.1 (1.41) 5065.0 (1.41) 5034.5 (1.41)
5142.5 (0.07) 5153.7 (0.08) 5136.3 (0.29) 5127.0 (*.**) 5117.9 (0.58) 5107.7 (*.**)
Numerical standard errors in parentheses if greater than 0.02.
in view of the results described below we did not spend additional time experimenting with s so as to move all acceptance rates into this range. Code was written in Matlab and executed on standard laptop machines. It is likely that faster execution could be realized with compiled code. Table 3 provides the log marginal likelihoods of the 24 cases. In all cases the approximations use the algorithm of Chib (1995), augmented with an auxiliary simulation approximation of the normalizing constant of the prior distribution in those models that impose regularity. These approximations could be improved with more simulation to provide a better approximation of the normalizing constant of the prior distribution, but this would not change the main implications of the results in Table 3. Regardless of the imposition of regularity or homogeneity, the log Bayes factor in favor of b = 1/2 over any other specification is at least 5.5. The distinction between b = 1 and b = 1/3 is not always clear-cut. The successively more flexible expansions b = 1/4, b = 1/5, b = 1/6 then follow in order supported by large log Bayes factors. The latter comparisons provide a classic illustration of the penalization of additional parameters in Bayesian model comparison. For all bases of polynomial expansion, homogeneity is favored over non-homogeneity, regardless of whether regularity is imposed or not. These comparisons become especially strong as b decreases and the number of model parameters increases, and this again can be interpreted as a penalization for excessive parameterization. For more flexible expansions (b ≥ 1/3) log Bayes factors favor non-regularity regardless of the specification of homogeneity. When b = 1 or b = 1/2 comparisons are mixed and log Bayes factors never exceed 1 in absolute value. Thus the evidence from the data clearly favors the expansion b = 1/2 with a specification of homogeneity, but is less clearcut about the regularity specification. On the other hand, economic theory has strong prior probabilities in favor of regularity and homogeneity but has very little to say about an appropriate base b in a fractional polynomial approximation of a cost function. Conditional on this assignment of prior probability across models, posterior probability clearly favors the specification with b = 1/2 that imposes regularity and homogeneity, given the summary of the evidence from the data in Table 3. Interestingly, this is precisely the generalized Leontief cost function first proposed by Diewert (1971). So long as the cost function C (x) is regular and first-degree homogeneous in prices x, the familiar characterizations in terms of elasticities follow in the usual way. Denoting C (x) normalized by output by C ∗ (x), the elasticity of substitution between factors i and j is
σij =
C ∗ (x) · ∂ 2 C ∗ (x) /∂ xi ∂ xj
∂ C ∗ (x) /∂ xi · ∂ C ∗ (x) /∂ xj
.
Since C ∗ is represented by the fractional polynomial pb (x; a) in our model, the mapping from a to σij for a given vector of factor prices x follows from the expressions developed in Section 2.3. In particular, these mappings from the posterior simulations of a to σij provide the simulation representation of the posterior distribution of σij .
27
From Shephard’s lemma the ratio of factor demands normalized by output is ∂ C ∗ (x) /∂ x, and the elasticity of demand for factor i with respect to factor price j is
ηij =
∂ 2 C ∗ (x) /∂ xi ∂ xj · xj . ∂ C ∗ (x) /∂ xi
Simulations from the posterior distribution of ηij can again be constructed using the posterior simulations of a and the expressions developed in Section 2.3. Figs. 2 through 5 show the 0.25, 0.50 and 0.75 quantiles of the posterior distributions of these elasticities for the five expansions b = 1/2 through b = 1/6 when homogeneity and regularity are imposed. (Elasticities are trivial when b = 1.) In each figure, each of the three rows of panels corresponds to a different elasticity. The vertical scaling is the same in each row and selected to include all quantiles for all polynomial bases shown. Scaling differs from one row to the next. Each of the three columns of panels corresponds to a different configuration of factor prices. In all cases the price of fuel is the sample mean 30.75 indicated in Table 1. In the left column the prices of labor and capital are their sample means (8002 and 71.42, respectively) indicated in Table 1 and by the circle in Fig. 1. In the center column the price of labor is its lowest and capital is its highest in the sample (5063 and 92.65 respectively) indicated in Table 1 and by the intersection of the dotted lines in the upper left corner of Fig. 1. In the right column the price of labor is its highest and capital is its lowest in the sample (13 044 and 31.33 respectively) indicated in Table 1 and by the intersection of the dotted lines in the lower right corner of Fig. 1. We turn first to the substantive implications of the posterior distributions of elasticities, and then to the econometrics of their sensitivity to the base of the fractional polynomial approximation b and the vectors of factor prices examined in Figs. 2–5. The pattern of elasticities of substitution is broadly the same at all three factor price vectors and for all bases of the fractional polynomial examined (Fig. 2). Elasticities of substitution σKL between labor and capital are high, usually exceeding 2, while elasticities of substitution between fuel and the other two inputs are lower, usually less than 0.5. At the sample mean (left column of panels) elasticities of substitution are less sensitive to the base of the polynomial expansion and interquartile ranges are shorter, than is the case for the extreme factor prices (center and right columns of panels). Consistent with these patterns, the elasticity of demand for labor (Fig. 3) is higher (in absolute value) with respect to the prices of labor and capital than with respect to the price of fuel. The own elasticity of demand is generally below −0.5, and for the models that receive the overwhelming support from the data (Table 3) it is between −0.5 and −1.0 at all three factor price vectors examined. For these models the elasticity of demand for labor with respect to the price of capital is about 0.5. Demand for labor is more responsive to labor and capital factor prices when labor factor prices are high and capital factor prices are low (right column of panels) than in the other two cases. Demand for labor has little response to the price of fuel, reflecting the low elasticity of substitution σLF . As in Fig. 2 there is less sensitivity to the base of the polynomial expansion and greater precision in the posterior distribution at the mean of the vector of factor prices x (left column of panels) than at the more extreme prices (center and right columns of panels). The evidence about the elasticity of demand for capital (Fig. 4) is similar. For the polynomial bases b = 1/2 and b = 1/3 that are most strongly supported by the data (Table 3) the elasticity of demand is about −0.5 with respect to own factor price (center row of panels) and about 0.5 with respect to labor factor price (top row of panels). The elasticity of demand for capital with respect to the
28
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
Fig. 2. Elasticities of substitution. Horizontal axis indicates b−1 . Dots connected with solid lines indicate posterior medians. Circles connected with dotted lines denote posterior first and third quartiles.
Fig. 3. Elasticities of demand for labor. Horizontal axis indicates b−1 . Dots connected with solid lines indicate posterior medians. Circles connected with dotted lines denote posterior first and third quartiles.
price of fuel, ηKF , is below 0.2 except for models with lower bases of expansion in the case when labor factor price is high and capital price is low (lower right panel of Fig. 4). Elasticity of demand for fuel with respect to all factor prices (Fig. 5) is very low. At the sample mean it is less than 0.15 (in absolute value) for all fractional polynomial bases b. The same is
true for the other two vectors of factor prices for the bases b = 1/2 and b = 1/3 that receive almost all of the posterior probability (Table 3). Since the posterior distributions of the elasticities in Figs. 2–5 are all driven by the posterior distribution of the coefficient vector a, they have many features in common.
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
29
Fig. 4. Elasticities of demand for capital. Horizontal axis indicates b−1 . Dots connected with solid lines indicate posterior medians. Circles connected with dotted lines denote posterior first and third quartiles.
Fig. 5. Elasticities of demand for fuel. Horizontal axis indicates b−1 . Dots connected with solid lines indicate posterior medians. Circles connected with dotted lines denote posterior first and third quartiles.
1. At the sample mean of factor prices x the location of the posterior distribution of an elasticity, as indicated by its median, is insensitive to the polynomial basis b. If the data required greater flexibility of the cost function than that inherent in the fractional polynomial base b = 1/2, then the other bases would lead to different elasticities of substitution and would have higher log marginal likelihoods. Thus the insensitivity of pos-
terior medians of elasticities to b, at the sample mean, is consistent with the evidence supporting the expansion b = 1/2, compared with more flexible specifications, documented in Table 3. 2. For the extreme factor price vectors x (center and right columns of panels) the location of the posterior distribution can be sensitive to the fractional polynomial basis b; in some cases interquartile intervals for the same elasticity but different bases
30
J. Geweke, L. Petrella / Journal of Econometrics 183 (2014) 22–30
b do not intersect. This is to be expected as the vector of factor prices x moves away from the main support of the data, the limiting case being one of pure extrapolation. Different models extrapolate in different ways. 3. For any given base b of the fractional polynomial, posterior distributions are more concentrated at the sample mean of factor prices x (left column of panels) than for the extreme cases. This reflects the fact that the factor prices are more heavily concentrated at their sample mean than at the extreme prices, and therefore they provide more information at the mean. 4. As the base of the fractional polynomial b decreases, and therefore its flexibility increases, the ways in which elasticities can respond to changes in factor prices increase. At factor price vectors x where there is less information in the data, this greater variability in the prior will be more strongly reflected in the posterior distribution. This can be seen in the center and right panels of Figs. 2–5. At factor price vectors x where there is more information in the data, the likelihood function dominates the posterior distribution and the less informative priors inherent in lower bases b have less influence on the posterior. This is evident in the left columns of panels of these figures. 4. Conclusion Neoclassical producer and consumer theory presents an important and well defined problem for econometrics: imposing exactly the regularity conditions inherent in the theory—no less, and no more. Parametric approaches to the problem, for example translog and generalized Leontief parametric forms, have been the workhorses of neoclassical applied econometrics for over four decades. These approaches exclude most regular functions and, indeed, are themselves regular only on compact subsets of the positive orthant. The functional approximation literature strongly suggests that the problem can only be solved on such compact subsets. That requirement is an important mathematical constraint. This paper has shown how fractional polynomial expansions can provide such a solution, using results in the functional approximation literature that extends the classical Weierstrass approximation theory to Sobolev spaces for functions of several variables. The extension is critical to reliable applied econometrics because lower order derivatives of production, cost and utility functions drive the comparative static exercises that are ubiquitous in the applied neoclassical literature. The application shows that the approach is practical, even for fractional polynomials with many terms, in the case of a cost function with three factors of production. Full Bayesian representation of functions of interest, like elasticities of substitution and demand, is practical using standard Markov chain Monte Carlo methods.
The application utilized a data set with 158 observations, which is the same order of magnitude as most of the canonical data sets in the neoclassical production and consumption literature. For these particular data the evidence favors a fractional polynomial expansion that corresponds to the generalized Leontief model of Diewert (1971). An interesting question for on-going empirical work is whether this turns out to be the case with other cost function data as well. Acknowledgments Geweke acknowledges partial financial support from Australian Research Council grant DP110104372 and useful discussions with William J. McCausland. References Barnett, W.A., Geweke, J., Yue, P., 1991a. Semiparametric Bayesian estimation of the asymptotically ideal model: the AIM demand system. In: Barnett, W.A., Powell, J., Tauchen, G. (Eds.), Nonparametric and Seminonparametric Methods in Econometrics and Statistics. Cambridge University Press, Cambridge, pp. 127–174. Barnett, W.A., Geweke, J., Wolfe, M., 1991b. Seminonparametric Bayesian estimation of the asymptotically ideal production model. J. Econometrics 49, 5–50. Barnett, W.A., Jonas, , 1983. The Müntz-Szász demand system: an application of a globally well behaved series expansion. Econom. Lett. 11, 337–342. Barnett, W.A., Serletis, A., 2008. Consumer preferences and demand systems. J. Econometrics 147, 210–224. Bloom, T., 1989. On the multivariable Müntz-Szatz problem. In: Chui, C.K., Schumaker, L.L., Ward, J.D. (Eds.), Approximation Theory VI, vol. 1. Academic Press, New York, pp. 89–91. Bloom, T., 1990. A spanning set for C (I n ). Trans. Amer. Math. Soc. 321, 741–759. Bloom, T., 1992. A multivariable version of the Müntz-Szatz theorem. Contemp. Math. 137, 85–92. Chib, S., 1995. Marginal likelihood from the Gibbs output. J. Am. Stat. Assoc. 90, 1313–1321. Christensen, L.R., Greene, W.H., 1976. Economies of scale in United States electric power generation. J. Polit. Econ. 84, 655–676. Diewert, E., 1971. A generalized Leontieff production function: an application of the Shephard duality theorem. Econometrica 39, 206. Evard, J.C., Jafari, F., 1994. Direct computation of the simultaneous Stone–Weierstrass approximation of a function and its partial derivatives in Banach spaces, and combination with Hermite interpolation. J. Approx. Theory 78, 351–363. Gallant, R.A., 1981. On the bias in flexible functional forms and an essentially unbiased form: the Fourier functional form. J. Econometrics 15, 211–245. Geweke, J., 2005. Contemporary Bayesian Econometrics and Statistics. Wiley, Englewood Cliffs, NJ. Greene, W.H., 1997. Econometric Analysis, third ed. Prentice Hall, Upper Saddle River, NJ. Royston, P., Altman, D.G., 1994. Regression using fractional polynomials of continuous covariates: parsimonious parametric modeling. J. R. Stat. Soc. Ser. C 43, 429–467. Rudin, W., 1996. Real and Complex Analysis. McGraw-Hill, New York. Sauerbrei, W., Royston, P., 1999. Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J. R. Stat. Soc. Ser. A 197, 71–94. Terrell, D., 1996. Incorporating monotonicity and concavity conditions in flexible functional forms. J. Appl. Econometrics 11, 179–194.