Research funding and performance in U.K. University Departments of Economics: A frontier analysis

Research funding and performance in U.K. University Departments of Economics: A frontier analysis

Economics of Education Review, Vol. 14, No. 3, pp. 301-314, 1995 Pergamon Copyright© 1995 ElsevierScience Ltd Printed in Great Britain. All rights r...

1MB Sizes 0 Downloads 30 Views

Economics of Education Review, Vol. 14, No. 3, pp. 301-314, 1995

Pergamon

Copyright© 1995 ElsevierScience Ltd Printed in Great Britain. All rights reserved 0272-7757/95 $9.50+0.00 0272-7757(95)00008--9

Research Funding and Performance in U.K. University Departments of Economics: A Frontier Analysis JILL JOHNES and GERAINT JOHNES Department of Economics, The Management School, Lancaster University, Lancaster LAI 4YX, U.K.

Abstract--Data envelopment analysis (DEA) is used to investigate the technical efficiency of U.K. university departments of economics as producers of research. Particular attention is paid to the role of external funding of research as an input into the research process. The data set used is an extended version of the one which informed the 1989 Universities Funding Council peer review, and the results obtained here are compared with those obtained by the Council. We conclude that DEA has a positive contribution to make in the development of meaningful indicators of university performance.

1. INTRODUCTION

bidding scheme, however, this mechanism has been replaced by a system of dual pricing. In the current

In recent years both pragmatic and philosophical considerations have intensified the drive towards increased efficiency of universities in the United Kingdom. The increased demand for (and supply of) post-compulsory education has coincided with a reduction in income tax rates, thus making more acute the search for both alternative sources of funding and efficiency gains. In line with the Conservative government's free market philosophy, measures have been introduced to make the provision of tertiary education more competitive in nature. This has been the case on both the teaching and research fronts, In the case of teaching there has been a move away from block grant funding of institutions, in favour of the principle that funding should go to those schools which are efficient and which can attract student applications for enrolment. Initially this involved the introduction of bidding systems, whereby institutions bid against each other for the right to receive government support for the tuition of specified numbers of students (Cave, 1990; Johnes, 1992a). Following the failure of the Universities Funding Council (UFC)

system, universities receive a basic fee for each student taught, and receive an additional sum based upon the numbers of students taught in the previous year. This is intended to provide an incentive for institutions facing relatively low costs to expand more rapidly than others. Likewise the research function has been subject to more intense competition. Periodic peer reviews conducted by the funding councils assess the relative research contribution of departments across universities, and allow the research element of funding from the central government's Department for Education to be allocated accordingly. Naturally enough, the move to greater efficiency has led to a clamour for performance indicators (PI) of various sorts (DES, 1987; Johnes and Taylor, 1990; Cave et al., 1991; Johnes, 1992b). These PIs include measures of unit costs (Johnes, 1990b), student attrition rares (Johnes and Taylor, 1989; Johnes, 1990a), degree results (Bee and Dolton, 1985; Johnes and Taylor, 1987), and research performance (Johnes, 1988a, 1988b, 1989, 1990). These measures all suffer

[Manuscript received 10 September 1991; final revision accepted for publication 5 December 1994.] 301

302

Economics of Education Review

from a number of common problems. In particular, appropriate adjustment needs to be made in order to prevent confusion between output and efficiency. For example, crude measures of research output can be obtained by a straightforward publications or citations count; but this makes no allowance for the vast differences in resources faced by the various institutions in the sample - - faculty in some departments may have PCs in their offices, good library facilities and low teaching loads, while those elsewhere may enjoy none of these luxuries. A meaningful measure of performance ought therefore to allow for differences in inputs in constructing a PI based on efficiency. In practice, it is rarely possible to make full allowance for differences in input levels; in the case of research output the only inputs considered in most analyses are those of faculty (distinguishing between those employed full-time on research projects and those who also teach), and external financial support for research projects, Focusing now more clearly on measures of research performance, a further problem concerns the tools with which research output is measured, Citations analysis is a commonly advocated method, but is fraught with difficulties. Particularly serious are the time lags involved. Since the citation half-lives associated with many of the leading economics journals are long, any rigorous citations analysis would inevitably rely on data going back 10 or 15 years, The validity of such a practice in the context of a planning model designed to help allocate resources for the next 5 years is dubious indeed. Moreover, there are gaps in the Social Science Citations Index which, in the case of British economics at least, are quite serious - - at least two major British journals (Fiscal Studies and the Bulletin of Economic Research) are not used as sources for the SSCI; neither are books, Bearing in mind these problems, an alternative method might be to use a publications count. Here, though, one is faced with the problem of adding together apples and oranges; how should publications of different types (for instance, articles and books) be weighted in order to arrive at an overall measure? The seriousness of this problem is underlined by the results obtained by Johnes (1990), who shows that departmental rankings are extremely sensitive to the choice of loss function. This is particularly worrying in the context of the "informed peer review" system currently used to allocate (non-project-specific) resources for research funding across universities in

the United Kingdom; j the peers are given very extensive information about the numbers of various types of publication produced by each unit of assessment, a limited amount of information upon which to base subjective assessments of quality, and no data on citations. The problem addressed in the present paper, then, is that of how publications data can best be used to inform the peer review process. Few would argue in favour of replacing peer judgement by purely mechanistic methods; this is true afortiori in cases where, owing to the absence of a functioning market, prices cannot inform the form of the loss function defined across multiple inputs and outputs. Peers should use their judgement, but it is far from clear that they should do so in a mechanistic way which penalises diversity. Recent developments in linear programming and frontier analysis enable some light to be shed on the relative technical efficiency of various departments, without needing to make arbitrary assumptions about the weighting scheme to be attached to various types of output. The use of data envelopment analysis (DEA)offers planners both at national and local level a mine of information which can be used~ to further the cause of efficiency in higher education. The following sequence is pursued in the remainder of this paper. Section 2 introduces briefly the method of DEA. Section 3 describes the data set used in the present study, and the following section reports our results. Conclusions are drawn in section 5. 2. DATA ENVELOPMENT ANALYSlS Several methods have been proposed in recent years as means of estimating the position of the production possibility frontier. Many of these are statistical in nature (see, for instance, Barrow, 1991). The frontier estimated by such methods must be dominated by observations which occur in life (since a regression line passes through a scatter of data rather than enveloping the data points) - - a clear disadvantage of the statistical approach when used in this context. An alternative technique for establishing the shape of the frontier - - and one which has particular appeal in applications with multiple inputs and outputs and where market prices are either absent or severely distorted - - is that of DEA. The method of DEA has as its origin in the work of Dantzig (1951) and Farrell (1957). The recent advances which have led to the more widespread use

Funding and Performance in U.K. Economics Departments

of the technique are due to Charnes et al. (1978). An excellent introduction to DEA is provided by Sexton (1986), and an early application of the method to the problem of constructing PIs in higher education may be found in Tomkins and Green (1988). In the present section the main features of DEA will be discussed only briefly; readers interested in a more comprehensive exposition are referred to the sources listed above, In situations where a market operates, the assignments of weights to the various outputs of a firm or any other decision making unit (DMU) is straightforward. Prices are observable so that the worth of one type of output relative to that of another is readily assessed. The same is not true in situations either where markets are absent or where their operation is substantially impeded. Hence, for example, we cannot easily define the contribution which a typical book makes to the research output of a university department, relative to that of a typical journal article. It is on situations such as these that DEA can throw light. Although prices do not exist and so meaningful aggregation across various output types is (in the absence of a dictatorially prescribed set of weights) impossible, some assessment of technical efficiency of DMUs is often feasible. For instance, consider two DMUs, each of which produces two types of output using an identical vector of inputs; if the first DMU produces more of both outputs than does the second, then clearly the latter DMU is technically inefficient. DEA extends this simple principle by using mathematical programming methods to define a piecewise linear production possibility frontier, so that DMUs whose output vectors lie within the frontier must be inefficient, while those with output vectors on the frontier are technically efficient. Note that no reference is made here to allocative efficiency, To formalize the model, consider a problem in which each of n DMUs uses m inputs in order to produce s types of output. T h e j t h DMU uses x 0 units of input i in the production of Yrj units of output r. For each of the n DMUs under consideration, a linear programming problem is set up which aims to choose the vectors of output weights, urk, and input weights, vik, where k = 1, ..., n, in such a way as to maximize the ratio of weighted output to weighted input subject to the appropriate constraints. (Note that implicit in this program is the assumption that returns to scale are constant.) It is, however, more convenient to flip this over and express it as a minimization problem; so, in the formal set-up of the linear program, we min-

303

imize inefficiency (equation 1 below) with respect to the choice of weights and subject to the constraints that the ratio of weighted outputs to weighted inputs should not exceed unity for any DMU (equation 2), the sum of weighted outputs are normalized to unity (equation 3), and that all weights are non-negative (equation 4). Hence min gk = ~ml VikXik u.~, s.t. 0 <- ~Zm~vi,x~j -- ~ , ~ ur*yrjVj 1= ~ ur*yr~

(1)

Vik,Urk >-- O.

(4)

(2) (3)

Equation (1) defines a set of n linear programming problems - - one for each DMU - - and each of these must be solved subject to the (n + 1) constraints defined by equations (2) and (3). The control variables, u and v, are (s + m) in number. Since each linear program involves (n + 1) constraints, there are (n + 1) shadow prices associated with each DMU. These shadow prices are, in effect, the value of a marginal relaxation of the constraint. For each DMU, the first n shadow prices - - those associated with equation (2) - - have an important economic interpretation. For notational convenience we shall refer to the first n shadow prices relating to the linear program of the kth DMU as zj, where j = 1, ..., n. We may interpret zj as follows. If zk equals unity, weighted inputs in the kth DMU are as low as constraint (2) allows, so the kth DMU is technically efficient. In other words, that DMU lies along the production possibility frontier. Otherwise, constraint (2) where j = k is ineffective so that zk equals zero; in this situation zj must exceed zero for at least some values of j ~ k. The latter DMUs form the set of units which (a) perform better than the kth DMU according to the kth D M U ' s optimal weighting vectors, and (b) are themselves technically efficient. If the kth DMU wishes to follow the cheapest route to technical efficiency, therefore, it should seek to emulate this group of DMUs. For conciseness, this latter group is referred to as the efficient reference set of the kth DMU. The zj f o r j = 1, ..., n may be interpreted as weights used in order to derive the linear combination of efficient DMUs which lies nearest the kth D M U ' s current position. The (n + 1)th shadow price attached to the kth linear program, which, for convenience, we shall denote by zo, also has a useful economic interpretation. It

304

Economics of Education Review

represents the output response of the kth DMU to a marginal relaxation of the output constraint on a DMU which lies at the closest point of the production possibility frontier to the kth DMU. If Zo = 1, then the kth DMU is itself efficient. Otherwise Zo will lie somewhere within the unit interval, and may be interpreted as a measure of the efficiency of the kth DMU. Henceforth we shall refer to Zo as the efficiency score, It is useful at this stage to illustrate the above by means of a simple example. Suppose that there exist four university departments of economics, and that these may be called A, B, C and D. The resources used and the publications produced by these DMUs are tabulated in Table 1. It is clear from this table that there are marked differences between the four departments in terms of the priorities which they attach to various types of research output. Department B emphasises the production of articles in academic journals, while department C assigns more weight to writing books. Department D occupies the middle ground. This means that any weighting scheme that stressed the value of books would favour department C, while a scheme that attached heavy weight to articles would favour B. It is easily demonstrated that no matter what weighting scheme is chosen, department A cannot appear efficient in this example. Only if books are assigned zero weight can A match the per capita productivity of D; in this case, though, A ' s productivity is inferior to that of department B. So A is a technically inefficient DMU. Note that the role of the constant returns to scale assumption is made clear in this example - - if returns to scale were increasing, the apparent inefficiency of department A could be attributed to its small size rather than to technical or xinefficiencies. Further light can be thrown on this argument by appeal to a graphical analysis. The production possibility frontier defined by the envelope around the data

Table

DMU A B C D

in Table 1 is shown in Fig. 1. The axes measure the

per capita output of articles and books. Departments B, C and D all lie along the frontier. The input-output combinations used by these departments are known to be feasible because they occur in life. (They might conceivably be improved upon, but we have no evidence to support such a view.) Combinations of inputs and outputs which are observed along the lines joining the three points representing departments B, C and D must also be feasible so long as the standard concavity assumption holds. Since the input-output combination used by department A is strictly dominated by the frontier, this department must be technically inefficient. This is true irrespective of the weights vector applied to the two per capita output measures in the problem. Now suppose that weights are assigned to the two outputs produced by the departments in the sample, so that a weighted total of per capita output can be calculated for each department. The set of weights which would bring department A closest to the production possibility frontier in this instance involves assigning five times as much weight to books as to articles. For this reason we have drawn Figure 1 using different scales for the two axes. From this diagram

~ g -~ ~

B A

4 -

'~.

. 3 ~ .......

2

~

go

," ," ,"

s s ~" s ~ ~ c

t -

,"

s

s

0

I o. 1

0.2

I

I

0.3

0.4

Books per capita Figure 1.

1. A simple example of DEA

Staff

Articles

Books

Articles per capita

Books per capita

15 20 25 30

45 80 25 90

3 2 l0 9

3 4 l 3

0.2 0.1 0.4 0.3

Funding and Performance in U.K. Economics Departments it is clear that a useful measure of department A ' s technical efficiency is the ratio OA/OA*. This ratio is the efficiency score (Zo) referred to earlier, and in this example A ' s efficiency score equals 8/9. Put simply, this means that, even if the weighting scheme most favourable to A is assumed, department A is only 88.9% as efficient as it could be. In essence, then, each DMU chooses the set of input and output weights which is best for itself; that is, the weights which minimize that DMU's ratio of (weighted) input to (weighted) output. This is done subject to the constraint that the vector of weights chosen by the kth DMU does not give any DMU a weighted output to weighted input ratio above one. Hence the efficiency of each DMU is assessed according to standards set by that DMU itself; it will be deemed inefficient only if it is outperformed by other departments which share the same strengths (and are given the same resources). This means that no DMU can be judged inefficient on the grounds that it produces output of a type which a group of peers might consider to be of low worth or unfashionable, Thus departments of economics which make a substantial research contribution by writing chapters in edited books (arguably an unfashionable activity in the eyes of UFC peers) are not discriminated against in a DEA which includes such publications as an output. This last point raises an important issue. By suitably choosing the inputs and outputs which are to be included in a DEA, it is often possible to make virtually any DMU appear efficient. In this respect, a poorly conducted DEA exercise would be a freak's charter. A technically inefficient DMU could apparently become 'efficient' merely by producing (however wastefully) an unusual type of output, or by forgoing the use of one type of input employed by all other DMUs. It is not at all clear that we would wish to deem technically efficient a university simply because it has no computer facilities! While DEA allows each DMU to choose its own loss function, careful analysis requires limits to be set to this generosity - - only the analyst can decide which inputs and outputs are to be included in the analysis (and so assigned non-zero weight) in the first place. DEA removes the need for a dictator to impose precise weights, but the analyst must nevertheless dictate which variables are worthy of inclusion, The above has three implications: first, the analysis should be based on as complete as possible a set of DMUs, and great care needs to be taken in choosing

305

the set of inputs and outputs suitable for inclusion in the DEA. The DEA results should ideally be robust with respect to the inputs and outputs chosen, and (in order to avoid bunching of DMUs on the frontier) the inputs and outputs included in a DEA should be limited to those whose relevance can be clearly argued. We have accordingly tested for robustness by running around 200 separate DEA trials, each using a different vector of inputs and outputs; we find that the results are in general robust, but as we note below, the inclusion or otherwise of one of our variables (research grants) as an input does have a substantial effect on the pattern of efficiency scores across institutions. The ramifications of this observation will be discussed later. The second implication is that, since DEA does not allow a unique set of input and output weights to be defined, it is not possible to evaluate the margirial impact of each input on each output. For example, an extra £ 10,000 of research grant finance might, ceteris paribus,produce an extra article in an academic journal if awarded to a specific department in the sample; in another DMU, research funding might be assigned zero weight as an input and the extra finance might have no effect on output. Since each DMU defines its own loss function, then, it would violate the spirit of DEA to attempt to put numbers on trade-offs of this kind. This is regarded by some observers as a disadvantage of the DEA technique, but we believe such a view to be misplaced. In the absence of both a market and a dictator, we have no information about input and output weights at all, and we cannot therefore evaluate the marginal impact on various outputs of changing input levels. This situation does not change simply by introducing DEA, but at least we can thereby gain information about the technical efficiency of each DMU. The third implication is that efficiency scores should not naively be taken at face value; the analyst should seek to understand the reasons why a certain DMU is deemed efficient or otherwise. Only thus can we ensure that a unit which achieves an efficiency score of unity does not do so merely by virtue of idiosyncrasy. For this reason, the DMUs which our analyses deem technically efficient are each individually discussed in the analysis which follows. It should be clear from the preceding discussion that DEA provides a great deal of managerial information. This is of a kind that is useful to planners at national level - - because it enables departments to be ranked according to their relative technical efficiency

306

Economics of Education Review

(using each DMU's value of Zo) - - and also at the local level of the DMU - - since it allows heads of departments to identify the strengths and weaknesses of their unit, and also to identify which departments elsewhere in the country they should seek to emulate (by means of the efficient reference set). The computational burden of DEA is quite substantial, since n linear programs must be solved, each of which has (s + m) control variables and (n + 1) constraints. Where, as in the present case, n is large (with 36 DMUs), solution can be time consuming. Fortunately recent software developments alleviate the burden of programming and allow the DEA to be run with reasonable economy of computing time. The package used to derive the results which follow is iDEA, developed by Rod Green at the University of Bath. It is worth at this stage taking stock of the major assumptions which underpin the DEA method. First, we have assumed that the frontier is concave and made up of distinct linear segments; the first of these assumptions is consistent with the received economic theory, but the second is not. The data set used in the sequel contains some 36 DMUs, the positions of which are bounded by a concave envelope. The piecewise linear nature of the DEA envelope clearly approximates the smooth frontier of economic theory more closely the greater the number of DMUs on the frontier. Where few DMUs are deemed technically efficient by DEA the piecewise linear approximation to the frontier puts an upward bias on the efficiency score of inefficient units, Secondly, DEA assumes that the frontier is defined by the most technically efficient units that are actually observed. If in reality all DMUs are technically inefficient, and if some DMUs on the measured frontier are in fact less efficient than others, the results of the analysis can mislead. This appears to be a strong assumption, though it is not clear how it could be tested; the heroism of the assumption lessens as the number of DMUs in the data set increases, since the probability of observing units close to perfect technical efficiency rises with sample size. Thirdly, DEA assumes constant returns to scale, The implication of DEA is that a DMU with an efficiency score of one half (say) could double its output (given input) in order to become technically efficient; but clearly, if returns to scale were variable, this would not be the case. In the present instance we have tested the returns to scale assumption by means of an analysis of covariance between faculty numbers

and efficiency score; we found no relationship between measured performance and size. While this is reassuring in the sense that it lends support to our approach in this paper, it is also a little surprising in view of other work which has detected a size effect on research performance (Johnes, 1988a). In view of this latter work, we followed the method advocated by Farreli (1957) - - that is we split the data into groups based on department size, and repeated the DEA within each group. This led to results which were virtually identical to those obtained from the analysis on the full sample, thus confirming that scale effects are absent from the data under consideration here. 3. THE DATA The UFC has hitherto conducted three major Research Assessment Exercises - - in 1986, 1989 and 1992. All these have involved a system of informed peer review. Since the information from the most recent exercise is not yet available, the data used in the present paper concern the period covered by the 1989 appraisal. The 1989 Research Selectivity Exercise covered all university departments in the U.K. (with the exception of those in the small private sector University of Buckingham, and the distance learning oriented Open University). The universities created in 1992 as a consequence of the abolition of the binary divide (between the old universities and the polytechnics) were not included in the 1986 or 1989 exercises? A vast quantity of data was collected from department,, including names, age, duration of employment, and rank of faculty, and the total numbers of books, articles and chapters published over the previous 5 years (Johnes et al., 1993). In addition, the best two publications by each faculty member were listed so that these could be read by the peers. (The choice of publications was left to the DMU, and typically delegated to the individual researcher). Information was also collected on research grants obtained from external funding agencies, and on both undergraduate and postgraduate teaching loads. On the basis of their informed peer review, the UFC classified all departments (including departments of economics) into five groups. Group 5 represented departments where all faculty have a national reputation for excellence, and some have a significant international reputation; departments of economics at Birkbeck, Essex, LSE, Oxford, Southampton, Warwick and York were all

Funding and Performance in U.K. Economics Departments allocated to group 5. Group 1 contained departments where no faculty had a significant national or international research reputation, We are extremely fortunate to have obtained, for most departments, a data set which includes not only the information available to the UFC, but also full bibliographic details for the years 1984-88. This, we believe, provides us with the most comprehensive set of data hitherto available for the assessment of research performance across departments. The data were kindly made available to us by heads of departments in order to enable the Royal Economic Society (RES) to provide an input into the UFC's Research Selectivity Exercise and to monitor the same. Details of the RES role in this exercise are given in Flemming (1991), and information about the data set is provided by Johnes (1990). The comprehensive nature of the bibliographic information contained in the data set enables us to classify publications according to a much more detailed typology than that used by the UFC. To be specific, we define eight categories for use later in this paper. These are: (i) papers in academic journals (ii) (iii) (iv) (v)

letters in academic journals articles in professional journals articles in popular journals authored books

(vi) edited books (vii) published official reports (viii) contributions to edited works. Heads of departments were invited to classify their units' output into these categories; we checked the results for consistency and also classified the output of those departments whose heads declined this invitation. In addition to these eight classes, a further category - - representing papers and communications published in serials identified by Diamond (1989) as the 'core' journals of economics - - was defined. This allows extra weight to be attached to publications in these journals, 3 and in this way allows the publications count to approximate the measurement of impact which is achieved by citations counts. While it is important to distinguish between the 'impact' (thus measured) and the 'quality' of a piece of research (see, for instance, Johnes, 1988b), it is difficult to see how else an objective and uncontentious operational definition of research quality could be constructed. 4 To the extent that authored books serve the function of textbooks, this category reflects the contri-

307

bution made by academics to the development of instructional tools; a similar function is performed by articles in popular journals? Articles in professional journals represent a contribution to business, while published official reports reflect a contribution made by academics to government policy making. Articles in academic journals are primarily addressed to the academic profession itself; the extra weight which should attach to high quality research ot this type is allowed for by the inclusion as a separate variable of articles in core academic journals. Departments with different research orientations or missions are therefore likely to publish in outlets of different types. The UFC data enable two distinct types of faculty to be distinguished. Those whose responsibilities are limited exclusively to research activity form the first group, while the second consists of faculty who both do research and teach. We follow the UFC's lead in taking separate account of these two groups. While the former group has more time available to devote to research, it consists largely of fixed contract research assistants with relatively limited experience; accordingly we have no a priori expectation as to the relative research productivity of the two types of faculty, and therefore prefer not to impose unnecessary restrictions on the analysis. Inputs of both types of faculty are measured as person-months over the 5 year period. 4. THE ANALYSIS It is well established that "efficiency scores cannot go down when additional variables, either outputs or inputs, are added to the model" in a DEA setting (Sexton et al., 1986, p. 86). This is analogous to the impact of variable addition on the determination coefficient in a regression analysis. Unfortunately, though, while statistical inference can be used as a means of judging whether or not a variable should be included as a regressor in a statistical analysis, DEA is not a statistical technique, and no such guide is available. For this reason we adopt in this paper a parsimonious approach to the measurement of research output. The only outputs which are considered are papers and letters in academic journals, with extra weight allowed for those published in the core journals of economics. This extra weight allows some crude 'quality' adjustment to be made in our measure of output. In the first instance only one input is considered, namely the person-months of teaching and research faculty employed over the 5 year period. 6

Economics of Education Review

308

The results obtained by applying DEA to these data may be seen in the first column of Table 2. These are the efficiency scores - - the shadow prices associated with equation (3), Zo - - for the economics departments of the 36 universities under consideration. As can be seen from this table, Liverpool and Birkbeck appear to be technically efficient departments; Aberdeen, Bristol, UCL, Reading, Warwick and York are not far behind. Casual inspection of the data confirms that Liverpool and Birkbeck indeed appear to be relatively prolific in their output, but it is instructive to

Table 2. DEA efficiency scores achieved by university economics departments in the U.K. University or college 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

Aberdeen Aberystwyth Bangor Bath Belfast Birmingham Bristol Cambridge Cardiff City Dundee Durham Edinburgh Exeter Glasgow Heriot-Watt Hull Keele Kent Lancaster Liverpool London(Birkbeck) London(QMC) London (UCL) Loughborough Nottingham Reading Salford Sheffield St Andrews Stirling Strathclyde Surrey Sussex Warwick York

Efficiencies obtained in DEA run 1

2

3

0.83 0.62 0.66 0.79 0.39 0.37 0.84 0.40 0.23 0.23 0.62 0.44 0.33 0.26 0.18 0.38 0.76 0.60 0.48 0.72 1.00 1.00 0.47 0.87 0.71 0.58 0.83 0.20 0.43 0.58 0.40 0.23 0.57 0.38 0.82 0.88

1.00 0.95 0.66 1.00 0.68 0.82 1.00 0.60 0.44 1.00 0.71 0.64 0.42 0.35 0.24 0.42 0.98 0.87 0.58 0.95 1.00 1.00 0.62 1.00 0.80 0.88 1.00 0.27 0.76 0.68 0.53 0.27 0.62 0.65 0.82 1.00

0.83 0.63 0.66 0.82 0.41 0.44 1.00 1.00 0.26 0.23 0.62 0.45 0.34 0.28 0,19 0,40 0.82 0.60 0.49 0.75 1.00 1.00 0.49 0.96 0.72 0.61 1.00 0.20 0.47 0.58 0.40 0.26 0.57 0.42 1.00 1.00

delve deeper in order to establish the criteria which DEA has used to label these, rather than other departments, technically efficient. Liverpool publishes a large number of letters and communications per member of staff. Birkbeck is an unusual department in that, over the time period concerned, it was not involved in undergraduate teaching. Its faculty were therefore able to devote an unusually large proportion of their time to work at the frontiers of knowledge in general, and to research in particular. The only department given a rating of '4' or above by the UFC which is not in the above group of eight departments in Cambridge. Two reasons for the strong showing of Cambridge in the UFC peer review suggest themselves. First, it is possible that a halo effect operates, whereby the good reputation of Cambridge in fields other than economics influences the peers' judgements. Secondly, many Cambridge economists choose to publish their work in the form of books or book chapters; such publications are not included in our vector of outputs. Departments which DEA deems relatively inefficient include City, Heriot-Watt and Salford, all of which received ratings of '1' by the UFC. A more formal comparison between the efficiency scores reported in column 1 of Table 2 and the UFC ratings obtained by informed peer review, indicates that the correspondence is tolerably close, with a correlation coefficient of 0.646. This suggests that a department's performance, as viewed by the peers, is determined largely by the per capita rates of publication in academic journals, especially the core journals. It should be clear from the discussion in section 2 above that the results of a DEA can be very sensitive to the choice of inputs and outputs included in the respective vectors. In order to check for this we have experimented with over 190 different specifications of the input and output vectors, and report the detailed results elsewhere (Johnes and Johnes, 1993). In addition to the input of staff employed on teaching/research contracts, the inputs considered in these DEA runs include various combinations of the following: the per capita value of external research grants awarded, the time available for research (proxied by undergraduate teaching commitments) and the person-months of faculty employed on research only contracts. The outputs considered include all eight publication types identified in the last section, plus the ninth category for articles and letters published in core journals. In sum, the results are, in

Funding and Performance in U.K. Economics Departments general, remarkably robust with respect to the choice of input and output vectors. 7 This suggests that there is much to recommend the parsimonious specification adopted here. There is, however, one important caveat. If the value of external research grants per faculty member is introduced as a second input into the research production function, several departments which were previously far from the efficiency frontier now come to be considered as technically efficient. This can be seen by reference to column 2 of Table 2. Several departments which appear to perform poorly in column 1 are seen to do much better if grants are regarded as an input. These departments include Birmingham, City, Sheffield and Sussex. All of these had relatively low levels of grant funding per capita over the 5 year period. Indeed, City received none at all, and as such provides a good example of the care which needs to be taken in interpreting the output of a DEA; it is not at all clear that City can truly be deemed technically efficient, since the reason for its apparent efficiency is that it receives no grant funding - - that is it produces an infinite number of publications per pound of research finance. Nevertheless, while it may not be grounds for unambiguous classification of City as a technically efficient unit, this does indicate that the efficiency score achieved by City in column 1 of Table 2 is downwardly biased by failure to take into account the beneficial impact of research funding on output. It is worth noting that the correlation coefficient obtained when the column 2 efficiencies are compared with the UFC ratings is, at 0.497, lower than that which referred to column I. This raises an important issue. Research income was, it is widely believed, a major factor in determining the rankings obtained in the first of the Research Selectivity Exercises in the U.K. (Gillett, 1987). But there are severe problems associated with this interpretation of research grants as a measure of research effectiveness. Grants are an input into the research process, not an output. If peers consider both grants and publications to be indicative of high research productivity, then they are guilty of double counting, for some of the publications would not have been possible without the favourable research environment created by the grants. Many of the departments which score highly in the second column of Table 2, therefore, succeed in producing a lot of published work relatively inexpensively. Failure to consider the role of external research funding as an input can therefore lead to misleading estimates of

309

the research efficiency of departments. It would appear, from the results in Table 2, that the UFC still tends to confuse research output with more economically meaningful concepts of efficiency. Multiple outputs considered in the analysis so far include only the various types of published research. Another function of the universities - - indeed one which society at large might deem more important than research - - is teaching. Despite synergy effects, faculty time spent teaching might be expected to reduce the time available for research. This suggests that the time available for research (once teaching commitments have been deducted)might potentially represent a further input into the research' production process. In order to allow for this, the per capita annual number of hours spent on each of three activities (undergraduate teaching, postgraduate teaching, and postgraduate research supervision) is subtracted from an arbitrary constant 8 in order to arrive at a measure of time available for research by each faculty member. The efficiency scores obtained from an analysis which includes this variable as an input, and where the remaining inputs and outputs are identical to those used in DEA run 1 are reported in Table 2 as DEA run 3. It is easily seen that there is a close correlation (r=0.919) between these results and those of DEA run 1. The above observations raise an intriguing conceptual problem: if all possible inputs into the production process were included in the analysis, then all DMUs would likely appear to be technically efficient. To make this point clearer, consider the sittlation where we include faculty ability and managerial efficiency as inputs alongside staffing levels, funding variables and information about teaching commitments. In such a situation there would appear to be little reason for any one department to be more or less efficient (in the technical sense) than any other. However, it would make little sense to include all conceivable determinants of output in the analysis in such a 'kitchen sink' approach. This is because some inputs could easily be transferred across institutions, while others are (in the short term at least) fixed. So we may regard teaching loads and grants as control variables amenable to redistribution across the system, while the intellectual capabilities of faculty and the managerial efficiency of departmental heads may vary only over the longer run. In assessing the relative efficiency of departments, it would seem reasonable to control for inter-institutional differences in inputs which could easily be varied in the short run (grants,

310

Economics of Education Review

computing facilities, teaching loads and the like), but not for those which could not. Thus we derive a measure of technical efficiency which provides information about the standards a given department could expect to sustain given that it has the same levels of transferrable resources as every other department, This appears to us to be the measure of efficiency most useful to those involved in allocating resources across the system - - it is a measure which acknowledges the constraints (both short and long term) within which each department necessarily operates, This helps us to answer the question: what factors determine the efficiency scores obtained by individual DMUs? Consider the situation where all inputs which are variable in the short run are included in the analysis - - these include in the present exercise faculty numbers, undergraduate and postgraduate teaching loads, and external finance for research projects. In this case the efficiency scores are determined only by factors which are not variable in the short run - - the factors which represent the true potential of the department. These include primarily the intellectual ability of faculty and the managerial efficiency of departmental heads, Having discussed the efficiency ratings which result from the DEA, let us now turn to consider the managerial information obtained from the analysis which may be of use to heads of department in the individual DMUs. Table 3 shows the efficient reference sets and the associated weights which attach to each department; these correspond to the DEA runs reported in the earlier table. These make a good deal of intuitive sense, as the following examples illustrate. The efficient reference set of Warwick (in both DEA runs where Warwick is not itself technically efficient) contains Birkbeck College - - both are departments which have been especially successful in placing articles with core journals. Birmingham - a department in which much high quality theoretical research is conducted, often unsponsored by external funding bodies - - has in its efficient reference set City University (in DEA run 2), which likewise produces research output with little support in the form of research grants, 5. CONCLUSIONS Publications counts inevitably beg questions about the appropriateness of the weights attached to each of the input and output types considered. In previous work a large variety of alternative weighting schemes

has been employed (see, for instance, Meltzer, 1949; Crane, 1965; Schubert and Braun, 1981), and the sensitivity of the results to the choice of loss function is by now well established (Johnes, 1990). One solution to this problem would be for central planners arbitrarily to impose upon the system their own priorities. Another possible solution - - advocated in this paper - - is to let the data themselves throw light on the issue. This can be done by appeal to DEA. 9 Some adherents of DEA, in common with those of recently developed econometric techniques (such as Bayesian vector autoregressive analyses), claim that the method is atheoretical. That is, the method calls upon the data themselves to decide what importance should be attached to each variable used in the analysis, and does not afford interference by the researcher. We do not support this view. Since a human decides which variables should be included in the analysis in the first place, he or she inevitably influences the outcome. It is lbr this reason that checks for the robustness of the results ofa DEA are essential. In choosing which results to report in this paper we have therefore conducted extensive experiments in order to ensure that these results are representative. We urge others using similar methods to take similar care. Clearly the performance of an institution, or of a department within that institution, is a function of much more than research achievement. Teaching, pastoral care, and keeping an eye on costs are also important roles played by academics. Aggregating across measures of performance derived for these various roles is likely to be just as problematic as summing across different types of publication in order to derive an efficiency index for research output. DEA could therefore be of help to educational planners in this wider context too. It is appropriate at this stage to emphasise once more why correcting for different input allocations should be necessary. It should by now be clear that our aim is t o establish the optimum distribution of resources across departments. In this context, allowance should be made for inter-departmental differences in the allocation of variable inputs when evaluating research performance. Only when such adjustments are made can financial resources be allocated in a manner which, subject to a financial constraint and to the short run immobility of faculty, will maximise output across the university system. To a limited extent, similar adjustments are made in other fields, such as sporting activities: participants in a motor sports event must race vehicles of a similar

University

or

-0.33 -0.13 0.12 0.18 0.55 0.40 --0.01 0.12 0.02 -0.06 0.12 0.52 0.11 0.11 0.50 1.00 --0.35 0.15 0.16 1.07 -0.30 0.03 ----0.08 0.12

21

DEA

22

1

0.80 0.36 0.23 0.83 0.40 0.41 0.73 1.51 0.44 0.16 0.45 0.42 0.39 0.44 0.22 0.40 0.42 0.48 0.45 0.50 . 1.00 0.59 0.77 0.75 0.71 0.18 0.21 0.52 0.34 0.43 0.56 0.47. 0.57 1.61 2.38

run

.

1.00 -. ----0.52 --0.10 -0.07 0.18 -. ----. . 0.56 . . -. ------0.19 . .

1

. 0.59 . . 0.45 0.56 1.00 0.95 0.36 --0.23 . 0.01 0.05 . 0.71 0.20 0.09 0.33

7

. 1.00 0.15 --0.69 0.18 -0.11 0.36 0.30 0.31 0.24 . . 0.02 0.43 -0.42 . . . . . -0.10 . . . . . . 0.55 0.41 . . . 0.21 . 0.03 0.90 0.09 . 0.34 . 0.09 . . . . 0.20 0.31 . . . . . .

.

. 0.14

4

3.

. . .

.

.

.

.

.

.

.

.

.

.

.

.

.

--

. . -.

. .

.

.

. 0.36 . . . 1.00 -.

.

10

. .

.

.

.

.

. .

.

. .

.

.

. --

.

21

.

.

.

run

.

. .

.

. .

.

.

.

.

.

22

0.21

.

0.23 .

2

reference

. .

.

.

.

.

.

.

.

.

.

. .

. .

.

.

.

--

.

.

.

.

.

.

--

. .

.

.

.

.

--

.

. .

.

24

.

-.

.

. . 1.00 -.

.

.

.

and

. 0.03

.

sets

. . . . 0.21 . . . 0.08 1.61 . . .

.

.

0.06

0.09 -. . . . . . --. . . 1.00 . 1.00 -0.07

.

.

DEA

Efficient

. . .

.

-.

.

.

.

-.

. 0.01

-0.05 . 1.00

. 0.01 0.09 . 0.02 0.22 .

.

.

. .

.

.

27

. .

. .

.

.

weights

.

.

.

.

.

.

.

. .

1.00

0.11 0.05 0.19 0.11 .

. 0.01

0.17 -. . . -0.31

0.17 --

0.03 .

. 0.02

.

36

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . .

.

.

. .

.

----.

. --

. . 0.12 --

---

-0.33

-.

. --

.

0.08 1.00

7

. .

.

-----

--

---

---

---

--

--

.

-. 1.00

8

.

.

.

. .

-0.27 0.03 ----

0.02 0.14 0.14

0.05 0.11 0.33 0.11 0.11 0.48 1.00

0.01 0.11 0.02

.

0.12 0.11 --

0.32

21

. .

22

.

0.21 0.05 0.34 0.43 0.07 0.47 0.31

. 0.11 0.16 0.04 0.30 0.31 0.21 0.10 0.21 0.18 0.48 0.39 0.13 . 1.00 0.48 0.66 0.58 0.36

.

run

0.80 0.29 0.23 0.63 0.25 0.22 .

DEA

.

-----0.10

35

. . -0.08 --------------------. . . . . . . ----0.25 -----1.00 --------------0.03 1.00

.

27

-----0.06

3

-0.04 -0.07 0.14 --0.20 --0.20 -0.08 -1.00

0.08 --0.05 0.03 0.09 0.05 0.08 --0.02 0.16

-0.03 -0.08 0.06 --

36

Note: To establish the efficient reference set of a given DMU, refer to the row relating to that DMU. The columns then define the set of DMUs which should be emulated. For instance, consider the first run of DEA, in which two departments -- Liverpool (21) and Birkbeck (22) -- are efficient. Suppose that we are interested in the department of economics at Lancaster. Since there are numbers in both the columns headed 21 and 22, we may conclude that both Liverpool and Birkbeck are in Lancaster's efficient reference set. The numbers are the weights. Suppose now that we are interested in Sussex; now only Birkbeck appears in the efficient reference set, so it is only Birkbeck which Sussex should seek to emulate.

Aberdeen Aberystwyth Bangor Bath Belfast Birmingham Bristol Cambridge Cardiff City Dundee Durham Edinburgh Exeter Glasgow Heriot-Watt Hull Keele Kent Lancaster Liverpool London (Birkbeck) London (QMC) London (UCL) Lou~ghborough Nottingham Reading Salford Sheffield St Andrews Stirling Strathclyde Surrey Sussex Warwick York

college

Table

,-,

.~

~"

t~

~" .~ .~

~, "~ t~

-~ o~

312

Economics o f Education Review

class; boxers are divided into weight divisions; the draft system of the National Football League is devised so that (over time) teams have reasonably equitable access to the best players, If resource allocation is not co-ordinated by a central agency such as the UFC, the aim of any exercise designed to evaluate research performance would differ from the goals of the present article. One possible aim might simply be to establish which department is, in absolute terms, the most productive. If this were indeed the objective, there would be no need to control for input variations. Thus the argument in favour of controlling for differences in inputs may not apply when evaluating the performance of the private sector of higher education in countries such as the United States. An appropriate sporting analogy in this case would be the winner of a motor race in which design of the car as well as quality of the driving became part of the competition. Nevertheless, in a more partial context, U.S. state governments seeking guidance

on how to allocate public monies amongst higher education institutions might find the DEA approach useful. The successful execution of a DEA requires, however, that appropriate and consistent data (such as those used in the analysis reported in this paper) are available at the desired level of analysis (for example, by department). The construction of sensible PIs is a prerequisite for the efficient operation of the higher education systern. The problems of aggregation involved in their construction are far from trivial. DEA is not a panacea - - it cannot answer impossible questions - - but it nonetheless offers much as a means of alleviating the~e difficulties. Acknowledgements--The authors wish to express their thanks to Stephen Hoenack and to two unusually perceptive referees, whose comments have enabled a substantial improvement in the content and exposition of this paper. Responsibility for errors remains, of course, with the authors.

NOTES 1. The funding of research and teaching functions are distinct. Research into the multi-product technology of universities is still in its infancy, and has thus far failed to influence the means by which research and teaching resources are allocated across British institutions. 2. In comparison with the collective of universities of many other countries - - notably the U.S.A. - the sample used here is therefore highly homogeneous in terms of the mix of resources devoted to teaching, research and other activities (Sloane, 1993). With the exception of Cambridge, the universities in the sample also enjoy approximate parity of popular esteem. 3. Indeed it constrains the additional weight assigned to such journals (over and above the weight attached to other journals) to be non-negative. 4. The quality of individual publications could be assessed by peer review - - indeed a sample of publications is assessed in this way by the British funding council peer groups - - but this inevitably introduces a degree of subjectivity and controversy. 5. There is evidence to suggest that the funding council peer review groups for economics have attached little importance to books in the evaluation of research output. This is because of the belief that research results are first reported in the journals, and are only later reproduced in book form. 6. A referee suggested that we might use faculty salaries or the degree granting status of the university as a proxy for the quality of inputs. In the U.K., however, all universities have similar degree awarding powers, and the structure of salaries is nationally determined. Other possible measures of university inputs or outputs include costs data and measures of graduates' earnings, but these data are not available at the required level of disaggregation. Indeed, it is the unavailability of 'price' variables of this type which makes DEA an attractive method to use in this context. 7. For reasons of space we cannot report all the results here. In Johnes and Johnes (1993) we conduct a hierarchical cluster analysis which allows us to identify two distinct clusters of efficiency score vectors resulting from the DEA runs. Those in the first cluster may be distinguished from those in the second by the presence or otherwise of the per capita value of grants as an output. Within clusters, the correlation coefficient obtained by comparing any pair of efficiency score vectors is at least 0.52. This being so, the two vectors reported in Table 2 of the present paper represent a useful summary of the mass of results which we have accumulated. 8. The constant used is 1665, based on the assumption that British academics work an average of 37 hours per week for 45 weeks of the year. This assumption is often made for administrative convenience, although recent survey evidence suggests that the average academic in Britain works over 50 hours per week. In addition to teaching and research, academics spend a non-negligible proportion of their working time on administration.

Funding and Performance in U.K. Economics Departments 9. DEA does not permit the analyst to express a view about the relative efficiency of two DMUs which lie at different points along the efficiency frontier. The frontier is, in this respect, analogous to a Paretian contract curve. Social welfare considerations can be accommodated by imposing restrictions on the weights attached to some or all of the variables used in the analysis. In the present paper, we have imposed just such a restriction on the weight attached to core journals - - it must not be less than the weight attached to other articles in academic journals. To go further - - by introducing an explicit social welfare function - - would violate the spirit of DEA, penalise diversity, and risk running foul of Arrow's impossibility theorem.

REFERENCES BARROW, M. (1991) Measuring local education authority performance: a frontier approach. Economics of Education Review 10, 19-27. BEE, M. and DOLTON, P.J. (1985) Degree class and pass rates: an inter-university comparison. Higher Education Review 17, 45-52. CAVE, M. (1990) Tendering trials. Public Money and Management 10(2), 4-5. CAVE, M., HANNEY, S., KOGAN, M. and TREVETT, G. (1991) The Use of Performance Indicators in Higher Education. London: Jessica Kingsley. CHARNES, A., COOPER, W.W. and RHODES, E. (1978) Measuring the efficiency of decision making units. European Journal of Operational Research 2, 429-444. CRANE, D. (1965) Scientists at major and minor universities: a study of productivity and recognition. American Sociological Review 30, 699-714. DANTZlG, G.B. (1951) Maximization of a linear function of variables subject to linear inequalities. In Activity Analysis of Production and Allocation (Edited by Koopmans, T.C.). New York: Wiley. DES (1987) Higher Education: Meeeting the Challenge, Cm 114. London: Her Majesty's Stationery Office. DIAMOND, A.M. (1989) The core journal of economics. Current Contents 21(1), 4-11. FARRELL, M.J. (1957) The measurement of productive efficiency. Journal of the Royal Statistical Society, Series A 120, 253-290. FEEMMING, J. (1991) The use of assessments of British university teaching, and especially research, for the allocation of resources: a personal view. European Economic Review 35, 612-618. GILLETT, R. (1987) Serious anomalies in the UGC comparative evaluation of the research performance of psychology departments. Bulletin of the British Psychological Society 41}, 42-49. JOHNES, G. (1988a) Determinants of research output in economics departments in British universities. Research Policy 17, 171-178. JOHNES, G. (1988b) Research performance indicators in the university sector. Higher Education Quarterly 42, 54-71. JOHNES, G. (1989) Ranking university departments: problems and opportunities. Politics 9(2), 16-22. JOHNES G. (1990) Measures of research output: university departments of economics in the UK, 198388. Economic Journal 100, 556-560. JOnNES G. (1992a) Bidding for students in Britain - - why the UFC auction failed. Higher Education 23, 173-182. JOnNES G. (1992b) Performance indicators in higher education: a survey of recent work. Oxford Review of Economic Policy 8, 19-34. JOHNES G. and JOHNES, J. (1993) Measuring the research performance of UK economics departments: an application of Data Envelopment Analysis. Oxford Economic Papers 45, 332-347. JonnEs G., TAYLOR, J. and FERGUSON, G. (1987) The employability of new graduates: a study of differences between UK universities. Applied Economics 19, 695-710. JOHNES J. (1990a) Determinants of student wastage in higher education. Studies in Higher Education 15, 87-99. JOHNES J. (1990b) Unit costs: some explanations of the differences between UK universities. Applied Economics 22, 853-862. JOHNES J. and TAYLOR, J. (1987) Degree quality: an investigation into differences between UK universities. Higher Education 16, 581-602. JOHNES J. and TAYLOR, J. (1989) Undergraduate non-completetion rates: differences between UK universities. Higher Education 18, 209-225. JOHNES J. and TAYLOR, J. (1990) Performance Indicators in ttigher Education. Buckingham: Open University Press. JOHNES, J., TAYLOR, J. and FaANClS, B. (1993) The research performance of UK universities: a statistical analysis of the results of the 1989 Research Selectivity Exercise. Journal of the Royal Statistical Society, Series A 156, 271-286. MELTZER, B. (1949) The productivity of social scientists. American Journal of Sociology $5, 25-29.

313

314

Economics o f Education Review SCHUBERT, A. and BRAUN,T. (1981) Some scientometric measures of publishing performance for 85 Hungarian research institutes. Scientometrics 3, 379-388. SEXTON,T.R. (1986) The methodology of data envelopment analysis. In Measuring Efficiency: an Assessment of Data Envelopment Analysis (Edited by Silkman, R.H.). San Francisco: Jossey-Bass. SEXTON, T.R., SILKMAN,R.H. and HOGAN,A.J. (1986) Data envelopment analysis: critique and extensions. In Measuring Efficiency: an Assessment of Data Envelopment Analysis (Edited by Silkman, R.H.). San Francisco: Jossey-Bass. SLOANE, P.J. (1993) CHUDE survey of teaching loads. Royal Economic Society Newsletter 83, 4-5. TOMKINS,C. and GREEN,R.H. (1988) An experiment in the use of Data EnvelopmentAnalysis for evaluating the efficiency of UK university departments of accounting. Financial Accountability and Management 4, 147-164.