Viewpoint
The universities’ research assessment exercise 1992: a second opinion
compared assessments of university departments of general practice by the heads of department with those of The UK University Funding Council. There was little difference between them. The departmental heads’ assessment was quicker and cheaper. Current means of assessing academic departments may need to be reconsidered.
We
Lancet 1993; 342: 665-66
Introduction
questions remain. Do the judgments of panel members match those of the constituents whose funding is affected? Are strong and weak departments inappropriately handicapped or promoted by being in larger assessment groups? Is information on grants gained and papers published a good measure of research quality? Are the weeks spent preparing for and carrying out assessments the most economic way of rating constituencies? We describe a prospective study of the general practice part of the "Community-based clinical subjects" assessment.
process,
Finding equitable
methods for the allocation of state resources between and within universities is an international problem for which no widely applicable solution has been found. The idea of auditing academic activity has been tried at national level in only a few European countries (including Scandinavia). Since the mid-1980s, the principles of value-for-money and competition have affected thinking about funding of UK Universities. It has proved easier to audit activity and redistribute resources in research than in other aspects of University work, and government funding strategies have now prompted an "internal market". In this market, research departments bid for money from the Higher Education Funding Council, some Research Councils, and the National Health Service (NHS). The funding of UK medical research in the future is also likely to be affected by the Central Research and Development Initiatives in England,l parallel policy initiatives in Scotland and Wales,2 and discussions over the future of the NHS Services Increment for Teaching and Research.3 The 1992 UK University Funding Council (UFC) research assessment exercise was the third in a series of reviews of university research activity,4 and differed from those in 1986 and 1989 in several ways. Measures of input, process, and output were more tightly defined partly to increase the detail available to assessors and partly to enable quality judgments to be based on routinely available management information. There were 72 units of assessment, with 3 covering clinical medicine, which had previously been a single "constituency". Membership of the assessment panel was decided only after the data for the assessment had been collected. The criteria used by the panels to make their judgments were published, but the relation of the criteria to consequent funding was kept secret until the assessment was over and the results available. Despite the increasing complexity of the assessment
Department of General Practice, University of Edinburgh, Edinburgh, UK (Prof J G R Howie FRCGP) and Department of General Practice, University of Wales College of Medicine, Llanedyrn Health Centre, Cardiff CF3 8PN, UK (Prof N C H Stott FRCGP).
Correspondence to: Prof N
C H Stott.
Method The "Community-based clinical subjects" unit included subunits of psychiatry, general practice, and public health sciences. The Panel had 6 voting assessors (2 representing each subject), and 5 non-voting members. We were the two general practice assessors.
Table: Universities
Funding Council Reference Scale
34 universities made submissions of which 26 included a department or unit of general practice. After the UFC research assessment exercise panel had made their assessments in November, 1992, and before the results were published, all heads of department (except ourselves) were invited to rate all 26 departments on the criterion-referenced scale used in the research assessment exercise (table). No information was available to them other than their own knowledge of the departments gleaned from academic activity. The ratings made by the heads of departments were described as "peer-ratings"; we agreed a "lead rating" based on the information provided to the panel; and the "constituency rating" was that published by the UFC.5
Results 18 of the 24 heads of departments returned assessments. Most rated all of the 26 departments, but heads of departments had been advised to leave blank any department they felt unable to comment on. The only department to be assessed by fewer than 14 assessors was one in London (by 7), apparently due to confusion over its recent change of name. One head of department found the assessment too difficult, and 5 did not reply. The averaged peer ratings for each department were adjusted upward to the nearest whole mark as was done in the main UK
665
Are departments handicapped or protected within larger constituencies? There were weak correlations between lead or peer ratings and the overall constituency rating when psychiatry, general practice, and public health sciences were combined. However, no departments of general practice seemed substantially handicapped by being in the larger grouping, and only one was assisted. Are grant income and
publications a good guide to
quality?
Figure: No of
Lead
vs
assessors
peer
ratings
in brackets.
for this constituency. 18 of the 26 departments received the same score by both methods; 7 rated lowest by peer ratings were included in the 9 scored lowest by the lead raters. The 7 departments rated highest by peer ratings were included in the 8 scored highest by the lead raters assessment
(figure). Although the correlation between the general practice ratings and the final published constituency ratings was low, no department was constituency rated 2 or less if the department had scored 4 or 5 on either peer or lead scoring. 1 department had a 4 or 5 constituency score and a lead or peer rating of 2 or less. Within the whole constituency, star ratings were awarded by the Panel to 6 departments of psychiatry, 6 departments of public health medicine, and 7 departments of general practice.The stars were given only when a subunit scored 4 within a constituency scoring 2, or a subunit scored 5 within a constituency scored at 3 or 4. Discussion This study took the advantage of the short interval between the completion of the official UFC research assessments and the publication of the results in one area of activity (general practice) in one area of assessment (community based clinical subjects). Are specialty-appointed panel members’ assessments valid? The results shows substantial concordance between the rankings of the lead assessors and the peer groups they represented. We were more generous in the ratings than were the peers, with the lead assessors giving more top and the peers giving more bottom ratings. This difference was probably the result of the information available to us, but may also reflect differences in attitudes within the
profession.
666
The peer and lead assessors ranked equally despite the peer raters having no access to information on input (grants gained), process (research students and assistantships), or output (higher degrees and public output of which published papers was the largest element). The quality judgment on "best papers published" used material potentially available to both lead and peer assessors, but the latter had no easy way to obtain the information, and all were so prompt in returning their sheets that they could not have done so. What are the economics of the assessment exercise? Our study took 18 heads of departments less than an hour each to complete, and it took us about 14 hours to analyse and prepare this paper. The preparation for the UFC assessment took 26 heads of departments at least one day each, and preparing for and undertaking the assessment took us about 10 days each. This gives a total cost of 32 hours for the experiment compared with 46 days for the formal UFC exercise. Our comparison between national peer rating and two lead raters can be interpreted in two ways. The peers provide some validation of the rankings in the UFC system and show the vulnerability of any criterion-referenced system. On the other hand, the time and cost of the UFC selectivity exercise did not justify an output that could have been achieved much more cheaply. "Efficiency savings" and "value for money" are now familiar concepts in health services and universities. Perhaps the next research selectivity exercise in the UK could test our hypothesis on a wider constituency? We thank Prof Michael Marmot, Chairman of the UFC Community Panel, for his support in publishing this data. Dr Roisin Pill, Dr Tim Peters, Mrs S Clee, Mrs P Moore and Mrs P V Syme for secretarial
support.
References 1 2 3
4 5
Research for health: a research and development strategy for the NHS in England. London: Department of Health. 1991. NHS Directorate. Sharpening the focus: research and development framework for NHS Wales. Cardiff: Welsh Office, 1992. Undergraduate Medical & Dental Education and Research. Third (Interim) Report of the Steering Group. London: Department of Health and Education, 1993. Universities Funding Council: Research Assessment Exercise 1992. Circular 5/62. Universities Funding Council Research Assessment Exercise 1992: The Outcome Circular 26/92.