lntemational
Journal of information
Management
(1989), 9 (179-l 85)
Database Evaluation: A Case Study from the Netherlands H.C. KOOIJMAN-TIBBLES
In the course of 1988 the Ministry of Education and Science in the Netherlands undertook an extensive evaluation of its database, ADION. ADION is the Dutch acronym for Automated Documentation and Information System for Educational Literature in the Netherlands. The evaluation included an internal technical and organizational review, a survey by means of questionnaire among the external users (end-users) and a quality assessment of the database using the precisionrecall technique. In this article the methods are described, associated problems and shortcomings discussed and the results given.
Helen Kooijman-Tibbles graduated from Exeter University in 1968, and followed postgraduate training in Library and Information Sciences in the Netherlands. She obtained her MA at the University of Utrecht in 1979, and has held various positions as an information scientist both in the profit and non-profit sector. Since March 1989 she has been Chief Librarian at the Nijenrode University of Management and Business Administration, Straatweg 25, 3621 BG Breukelen, the Netherlands. Prior to this she was Head of the DOCUmentation and Information Department at the Dutch Ministry of Education and Scicnce in Zoetermeer.
0268-4012/89/03
017%07
$03.00
0
Introduction The Dutch Ministry of Education and Science maintains a documentary database (ADION) which is made publicly available for online use, with the government computer centre (RCC) acting as host organization. The database, which has been operational since 1982, was one of the first of its kind in the Netherlands, although many followed soon afterwards. It was designed to support the activities of the Library and Information Department at the ministry. The department has a long (pre-automation) tradition of providing literature information both to those working at the ministry but also to other educational organizations, universities and the general public. The policies for educational innovation in the 1970s generated so much work for the Library and Information Department that effectively it ground to a halt. There was a considerable backlog in descriptive cataloguing, and index cards once completed had to be inserted in the traditional card catalogue. The card catalogue had at its worst a three-year delay. Obviously this was of considerable hindrance to the information officers trying to provide up-to-date and relevant literature information. This picture will be familiar to anyone who was working at about that time. Computerization offered a feasible solution and the information needs of policy makers provided sufficient pressure for the project to be authorized. As with all automation projects this one too had consequences for the organization. Internal procedures and allocation of personnel had to be reviewed. It was envisaged that the new organization supported by ADION would be able to keep abreast of the work load and improve the services offered, while at the same time reducing personnel. A first evaluation in 1985 concentrated on technical and organizational aspects and led to some limited adjustments. Basically we could have stopped there as it was concluded that the aims of the operation were being met. In the meantime, however, other developments both in the ministry and in the country justified a second and more extensive evaluation. In 1988 ADION was no longer the only database for educational literature information and a policy decision to integrate these different systems - which were directly or indirectly
1989 Butterworth
& Co (Publishers)
Ltd
179
Database evaluation
funded by the ministry - had been taken. Besides this, the library at the ministry was in the process of being automated. The question of how ADION should be developed further remained. To come to some long-term policy for ADION there was a need for objective informatioil on its quality, information on its users, their assessments and additional needs. The second evaluation of ADION should provide answers to these questions. This article describes ADION and indicates its place in the library and information network in the Netherlands. It discusses the evaluation, the methods used and the results.
introducing ADION ADION stands for Automatisch Documentatieen Informatiesysteem voor Onderwijsliteratuur in Nederland (Automated Documentation and Information System for Education Literature in the Netherlands). The system is built up of a number of separate physical databases each with a defined product: 1. 2. 3. 4. 5.
Online STAIRS database. Monthly reference journal. Monthly SD1 service. Quarterly journal of legislative Irregular bibliographies.
documentation.
Each of the above products is made available on subscription. There are also a number of products specifically for managing the database, such as frequency listings of words and descriptors and frequency listings of journal citations. Currently (January 1989) there are 44 000 items in ADION. It is updated every two weeks; the yearly addition is now on average 7500 items. The average time taken to process the documents from the moment they arrive at the library to availability online varies from two weeks for legislative documents to two months for journal articles and three months for books. As each item in ADION is in the library at the ministry, document delivery is possible and provided through the national library lending scheme. The database is loaded on to one of the large IBM mainframe computers at the RCC host. At locai level the system is operated on a Phihps P9070 configuration with 32 peripherals (disc drives, printers, terminals). There are sufficient terminals for all information specialists. This forms the Library and Information Department’s own network which is linked up to the ministry’s LAN. There is a dedicated datacommunication link between the ministry and RCC. ADION is one of 12 databases hosted by RCC. Eight other ministries, the Parliamentary information centre, Eurodata and a second database for the educational field make up the rest. The other educational database, called DION, is a technical copy of ADION and built to meet the specific needs of the education community. In 1990 a form of integration between these two databases is planned. One of the aims of the second evaluation of ADION was to define areas of information not yet covered and to assess user problems. Any necessary changes can then be tried and tested before integration takes place. Another development which will influence the course of ADION is the current automation project at the library of the Ministry of
180
H.C. KOOIJMAN-TIBBLES
Education and Science. The library has recently joined the automated central catalogue of library holdings hosted by PICA (Project for Integrated Catalogue Automation). Most of the university libraries and all of the public libraries take part in this project. A local system for library administration designed by PICA has just been acquired. As PICA is completely separate from RCC and uses different hardware and software, an interface between PICA and ADION will have to be built. This will probably involve some changes in the structure of the records in ADION. It was felt that this should be done in conjunction with any other changes arising out of the evaluation. The evaluation falls into three interrelated parts: 1. An internal technical and organizational evaluation, carried out by ourselves. 2. An end-user survey by means of a questionnaire; this was undertaken by an external bureau. 3. A research project on the quality and consistency of the indexing by means of the precision-recall technique. This was carried out by the department of Library and Information Sciences at the University of Amsterdam. The evaluation covered the period from October 1986 to October 1988. The evaluation itself was carried out from March to December 1988. The final report was presented to the ADION management group in January 1989.
The internal technical-organizational
evaluation
The main aim of this part of the evaluation was to see how changes in the system had affected its operation and whether ADION was still meeting its original purpose. Operating or production problems were systematically analysed on the basis of log-books and system-generated records, to see how they could be solved by adaptations to the system. Library staff were asked to complete a short questionnaire to assess the effects on the organization and their well being. Using their insight into the needs of the client, library staff also indicated how the various products could be improved to meet current requests. A cross-section of the subscriber population was also directly contacted and asked to grade the value of the products they were receiving. As a result of the earlier evaluation some system adaptations had already been made and the organizational structure of the management group radically changed. Library staff were specifically asked to comment on the effects (if any) these technical adaptations had had on their work and whether internal communication had improved with the new form of management structure. Following a number of marketing activities whereby the supportive services of the library became better known within the ministry, the use of the system had grown by 40 per cent in two years. ADION had definitely become indispensable: no one could conceive of working without it. It was still meeting its aims, although open to some minor improvements. The most irritating problems concerned the frequent mechanical breakdowns of the printers, the delays at administrative level in getting products to the clients and the high frequency of the data communication problems. These were usually caused by LAN deficiencies. rather than breakdowns in Dutch Telecommunications. It was
181
Database evaluation
generally felt that the service could still be improved and that the system could provide more support than it already did. Expectations were higher than at the start of operations in 1982 and seemed to increase the more the staff had grown accustomed to it. The various published products generated by the system still met a demand. The SD1 service is by far the most important, followed by the quarterly overview of legislative documentation and specific bibliographies. The evaluation brought to light a number of possible improvements which could be met through minor adjustments to the system. For example, 1. Increasing the frequency of the overview of legislative documents from quarterly to monthly and combining distribution with the general monthly reference journal. 2. Streamlining the mailing of SDIs. 3. Improvements in layout and indexes to the bibliographies. There is also an increasing interest in receiving documentary products on floppy discs so that clients within the ministry can build up their own databases using a personal computer. There is as yet little interest in direct online access to the system. In 1985 there were only five end-users within the ministry (i.e., users who are not library staff members). By 1988 this had grown to eight. Most of our clients prefer to use the department’s intermediary services. Internal communications had improved. If anything we were overcompensating for the previous situation, although still inclined to underestimate the time needed for our staff to participate in managing the system. All suggestions which were made to improve the system and its products were considered technically and budgetarily feasible and now form the basis of a plan of action for this year.
The external users The external users account for 20 per cent of the actual use of the database; the other 80 per cent is the use by library and information department staff. While the use of the database has increased, the proportion of external to internal users has remained constant. An external bureau was commissioned to carry out this part of the evaluation, partly because we did not have the resources, but more importantly to ensure objectivity. There was close cooperation in the compilation of the questionnaire. The main purpose of the questionnaire was to find out more about the end-users and whether their profile matched our general impression. We also wanted to know whether ADION met their expectations. Were they finding the information they expected and if not what was missing? A number of questions were also constructed about the thesaurus, to assess usability and problems. Although it was outside the scope of our influence, we were also interested in assessing the acceptability of STAIRS and user experience of the host organization. This information would help us to direct more adequate support to end-users and increase their use of the database. The questionnaire was tested by interviewing a small number of end-users and adapted accordingly. One of the problems with this survey was that there are more authorized users than genuine users. Currently there are some 300 authorizations. About a third of these
182
H.C. KOOIJMAN-TIEBLES
applied for authorization to one of the other RCC databases and ticked off ADION as well as it did not cost anything extra. These are not genuine users. On the bases of addresses supplied by the host, however, it was not possible to identify this group before sending out the questionnaires. They could only be identified by comparing actual use of the database (number of actual STAIRS sessions) with the assessments given by the respondents. With one or two exceptions this non-user group did not fill in the questionnaire. The response was reasonable, certainly as a percentage of the genuine users (24 per cent). The results, however, were not spectacular. In general the users knew what to expect from ADION. They were reasonably satisfied, but some improvements would be welcomed. The coverage met expectations; the currency could be higher and the abstracting more informative. The thesaurus was adequate but a higher up-date frequency was requested. As we had expected the end-users of our system were mostly intermediaries with training and experience in using online databases. Despite this a need was expressed for initial ADION-specific training which would have helped the intermediaries to become more quickly acquainted with ADION. Our public relations also needed improvement as nobody was aware of the existence of our help-desk. Problems were indicated concerning data communication, and some criticisms were made about the host organization. More support was requested from the host in making hardware choices; their help-desk was not always available and for more serious problems it was difficult to reach the right person. However RCC had recognized these problems and were already working on solutions. It is hoped that the current reorganization at RCC will bring more customer satisfaction. It was difficult to draw any far reaching conclusions from the survey partly because of the limited numbers but mainly because the answers were not very surprising. It does, however, give us support for our plans to extend our training and publicity programmes. The criticism of the thesaurus was not new and in general milder than our own. New software for thesaurus management should be looked into and we are following the current developments of a thesaurus interface for public libraries with considerable interest. Our problems may then also to a large extent be solved. We were not able to identify a particular group on which to target future marketing activities. With hindsight we can conclude that a more direct technique of interviews with a crosssection of the end-users would probably have been more effective.
The precision-recall
project
The department of Library and Information Science at the University of Amsterdam was commissioned to evaluate the quality and consistency of the indexing and abstracting in ADION. Literature research indicated that while the ‘precision-recall’ technique does have its shortcomings as an evaluation method, an alternative was not readily available. The technique had also not yet been applied on any scale within the Netherlands. As far as could be judged from the literature there were few comparable cases of this technique being used on an operational database. Most applications measured specially constructed test databases. The reason was also evident: it is difficult to make an accurate assessment of the recall on an operational database.
183
Database evaluation
Recall is defined as the percentage of documents found of all relevant documents in the database. Precision is the percentage of relevant documents of all those found. A high recall is accompanied by a low precision and vice versa. The problem of assessing the recall was solved for this project by having a small control group. In this project recall was defined as the sum total of documents found by the researcher and the control group minus the overlap. It should be noted, however, that this probably means that the recall percentage is relatively high. (It would be interesting to see what the results would have been with a larger control group.) Despite these limits to the methodology, it did turn out to be a viable model for measuring the internal consistency of the database. The researcher, a postgraduate student, while having general knowledge of databases and search methods, was new to ADION and not an expert in the educational field. She could be regarded as an intelligent end-user. The control group comprised two experienced intermediaries at the department. We expected the results of the intermediaries to be better than those of the researcher. The research method and results are reported in the final evaluation report’ and are to be published in the Dutch journal of librarianship Open.2 The description here will be limited to the main results and conclusions. Sixty text questions were chosen randomly from real questions that had been handled by the information department over the last two years. The researcher handled all of the questions; the intermediaries handled 30 each. As STAIRS allows for searching with controlled vocabulary, free-text and a combination of both, the precision and recall were measured three times for each question:
‘KOOIJMAN-TlBBLES,
H.C.
AND
BUITFLAAR,
J.C. (1989). (In Dutch).
Rapper! van de tweede evaluatie van het ADION 1986 1988. Zoetermeer: Ministry of Education and Science. ONGERING,
M.H.A.
AND
RIESI‘HUIS,
G.J.A.
(1989). (In Dutch). ADION: een kwaliteitsonderzoek. Open, 21 (in press).
1. On the basis of a search using just descriptors. 2. On the basis of a free-text search. 3. For ‘normal’ combined search. Both a user-related average and a system-related average were obtained from the results. They showed considerable consistency Table 1). This consistency is due to the fact that the documentalists take possibilities of STAIRS into account at the indexing phase. Table 1. Average User average
precision-recall
precision-recall
System
average
per search session Free-text
184
Total
Precision
Recall
Precision
Recall
Precision
Recall
73.5 71.3
56.4 64.3
75.2 78.1
41.3 50.9
72.7 72.4
71.7 73.6
precision-recall
per search session
Descriptors
Researcher Intermediary
the The
results
Descriptors
Researcher Intermediary
then (see
Free-text
Total
Precision
Recall
Precision
Recall
Precision
Recall
72.6 65.4
51.3 44.5
75.7 77.1
37.3 18.5
69.9 69.5
69.3 76.4
H.C. KOOIJMAN-TIBBLES
abstracts are short but informative. The choice of words reflecting the content, abstracts and descriptors are complementary to one another. Descriptors are chosen from the ADION thesaurus which most closely represent the document content. Synonyms and new terminology are used in the abstracts. This means that ADION is relatively user-friendly and explains the unexpected similarity in the results between the researcher and the intermediaries. Conclusions and recommendations could be drawn particularly with regard to the thesaurus. The thesaurus is revised every three to four years. This means that there is a considerable time-lag before new terminology is included in the thesaurus. The complementary nature of current indexing and abstracting practice described above however limits the search problems caused. It also masks the relatively low proportion of non-descriptors to descriptors in the thesaurus. The practice of using strings of descriptors to describe types of schools and some categories of legislative documentation gives a relatively low precision for searches on those topics. The end-user gets respectable results but improving the support given by the thesaurus is recommended. The strings used are listed in a separate publication basically for internal use but available to end-users on request. The strings take the form of ‘use combined term’ relationships which cannot be dealt with by our current thesaurus software. Nonetheless some solution must be found whereby these separate listings can be included into the thesaurus. The relationship between synonyms in the abstracts and the descriptors makes it feasible to build up a terminology bank of synonyms and their relationship with the thesaurus. At its simplest level this will lead to a higher proportion of non-descriptors in the thesaurus. Theoretically it can form the basis for automatic indexing.
Conclusions Being the newest element in the evaluation, the precision-recall project was perhaps the most exciting part and certainly provided the most interesting results. These will lead to adaptations to the thesaurus and to some internal schooling for the documentalists particularly with regard to abstracting. While being at a generally acceptable level, some bad habits have slipped in during the years which should be eliminated. The quality and consistency of the database is good and one which should be maintained regardless of the direction ADION takes in the future. The internal organizational review and the end-user survey provided very little new information. Inasmuch as they confirmed our impressions they were nonetheless valuable. The expectations regarding ADION are high and the standard of the support given to the work of the Library and Information Department cannot be missed. As far as giving us some indications for the future, the evaluation has shown us that no concessions on that score can be made. A possible move to PICA and/or the integration with DION must provide at least the same level of support and high standard of consistency and quality.
185