The InternationalJournal
of Museum
Management
and Curatorship
(1989), 8,57-62
Data Entry The ‘Bugaboo’ of Museum Computerization KAROL A. SCHMIEGEL
Concern for data entry often prevents a museum from obtaining a comprehensive computerized collections information management system. The project planner, top management, the board of trustees or other oversight group may believe that data entry will not be completed and the system will not be useful, that data entry will cost more than hardware, software and storage space, and that if regular staff do the data entry, they will not be able to carry out their normal responsibilities. These fears are realistic: data entry is a ‘bugaboo’. A ‘bugaboo’ is defined as a cause of ongoing concern, and while there are many options to reduce concerns about data entry, the ‘bugaboo’ remains. The effects of such fears can be, first and probably worst, no system at all; second, a system which includes fewer data than are needed. This result can be achieved in two ways. The first is reducing the number of types of data so that the system is of little use-for example, recording each object’s number and name but not its location for an inventory record. The second is to reduce the amount of data entered in the first phase or first pass, but allow provisions for all that is needed for an effective system and add it at a later date after the system has proven its worth. For example, for an inventory, record initially each object’s number, name, primary material and location but allow fields and space for measurements and secondary materials which can be filled in after one has recorded locations for the entire collection. The third and best result of data entry fear is to plan efficient ways to enter all the data needed by taking advantage of services and technology that are available. The corollary to this approach is to plan to fund the data entry project adequately. The end product is a computer system which works efficiently to meet the goals set for it. The goal may be very simple, such as an inventory of the collection with minimal information about the objects and locations, or it may be a complex system for both catalogue and collections management data. There are two parts of the end product: the system and the data. Proper selection and use of the former are essential to managing the latter, but they can also play an important role in the data entry process. In planning a system, one of the first steps is to review the data. How much is there and how complex is it? The size of the museum’s collection is one factor. A collection of several million specimens is rather daunting; however, the information to be recorded about these may be only a few fields. On the other hand, a smaller collection of thousands of objects may be very complex and contain many different types of objects for which there is much information. More knowledge will be needed on the part of the data entry planners to enter data about these different types of objects wisely. The quality of the original data is an important factor, in terms of both accuracy and legibility. Vocabulary control is a desirable feature and may already exist in the manual 0260.4779/89/01
0057-06
$03.00 0
1989 Butterworth
& Co (Publishers)
Lrd
58
Data Entry
system. However, where it does not, one must consider whether it is important to implement these controls before data are entered into an automated system or whether the system can be used to determine the most suitable vocabulary for enforcing consistency. In an earlier part of the computer era, it was essential to have all of the data corrected before they were entered, as changing them could be very difficult and timeconsuming-and still can be in an unsophisticated computer system. However, more sophisticated and complex systems have a global search-and-replace capability, so changing incorrect terms to the more desirable ones is an easy matter. It is important not to succumb to the idea that a collection must be recatalogued before any of its data can be computerized. Indeed, computerization of part of the data may facilitate a review and updating of the rest of the documentation. In planning the system, selection of the number and types of data fields can be very important for managing the data entry process. Using the same fields as the manual system will speed data entry; adding fields which require records to be searched extensively for the data will slow data entry. Maybe one should not enter all these data, at least not into separate fields. Perhaps large areas of text which have been used in a manual or catalogue card system should be retained. If one may search on the text, it is much simpler to enter the record as it exists than to divide it into many separate fields. For example, one may choose to search the text for the term ‘cabriole leg’ rather than assign a field for each part of the object. Alternatively, one may use only those fields that are needed to locate the objects which may meet researchers’ usual criteria and then go to the paper files for the additional data. The computer may be used to produce a list of all the furniture likely to have cabriole legs based on dates and type of object. The researcher would then go to photographs or descriptions to find those with cabriole legs. In short, the computer system can serve as a finding aid but not as a repository for all the information about the collection. However, the project manager must decide on what data fields must be included and stick to the decision. Data entry methods have come a long way since the key-punch cards of the 1950s and 60s. There are now at least four major possibilities. The first is to key in using a CRT terminal with screens formated to accept the data and enter them directly into the database. This on-line process is one of the simplest ways to enter data but has one disadvantage in that it can reduce the efficiency of the computer system’s other functions. The second way is to use a personal computer for data entry, with the data being stored on floppy disks or tape and being uploaded into the main system’s database in batches, often after normal working hours. The third way is to transfer data electronically from another automated database which may be on a word-processing system, main frame or in another computer database. The data, once entered electronically into one form of database, should not have to be rekeyed in order to enter the system you want to use. Good programmers can write software which will transfer the data for you with minimal effort on your part. The fourth and most recent use of technology is an optical character reader which can scan existing typed records, display these on the screen for editing, then store them and permit a batch dump into the database at a time convenient for the system’s users. The design of a computer system should have many features which will speed data entry. A copy or repeat feature for identical or very similar objects will copy the data once entered for each subsequent object but permit the editor to change the items which differ. For example, for the individual pieces of a porcelain dinner service, one would enter the data for the first object, which would include a full description of the pattern
KAROL A. SCHMIEGEL
59
and the shape of that particular object. All the data are repeated for every subsequent entry, but the name of the object, measurements, and the shape of each and its condition can be changed. Data such as date, origin, maker, material, history of ownership, donor and the description of the actual pattern for the entire set are entered only once. Using codes or abbreviations can also save key strokes and storage space. However, for these to be useful, they must represent a lot of data; otherwise trying to remember the codes and keep track of them can be more trouble than it is worth. Ideally software is designed to take data transferred from another database and put it into the correct field in the new database. The current technique of optical character recognition currently is suitable only for consistently formated, typed records. However, it can be a lot of fun to use, and, at least at Winterthur, has made it easier to attract workers who do not need to be superb typists but who do need to know and understand the data. First of The problem of who is going to make the data entry is another major concern. all, one may look at one’s existing staff and believe that the experts should review the old data and make any alterations that are necessary before changing over to the new system. However, the senior staff among the keepers and the curators usually will not do this. They will delegate the task to junior professional staff who would prefer to be doing something else and who often do not have the expertise to spot all the errors which should be corrected. Those clerical staff who have worked with typed and manual records would be a good choice; however, their time is usually regularly committed elsewhere, as is that of other clerks in the institution. New, and temporary staff, such as interns and students or non-specialists, can be used, often with great success-especially interns who are seeking to learn about the collection, and students for whom data entry can be a relatively high-paying job. There are agencies to provide specialized data entry staff, who can either come to your site and use your equipment or who do your work at the agency. Most of these people are extremely accurate typists, but they enter the data exactly as given to them. They can make no decisions about accuracy of data or including any additional information which is not obvious. They will not know your vocabulary and may not be able to check for spelling errors in specialized terms. One company in the United States, Willoughby Associates, will enter data of museum records for those who purchase its software. This company has on its staff art historians who can indeed decipher your almost impossible data and read between the lines. The training of data entry staff must be very thorough. A written procedures manual is necessary. The system designers and project manager should compile one and if possible have on-line help screens. New staff should have ready access to experienced and knowledgeable regular staff members who can answer their questions. Data entry is pm-t of the total project and not the project itself. However, successful data entry which permits a museum to use a new computer system relatively quickly and with minimal disruption of normal work is viewed as a success, even if the quantity of data about each object is not extensive. Being able to track an object’s location or to do some searches which aid researchers and staff members are major steps which are recognized by the museum’s management and oversight board. Winterthur Museum is a case study in combining three different data entry techniques to achieve control over the information available for approximately 85000 objects of various types, comprising a decorative arts collection that ranges from 1640 to approximately 1850 and includes objects made or used in the Northwestern USA. The primary goal of computerization was inventory control. However, we developed the specifications for a very extensive system that would permit us to eliminate almost all
60
Data Entry
paper work other than images and primary documents. The system was seen as a major aid both to collections management activities such as object movement, loans, accessioning and photography, as well as a research tool for historical and descriptive data about our objects. Eight application modules were specified in addition to the design for the database. These included: the basic data; location data; confidential data including appraisal and insurance values and donor information; a loan processing module; a collections care module; photographic holdings; photographic order processing; and the remaining documentation and descriptions of the objects in the collections. However, the cost of designing the software, to say nothing about data entry, for the entire system exceeded the funds available. We have readjusted the scope of the project to be carried out initially so that only the first three sections or modules will be implemented, but the database design allows for the other parts to be added when possible. The scope of the first three modules was somewhat expanded. The plan was to include the basic facts, such as the object’s name, number, maker, origin, dates, materials, donor, as well as the location information and data concerning source and value as in our existing catalogue. The fields, however, were selected according to our existing catalogue cards. The object’s descriptions and histories would be added at a later date. The data for approximately 15 000 objects (number, name, material, permanent and temporary locations) were to be transferred electronically from another database already resident on our computer which had been created by transferring data from a word-processing system. Our vendor, Delaware Computing Services, suggested that we investigate using an optical character reader, as our existing catalogue cards for approximately 50 000 objects were in consistent format and good condition. The first optical character readers which we investigated would not accept the 4 x 6inch cards. These would have been xeroxed before they could be fed into the optical character reader. The small readers in the approximately US $8000 price range would have accepted the information on standard 8 x 11 inch sheets of paper. The data would then have gone to our WANG 01s 140 Word-Processing System for editing prior to being transferred to our Hewlett Packard 3OO0/70 computer with an Image database management system. However, a combination of a Palantir Optical Character Reader and a Sun Workstation had just been released as a package for hire. A demonstration using our own catalogue cards proved successful, and we arranged to lease these two machines. We believed that we could enter both the basic data and the descriptions in the same amount of time as keying in directly just the basic data. The catalogue cards had been proofread prior to being printed, and we thought them to be quite accurate. The staff at Delaware Computing Services planned very carefully so that the existing identifiers or flags on the cards would be recognized as field delimiters and would be used to indicate which data went into which field in the new database. Alternatives were available for each so that upper and lower case did not matter and a number would serve as well as the word. As we began the data entry before all of the software and screens were completed, some pieces of data were flagged with a number but put in the large text field. Later these were retrieved and put into their proper places in the database. The data entry staff were recruited from a number of sources-first, from our guiding staff. Our guides work part-time. At least ten of them had worked previously in the Registrar’s Office to assist with inventory. Three of these decided to join the data entry team. One has worked consistently on the optical character reader and two have been checking the results in the database on the main computer system. Another guide has also
KAROL
A. SCHMJEGEL
61
joined the staff carrying out both tasks. Because the guides are familiar with the vocabulary used and with the record format, they are extremely valuable workers. A number of students in our Masters degree program have also been working on the optical character reader. They are delighted to learn more about the collection and enjoy using the new technology. We contracted for experienced data entry people through an agency which specializes in computer temporaries. They have also proved valuable as they work at night and on weekends. They were not at all familiar with our vocabulary, had excellent typing skills and were used to adjusting to the demands of different systems. We have utilized these people both on the optical character reader and for keying in data from handwritten worksheets for one large collection of silver flatware. Additional students, both at the graduate and undergraduate levels, and recent graduates were hired to work during the summer vacation. They were eager to work in a museum and learn about collections, and they were enthusiastic and productive. However, at the end of summer vacation when they returned to school they resigned or reduced their hours to accommodate their studies. Another source has been museum staff, both from our institution and others, who are looking for extra work to expand their incomes and who like to do something which relates to their professional interests. Former museum employees who have retired, but who like to work on a part-time basis, often for only part of the year, are disciplined professionals who know something about the collections and have proved in most instances to be very good workers. Recruiting people to work on the optical character reader was easier than for keying, and the Sun is especially user friendly. The Sun workstation has a very large screen and uses a mouse both for editing and for activating its various functions. It is more fun to use than a standard computer with function and cursor keys. Our data entry staff has a higher turnover rate than regular staff. However, they have not been too difficult to replace. The optical character reader was scheduled for staffing 15 hours per day and often on weekends. When all worked well, 1500 records per week could be input. The size of each record ranged from one to three cards. The procedure would be that the operator would feed cards into the optical character reader, check each one on the screen, key in any handwritten information (such as the object’s location within each exhibition area), proofread and correct if necessary what the machine had ‘read’. Each evening the main computer system’s back-up operator would create a tape of these data for security and transfer it into the database for our applications software. We used keying for the several thousand records where we had no typed cards. These were for reserve and research collections which are being catalogued retrospectively and for several large collections that have been given to us quite recently. The Sun workstation has features which permit copying of records so that the same data do not have to be read or keyed for like objects-for example, a set of six nearly identical silver teaspoons, where a separate record could be copied from the first for each of the subsequent items. However, we found that sets of more than ten identical or similar objects simply did not work well on the optical character reader. It was too easy for the operator to lose his place when making many copies of the same record. It also proved to be faster, when large sets were involved, to key in the data once and use the copy feature on CIMS-Hewlett Packard systems than to use the copy feature on the Sun. Some cards which had extensive handwritten revisions or which were simply smudged and dirty were difficult to edit on the Sun and these were assigned to people to key in directly. The data entry project was massive. It has required over a year to enter data for our
62
Data Entry
collection. Utilizing the optical character reader has made the process faster for the quantity of data which we entered. When the optical character reader was working well, it was a real pleasure to use. However, our particular machine was plagued by recurring hardware problems. The Sun workstation and of course the Hewlett Packard computer system have proved to be extremely reliable. We knew at the beginning that data entry was a major area of concern. We have had an excellent manual system with very detailed records. Expectations for our computer system were therefore extremely high. The accuracy of the manual system was very good. While our computer system’s first goal was to allow us to gain inventory control of our collection, providing assistance to staff and other researchers, and in the future to the general public, was also a main goal. Being able to enter the descriptive information about each object as well as the discrete pieces of basic data means that our educational mission has been even further enhanced. What would we have done differently, with the benefit of hindsight, to facilitate the data entry even further? Our main problem has been that we have not been able to have one staff member as a full-time project manager for data entry. It would have been to our advantage to have hired on a temporary basis one person to manage the entire data entry project literally from start to finish: someone who would have instantly been on top of scheduling problems, technical problems, training and proofreading. We would undoubtedly have saved some time on data entry and have had a better-quality product as we went along rather than postponing so much of the cleanup until nearer the end of the project. Hindsight is, of course, always clearer than foresight. We have utilized three methods of data entry. All three techniques have proved beneficial with each having its own peculiar advantages. Our data entry ‘bugaboo’ is less apparent as we now can enjoy our system and only occasionally find an omission or an error which has slipped through the cracks. However, there is still a great deal of data which we would like to have in our computer system. Our photographic records of collection objects remain, as do the detailed compositional analyses for the materials used in creating the objects. Having overcome the initial challenge of the data entry project, we can plan the subsequent ones with more confidence.