Computers I: A very boring problem

Computers I: A very boring problem

114 Professional Notes vices to museums and collectors. The site of the new Centre will be on the Universitv College campus in Bloomsbury, near the ...

319KB Sizes 0 Downloads 67 Views

114

Professional Notes

vices to museums and collectors. The site of the new Centre will be on the Universitv College campus in Bloomsbury, near the British Museum, National Gallery and National Portrait Gallery, and the new British Library, and within easy reach of the main railway stations for visiting scholars. As ‘conservation’ develops into ‘materials science’, dealing not only with the external aspects of preservation, but with the physical and chemical nature and behaviour of materials ancient and modern, the Institute of

Computers I A Very Boring Problem Writers of these notes are granted few words, an ideal condition for topics of such numbing tedium as museums’ numbering systems and their impact upon computers’ digestion. These numbers are inescapable, infecting every object we beg, steal or borrow. They are justly viewed as a nuisance inflicted upon great minds by small ones, and yet they are It is the process of listing, necessary. numbering and marking that counting, makes a collection. It is the unique number, common to an object and its body of information, that binds them together; and it is by finding a number without an object that we know a thing is lost, For the organization of information and collections nothing is more vital; and so, in the jargon of database management, the ‘object number’ field is rightly designated a primary key. The requirements for a numbering system are simple. Firstly, no two objects may have the same number. Secondly, the numbers must form an unbroken series so that, where there is a number 5, numbers 1, 2, 3 and 4 must also be present or accounted for; and, as a corollary, the beginning and current end of the series must be known. Thirdly, there must be no reason ever to change an object’s number-as, for example, when the object is reclassified or transferred from one administrative category to another. The series of positive integers, starting with 1, satisfies these criteria and also suits a computer as fish suit a gull. The author knows of one venerable institution that does,

Archaeology is preparing to face the challenge of meserving a fast-vanishing east for our descendants. For further information contact Cathy Giangrande, Development Director at the Institute of Archaeology, 31-34 Gordon Square, London WClH OPY, UK. ”

I



“I

Photo credit. Institute of Archaeology,

University

College London. CATHY

GIANGRANDE

in fact, number its holdings consecutively from I. The wonderful simplicity of this idea is spoiled, to some extent, by the nature of objects themselves, which cluster to form sets and also divide into parts, and parts of Parts of objects and members of parts. sets-such as the panels of an elaborate altarpiece-demand individual records as to size, condition, subject-matter and photographic references. Hence each must have its individual number in addition to the number that ties the whole entity together and relates it to its parts. Thus museums are forced to use hierarchical numbering systems where, for example, ‘1234.6/B’ might designate the reverse side (‘B’) of the sixth part of the 1234th object to be catalogued. Our professional forebears, too clever by far for OUT own good, have compounded the difficulties by: (1) instituting multiple systems within individual museums; (2) adding coded information (such as object clssification, source, year of acquisition, etc.) to what ought to be a pure identification code; and (3) mixing digits with letters, abbreviations, roman numerals and miscellaneous punctuation. Sometimes even the of elements is juggled: at The sequence Museum of Modern Art, New York, ‘56.37’ designates both the 37th object borrowed in 1956 and the 56th acquisition of 1937, and every year one number (e.g. ‘89.89’) is assigned twice. Every institution has its own, more or less eccentric, numbering system. Indeed most have several. Often a number of distinct collections, each with its own numbers, are housed in one place, stored and displayed in the same rooms. Often a single institution’s

Professional Notes

collection is divided along lines of discipline or source, the parts numbered according to differing rules. Many a museum has had as many as three successive systems, with up to three numbers, painted in various colours, on each of its older possessions. The system of The Metropolitan Museum of Art, New York, is spreading among museums not otherwise committed, and thus becoming something of an industry standard. Its elements, in order, are: year of acquisition, ‘lot’ number, item serial number and, where necessary, a part number within the item. All the elements are numeric and they are separated by periods (full stops). The first two elements (year and lot) constitute the accession number and relate the object to the transaction through which it was acquired. Thus ‘69.1.1’ would designate the first (and perhaps the only) object in the first lot acquired in 1969. The first accession of the following year at the Metropolitan Museum of Art had to be numbered ‘1970.1’ because the museum was then 100 years old. Some museums have been known to substitute ‘XX’ or some other non-numeric code for the year of a forgotten acquisition. Sometimes upper- or lower-case letters are used as ordinal numbers, so that the presence of ‘c’ implies the existence of an ‘a’ and a ‘b’ in the same series. Letters may also classify, as when ‘D’ indicates a ‘donated’ object without at all implying that there is a corresponding ‘C’. Some classifying characters determine how the rest of the number is to be interpreted: at The Museum of Modern Art ‘E.L.39.40’ would be the fortieth ‘extended loan’ of 1939 (year before item number), while 39.40 would be the thirty-ninth acquisition of 1940 (item before year). As a museum prepares to automate its collection records the numbering muddle becomes, quite unexpectedly, a thorn in the flesh, all the more irritating because such a trivial formality should not, by all rights, present a problem at all. One of the essential preliminary steps is the design of record formats; and, whatever else may or may not be present in an object record, there is always an object number field. For renumbered collections there may also be one or more ‘old number’ fields to define. Defining a field involves assigning it a data type, if only by default. The assigned type determines how

115

the field content will be represented (by bits in storage), how it will be interpreted when retrieved, and what editing or mathematical operations can be done with the data. Alas, there is no data type that suits a museum object number.

The type called ‘text’ or ‘character’ will hold amost anything that can be typed but its left-to-right ‘alphanumeric’ sorting produces chaos, with ‘1999’ before ‘2’ and all punctuation taken nonsensically into account.’ This is serious because if one cannot sort by object number one cannot see what’s missing. The ‘integer’ type (for numbers without decimal points) will tolerate nothing but digits. The ‘decimal’ type accepts digits and a single decimal point (or period) but rejects leters, other characters or a second decimal point, in short, any input that cannot be read as a decimal number. This type would, of course, accept ‘39.40’ but sort it as 39 and 40 IOOths-well ahead of ‘39.5’. Assuming that a numbering scheme has any logic to it, a solution can always be found; but there is a price to be paid-not once but in perpetuity. If object numbers are written with the most significant part first (as at the Metropolitan Museum of Art) then a ‘brute force’ solution may work. This involves calculating the maximum possible size of each part of the object number and ‘padding’ each part with leading zeros or blanks. Thus ‘1970.1.1’ would be entered in a text field, as perhaps, ‘1970.0001.00001.0000’, which would sort properly and still correspond, in a way, to the number painted on the object. Obviously this procedure would be a burden and a source of error forever. Where the most significant part of the number is not first (as at The Museum of Modern Art) a brute force solution is even more outlandish, entailing a transposition of parts. Other solutions involve special programming at an initial cost plus perpetual operating overhead. It is possible though not as simple as it may seem to automate the brute force transformation so that, when a standard object number is entered, an expanded form is stored; and, when the record is retrieved, the original input is restored. No off-the-shelf software can do this for a museum. A more logical approach (requiring rather

116

Professional

careful input) is exemplified by the system at the Indian Pueblo Cultural Center, Albuquerque, designed by R. G. Chenhall.’ This numbering system is similar but not identical to that of The Metropolitan Museum of Art. Each number includes a prefix, ‘L’ for extended loan, ‘D’ for donation or ‘I” for purchase. Since the L, D and P series all start afresh each year with ‘I’, the letter prefix is necessary to identify an object. Each of the four parts of the object number is entered as a separate data category in its own field. The numeric parts are stored as integers (so that ‘99’ sorts properly before ‘loo’), while the letter prefix occupies a one-character text field. Separate data fields give a user the option of sorting retrieved records by letter, then year (e.g. as ‘I’. 85.1.99’) or by year, then letter (e.g. as ‘85.P.1.99’). A major drawback to this technique is that retrieval of an individual item by number is no longer a simple query (FIND (number = ‘P. 85.1.99’)) but a boolean query (FIND (letter = ‘P’) & (year = 85) & (accession = 1) & (item = 99)). Some more elaborate and expensive database systems would allow the object number to be reecorded in a ‘composite field’ (or ‘data structure’) containing the desired number of subfields. Then, at least in principle, the object number might be entered and retrieved as a unit, stored as a set of independent parts and sorted by any desired combination of parts taken in any useful order. Notes 1. See ‘The A B Cs of Computing’ in this column, Vol. 7 (December 1988), pp. 389-391. 2. R. C. Chenhall, and D. Vance, Museum Collections

Greenwood

and Today’s

Computers (Westport, Press, 1988), pp. 111-118.

DAVID VANCE

Comwters II Coloine Computer Conference, 7-10 September 1988 The recent growth in the computing ties of historians and social scientists,

activiwhich

Notes

had lagged behind those of other disciplines only in organizational respects, has now reached a pace which cannot fail to strike observers. Computers are rapidly becoming ubiquitous in lecture-rooms, classrooms and libraries, and will soon be as natural a research and teaching tool as any other. It was as a natural response to this situation that three organizations closely involved with aspects of this development came together to hold a joint international conference. The International Conference on Data Bases in the Humanities and Social Sciences (ICDBHSS) has regular biennial conferences, to date all held in the USA, on all aspects of database technology as applied to the humanities and social sciences. The International Federation of Data Organizations for the Social Sciences (IFDO), a worldwide association of major social science data archives and services, has for a decade been organizing topically centred conferences on methodological and technological aspects of social research. The Association for History and Computing, a forum for all aspects of computing in the historical disciplines, is the newcomer to this area, having been founded only in 1986 but already with a membership of over 600. The collaboration of these three bodies produced a major international event, with over 500 participants. It was fitting that this event should have been held as part of the celebrations of the 600th anniversary of the University of Cologne. Over 200 papers and demonstrations were arranged for the conference, spread over four days. In order to make this possible, the in parallel sesprogramme was organized sions (up to seven at a time, excluding the demonstrations). It is therefore impossible for a single participant to report adequately on the whole event-though the capacious volume of abstracts that was supplied can help, and the planned publications based on the presentations themselves (not surwill run into several prisingly this volumes)-should provide a fuller record. The papers and demonstrations covered all the main areas of current interest. The programme was organized into several main themes: ‘New Data Bases’ included sessions on archaeology, regions and communities, individuals, texts, economic activities, juri-