ELECTRONIC INFORMATION RESOURCES

ELECTRONIC INFORMATION RESOURCES

4 ELECTRONIC INFORMATION RESOURCES DAVID F. FARR AND ELLEN R. FARR DATABASE DESIGN 50 Fields and Tables 50 Table Relationships 51 Relational Tables i...

77KB Sizes 2 Downloads 137 Views

4 ELECTRONIC INFORMATION RESOURCES DAVID F. FARR AND ELLEN R. FARR

DATABASE DESIGN 50 Fields and Tables 50 Table Relationships 51 Relational Tables in Systematics 51 Recursive Relationships 51 Standardization of Data 52 GETTING STARTED 52 Application Software 52 User Interface 53 Project Goals 53 CONCLUSIONS 53 APPENDIX 53 Database Details 53 Table Design for Specimen Data 53 Core Data Structure for a Specimen Database 54

In recent years efforts of the world’s scientists, industrialists, and government policymakers to improve understanding, use, and conservation of biological diversity have exploded. Increased understanding is based largely on biodiversity studies that generate great amounts of

data concerned with, among other things, the systematics and distribution of organisms. Many of the most useful analyses of that information will involve pooled data from many sources. At this time, much of the available information on biodiversity remains with the individual scientists who collected it. Providing a mechanism by which the broader community of biodiversity researchers can consolidate and then easily access those data to answer its questions is a challenge for data managers. Clearly computers will play a major role in the data management. The needs of the biodiversity community could be met most easily either by having data entered at a central location or by providing scientists with a standard software application for recording their observations. Both are viable options that have been used successfully. In our experience, however, many of the researchers who generate valuable biodiversity information have not been amenable to those approaches, often because they view the programs as irrelevant to their needs and of little value in promoting their individual research goals. Given that situation, it is perhaps more appropriate to stress that centralized collection and management of information is not as important as ensuring that information is available, compatible, and comparable. Virtually everyone at a modern research facility uses a personal computer for word processing; this is not the

49

50

David F. Farr and Ellen R. Farr

case with data management. Until recently, PC database software was not very user friendly, and many scientists used it only for simple applications. As a result, many data that have remained in notebooks or appeared in research publications must be rekeyed to make them available electronically. Over the last few years, however, several manufacturers have introduced new programs that have improved the situation significantly, providing the scientist with new opportunities to manage his or her data using database software. Few mycologists have advanced skills in manipulating computer databases. Many have only an elementary knowledge of database design and data manipulation tools and lack readily available support. Our goal in this paper is to stimulate scientists to include computer databases in their data management strategies. We provide general guidelines that should help to ensure that the information in the databases they develop will be comparable as well as compatible with that in databases developed by others (see “Database Details,” in the Appendix, this chapter). Thus, more information will be integrated into large and readily available datasets necessary to meet the needs of the biodiversity community. In this chapter we provide a general overview of the way current software packages manage data and how they may be helpful to the museum curator or researcher. We also introduce some general concepts important in building computer databases, using a sample application to develop the concepts of database design. In the Appendix, we provide a detailed list of the entities that form the core of a specimen database—that is, the information that documents the occurrence of an organism in a given place at a given time (see “Core Data Structure for a Specimen Database,” in the Appendix, this chapter). The goals of individual scientists and those of the consumers of biodiversity information are compatible. A well-thought-out personal database management application can serve the needs of the researcher and provide valuable data for comprehensive biodiversity studies. Olivieri and colleagues (1995) have provided an excellent treatment of the vast array of issues involved in managing the information developed in biodiversity studies. Anyone interested in the broader aspects of this subject should consult that publication.

DATABASE DESIGN FIELDS AND TABLES Designing a database is best done from the top down. First the purpose of the database is determined, then the

subjects that will be covered, and finally the specific bits of data that will be entered. Although a database can have broad coverage (e.g., maintaining all information available about a specimen), we recommend that initial attempts at design be more restricted. For this discussion we will assume that the purpose of the database is to manage a collection of photographs. Among the best sources of information for the development of a database are the reports or notes that the investigator currently is generating. For example, a notebook entry might contain the following information: Collection no. 3456, May 10 1996, roll 33-10, Pholiota terrestris, 1/30 sec or Specimen no. 3333, Septoria florida, conidia, koh and phloxine, ¥100, exposure reading 0, roll 34-11, tech pan, compound scope The data in those entries appear to fall into logical groupings. For example, we can assign the information to three categories: name, negative, and roll. We will create a separate table for each of those subjects. We next need to decide which data elements should be assigned to each table (see “Table Design for Specimen Data,” in the Appendix, this chapter). For the roll table we will use fields for type of film and type of camera or microscope. The negative table will have fields for exposure, magnification, mounting medium, roll number, negative number, and structure observed. The name table will contain information on the name of the organism—in this case, a fungus. To simplify this discussion, we are putting the full name in one field, although in most cases one likely would place each element of a name (e.g., genus, specific epithet, family) in a separate field. The entry in each table also will need a record number. That can be a single field or a combination of fields that uniquely identify each record. For the name table that would be an arbitrary number, and in the roll table the roll number could be the record number. There are a couple of possibilities for the negative table. One possibility would be to use an arbitrary number; another would be to use the combination of the roll number and negative number. We will use an arbitrary number. General points important when selecting fields and building tables follow. For fields: 1. The data in the fields should describe the subject of the table. 2. The data should be broken down into the smallest logical unit.

Electronic Information Resources

3. In general it is difficult to parse the data into too many fields. 4. The designated fields should be checked against field notes and reports to ensure that they cover all of the necessary information. For tables: 1. Tables should only include fields that pertain to the same subject. 2. Tables should not contain fields that are intentionally left blank in many records. That condition often suggests that those fields should be put in a different table. 3. Different tables should not duplicate data except as needed to establish relationships (see the next section).

TABLE RELATIONSHIPS We now have the basics of our application. The next step is to establish relationships between these tables, so that an application (e.g., production of a report) can use data from all of the tables. Relational databases for personal computers are now widely available and increasingly popular. We assume that readers are familiar with the concept of the relational database. Those who are not should consult one of the many books on this subject in a library or bookstore. The important point for this discussion is that those products have improved greatly the ease with which two or more different tables can be related to produce output containing data from the related tables. What once required significant programming to achieve now can be accomplished with essentially no effort. Different sets of data are stored in separate tables, the relationships among the tables are defined, and then the software uses the relationships to find the data requested. To be related two tables must share a data element. In addition, data type (e.g., numeric, text) in that element must be the same. The data element (field) common to the negative and roll tables is the roll number. The negative and name tables do not share a data element and cannot be related. Therefore, we need to add a field to the negative table that will refer back to the name table. That field, SpecimenName, will be a number field to match the ID number field of the name table. With that adjustment, the name table can be related to the negative table. The resulting data display includes the name data. We now have the ability to generate the types of reports we used as the starting point of this example,

51

which is one of the goals of a database project. We can do much more with the data, however. We quickly can determine what pictures we have of a particular structure, species, or collection or of a particular structure from a particular species. Reports can be produced in order by roll, by name, by structure, or by collection number. A primary goal of the table design for the photographs (or for any set of data) is to reduce the keyboarding of redundant information. If the roll and negative fields were in the same table, then the information about the roll would have to be entered for each negative. That is time consuming and increases the possibility of error. The same holds for the name table, but the name table also illustrates another goal of relational database design—reusability. A name table can be used in other applications. Two obvious possibilities are in a specimen label table and in a nomenclature table. A name is entered only once but is used in many places, providing for consistent use of the name through all applications.

RELATIONAL TABLES IN SYSTEMATICS Now that we have built an application, we can review other areas where the methodology may be helpful. A specimen table should be restricted to information relating to the collecting event: who, what, where, and when. Information dealing with morphological features is best placed in a separate table. An obvious component of nomenclature information is the original citation. Although we can assume that every name will have a literature citation, we put the citation in a separate table rather than in the name table to facilitate reusability. Literature citations often are maintained in a bibliographic database. By establishing a relationship between the bibliographic table and the name table, it is easy to use a record from the bibliographic database in the name table. Sometimes multiple taxa are described in a single publication. With a bibliographic table, a full citation need only be entered once and then can be linked to all appropriate name tables. A separate table can be used to determine the class and order for the names in the specimen table.

RECURSIVE RELATIONSHIPS A nomenclature database is often of interest to systematists. A recursive relationship, in which a table is related to itself, can be useful in this regard. To illustrate, consider a simple table that contains only three fields: an ID

52

David F. Farr and Ellen R. Farr

number (“NameId”), a name, and a synonym. Note that the synonym field is a number, whereas the name field is a text field. The synonym field actually refers to an already-existing record in this same table. To display the actual name in the synonym field you need to relate the table to itself. With this design you can list only the accepted names in the table, the accepted names and synonyms, or only the synonyms. An additional field can be added to the table to control the sorting of the synonyms. Let us call this field “SynonymType.” It will have three values, or text designations: b for basionym, o for obligate synonym, and t for taxonomic. Using that field to sort the synonyms will put them in the standard order used in publications. By adding a fourth field to the table, which contains the number that is the link to a literature citation table, it is possible to maintain a nomenclature application. Recalling the theme of reusability of data, that name table now can be used to update the names in the specimen table. Table design is complex, and, as mentioned previously, it is also somewhat subjective. For more information, consult one of the many books on the subject (see also “Table Design for Specimen Data,” in the Appendix, this chapter). In addition, the data models from the Natural Science Collections Alliance (formerly The Association of Systematic Collections) and the Common Data Structure for European Floristic Databases (Biodiversity and Biological Collections Web Server, Appendix III) provide examples based on biological attributes. As with documents from any specialized field, those examples may be rather opaque in the absence of previous experience in information modeling.

STANDARDIZATION OF DATA Although field structure may be the most important aspect of a database project, failure to standardize both the format and the content of the data entered can greatly reduce the value of the database. In fact, when converting data from one application to another application, a lack of standardization causes the most trouble. The entering of data into a database somehow does not confer a higher level of precision on the data than they actually have. To maintain the standards of data entered, one routinely can edit the data, or one can enforce standardization at the time of entry. The first requires a minimal understanding of the software being used but suffers from good intentions that almost always fail. The second requires more computer expertise but also delivers the desired level of standardization more reliably. Standardizing data has both an immediate return and long-term benefits. For example, when a database is queried for a

particular piece of information, it should produce all of the records with those data. That will not happen, however, if the name of the item has not been spelled the same way in all records. Well-standardized data make feasible the inclusion of data in larger datasets or combinations of datasets. Both format and content should be standardized as data are entered. The format is controlled through the software application, which may check, for example, that the date always is entered in the same way (e.g., YYYY MM DD). Another major use of related tables is to verify the data being entered into the main table. Tables of that type often are called “lookup tables” or “authority tables.” A common example is the use of a table with standardized geographic or political names, such as states, provinces, cantons, districts, and so forth. In the United States, a state table can be used to verify the data added to a field in the specimen table. For example, if a user sets up a relationship between the state field in the specimen database and a table listing all of the states of the United States, most software only will allow the user to enter values for the state field that are in the state table. That is a strong tool for ensuring consistent data entry without the need for post–data entry editing. Many of the fields in a specimen database can benefit from the use of authority tables. Generic names, author names, and place names are obvious candidates. Authority tables can be obtained from a colleague or institution that already has built such a table available for outside use. The other option is to build the lookup table yourself by entering items from a standard reference into an authority table either all at once or as the data are entered. With the latter, procedures for adding new values to the authority table will have to be set up during the early stages of the project. On a project of any size that involves a continuing long-term effort, authority tables significantly increase the efficiency of data entry. It is difficult to overstate their importance.

GETTING STARTED APPLICATION SOFTWARE We have talked about using a single database product to handle many data manipulation needs. In many cases, however, the literature citations may be in bibliographic software, specimen-label data in a database, and morphological data in a spreadsheet. The downside of using different products is the difficulty of reusing the data for different purposes. For example, it can be difficult to integrate literature citations in a specialized bibliographic database with a nomenclatural database. The downside

Electronic Information Resources

to using a single software package for different applications is that the applications are not likely to be as sophisticated as those developed for a single purpose. One should keep in mind as we move toward electronic publication and real-time display of data over the Internet that those curators and investigators whose data are in a format that makes those activities an extension of existing procedures without the need for any reformatting or rearrangement of the data will be at an advantage.

53

Relational databases provide logical procedures for building an application table by table over time as needs arise and allow everyone to design an application that best meets the scientists’ needs on a daily basis.

APPENDIX DATABASE DETAILS

USER INTERFACE A user interface controls how data are edited, added to tables, and displayed. This is very much a function of the software and cannot be discussed adequately here. Most of the newer database packages simplify the creation of data-entry screens and reports. As in table design, the interface can be simple or complex. Data managers should start simple and add capabilities as needed.

PROJECT GOALS Goals for a database project should be realistic, focusing closely on what really is needed for a research project as opposed to what would be “nice” for a project. Many projects get stalled after overzealous beginnings. The biggest culprit is the effort required to collate and enter the information. Even with data entry assistance, time demands for prescreening the information can be high. A small, well-defined database can be integrated into a work routine. Understanding how the database supports data maintenance will help the investigator determine how to expand it to include other activities.

CONCLUSIONS Modern database software greatly increases the ease with which applications can be developed by end users. If fields are delimited carefully, applications can be developed that meet the researcher’s needs yet provide data that are generally useful and easily transferable to the wider community involved in biodiversity-related studies. Although the software has improved and is easier to use, it will not immediately meet all data management needs. If expectations are reasonable, chances of successful implementation are improved greatly. If designed properly, database tools will be useful throughout a career; thus, they are worth an investment of time and patience.

In recent years many organizations and institutions have debated, developed, and implemented standards for storing and exchanging specimen data, including conversion of data for use in other databases. Those efforts are ongoing, and although some common principles appear to be emerging, it is difficult to promote a definitive guide for exchange of specimen data. In this appendix we discuss aspects of table design that can be varied to fit particular uses and that may facilitate data exchange. We also suggest a core data structure for mycological specimens. The list of fields is our design, but we have drawn inspiration from the work of many ongoing projects and from our interactions with experts in the field. This core data structure is intended to provide a model framework that can fulfill several functions: 1. Serve as a basic exchange format. We describe a flat file structure with field names, field definitions, and maximum field length. Both the recipient and the donor of data must be aware of the requirements. 2. Serve as a point of departure for communication about the exchange of data. The donor of the data can say, “The data that I am sending differ from the basic exchange format in the following ways ...” 3. Serve as a file structure for collecting and storing specimen information for various purposes. A core specimen-data record includes the “what, who, when, and where” information—the identity of the specimen, the collector, the date, and the place—that enables a specimen to serve as a voucher for observations that support conclusions in biodiversity studies.

TABLE DESIGN FOR SPECIMEN DATA Field structure is the most important aspect of database design for facilitating conversion to other databases. Table design is more subjective, and different designs may be better for particular purposes. In the text examples we based tables on different subjects. Tables also can be based on different events in an application. With specimen data two events can be distinguished: identification

54

David F. Farr and Ellen R. Farr

and collecting. The fields associated with such events are assigned to the two right-hand tables. In that layout all of the data involved with the collecting event are treated as one record. The utility of that arrangement when field collecting involves many collections from a few localities is easily imagined. The data for each locality are entered only once, and each specimen record can be referred to that record easily. Consistency in data entry is assured. The identification table includes the scientific name as well as the name of the person who identified the specimen and the date of identification. In fact, that combination of events may not be common enough to justify grouping the fields—that is, the reusability of any given record in the table may be too low to justify the effort required to maintain the table. Another approach to the identification table would be to replace the “Rec. No.” field with the “Coll. No.” field from the specimen table. That is little different, however, from putting those fields directly in the specimen table. A third approach is to include both the Rec. No. and Coll. No. fields in that table. By changing the Rec. No. and keeping the Coll. No. the same, you can have multiple records with the same Coll. No. The table then would become the source of information not only for the original determination, but also for all of the subsequent determinations and annotations. The Rec. No. field in this case automatically increases with each new determination, and thus, the records in the table are reusable. If an additional table is inserted between the specimen table and identification table, then both the reusability and historical determination goals are achieved. The additional table would have two fields: the Coll. No. field from the specimen table and the Rec. No. field from the identification table. In that way reusable records in the identification table can be connected to one or more specimen records. In other words, one specimen record can be connected to one or more unique identification records. Although the theoretic aspects of table design are quite interesting, practical aspects also must be considered. Simply put, the more complicated the table design, the more difficult it will be to construct a usable interface between the user and the tables. Current software is helpful in that area but has limited resolving power. The use of authority tables (see “Standardization of Data,” earlier) is incorporated easily into table design. Addition of intermediate tables, however, requires increased knowledge of the more esoteric parts of the software. Another way of handling the name in a specimen table is to have fields for all of the components of the name, each of which is related to a table with names for that component. There, the components of the name can be reused but not the whole name. Although that design includes several tables, which are essentially

authority tables, implementation would be easy. The locality data could be handled in a similar way by putting all of the locality fields in the specimen table and using authority tables for the actual data. A well-designed database should, at a minimum, achieve the following: 1. Well-designed fields 2. Incorporation of authority tables

CORE DATA STRUCTURE FOR A SPECIMEN DATABASE Taxonomic Name Genus (Genus) Size: 26 Text Required Yes Format: Standard: Farr et al. 1979, 1986; Index of Fungi; Greuter et al. 1993 Reference: Bisby 1994; Greuter et al. 2000 Comments: Species (Species) Size: 32 Text Required No Format: Standard: Index of Fungi; Saccardo 1882–1972; Reed and Farr 1993 Reference: Bisby 1994; Greuter et al. 2000 Comments: Rank (Rank) Size: 6 Text Required No Format: var., f. subsp., subsp., etc. Standard: Reference: Bisby 1994; Greuter et al. 2000 Comments: Subspecific Epithet (SubSpEp) Size: 32 Text Required No Format: Standard: Reference: Bisby 1994; Greuter et al. 2000 Comments: See Species. Author (Author) Size: 100 Text Required No Format: Standard: Brummitt and Powell 1992 Reference: Bisby 1994; Greuter et al. 2000 Comments: This is the author at the lowest rank (binomial or trinomial). If there are more than two authors then “et al.” is used for all authors after the first. That recommendation comes from the International Code of Botanical Nomenclature (Greuter et al. 2000).

Electronic Information Resources

Determiner (Determ) Size: 55 Text Required No Format: Standard: Reference: Comments: See Collector for additional details. Determine Date (DeterDate) Size: 11 Text Required No Format: Standard: Reference: Comments: See Collector Date1 for details. Locality Data Country (Country) Size: 25 Text Required Yes Format: Standard: Reference: Times Atlas; ISO; Hollis and Brummitt 1992 Comments: Because field is required, use “unknown” for specimens lacking this information. Country Subdivision1 (CntrSub1) Size: 35 Text Required No Format: Standard: For states in U.S., use postal code abbreviations (MD, VA, etc.). Reference: Comments: First political subdivision for a country. In the United States that would be State. It is not necessary to name the rank of subdivision (State, Province, Canton, District) if it is understood that it is the first major subdivision of a country. Country Subdivision2 (CntrSub2) Size: 35 Text Required No Format: Standard: Reference: Comments: Second political subdivision for a country. In the United States that would be County (or Parish in Louisiana). Country Place Name (CntrPl) Size: 100 Text Required No Format: Standard: Comments: The lowest identifiable named place not assignable to subdivisions 1 and 2. National Parks, Cities, etc. Locality (Locality) Size: 250 Text Required No Format: Standard:

55

Comments: Specific information about collection site. “At the end of Fungus Lane,” or “3 miles from the intersection of Route 1 and Fungus Lane.” Elevation1 (Elev1) Size: 5 Numeric Required No Format: Number without commas Standard: Reference: Comments: “Around 100 feet” would be expressed in meters as “around 31 meters” (if rounding is recommended). 0 (zero) implies sea level. Negative values indicate below sea level. Elevation2 (Elev2) Size: 35 Numeric Required No Format: Standard: Reference: Comments: This field is used to express a range of elevations. See Elevation1 for details. Elevation Source (ElevSr) Size: 25 Text Required No Format: Standard: Reference: Comments: Source of elevation data. Map, instrument, estimate, etc. Elevation Unit (ElevUnit) Size: 1 Text Format: Standard: Reference: Comments: Enter f for feet or m for meters Latitude (Latitude) Size: ? Numeric Required No Format: Decimal degrees with north as positive and south as negative. -38.1234 Standard: Reference: Federal Geographic Data Committee 1998. Comments: Most mapping/GIS programs appear to use the decimal degree format. The biggest problem with decimal degrees is that converting from a value lacking seconds implies greater precision than was present in the original observation. Longitude (Longitud) Size: ? Numeric Required No Format: Decimal degrees with east as positive and west as negative. Standard: Reference: Federal Geographic Data Committee 1998. Comments: See latitude for details. Latitude Longitude Source (LatLonSr) Size: 25 Text Required No

56

David F. Farr and Ellen R. Farr

Format: Standard: Reference: Comments: Source of elevation data. GPS, map, estimate, gazetteer, etc.

Format: Standard: Comments: Use for nondate time periods. Autumn, summer. Details of Collection

Collector Collector (Coll) Size: 25 Text Required Yes Format: Last comma First. No spaces between initials. Standard: Reference: Comments: The name associated with the number series. Team Collectors (TeamColl) Size: 55 Text Required No Format: Separate each name by semicolon. If there are more than three then use “et al.” for all following the first named collector. Farr, D.F.; Farr, E.R. Standard: Comments: For additional individuals on collecting expedition. Our MSA Committee is divided about using a separate field for collector name(s) associated with number, but there is precedent in some other projects for doing this. Collector Number (CollNum) Size: 20 Alphanumeric Required No Format: Standard: Reference: Comments: Submitter (Subm) Size: 25 Text Required No Format: Last comma first. No spaces between initials. Standard: Reference: Comments: Name of the person submitting the collection. Collector Date1 (ColDate1) Size: 10 Alphanumeric Required Yes Format: YYYYMMDD or YYYY-MM-DD Standard: Comments: American National Standards Institute (McCallum 1986) standard. Collector Date2 (ColDate2) Size: 11 Alphanumeric Required No Format: Standard: Reference: Comments: See Collector Date1 for details. Used to express a range of dates. Season (Season) Size: 10 Text Required No

Substratum (Substr) Size: 50 Text Required Yes Format: Standard: Reference: Comments: Leaves. Decayed wood. Small branches. Use “unknown” if data are missing. Habitat (Habitat) Size: 250 Text Required Yes Format: Standard: Comments: Characterization of the collection spot. In mixed woods. On limestone outcropping. It would be useful for individuals or groups to agree on a standardized set of descriptive phrases for their particular needs. May need more space. Host Genus (HGenus) Size: 26 Text Required Yes Format: Standard: Reference: Comments: See Genus for details. Host Genus Hybrid (HGenusHy) Size: 1 Text Required No Format: Lower case alphabetic ¥ adjacent to the genus name. The alphabetic ¥ substitutes in computers for the multiplication sign specified by the International Code of Botanical Nomenclature. Printouts should use the multiplication sign when possible. Comments: Host Species (HSpecies) Size: 32 Text Required No Format: Comments: Standard: Reference: Index Kewensis (Royal Botanic Gardens 1997); Gray Herbarium Card Index. Host Species Hybrid (HSpeciesHy) Size: 1 Text Required No Format: See Host Genus Hybrid Comments: Standard: Reference: Host Rank (HostRank) Size: 8 Text Required No Format: var., f. subsp., subsp., n-subsp.(nothosubspecies) etc.

Electronic Information Resources

Standard: Reference: Bisby 1994; Greuter et al. 2000 Comments: Host Subspecific Epithet (HostSub) Size: 30 Text Required No Format: Standard: Reference: Comments: Host Common Name (HostCom) Size: 50 Text Required No Format: Standard: Reference: Comments: Use to record the common name of the host if that is the only available name. Housekeeping Herbarium number (HerbNum) Size: 20 Text Required Yes Format: Acronym plus institutional number used to identify the specimen. Standard: Reference: Holmgren et al. 1990. Comments: This should be a unique number for the institution. Use Index Herbariorum codes. Additional Fields It has been suggested that a field to indicate the type status of the specimen and a field to indicate if a culture was made from the specimen should be included in the field list. Both of those categories extend the field list beyond the basic; the use of relational tables provides another mechanism for handling that information. If data about the type and the cultures are placed in sepa-

57

rate tables, the specimen table can be related with either one or both, using the collection number to determine if the specimen is a type or has been cultured. If a separate field in the specimen table is used for those data, the information is not directly linked to the information in the culture table and, therefore, it is not automatically updated. For example, if a culture were removed from the culture table, then the data in the specimen table also would have to be changed. Other fields have been suggested for inclusion but are clearly outside the scope of minimum required data. Individuals and institutions, however, should customize their databases to include all information relevant to their needs. Type (Type) Size: 2 Text Required No Format: Standard: Comments: This is used to indicate that the specimen has some status as a type. A two-letter abbreviation indicates which type (e.g., HO, IS). Culture (Cult) Size: 1 Text Required No Format: Standard: Comments: Used to indicate that a culture was made from the specimen. ACKNOWLEDGMENTS. Considerable input regarding the core fields to be used in a specimen database was provided by Nancy Weber, Bob Fogel, Tom Volk, and Jim Ginns, our fellow members of a temporary committee of the Mycological Society of America (MSA) working on this project. The fields listed and associated commentary are a summary of our discussions. Scott Redhead reviewed a draft of the manuscript and also provided many helpful suggestions.