The Intersection of Virtual Organizations and the Library: A Case Study

The Intersection of Virtual Organizations and the Library: A Case Study

The Intersection of Virtual Organizations and the Library: A Case Study by Jake Carlson and Jane Kinkus Yatcilla Available online 19 March 2010 The p...

760KB Sizes 0 Downloads 23 Views

The Intersection of Virtual Organizations and the Library: A Case Study by Jake Carlson and Jane Kinkus Yatcilla Available online 19 March 2010

The proliferation of virtual organizations is changing the nature and practice of research. These changes present a challenge to Libraries, as their traditional roles and services do not translate well to virtual organizations. However, virtual organizations also offer opportunities for librarians to participate in shaping the next generation of information discovery and management systems. This case study describes a project conducted by the Purdue University Libraries to assist CAT-hub, a developing virtual organization, in creating a set of tags for information discovery and site navigation. The methodology and processes used to develop and implement these tags are described in detail.

Jake Carlson, Data Research Scientist, Purdue University Libraries, West Lafayette, IN, USA ; Jane Kinkus Yatcilla, Mathematical Sciences Librarian, Purdue University Libraries, West Lafayette, IN, USA .

INTRODUCTION This article describes collaboration between a developing virtual organization, CAT-hub, and the Purdue University Libraries. The Center for Assistive Technologies (CAT), the developer of CAT-hub, is an interdisciplinary research center housed within the Regenstrief Center for Health Care Engineering at Purdue University. The CAT-hub site will be used as a means to bring together diverse communities with varying interests and expertise to share their information, resources, knowledge, opinions, etc., in an online environment. The administrators of CAT-hub recognized early on the challenge of constructing mechanisms to help users navigate the CAT-hub platform effectively and discover content relevant to their needs. The Center for Assistive Technologies initiated a partnership with the Libraries to address this challenge, through developing a foundational set of descriptive tags for use within CAT-hub. These tags were developed from a controlled vocabulary designed by the Libraries to reflect the types of content likely to be present in CAT-hub and the diverse nature of CAT-hub's users.

VIRTUAL ORGANIZATIONS

AND

LIBRARIES

The advent of high-performance computing, ready access to highbandwidth networks, and the capacity to store massive amounts of data is driving dramatic changes in the nature and practice of scientific research. These technologies, collectively referred to as cyberinfrastructure, are providing the tools and capabilities that enable researchers to ask new questions and explore areas of inquiry that were previously impossible to conduct or even conceive of in some cases. The proliferation of cyberinfrastructure in turn is fueling the development of virtual organizations. The National Science Foundation defines virtual organizations broadly as “a group of individuals whose members and resources may be dispersed geographically and institutionally, yet who function as a coherent unit through the use of cyberinfrastructure.”1 This definition of virtual organizations encompasses collaboratories, e-Science or e-Research, virtual research environments, distributed workgroups or virtual teams, virtual environments, and other online communities. Research-based virtual organizations tend to be formed around a shared infrastructure, data, software, and other resources that enable scientific inquiry and analysis to take place online. The impact of virtual organizations on scientific practice goes beyond supporting new types of research explorations. Virtual organizations are also having a dramatic effect on the culture of research and how it is practiced. Under this new paradigm, the traditional model of a solitary practitioner in a physical laboratory gives way to a more open, collaborative approach. In addition to providing tools and resources for any of its members to use, virtual organizations also typically offer the means to communicate and share one's work with other members in the organization. Moreover, many large-scale virtual organizations have arisen that are centered on addressing large and complex problems or on

192 The Journal of Academic Librarianship, Volume 36, Number 3, pages 192–201

supporting a broad field of study that spans beyond a single discipline. Membership in these virtual organizations may be open to researchers from multiple disciplines to provide different perspectives on the issues under examination. Thus, in addition to enabling researchers across different geographies and time zones to collaborate, virtual organizations are enabling researchers from different disciplines to come together to form interdisciplinary research teams. The nanoHUB, a virtual organization hosted by Purdue University, provides an example of how virtual organizations are impacting the nature and practice of scientific research.2 The nanoHUB supports research and education in nanotechnology, the study of the control of matter at a microscopic or atomic level, through providing simulations, tools, scholarly papers, lectures, learning modules, and other resources for users through a portal-like interface. However, nanoHUB is more than just a gateway to resources, it also functions as a centralized location in cyberspace where researchers, students, professionals, and others can interact and collaborate with one another. NanoHUB provides users with the ability to hold online meetings, to form their own groups, to upload their own content or resources, and to rate or review existing content. NanoHUB has proven to be very successful and boasts more than 90,000 users from all over the world.3 The proliferation of virtual organizations and the resulting changes in how research is being conceived, conducted, and conveyed are generating substantial challenges for academic libraries. Academic libraries were developed to support traditional models of scholarly practice and have been slow to adapt to changes. Collections offered by an academic library tend to be tied to the particular institution it serves, either because of the physical nature of the collection or because of licensing restrictions placed on electronic databases by vendors. Staffing models in academic libraries tend to revolve around support for traditional disciplines with librarians serving as subject specialists and as liaisons to academic departments. This model of library services and staffing breaks down when confronted by the trans-institutional and interdisciplinary nature of virtual organizations. For example, what if a team member's home institution does not subscribe to a needed electronic resource? Who in the libraries will be responsible to identify and address the needs of an interdisciplinary research center? However, the widespread arrival of virtual organizations also provides librarians with opportunities to develop more engaged relationships with the researchers they serve. Virtual organizations and their members are not only consumers of information, they are producers as well. The development of systems and tools to enable the development of meaningful connections and linkages between disparate sets of information linkages for the purposes of discovery, collection, and analysis is a common challenge for virtual organizations.4 Librarians have opportunities to address this challenge by applying their skills and experiences to the development of discovery and management tools for communities within virtual organizations.5 Technology professionals are designing solutions to address these issues; however, as they lack the perspectives and skills of librarians, the solutions they devise may be incomplete or incompatible with user needs. As Judith Wusteman wrote in a recent editorial, “…librarians find the process of determining user requirements intrinsically more important and interesting than do many traditional computer scientists.”6

“...the widespread arrival of virtual organizations also provides librarians with opportunities to develop more engaged relationships with the researchers they serve. Virtual organizations and their members are not only consumers of information, they are producers as well.”

Librarians have a long history of taking disparate sets of information, describing them and organizing them in ways that unify them into a coherent collection, and building and maintaining systems to make them discoverable and accessible for the long term. Although the nature of the information resources used and generated by virtual organizations can be quite different from the materials that librarians have traditionally worked with, many of the underlying information needs of researchers working in this new environment, discovery, access, description, and organization, for example, remain the same. Furthermore, virtual organizations are still a developing phenomenon and so there is still time for librarians to influence their eventual forms and capabilities in ways that would facilitate the wider exchange and dissemination of information.

THE CAT-HUB PLATFORM The purpose of the Center for Assistive Technologies is to improve the lives and help enable the independence of those with disabilities through the development and use of assistive technologies. Its stated goals include the following: improving communication and collaboration on assistive technologies between users, caregivers, vendors, developers, and researchers; identifying best practices for AT product development and evaluation; and increasing visibility for assistive technologies and related issues.7 To achieve these goals, the Center for Assistive Technologies is developing CAT-hub, a Web-based platform to house a virtual community comprised of users, developers, providers, medical professionals, and others involved with or interested in assistive technologies. The CAT-hub platform is designed to provide a means for these different groups to connect, communicate, and collaborate with each other online and to enable these groups to discover, access, share, develop, and respond to content relating to assistive technologies. The underlying software powering the CAT-hub platform is HubZero, which was developed by the Network for Computational Nanotechnology at Purdue University. The HubZero software serves as the foundation for several virtual organizations already in existence, including the nanoHUB, and is planned as the foundation for several new “hubs” in the near future. One of the features of the HubZero software is the ability for the users of a hub to generate and describe content generated within the virtual community or external content that is uploaded into the Hub through the use of tags. These tags are used as descriptors for the content within the hub and help facilitate site navigation and discovery of content. Tags may be assigned to any page, content, or resource within the hub. Users may also add tags to existing content when providing feedback about the content. As with the other virtual communities running on HubZero software, CAT-hub would employ user tagging as a principal mechanism for content discovery and site navigation.

THE STRENGTHS

AND

WEAKNESSES

OF

TAGS

Open or free end-user-generated tags and their resulting folksonomies, the bottom-up classification structures composed of all tags on a site, have been a feature of various social networking sites since around 2004,8 and tagging has many proponents in the information world. Clay Shirky9 argues that the internet has made the need for taxonomies and ontologies, formal, organized classification systems used to organize information, almost obsolete. Instead, user produced tags offer a much more powerful and affordable means of organizing and finding information online. Hammond et al.10 described tagging as “noisy” but far more flexible, abundant, and much cheaper to generate than taxonomies. Hammond also points out that unlike taxonomies generated by information professionals or field experts, tags represent the real-world terms that individuals seeking information would use to find content. Indeed, the flexibility of tags and the ability to use more colloquial terms for description and searching are often cited as the biggest advantage of tagging over more

May 2010 193

prescribed systems such as controlled vocabularies or taxonomies. The ability to tag documents, photos, and other documents provides a deeper level of engagement for users and a vehicle for highly personalized information management.11 Further, folksonomies are more inclusive than taxonomies precisely because they reflect the vocabulary of the users and are able to incorporate new concepts and terms far more quickly. Library science professionals have expressed some trepidation of user-driven tags and resulting folksonomies as an effective mechanism for discovery. Because folksonomies are composed of tags created by users, they are inherently uncontrolled, imprecise, and chaotic. Tags are often highly personalized, reflecting the needs and perspectives of the individual user that created it and, therefore, may not function well as a means of describing content to other users. Many of the tags appearing in folksonomy-driven Web sites are believed to be “single-use,” meaning they appear only once on the site, limiting their utility for information discovery.12 User-driven tags often use different words to mean the same thing (synonymy) and the use of words that have multiple meanings (polysemy). Further, folksonomic sites generally do not provide formal rules for the development and use of tags, so there is potential for a folksonomy to burgeon with misspellings, slight variations, multiple tags for describing the same concept, singular and plural forms of the same term, nonstandard usage, and foreign words.13 Finally, tags in a folksonomy lack the hierarchical structures present in a taxonomy; no term is broader, narrower, or even related to another. A folksonomy is essentially a flat, bottom-up vocabulary which hinders the identification of meaningful relationships between two or more tags.14 These structural weaknesses of user-driven tags lead to ambiguities within folksonomies, making it difficult for a user to ascertain if they may have missed relevant information in their searching. In an article comparing tags and taxonomies, Macgregor and McCulloch emphasize that the problems with tagging primarily pertain to collaborative tagging and affect resource discovery among communities of users, i.e., if tags are used only for personal information management, then there really is no problem with uncontrolled, usergenerated tagging. However, the very sites where tagging is widely used, such as the photo storing and sharing site Flickr, are socially oriented sites that encourage the sharing of documents and tags. And so while the limitations of collaborative tagging understandably make information professionals unwilling to abandon taxonomic-based systems of knowledge organization and discovery, the ease of use, flexibility, and widespread adoption of tagging by popular social networking sites ensure that both types of approaches will continue to coexist. Macgregor and McCulloch15 conclude by arguing that librarians, given their knowledge and expertise in information retrieval issues, should take a leading role in conducting research to influence the continued development of collaborative tagging systems. Currently, there are relatively few discussions on the potential of integrating elements of tags and taxonomies in the literature. Peterson16 takes a pessimistic view and asserts that a database that truly integrates both subject cataloging and folksonomy does not seem feasible at this time. Her assertion stems from the observation that traditional cataloging is rule-bound and limiting and folksonomies are open-ended, which ultimately renders them incompatible with each other. Sun17 states that taxonomies can happily coexist with folksonomies but does not suggest any practical way to merge the two schemes. She recommends that users who have greater subject expertise spend some time adding folksonomic tags to important Web sites, such as health-related sites, to improve discovery by other users. However, Sun also asserts that nonexpert end-user tagging helps valuable health information become more accessible to the public, since the terms used by nonexperts tend to be more intuitive or employ more commonly used terminology.

194 The Journal of Academic Librarianship

Rosenfeld18 is perhaps most hopeful in his supposition that “treating [controlled vocabularies and folksonomies] as major parts of a single metadata ecology might expose a useful symbiosis: encourage authors and users to generate folksonomies, and use those terms as candidates for inclusion in richer, more current controlled vocabularies that can evolve to best support findability.”

“...there are relatively few discussions on the potential of integrating elements of tags and taxonomies in the literature.”

THE PROJECT As a part of fulfilling its goals and purpose, CAT-hub is designed to serve as a centralized access point for information about assistive technologies and other topics of interest to the disabled and their related communities. Therefore, it is particularly important that providers of content to CAT-hub have the means to describe their content effectively and accurately to enable end-users of CAT-hub to discover it and determine its relevance to their particular information needs. Recognizing the inherent weaknesses of tags, the administrators of the Center for Assistive Technologies were wary of relying solely on user-generated tags as a navigation system and discovery mechanism for CAT-hub, especially in its early stages of development. Seeking ways to improve the functionality and usability of the tags within CAT-hub, the administration of the Center for Assistive Technologies approached the Purdue University Libraries for assistance. The Libraries had developed a good working relationship with the Regenstrief Center, the parent organization of the Center for Assistive Technologies, from assisting them in setting up collections of their papers in the Libraries institutional repository and other past interactions. The Libraries responded to the Center for Assistive Technology's request for help by drafting a project proposal to generate a foundational set of tags for use within CAT-hub through the development of a controlled vocabulary. Existing taxonomies relevant to the work of the Center for Assistive Technologies would be identified, reviewed, and used as a guide to develop a controlled vocabulary specifically for implementation as a foundational set of tags in CAT-hub. The utility of these tags would be tested by uploading a broad range of sample content relevant to assistive technologies into CAT-hub and assigning the controlled vocabulary tags to this content. Based on the results of this testing, the tags would be reviewed and revised accordingly and then made openly available for use in CAThub. The deliverables from this project would include a set of controlled vocabularies, the implementation of these vocabularies as tags with CAT-hub, a description of how the controlled vocabularies and resulting tags could be used, and documentation of the work that was performed. The Center for Assistive Technologies accepted the proposal and provided the necessary funding for the project.

GETTING STARTED Once the project was accepted, the Libraries formed a project team to carry out the proposal. The Libraries' Data Research Scientist, who had previous experience in designing and carrying out applied research projects, was enlisted as the project manager. The Mathematical Sciences Librarian, who has a background in medical librarianship as well as a familiarity with controlled vocabularies used in the medical field, was recruited to the project. Funding for the project included monies to hire two graduate student assistants. One graduate student from the linguistics department who had prior experiences working with and developing ontologies was hired. The other graduate student hired for the project came from the education department and had

hands-on experiences in working with assistive technologies and populations with disabilities. After the project team had been assembled, the next step in launching the project was to meet with the senior administration of CAT-hub to work out the details of the project, to learn more about the mission and goals of CAT-hub, to learn where the CAT-hub platform was in the development process, and to identify the immediate priorities that would need to be addressed. From this initial meeting, the project team gathered the following information to guide its work: • In developing the controlled vocabularies for CAT-hub, the team should focus on breadth rather than depth. Creating too many controlled vocabulary tags for CAT-hub would be counterproductive at this time. • Initially, the language used in creating the controlled vocabulary should be geared toward assistive technology vocational professionals who have some knowledge, background or experience with assistive technologies. The controlled vocabulary tags will help to address their need for discovering information on AT through the CAT-hub to enable them to better counsel their clients. • However, the controlled vocabulary terms are needed in part to bridge gaps between user groups (AT users, medical staff, vendors, researchers, etc.). One of the objectives of developing these controlled vocabulary based tags is to empower lay persons so that they will be able to easily identify what is available and what they can derive from CAT-hub. Therefore, although the tags will be geared for an audience with some assumed familiarity with disabled communities and assistive technology fields, the terminology used cannot be overly technical or obscure. • Content types in CAT-hub will consist of more than just informational resources. Other types of expected content will include directories of government and nonprofit agencies that offer assistance or provide support and resources to the disabled. Content within CAT-hub will likely include information about agencies providing legal, health care, funding for assistive technologies, and other services. • The CAT-hub administrators are seeking to create an environment where content is easy to discover and access. In addition to carrying out the project, the CAT-hub administrators asked the project team for their recommendations for improving the use and continuing the development of the controlled vocabulary based tags. From this initial meeting with CAT-hub administration and from reviewing the documentation for the CAT-hub platform, the project team decided to focus their efforts on generating a controlled vocabulary for five broad types of disabilities. These five categories were vision, hearing, mobility, speech/language, and cognition/ learning disabilities. Developing controlled vocabularies for tagging content within these five categories of disabilities would have the greatest potential impact and benefit for CAT-hub given their state of development at the time. The project team also recognized early on that some of the controlled vocabulary terms for tags needed to describe and organize the types of content residing within CAT-hub would fall outside of these five categories, such as information about health insurance or disability law. Therefore the project team created a sixth category titled cross-cutting as a means of including controlled vocabulary terms that would enable the description of important topics that were too broad to fit into one of the other five categories.

ENVIRONMENTAL SCANS The next stage in this project was for the project team to acquire a base level of knowledge and familiarity in three areas: existing communities of support for populations with disabilities, existing assistive technologies, and relevant controlled vocabularies. Each member of the project team was assigned at least one of these three areas to explore, based on their background and expertise, and tasked with reporting back to the team their findings and recommendations for moving forward. All three environmental scans were conducted simultaneously during the first month of the project. The progress of the project team's explorations and the results of these environmental scans were documented and shared through an internal department wiki site that was accessible and editable by all team members. Given the limited amount of time available for this project, these environmental scans were meant to demonstrate a broad representation of the nature and types of existing resources, rather than a comprehensive review. The first area of exploration was the existing communities of support for the five categories of disabilities that would be initially addressed through CAT-hub: vision, hearing, mobility, speech/ language, and learning/cognition. Developing an understanding of these communities centered on identifying and comprehending the needs of these communities, along with the current environment of services and resources available for these populations. This information would be used to guide the work of the project team in developing the controlled vocabulary. Members of the project team methodically explored agency Web sites and other resource-based Web sites that were either disability-, service-, or socially oriented to learn more about the demographic groups that would likely be using and contributing to CAT-hub content. These Web sites were analyzed by the following criteria or features: the nature of the organization or community, target audience, the services and features offered, and the types of information provided. Predictably, the content of these Web sites varied widely; nevertheless, the group became aware of broad patterns and overall trends in the nature of the content and offerings, as well as sources with potential for inclusion in CAT-hub. The second area of exploration for the project team was to identify the existing assistive technology products that were currently available for each of these five communities. The aim of this environmental scan was not only to learn more about the products themselves but to get a better sense of the needs that the products were designed to address. The project team also sought to learn how assistive technology products had been categorized and described by their manufacturers or producers. Here, too, the types of products and purposes for the products varied widely, but the project team was able to gain a broad sense of the current assistive technologies market and the nature of the types of products being offered or developed. This information was invaluable in directing the subsequent work in creating controlled vocabularies. The third area of exploration undertaken by the project team was to investigate existing controlled vocabularies that appeared to have some relation to this project and that could potentially be mined for useful terminology for the CAT-hub tags. To accomplish this task, members of the project team generated a list of existing controlled vocabularies, glossaries, thesauri, categorizations, descriptions, etc., that were identified as having some relation and relevance to CAThub. The expectation was that no single existing controlled vocabulary could solely be applied to the CAT-hub platform given the broad and diverse nature of CAT-hub's anticipated content and audience was confirmed early on in the course of conducting this environmental scan. Therefore, multiple vocabularies were closely reviewed and examined to determined if they could be applied for the purposes of this project and, if so, how and to what extent.

May 2010 195

“The development of controlled vocabularies for each of the five categories of disabilities: vision, hearing, mobility, speech/language, learning/cognition, as well as the cross-cutting category, took place over several stages.”

CREATING CONTROLLED VOCABULARIES The development of controlled vocabularies for each of the five categories of disabilities, namely, vision, hearing, mobility, speech/ language, learning/cognition, as well as the cross-cutting category, took place over several stages. The initial stage was to decide which of the existing controlled vocabularies that the project team had examined during the environmental scan phase of the project to employ as the primary references for the development of controlled vocabulary for tags within CAT-hub. The four controlled vocabularies selected by the project team for this purpose were: • Medical Subject Headings (MeSH)—The MeSH-controlled vocabulary is primarily geared toward serving the needs of medical professionals.19 • Library of Congress Subject Headings (LCSH)—The LCSH controlled vocabulary is designed to be broadly accessible to the general public.20 • International Standard 9999:2007 - Assistive products for persons with disability. Classification and terminology—This ISO standard provides an organizational structure for assistive technology products from the viewpoint of a product vendor or developer.21 • AbleData—AbleData is an online directory of services and resources relating to assistive technology. It is well known and acknowledged as a useful source of information among the disabled communities and the professionals that work with disabled populations.22 After deciding on these four controlled vocabularies as guides, the project team returned to the environmental scans of communities of support and assistive technology products to identify the concepts or items that would be need to be included in the controlled vocabulary for CAT-hub. The team then searched through the four controlled vocabularies to investigate how a particular item or concept was identified, addressed, and described. Using a combination of the four related but distinctive controlled vocabularies provided a broad perspective of possible terms and approaches to employ. Conflicting approaches to concepts and items between two or more of the existing controlled vocabularies were common. In such cases, the approach that appeared to best address the intended audiences of CAT-hub and the needs that they were likely to have was selected. In cases where the concept or item was not addressed adequately in any of the existing controlled vocabularies for the purposes of CAT-hub, the team developed its own approach. In developing the terminology for the controlled vocabularies in each of the five disability categories, the project team sought to provide coverage of the medical conditions related to the disability, and the types of assistive technologies designed to address these conditions. The guidelines to generating terms for use as tags was informed by both the team's environmental scans and the representation of the term in the existing controlled vocabularies. A particular challenge in developing the controlled vocabularies was the restriction placed on the number of terms that could be generated. In

196 The Journal of Academic Librarianship

negotiations with the CAT-hub administrators, the project team was asked to limit the number of terms to between 20 and 25 for each category. In attracting new users, the administrators wanted to achieve a balance between offering a set foundational set of tags to use, without overwhelming users with too many options before they become acclimated with CAT-hub. Given this restriction, and that breadth was more important than depth for the tags, the project team placed emphasis on incorporating common terms into the controlled vocabulary while providing the greatest coverage with the least redundancy. The sixth category, cross-cutting, was a critical addition to the structure of the controlled vocabularies because it provided a means to generate tags that had potential application across multiple disabilities and assistive technologies. Example of tags that could apply to several or all of the disability categories mentioned include advocacy, government services, health care, etc. The tags chosen for this category primarily represented the most frequently identified types of services provided on the service-oriented Web sites as well as the topics most commonly seen on community support pages identified during the environmental scans. Naturally, in the process of collecting potential terms and constructing their interrelationships, the project team generated many more terms than were feasible or advisable to implement as tags in CAT-hub at this early stage of its development. In striving to create a balance between addressing a broad list of topics without generating a long list of tags, the controlled vocabularies required ongoing review and revision as they were being developed. During that revision process, many potential tags were reconsidered and removed. The decision to remove a tag relied primarily upon three criteria: redundancy, technicality, or overspecificity. A tag was deemed redundant if similar wording could be found elsewhere in the controlled vocabulary, or if similar content was already tagged using simpler, more effective means. Tags were also removed if they were more technical in nature, such as a medical term for a common condition. In choosing which tags to keep, a higher priority was placed on usability than on medical precision, i.e. a tag is ‘better’ if it is commonly used and would be easily recognizable to CAT-hub users, in contrast to if it is used primarily by medical professionals. A related but separate issue is overspecificity. Tags were also removed if they provided a level of detail for a concept or product that was not sufficiently broad as to make it applicable to multiple instances of content. An example of a tag deleted due to overspecificity might be a specific model of a hearing aid.

RELATIONSHIPS BETWEEN TERMS Early on in the process of developing the terminology for tags, the project team recognized the need to develop cross-references between terms within a disability category as well as between terms in different categories. Developing cross references between terms in the same disability category would address the connections between medical conditions and assistive technologies. Crossreferences were desirable across categories to identify the connections and relationships between different types of medical conditions. The need to identify these relationships was particularly apparent in generating controlled vocabularies for the related categories of hearing and speech/language and, to a lesser extent, the mobility and cognition/learning disabilities categories. Although CAT-hub in its early phase of development cannot yet hyperlink the relationships between tags, the project team and the CAT-hub administration saw future potential for using these relationships between terms. As CAT-hub continues to evolve, these relationships could be used to connect or associate terms in ways that will assist information seekers in discovering information related to their particular interests, as well as to provide them with a more complete understanding the universe of available resources. Therefore, the

project team decided to identify the relationships perceived between terms and to include these relationships as a part of the controlled vocabularies for CAT-hub. The utility of the controlled vocabularies is not dependent upon these relationships, nor is the utility of CAT-hub in its current state adversely affected by including the relationships between tags. The types of relationships used by the project team were broader term (BT), narrower term (NT), related term (RT) ,and use for (UF), which complies with ANSI/NISO Z39.19-2005, Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies.23

TESTING CONTROLLED VOCABULARIES

WITH

CONTENT

After developing a set of controlled vocabulary for each of the six categories, the next stage of the project was to test their utility as tags for the CAT-hub platform. As the project team was building the controlled vocabulary, the project manager arranged a meeting with the lead developer for the CAT-hub platform to discuss how to coordinate the implementation of the tags into CAT-hub. This initial meeting led to a close collaboration between the project team and the lead developer, who started attending and actively participating in weekly meetings. The willingness of the lead developer to contribute her time and expertise was a key factor in the success of the project. Two rounds of testing the controlled vocabulary as tags for use in CAT-hub were conducted by the project team. The first round of testing was carried out using a set of 44 items chosen randomly from a collection of information relating to assistive technologies provided by CAT-hub's lead developer. These materials included product catalogs, information on organizations providing services to the disabled, professional training, and other types of information. Initially, the project team had planned on using some of the materials that had been gathered through the environmental scans; however, the team wanted the testing process to center on the types of content that would be representative of what would be uploaded into CAThub. The lead developer had collected these materials with the intention of adding them to CAT-hub, and so the project team viewed using these materials as a more realistic test of how the tags might be applied. Four scholarly journal articles with relevant content were added to the pool of tested materials to examine the effectiveness of the tags in describing research-based content. The first round of testing took place as the controlled vocabulary terms were still in draft form. This iteration of testing was meant to serve as an initial indicator of the applicability of the controlled vocabulary terms the project team had developed for CAT-hub. This first round of testing was conducted by a single member of the project team to provide the project team an early indication of the effectiveness of the team's efforts. Because the materials were being tagged either before or while the lists of controlled vocabulary terms were being compiled, the first test helped to inform the finalization of the tags for each of the six categories. A second round of testing was performed in which the project team member who had conducted the first round went back and reassigned tags from the completed versions of the controlled vocabularies. In addition, a second member of the project team also assigned tags independently from the first member on the same materials. The second tester completed her test on paper rather than in CAT-hub so as not to be biased by the decisions made by the first tester. Once the tags had been assigned, the project team conducted a blind comparison of tags that had been assigned. The purpose behind the second round was to determine the reliability of the testing – would two people assign the same tags to the same content?– and to test for the completeness of the vocabulary. The results of this test were used as the basis for discussion among the project team on the utility of some of the tags, enabling the identification and elimination of redundant or marginal tags.

The final component of the testing process was for the project team to review test results and the complete list of tags as a group. The team then discussed the relevance and appropriateness of the tags in each of the six categories in light of the test results. A few extra tags that were not part of the initial list of tags were suggested and adopted as a result of the testing. Similar to the environmental scan, the results of tag testing were documented on the Libraries' internal wiki. The wiki enabled the team to update the controlled vocabularies quickly and easily while providing us with the means to document the changes that had been made. After several iterative discussions, the project team reached consensus about which terms would be included in the final draft of the controlled vocabularies.

IMPLEMENTATION Once the testing was completed, a final draft of the controlled vocabularies was crafted. The controlled vocabularies, along with their identified relationships, were then extracted from the wiki and uploaded into the CAT-hub platform for use. Electronic copies of the controlled vocabularies from the project, along with a final report, were given to the CAT-hub administration. The final report contained a detailed description of the decisions, procedures, and work that had been done by the project team, a list of recommendations for continued development of tags within CAT-hub, and a “user's guide” to help users of the CAT-hub system design tags of their own. The implementation of these controlled vocabularies as tags can be viewed on the CAT-hub portal (see Fig. 1). The HubZero software upon which the CAT-hub portal is built offers several different ways to organize content. CAT-hub has chosen to group content by resource type (tutorials, professional training, products, reviews, services and organizations, and research) and by user type (consumers, professionals, innovators, providers, and researchers). Tags are designed to cut across these categories and can be assigned to any resource uploaded or created in CAT-hub. The CAT-hub administration decided to make the five categories of content, vision, hearing, speech, learning and cognition, and mobility, as tags themselves. These five tags appear on the CAT-hub home page for users to access relevant content quickly. The main page of CAT-hub also enables users to browse the all of the tags within CAT-hub, or to view the most frequently used tags. In addition to using tags for browsing, tags are also incorporated into the search functionality of CAT-hub. A search box is provided in the upper right hand corner of the home page. Results from searching are categorized and displayed according to their content type; tags are one of the categories listed (see Fig. 2). Click on the “tags” link takes a user to a list of the relevant tags (see Fig. 3). As one would expect, clicking on a tag in the results list takes a user to a listing of all of the content that has been tagged with the particular word or phrase selected. Currently, the HubZero software is not capable of using the relationships between the controlled vocabulary terms (BT, NT, RT, UF) as searchable links. Instead, CAT-hub lists any terms in text, identifying them as all as related terms, as a part of the search results. Although they are not hyperlinked, displaying these terms provides some guidance to users on other possible topics of interest. Clicking on a tag, either through browsing the list of available tags or from the results of a search, takes a user to a screen displaying items connected to that tag (see Fig. 4).

“In addition to using tags for browsing, tags are also incorporated into the search functionality of CAT-hub.”

May 2010 197

Figure 1 A Screen Shot of the CAT-hub Portal—http://cathub.org.

Selecting tags is a part of the process of uploading content into CAT-hub. Contributors to CAT-hub are guided through the upload process through a series of screens that ask them to supply information about the content. The tags screen enables users to select one of the five overarching categories (vision, hearing, speech, learning and cognition, and mobility) as a focus area and then lets

users add additional tags if they choose to do so. Existing tags are not displayed on this screen; however, as a user types in a word or phrase in the “assigned tag” field, they are shown the existing tags that match the letters they have typed. For example, typing “vis” into the “assigned tags” field would bring up the existing tags “vision disorders,” “vision loss,” “vision screening,” and any other tags that

Figure 2 A Screen Shot of the Results from a Search for “Vision Aids.”

198 The Journal of Academic Librarianship

Figure 3 A Screen Shot of the Results from Clicking on the “Tags” Link.

began with the letters “vis” (see Fig. 5). Users are encouraged to select existing tags but are able to enter their own terms and phrases for use as tags. Users are not required to select an overarching category or tag for the content they are uploading.

CONCLUSION The controlled vocabulary-based tags have been fully implemented into the CAT-hub portal. The tags are currently being used by CAT-hub community members to describe the content they upload and to discover content contributed by others. Unfortunately, the time allotted for developing the controlled vocabulary-based tags did not permit the formal user testing procedure. However, both the CAT-hub administration and the authors are interested in examining how the

controlled vocabulary tags generated by the project team are used and what user-generated tags are created to describe the content.

“It is anticipated that users will create their own tags to describe the content they upload or existing content within CAT-hub in order to describe the content in ways that are meaningful for their particular contexts and needs.” The controlled vocabulary-based tags developed by the Libraries are meant to serve as a foundational navigational system to

Figure 4 A Screen Shot Displaying the Items Tagged as “Low Vision Aids.”

May 2010 199

Figure 5 A Screen Shot Demonstrating How Tags are Assigned to Content as a Part of the Process of Uploading Material into CAT-hub.

encourage the contribution of content and the use of the CAT-hub portal. It is anticipated that users will create their own tags to describe the content they upload or existing content within CAThub to describe the content in ways that are meaningful for their particular contexts and needs. As time passes and the quantity of tags generated by users grows, the Libraries are interested in examining how the controlled vocabulary tags the Libraries developed as a foundational base for describing content and navigating the CAT-hub portal compare to the tags generated by users. Will the CAT-hub community adopt the foundational set of tags generated by the Libraries for their own content and, if so, to what extent? How will the user generated tags compare to the tags generated by the Libraries? Will the user-generated tags in CAT-hub be more specific and include product names of assistive technologies, or list medical conditions at a more granular level? Exploring these and other questions within CAT-hub will help the Purdue Libraries to develop a better practical understanding of how virtual organizations work and how librarians can contribute to their functionality and use. The rapid rise of virtual organizations is enabling people to interact and collaborate with one another in a variety of new ways. By reducing the barriers of time and distance, virtual organizations are having a profound effect on how research, teaching, and scholarly communication are being carried out. These changes present significant challenges to academic libraries which tend to be centered on and organized around addressing traditional structures and practices of teaching and research. However, individuals and communities within virtual organizations have their own set of information needs that must be addressed if the organization is to be successful. As librarians strive to redefine themselves to address the needs of researchers in a digital era, we need to forge relationships with researchers practicing in virtual organizations to gain a greater understanding of the nature of their workflows, interactions, and environment overall. As illustrated in this case study, librarians possess a unique set of skills, knowledge, and perspectives that enable them to contribute to addressing the information needs of virtual organizations. Therefore, the question is not so much whether librarians can contribute to the development of virtual organizations, but when, how, and to what extent librarians will become involved.

NOTES

AND

REFERENCES

1. National Science Foundation. Beyond being there: A blueprint for advancing the design, development, and evaluation of virtual organizations. Washington, DC: National Science Foundation, 2008. http://www.ci.uchicago.edu/events/VirtOrg2008/VO_ report.pdf (accessed August 28, 2009).

200 The Journal of Academic Librarianship

2. Windham, Carie. “nanoHub,” EDUCAUSE Learning Initiative Paper 7, (July 2007), http://net.educause.edu/ir/library/pdf/ELI3015.pdf (accessed August 28, 2009). 3. nanoHUB.org: Simulation, Education, and Community for Nanotechnology. http://nanohub.org/ (accessed August 28, 2009). 4. Fraser, Michael. “Virtual Research Environments: Overview and Activity,” Ariadne, 44, (July 2005). http://www.ariadne.ac.uk/ issue44/fraser/ (accessed August 28, 2009). 5. Masson, Alan. “VRE library services: learning from supporting VLE users,” Library Hi Tech 27, no. 2 (2009): 217–227. 6. Wusterman, Judith. “Virtual research environments: What is the librarian's role?” Journal of Librarianship and Information Science 40, no. 2 (June 2008): 67–70. 7. Center for Assistive Technology. “About us,” Center for Assistive Technology, http://cathub.org/about (accessed August 28, 2009). 8. Vander Wal, Thomas. “Folksonomy coinage and definition,” vanderwal. net. (February 2, 2007), http://vanderwal.net/folksonomy.html (accessed August 28, 2009). 9. Shirky, Clay “Ontology is Overrated: Categories, Links, and Tags,” Clay Shirky's Writings About the Internet, (January 25, 2006) http:// www.shirky.com/writings/ontology_overrated.html (accessed August 28, 2009); Shirky, Clay. “Folksonomy,” Many2Many, (August 25, 2004) http://many.corante.com/archives/2004/08/ 25/folksonomy.php (accessed August 28, 2009). 10. Hammond, Tony, Timo Hannay, Ben Lund, and Joanna Scott. “Social Bookmarking Tools (I): A General Review,” D-Lib Magazine, (April 2005) http://www.dlib.org/dlib/april05/hammond/04hammond. html (accessed August 28, 2009). 11. Macgregor, George and Emma McCulloch. “Collaborative tagging as a knowledge organisation and resource discovery tool,” Library Review 55, no. 5 (2006): 294. 12. Guy, Marieke and Emma Tonkin, “Folksonomies: Tidying up tags?” D-Lib Magazine, (January 2006). http://www.dlib.org/dlib/ january06/guy/01guy.html (accessed August 28, 2009). 13. Spiteri, Louise F. “The structure and form of folksonomy tags: The road to the public library catalog,” Information Technology and Libraries 26, no. 3 (September 2007): 13–25. 14. Dye, Jessica. “Folksonomy: A game of high-tech (and high stakes) tag,” EContent 29, no. 3 (April 2006): 38–43. 15. Macgregor and McCulloch, “Collaborative tagging,” 298. 16. Peterson, Elaine. “Parallel Systems: The coexistence of Subject Cataloging and folksonomy,” Library Philosophy and Practice, (April 2008): 1–5. 17. Sun, Beth DeFrancis. “Folksonomy and health information access: How can social bookmarking assist seekers of online medical information?" Journal of Hospital Librarianship 8, no. 1 (2008): 119–126. 18. Rosenfeld, Louis. “Folksonomies? How about metadata ecologies?” Louis Rosenfeld: Information Architecture & User Experience. (January

6, 2005). http://louisrosenfeld.com/home/bloug_archive/000330. html (accessed August 28, 2009). 19. National Library of Medicine. Medical Subject Headings. http:// www.nlm.nih.gov/mesh/meshhome.html (accessed August 28, 2009). 20. Library of Congress. Library of Congress Authorities, http:// authorities.loc.gov/; Library of Congress, Classification Web, http://www.loc.gov/cds/classweb/ (accessed August 28, 2009).

21. International Organization for Standardization. “Assistive products for person with disability: Classification and terminology,” ISO 9999. (2007) 22. Abledata. http://www.abledata.com/ (accessed August 28, 2009). 23. National Information Standards Organization (U.S.); American National Standards Institute. “Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies,” (July 2005).

May 2010 201