An Update on the NISO Metasearch Activity

An Update on the NISO Metasearch Activity

Standards Update An Update on the NISO Metasearch Activity Mark H. Needleman, Column Editor In May 2003, the National Information Standards Organizat...

57KB Sizes 0 Downloads 78 Views

Standards Update An Update on the NISO Metasearch Activity Mark H. Needleman, Column Editor

In May 2003, the National Information Standards Organization (NISO) held a strategy meeting to look at issues surrounding standardization of metasearch systems. The goal of the workshop was to examine the state of the art in metasearch and ways of building consensus to move forward. This article describes the work that has been going on with NISO metasearch activity since that initial meeting. Serials Review 2006; 32:143–145.

develop new standards, while others might recommend the use of existing standards or develop best practices or other types of recommendations.

Introduction In May 2003, the National Information Standards Organization (NISO) held a strategy meeting to look at issues surrounding standardization of metasearch systems. The goal of the workshop was to examine the state of the art in metasearch and ways of building consensus to move forward. The NISO Web site defined the goals of this workshop to enable the following groups:

NISO Standards Committee BA: Access Management This committee was not intended to develop any new authentication or access control protocols or mechanisms. Instead, it was charged with examining existing and emerging protocols and determining if one or more of them could be recommended as best practices for use in a metasearch environment. Deliverables from the committee were:

! metasearch service providers to offer more effective and responsive services ! content providers to deliver enhanced content and protect their intellectual property ! libraries to deliver services that distinguish their services from Google and other free Web services.

! a definitions document of access management and metasearch terms ! defined distinctions in access management between user access and agent access ! understanding basic requirements of constituents ! an inventory of methods and techniques in use today ! use cases describing authentication and access needs ! defined statistics that must be kept to satisfy access management systems.

The strategy meeting was described in an earlier Standards Update (vol. 29, no. 3, 2003, p. 256–257). A follow-up workshop was held in October 2003 to inform librarians, content providers, and aggregators about metasearch issues. The following is an account of the activities since the May 2003 meeting.

NISO Metasearch Standardization Activity

The committee examined most known authentication protocols and produced a matrix, ranking them on a series of criteria including, but not limited to, usability, cost, suitability, ease of implementation, security, robustness, scalability, and existing implementation. The final version of the committee report was published on September 13, 2005, and is available at http://www.niso.org/committees/MS _ initiative.html (accessed February 4, 2006). The committee found that in the current environment sites acquiring new electronic resources should

Following the initial strategy meeting and workshop, NISO established three committees to work on various aspects of metasearch. It was not necessarily assumed that new standards would emerge from all of these committees. Some of the committees might

Needleman can be reached at [email protected]. doi:10.1016/j.serrev.2006.02.006

143

Needleman / Serials Review 32 (2006) 143–145

implement either IP authentication with a proxy server (either traditional or rewriting) or username/ password authentication. The committee also concluded that moving forward with Shibboleth held the most promise as the next-generation authentication method, and members of the Access Management Committee are working with the Shibboleth community to ensure that the next version of Shibboleth (Shibboleth 2) has all the necessary functionality to make it usable in a metasearch environment.

NISO Standards Committee BC: Search/Retrieve Search and retrieval issues were combined in this committee since they are so interconnected. The charge to this committee was to produce: ! a description of the current practice in metasearching search and retrieval ! definitions of a standard vocabulary and terms ! a definition of a template for exchange of search and retrieval functionality ! an inventory of proprietary XML interfaces and best practices for metasearch search and retrieval ! recommendations for the data elements to describe a Result Set and a record within a Result Set ! a review of SRW/SRU and a recommendation for modifications for use as the basis of a metasearch search and retrieval standard.

NISO Standards Committee BB: Collection and Service Descriptions The charge to this committee was to produce: ! a list of data elements needed to describe a collection ! a document containing guidelines for maintaining and exchanging collection information ! recommendations on further steps needed, including development of best practices, implementation guidelines, and formal standards.

So far this committee has produced three documents: NISO Metasearch XML Gateway Implementers Guide: This document describes the NISO Metasearch XML Gateway Protocol which is based on the NISO-registered SRU protocol. It is intended as a low barrier to entry mechanism that will allow information service and content providers to expose their services and contents to metasearch engines. It currently does not specify any standardized query mechanism but rather allows the exchange of any arbitrary query so systems can send and receive whatever queries they currently support. However, recognizing the benefits of having a standardized query format, the long-term goal is to produce one, most likely based on the XQL query protocol in use in the SRW/SRU protocols. Result Set Metadata: Defines a core set of metadata elements that provide information about a result set at both the aggregate level and the individual record to provide better quality of information returned and to ensure more standardized presentation of results to the end user. These data elements are intended to be used by content providers to provide better quality of information returned through a variety of methods. They may also be used to ensure that the needs of metasearch products are met by a given protocol. Citation Level Data Elements: A minimum set of required citation level data elements has been identified to overcome the current lack of standardization in the way a citation is formatted in a record returned by a metasearch engine. Use of these data elements will allow citation information to be parsed for reuse in applications such as OpenURL linking and metadata formats such as Dublin Core.

This committee has produced two documents which are now NISO Draft Standards for Trial Use: ! Z39.91-200x-Collection Description Specification: This draft standard defines a means of describing collections, where a collection is defined as an aggregation of items. It takes the form of a Dublin Core Application Profile which is a specification of how metadata terms from the Dublin Core metadata vocabularies and from other metadata vocabularies (some constructed for use in association with this Dublin Core application profile) are used to construct a description of a collection, in accordance with the Dublin Core Metadata Initiative Abstract Model. It also specifies an XML binding for serializing such descriptions for interchange between applications. ! Z39.92-200x-Information Retrieval Service Description Specification: This standard defines a method of describing information retrieval-oriented electronic services including, but not limited to, those services made available via the Z39.50, Search/Retrieval in URL (SRU)/SRW and OAI protocols. The ZeeRex standard addresses the need for machine-readable descriptions of services in order to enable automatic discovery of and interaction with previously unknown systems. It specifies an abstract model for service description and a binding to XML for interchange. SRU/SRW are modernized versions of the Z39.50 Information Retrieval Protocol. SRW stands for Search/Retrieval over the Web and implements the principles of Z39.50 using Web services and Simple Object Access Protocol (SOAP). SRU is a lightweight protocol where the query is implemented in a series of parameters encoded in a URL. ZeeRex (Z39.50 Explain, Explained, and Re-engineered in XML) is an implementation of the Z39.50 Explain Service in XML.

Conclusion Lack of standard mechanisms for authentication, search and retrieval, and metasearching, while currently possible and being implemented and used, puts a strain on

144

Needleman / Serials Review 32 (2006) 143–145

both the metasearch systems and the content and data providers. Metasearch systems have to implement multiple authentication methods, search protocols, and receive data back in multiple formats. Content providers have to deal with metasearch systems that send them multiple simultaneous queries with no real mechanisms for optimization. If the standards and protocols discussed above take hold and are implemented, meta-

searching will become more efficient, and metasearch systems and content providers will be able to interoperate in a manner that is more efficient for both of them. More information on the NISO Metasearch activity and the work of the committees it is available at http:// www.niso.org/committees/MS_initiative.html (accessed February 4, 2006).

145