Studying information seeking on the non-English Web: An experiment on a Spanish business Web portal

Studying information seeking on the non-English Web: An experiment on a Spanish business Web portal

ARTICLE IN PRESS Int. J. Human-Computer Studies 64 (2006) 811–829 www.elsevier.com/locate/ijhcs Studying information seeking on the non-English Web:...

478KB Sizes 4 Downloads 81 Views

ARTICLE IN PRESS

Int. J. Human-Computer Studies 64 (2006) 811–829 www.elsevier.com/locate/ijhcs

Studying information seeking on the non-English Web: An experiment on a Spanish business Web portal Wingyan Chung Department of Information and Decision Sciences, College of Business Administration, The University of Texas at El Paso, 500 W. University Avenue, El Paso, TX 79968, USA Available online 19 June 2006

Abstract The Internet is estimated to grow significantly as access to Web content in some non-English languages continues to increase. However, prior research in human–computer interaction (HCI) has implicitly assumed the primary language used on the Web to be English. This assumption is not true for many non-English-speaking regions where rapidly growing on-line populations access the Web in their native languages. For example, Latin America, where the majority of people speak Spanish, will have the fastest growing population in coming decades. However, existing Spanish search engines lack search, browse, and analysis capabilities. The research reported here studied human information seeking on the non-English Web. In it we developed a Spanish business Web portal that supports searching, browsing, summarization, categorization, and visualization of Spanish business Web pages. Using 42 Spanish speakers as subjects we conducted a two-phase experiment to evaluate this portal and found that, compared with a Spanish search engine and a Spanish Web directory, it achieved significantly better user ratings on information quality, cross-regional search capability, system performance attributes, and overall satisfaction. Subjects’ verbal comments strongly favored the search and browse functionality and user interface of our portal. As the Web becomes more international, this research makes three contributions: (1) an empirical evaluation of the performance level of a Spanish search portal; (2) an examination of the information quality, cross-regional search capability and usability of search engines for the non-English Web; and (3) a better understanding of non-English Web searching. r 2006 Elsevier Ltd. All rights reserved. Keywords: Internet; Web; Searching; Browsing; Spanish; Non-English Web content; Web portal; Information quality; Cross-regional search capability; User satisfaction; User study

1. Introduction It is estimated that a majority (64.8%) of the world’s online population consists of non-English speakers (Global Reach, 2004b), a population estimated to grow significantly to 820 million in the near future. In contrast, the English-speaking on-line population is anticipated to remain at 300 million (Global Reach, 2004a). For instance, the Spanish-speaking on-line population now exceeds 9 million and, in the coming decades, Latin America will have the fastest growing population in the world (Caramelli, 2003). (In the following, we consider the ‘‘nonEnglish-speaking on-line population’’ to consist of people who for the most part use non-English languages to access Tel.: +1 915 747 5496; fax: +1 915 747 5126.

E-mail address: [email protected]. 1071-5819/$ - see front matter r 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijhcs.2006.04.009

the Web. We consider ‘‘English-speaking on-line population’’ to be people who generally use the English language to access the Web. Although the Web technology is language-independent, a large number of people in the world actually use their native languages to access the Web. And these languages are different in different regions. Therefore, we define ‘‘non-English Web’’ as content of the Web that is delivered primarily in non-English languages.) Although need for Web searching in some non-English languages is growing, previous research in human– computer interaction (HCI) has implicitly assumed English to be the primary language used on the Web. However, this is not the case for many non-English-speaking on-line users. For example, Spanish is the foremost language spoken in most Latin American countries as well as Spain and the second most popular language in the United States. Nevertheless, existing Spanish search engines lack search

ARTICLE IN PRESS 812

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

and analysis capabilities. Moreover, the quality of Web collections of many Spanish search engines is variable, depending on the extent to which these engines provide information scattered in different regions. Cross-regional search capability of such search engines has become important because Spanish is used in so many regions. At the same time, research about information seeking on the non-English Web and cross-regional Web searching is scarce. Knowledge of these aspects, if available, would provide insights to HCI researchers and system developers to enhance the Web searching experience of a large number of on-line users in the world. In this research, we addressed three questions: (1) How can the search and browse support of a Spanish business Web portal be evaluated? (2) How do user perceptions of the proposed portal’s cross-regional search capability, information quality, and overall performance compare to those of an existing search engine and Web directory? (3) To what extent do the empirical findings of this research have implications for non-English Web searching? The Web portal studied in this research supports searching, browsing, summarization, categorization and visualization of Spanish business Web pages. In a twophase experiment involving 42 Spanish-speaking subjects, we examined the way this portal supported human analysis in the Spanish business domain. Our goal was to achieve better understanding of how a Web portal with automated analysis capability and a comprehensive directory supports human information seeking on the non-English Web. The remaining sections are organized as follows. Section 2 surveys HCI research in non-English Web searching and search engines in Spanish-speaking regions. Section 3 describes the portal we developed to support searching of Spanish business Web pages. Section 4 describes the methodology employed to evaluate the portal. Section 5 reports and discusses the experimental findings. Section 6 concludes the paper and discusses future directions. 2. Literature review As Internet technologies are widely diffused around the world, use of many non-English languages has gained popularity among on-line users who rely on the Web to seek for information and to conduct commercial activities. It therefore is useful to review HCI research in information seeking and to examine recent advances in Web searching and browsing. In particular, we review system-centered and user-centered approaches because much work has been done in these two streams. Because non-English search engines often need to provide information across different regions, we also review prior work on these engines’ regional impacts and information quality. Finally, we review existing search engines available in Spanish-speaking regions, the domain selected for our experimental study.

2.1. HCI research on the Web Information seeking is a major activity in HCI. To study human information seeking, HCI researchers typically adopt a process model that consists of various stages of problem identification, problem definition, problem resolution and solution presentation (Wilson, 1999). Variations of the process model also can be found in the literature (Marchionini, 1995; Kuhlthau, 1998; Sutcliffe and Ennis, 1998). For example, Bates’s model for information search, called ‘‘berrypicking,’’ captures the idea of an evolving multi-step search process rather than a system that supports separately submitting single queries (Bates, 1989). Kuhlthau found that high school students began research assignments by conducting general browsing and proceeded by performing more narrowly directed search as their understanding of a subject increased (Kuhlthau, 1991). Ellis (1989) studied patterns of academic information-seeking behavior and found six features of social scientists’ individual information-seeking patterns: starting, chaining, browsing, differentiating, monitoring, and extracting. Two major information-seeking activities are searching and browsing. ‘‘Searching’’ in prior research has been considered as a range of behaviors from goal-directed information searching, in which the user has a specific target in mind, to more serendipitous or exploratory information browsing when no goal more specific than the intention to explore the information repository is present (Sutcliffe and Ennis, 1998). In directed searching, the user first decomposes his goal into smaller problems, then expresses his needs as concepts and higher-level semantics, formulates queries using such supports as Boolean query languages and syntax-directed editors and finally evaluates the results by serial search or systematic sampling. In exploratory browsing, the user first transforms his general information need into a problem. He then articulates his needs as search terms or hyperlinks that appear on the system interface, searches using those terms or explores hyperlinks using such browse supports as automatic summarization, clustering and visualization tools, and Web directories, and finally evaluates the results by scanning through them. As the Internet evolves as a major information-seeking platform, HCI has been frequently addressed in recent research that includes a system-centered approach and a user-centered approach. 2.1.1. System-centered vs. user-centered approaches The system-centered approach aims to use information technologies to support human beings’ information-seeking process. Information retrieval systems (most notably search engines) is a major technology studied in this approach. Because different search engines have different methods of page collecting, indexing and ranking, they may include systematic bias in their search results (Mowshowitz and Kawaguchi, 2002). Meta-searching has

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

been proposed as a promising method to alleviate the problem (Chen et al., 2001). By sending queries to multiple search engines and collating the set of top-ranked results from each engine, meta-searching can greatly reduce bias in search results and improve coverage. In addition, post-retrieval analysis provides added value to results returned by search engines. Previews and overviews of retrieved Web pages are important elements in post-retrieval analysis. A preview is extracted from, and acts as a surrogate for, a single object of interest (Greene et al., 2000). Document summarization techniques provide previews of individual Web pages in such forms as indicative summaries (Firmin and Chrzanowski, 1999), query-biased summaries (Tombros and Sanderson, 1998), or generic summaries (McDonald and Chen, 2002). An overview is constructed from and represents a collection of objects of interest (Greene et al., 2000). Document categorization techniques such as the self-organizing map (SOM) algorithm (Kohonen, 1995) have been used to categorize and search Web pages (Chen et al., 1998). Document visualization techniques have also been used to amplify human cognition in browsing Internet search results (Lin, 1997; Marshall et al., 2004). Despite the potential advantages of meta-searching and information previews and overview, they are rarely applied to developing non-English search engines. The user-centered approach to information seeking is concerned with the behavioral and cognitive aspects of information seekers. In this approach, human informationseeking has been described as a behavior that includes questions, dialogue, and social and cognitive situations, associated with a user’s interaction with an information retrieval system (Kuhlthau et al., 1992; Kuhlthau, 1993). The information-seeking process involves user judgments, search tactics or moves, interactive feedback loops and cycles (Spink and Saracevic, 1997). Previous research has dealt with issues relating to user’s cognitive structure (Ingwersen, 1992) and factors affecting the user–intermediary interaction process (Saracevic, 1996). However, relatively little research studying perception by information seekers has been done in the context of non-English Web searching. Considering the multiple cross-regional information sources that are typically used, two issues deserve more attention: regional impacts and the quality of information sources. 2.1.2. Regional impacts and information quality As a language can be used in more than one region or country, regional impacts arise because cultural, social and economic environments differ. For example, Spanish is widely used in Europe, North America and South America. Spink et al. (2002) compared the searching behaviors of FAST search engine users (who are largely European) with those of Excite search engine users (who are largely American) and found that FAST users input queries more frequently while Excite users focused more on e-commerce topics. These results suggest regional differences on the

813

Web, arising from possible cultural and social differences. However, these studies focused only on query and topic differences and did not reveal differences in search-engine effectiveness. Chinese is another non-English language that is gaining popularity on the Web (Chung et al., 2004). However, because Chinese is mainly used in three closely located geographic regions (mainland China, Hong Kong, and Taiwan), its regional impact is much less than that of Spanish, which is used across continents and widely separated geographic regions. Unfortunately, there has been little research on the cross-regional impacts of Spanish search engines, although evaluation of them should improve understanding of optimal design of search engines and portals. Information quality, a multifaceted concept considered an important aspect of evaluating the quality of a Web site (Loiacono, 2002), has been explored in previous research (Wang and Strong, 1996). To evaluate information quality, a set of 16 dimensions was developed (Wang and Strong, 1996) and tested in Pipino et al. (2002). These dimensions were for the most part used in evaluating the quality of information about organizations or companies, not the quality of information obtained from search engines. Although Marsico and Levialdi (2004) have developed a Web site evaluation methodology that considers a site’s information quality, their methodology was designed for evaluating general Web sites (e.g., travel information Web sites) and does not consider the special requirements of non-English Web searching.

2.2. Search engines and Web directories for Spanishspeaking regions To address the needs of non-English-speaking on-line users, major English search engines and some regional search engines have expanded their services and have begun to support more localized searching. These search engines typically accept queries in a user’s native language in addition to English and return pages from the regions being served. The following presents a survey of major search engines and Web directories in Spanish, a widely used language that is gaining popularity on the Web. Spanish is the second most popular language in the United States and the primary language of Spain and some 22 Latin American countries. Major search engines in Spain are Terra and Wanadoo. Terra (http://www.terra. com/) offers services to more than 3.1 million Internet users in Europe and the Americas. According to a Gallup poll (2002), Terra was voted the most popular search engine in Spain; Wanadoo (http://www.wanadoo.com/), a subsidiary of France Telecom, was rated second. Terra serves more than 3 million Internet users in Spain, Latin America, the United States and many European countries. With 9.3 million customers in June 2004, Wanadoo is currently the leading Internet service provider in France and the United Kingdom.

ARTICLE IN PRESS 814

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

Spanish search engines serving Latin America include Yahoo Espan˜ol, Ahijuna, Auyantepui, Quepasa, Bacan, and Conexcol. Yahoo Espan˜ol (Spain, http://espanol. yahoo.com/), the Spanish version of Yahoo, provides a human-compiled Web directory developed by about 150 editors who categorized more than one million listed sites. Yahoo Espan˜ol (YahooES) also supplements its results with those from Inktomi and Google. Inktomi matches also appear to users after all YahooES matches have first been shown. Established in 1995, BIWE (http://www. biwe.com/) was one of the first search engines for searching Spanish information on the Web. It supports searching for news, products, images and other information and provides a variety of services including a Web directory, email, entertainment and market information for Hispanics. Headquartered in the United States, Quepasa (http:// www.quepasa.com/) was launched in 1997 and is a bilingual Web portal (Spanish and English) serving Hispanic populations in the United States and Latin America. It uses proprietary Web search technologies to reduce the number of irrelevant results by utilizing terms most frequently used and documents most frequently viewed (Peterson, 2002). Quepasa also offers other services such as news, email, on-line radio, chat, on-line translation, forums, and Web hosting. The following Spanish search engines primarily serve their own or adjacent regions. Launched in 1998, Ahijuna (Argentina, http://www.ahijuna.com.ar/) provides searching of Argentina Web sites and other Spanish Web sites. It contains a Web directory with 14 categories having a total of 7578 hyperlinks. Based in Venezuela, Auyantepui (http://www.auyantepui.com/) provides a searchable Web directory of Spanish sites. It grew from 14 categories listing 117 Web sites in 1996 to 550 categories with over 18,000 Web sites in 2002. Launched in 1998, Conexcol (Colombia, http://www.conexcol.com/) provides a searchable Web directory containing 14 categories having 400 subcategories and 13,214 Web sites’ URLs. With more than 150,000 unique visitors per month, it is one of the four most often visited sites in Colombia. Bacan (Ecuador, http://www. bacan.com/), which began its operations in 1996, provides services such as news, email, on-line chat, entertainment and shopping guides. Every month Bacan has 80,000 individual visitors and generates more than 2 million hits. Ascinsa Internet (http://www.ascinsa.com/) is widely used in Peru and contains Web sites from Latin American countries and the United States. It provides services such as Internet access, email, Web page design, domain registration, and Web hosting, among others. It also contains a directory listed by countries and then by domains. Although providing different types of information, these search engines typically present results as a long textual list and lack post-retrieval analysis capabilities such as summarization, categorization, and visualization. Moreover, except for some large portals such as Yahoo Espan˜ol, BIWE and Terra, most Spanish search engines serve a small number of regions rather than the entire

Spanish-speaking community. Table 1 provides comparisons of different Spanish search engines and Web directories along different dimensions. 2.3. Summary Because existing search engines in Spanish typically lack analysis capabilities, they offer a user limited ability to understand retrieved results. The collections they searched are often region-specific, therefore do not provide a comprehensive knowledge of the environment where they operate. While major English search engines like Google make available searching of non-English resources, they fall short of covering domain- and region-specific information. Research in such areas as information-seeking, regional impacts of search engines, and information quality should improve HCI for non-English Web searching, which is expected to grow significantly in the future. 3. System development In this section, we describe the Spanish Business Intelligence Portal (SBizPort) that provides domain-specific collections for searching and post-retrieval analysis of the Spanish business information (see Fig. 1). Recent economic activities and agreements such as North America Free Trade Agreement and Central American Free Trade Agreement have made the Spanish business domain increasingly represent an important segment of the Web for individual users and multinational organizations. Members of the growing Spanish-speaking population in businesses throughout the United States, Spain, and Latin America are actively seeking information on the Web to expand their opportunities. As shown in our literature review (Section 2.2), existing Spanish search engines lack post-retrieval analysis capabilities. Designed to support post-retrieval analysis, SBizPort contains a summarizer, a categorizer and a visualizer that can dynamically process a large number of Web pages returned on a query input and thereby alleviate information overload resulting from a long textual list of Web pages. 3.1. Steps in system development SBizPort was developed by two major steps described as follows. 3.1.1. Step 1: searching and collection building On the search page; a user can input keywords and choose whether to search, Organize, or visualize the results. The user can input multiple keywords separated by line breaks and can choose among a number of carefully selected information sources by checking the boxes that are shown. The results page lists search results according to the information sources selected by the user. To obtain high-quality Spanish business information, we manually browsed Web sites from key business categories

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

815

Table 1 Comparing major Spanish search engines and Web directories Content

Spain

Latin America

Web pages and news on

Terra (Spain)

Wanadoo (France)

Auyantepui (Venezuela)

Ascinsa (Peru)

Conexcol (Colombia)

Bacan (Ecuador)

Quepasa (Mexico and U.S.)

YahooES (Spain)

Ahijuna (Argentina)

BIWE (Spain)

IT Business Government Financial Medical Other Latin American countries General Size of collection

| | | | |

| |

| | | | | |

| | | | | |

| | | | |

| |

| |

| | | | |

| | | | |

| | | | |

| | | | | |

| Very good

| Very Good

| Fair

| Good

| Good

| Fair

| Very good

| Very good

| Fair

| Very good

Functionality

Terra (Spain)

Wanadoo (France)

Auyantepui (Venezuela)

Ascinsa (Peru)

Conexcol (Colombia)

Bacan (Ecuador)

Quepasa (Mexico and U.S.)

YahooES (Spain)

Ahijuna (Argentina)

BIWE (Spain)

Links to related resources Membership services Newsgroup search Web directory Search for Web sites Search Stock prices Filtering for adult content Online translation tool Search for news Multimedia search (image, music, software, etc.) User interface

|

|

|

|

|

|

|

|

|

|

| |

| | | |

| | | |

| | |

| | | |

| |

| |

| | |

| | | | |

| | | |

| | | |

| | |

| | |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

Very good

Very good

Fair

Fair

Fair

Good

Very good

Very good

Fair

Very good

such as e-commerce, international business, and competitive intelligence. These Web sites and business categories were obtained from searching and browsing major Spanish search engines and Web directories reviewed in Section 2.2. We found more than 183 seed URLs and used a Web crawler to follow these URLs to collect pages automatically. The pages were then automatically indexed and stored in our database. In addition to domain spidering, we performed meta-spidering of six major search engines using queries translated from English queries that previously had been used to build an English business intelligence search portal (Marshall et al., 2004). The translation was done by a native Spanish speaker who also reviewed the search results to ensure that the translation was correct. We chose the six search engines, namely, YahooES, Ahijuna, Conexcol, Ambdirecto, Auyantepui, and Teoma, for their rich Spanish business content. The resulting Spanish business collection contained more than 476,084 Web pages covering more than 22 countries. In addition, SBizPort supports meta-searching of two domain-specific databases (SBizPort collection and AMBDirecto) and six Spanish general search engines (Yahoo Espan˜ol, Terra, Ahijuna, Auyantepui, Bacan, and Ascinsa).

3.1.2. Step 2: post-retrieval analysis SBizPort provides post-retrieval analysis capabilities in the form of Web page summarization, categorization, and visualization. The SBizPort summarizer was modified from an English summarizer that uses sentence-selection heuristics to rank text segments (McDonald and Chen, 2002). To summarize a Web page, the summarizer automatically performs sentence evaluation, segmentation or topic identification, and segment ranking and extraction. Users can invoke the summarizer by clicking to indicate how many sentences should be summarized for each result. A new window then is activated to display both the summary and the original Web page. The SBizPort categorizer organizes the Web pages related to the input query into twenty (or fewer) folders labeled by the key phrases appearing most frequently in the page summaries or titles. The categorizer relies on a Spanish phrase lexicon to extract phrases from Web page summaries obtained from meta-searching or searching our collections. The phrase lexicon was automatically created from a large number of Spanish Web pages using the mutual information approach (Ong and Chen, 1999). In addition, SBizPort supports visualization of Web pages retrieved using a Kohonen SOM algorithm (Kohonen,

ARTICLE IN PRESS 816

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

Fig. 1. Screen shots of SBizPort.

1995) to categorize and place Web pages onto a twodimensional jigsaw map. The size of a region on the map varies according to the number of Web pages assigned to it. Users can click on a region to see a list of its pages displayed to the right of it and can open pages by clicking their link-embedded titles.

In addition, SBizPort provides a Web directory of the resources in its specific domain. Organized in a hierarchy, the directory was built from a combination of human identification and meta-searching. The Spanish business directory contains 295 categories and has a depth of 5 levels.

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

4. System evaluation and experimental design This section describes the evaluation methodology and design of an experiment conducted to evaluate the usability of SBizPort. Through the evaluation, we studied how human users perceived the information quality, crossregional search capability, and overall performance (dependent variables) of the portal (independent variable). We invited 42 Spanish speakers to serve as subjects in a twophase experiment, in which the subjects used SBizPort to search and browse the Spanish business domains. The subjects also used BIWE (in Phase 1) and Yahoo Espan˜ol (in Phase 2) to perform search and browse tasks similar to those they had done using SBizPort. 4.1. A two-phase experimental design The experiment consisted of two phases. Phase 1 focused on general search and post-retrieval analysis capabilities of the portal while Phase 2 focused on the browse support provided by the Web directory of the portal. Subjects performed assigned tasks using the systems in each phase. The experimental tasks were designed as scenario-based search and browse tasks based on Text Retrieval Conference (TREC) standards (Voorhees and Harman, 1997). Sponsored by the National Institute of Standards and Technology (NIST) and the Defense Advanced Research Project Agency (DARPA), TREC strives to provide a common task evaluation that allows cross-system comparisons. There were a total of four scenarios, each having two search tasks and two browse tasks. A summary of the scenarios and tasks is provided in Table 2. To validate the relevance of these tasks, we conducted a pilot test with

817

three subjects before conducting the actual experiment. Based on the subjects’ responses in the pilot test, we revised the wording in the tasks. In Phase 1, nineteen Spanish-speaking students recruited from undergraduate and graduate courses at a university in the United States participated as volunteer subjects. Each subject used SBizPort and BIWE to perform the assigned search and browse tasks. BIWE (Buscador en Internet para la web en Espan˜ol, http://www.biwe.com/), the benchmark search engine chosen for comparison with SBizPort in this phase, is a major Spanish search engine providing information for the Spanish-speaking community. Compared with other Spanish search engines reviewed in Section 2.2, BIWE’s services are more comprehensive because this search engine categorizes search results and supports meta-searching. These functions are typically not found in other Spanish search engines. Established in 1995 with a headquarter in Spain, BIWE has been available to Spanish-speaking communities in Spain and Latin America much longer than other search engines and has been widely adopted by Hispanics. In Phase 2, twenty-three Spanish-speaking students (different from those used in Phase 1) recruited from undergraduate courses in computer information systems at a university in the United States participated as volunteer subjects. Each subject used the Web directories of SBizPort and Yahoo Espan˜ol to browse Spanish business information. Yahoo Espan˜ol (http://espanol.yahoo.com/) was chosen as the benchmark because its directory is more comprehensive (in terms of the numbers of categories and items in these categories) than other directories in the search engines and Web directories reviewed in Section 2.2. To enable a fair comparison, we restricted subjects to

Table 2 A summary of the scenarios and tasks used in the experiment Scenario

Search tasks

Browse tasks

AOL in Latin America

Find the stock symbol of AOL.

Find software companies similar to AOL.

When was AOL Latin America launched in the United States?

Find financial portals with AOL’s stock quotes.

Find the GDP of Argentina in 2002.

Find magazines reporting about MERCOSUR.

Find the percentage of economic flow to MERCOSUR from foreign investment.

Find sites about transportation and logistics by trailer from Argentina to other MERCOSUR country members. Find Latin American entrepreneurship Web portals. Find e-commerce sites from different Spanishspeaking regions. Find associations and chambers of commerce from Mexico that are focused on international business and trade. Find sites from private parties that offer market analysis services.

MERCOSUR (Mercado Comu´n del Sur, a trade association among South American countries)

Electronic commerce

How much does Mailics charge for Web hosting (in dollar amount)? Who founded Salutia.com?

North American Free Trade Agreement (NAFTA)

What are the objectives of NAFTA?

Where are most Mexico’s factories located?

ARTICLE IN PRESS 818

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

post-session questionnaire ratings and comments on a system right after using it so as to ensure a fresh memory of the system features. The experimenter recorded all verbal comments or behavioral observations that were later analysed using protocol analysis (Ericsson and Simon, 1993).

browsing only within each directory. They were not allowed to perform keyword searching in either SBizPort directory or YahooES directory. In each phase’s one-hour experiment, we introduced the two systems (SBizPort and the benchmark system) to each subject and randomly assigned two different scenarios (see Table 2) to evaluate the systems. Before a subject used a system, the experimenter showed him or her how to perform sample experimental tasks using that system. In Phase 1, we used two search tasks and one browse task from each scenario. Subjects spent an average of three minutes to finish a search task and eight minutes to finish a browse task. In Phase 2, we used only two browse tasks in each scenario to study the browse performance of the Web directories. The order in which the systems were used was randomly assigned to avoid bias (e.g., recency or primacy effects) owing to sequence of use. The experiment and the questionnaires about search and browse tasks were administered in Spanish. An English version of the questionnaire was used among our research team (consisting of Spanish and non-Spanish speakers) to evaluate the suitability of questions. Translation of the questionnaire was done by a native Spanish speaker to ensure its correctness. To further ensure no loss of meaning in the translation, we invited another native Spanish speaker to verify that both questionnaires are semantically equivalent to each other. Post-section and post-study questionnaires (which contain all the scales for subject rating) were administered in English as used in previous studies in order to preserve the psychometric properties of the instruments (Davis, 1989; Lewis, 1995). Each subject provided in a

4.2. Performance measures Upon finishing the study, a subject filled in a post-study questionnaire to compare the two systems and to rate each system in terms of information quality, cross-regional search capability, and overall satisfaction. A seven-point Likert scale was used in these ratings. To measure information quality, we modified the 16-dimension construct developed in Wang and Strong (1996) by dropping the ‘‘security’’ dimension that was not relevant because the information provided by the systems is already public. Because the remaining 15 dimensions may have different impact on information quality in our chosen domain, we invited a Spanish business expert to provide ratings on the relative importance of different dimensions (see Table 3). These ratings were used to distinguish the different levels of importance of the 15 dimensions. A native Spanish speaker, the expert was a senior management consultant in Mexico and had 24 years of experience in business development, raising capital, negotiations, finance, and strategic planning. He also had worked as the Vice President of Business Development for the Gallup Organization in Mexico. As had been done by Marshall et al. (2004), we summarized the 15 dimensions of information

Table 3 Definitions of 15 dimensions of information quality and expert rating Definition

Expert ratinga

The extent to which information is available, or easily and quickly retrievable The extent to which information is compactly represented The extent to which information is presented in the same format The extent to which information is easy to manipulate and apply to different tasks The extent to which information is in appropriate languages, symbols, and units, and the definitions are clear

3 3 3 3 2

2 2 3

Free-of-error Objectivity

The extent to which the volume of information is appropriate for the task at hand The extent to which information is regarded as true and credible The extent to which information is not missing and is of sufficient breadth and depth for the task at hand The extent to which information is correct and reliable The extent to which information is unbiased, unprejudiced, and impartial

Usability and analysis quality Relevancy Reputation Timeliness Understandability Value-Added

The The The The The

3 3 3 3 3

Dimension Presentation quality and clarity Accessibility Concise representation Consistent representation Ease of manipulation Interpretability Coverage and reliability Appropriate amount of information Believability Completeness

a

extent extent extent extent extent

to to to to to

which which which which which

information information information information information

is is is is is

applicable and helpful for the task at hand highly regarded in terms of its source or content sufficiently up-to-date for the task at hand easily comprehended beneficial and provides advantages from its use

Expert rating: 3 ¼ extremely important, 2 ¼ very important, 1 ¼ important.

2 2

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

quality into 3 categories: presentation quality and clarity, coverage and reliability and usability and analysis quality. The mean rating for each category was obtained by multiplying the weighted expert rating with the average score of that category. We also asked each subject to provide ratings on several attributes of each system, including system usefulness, ease of use, and information display and interface design, based on the items in the questionnaires developed in Davis (1989) and Lewis (1995). The subjects also provided demographic information, which was kept confidential in accordance with the Institutional Review Board Guidebook (Penslar, 2001). 4.3. Hypothesis testing Because SBizPort encompasses Web resources from different Spanish regions, we anticipated that it would provide richer content and higher usability than those of the benchmark systems and that users could find relevant results more quickly from our portal and provide better ratings in different dimensions. Therefore, we established the following hypotheses: H1. SBizPort provides higher information quality (in terms of presentation quality and clarity, coverage and reliability and usability and analysis capability) than a benchmark system. H2. SBizPort provides better cross-regional search capability than a benchmark system. H3. SBizPort achieves better user ratings in system usefulness, ease of use and information display and interface design than a benchmark system.

819

H4. SBizPort users experience a higher overall satisfaction than users of a benchmark system. As each subject was asked to perform similar tasks using our portal and the benchmark system, we used a one-factor repeated-measures design, which gives greater precision than designs that employ only between-subjects factors (Myers and Well, 1995). The unit of analysis was task. 5. Experimental results and implications This section reports and discusses the results of our user evaluation study. Table 4 summarizes the statistical results of testing various hypotheses using pairwise t-tests on the sample means. Table 5 summarizes subjects’ demographic profiles. Appendix A in Section 9 provides frequencies and percentages of subjects’ choices among different Likertscale categories. In general, we found that subjects rated SBizPort more favorably than BIWE and YahooES in most dimensions. We explain the results in the following paragraphs (throughout our discussion, we use the terms ‘‘significant’’ or ‘‘significantly’’ to mean ‘‘statistically significant’’ or ‘‘statistically significantly’’, respectively). 5.1. Information quality In both phases of the experiment, subjects rated the information quality of SBizPort to be significantly higher than that of the benchmark systems, showing that the information provided by SBizPort enabled them to perform the tasks more effectively. Among all three categories of dimensions on information quality (see Table 3 for the three categories), SBizPort got higher mean ratings. In Phase 1, the mean ratings on ‘‘coverage and reliability’’ and ‘‘usability and analysis capability’’ were significantly

Table 4 Statistical results of hypothesis testing Measure

Phase 1

Phase 2

SBizPort

H1

H2 H3a H3b H3c H4 a

Information quality (overall) Presentation quality and clarity Coverage and reliability Usability and analysis quality Cross-regional search capability System usefulness Ease of use Information display and interface design User satisfaction (overall)

Meana

S.D.

Meana

S.D.

2.1

0.66

2.9

1.1

2.3

0.78

2.9

2.2 1.98

0.63 0.76

1.7

SBizPort

Yahoo Espan˜ol

p

Result

Partially supported

Meana

S .D .

Meana

S.D.

0.01b

2.6

1.1

3.1

1.0

0.03b

1.3

0.08

2.4

1.3

3.2

1.0

0.01b

3.0 2.9

1.1 1.1

0.01b 0.00b

2.9 2.4

1.2 1.1

3.2 3.0

0.8 1.3

0.19 0.04b

0.81

3.2

1.6

0.00b

3.0

2.0

4.0

1.8

0.05b

Supported

2.5 1.97 2.2

0.62 0.63 0.65

3.5 3.0 3.1

1.7 1.3 1.2

0.00b 0.00b 0.00b

2.9 2.6 2.6

1.4 1.5 1.4

4.0 3.4 3.6

1.6 1.6 1.2

0.01b 0.02b 0.01b

Supported

1.8

0.76

3.1

1.7

0.00b

2.8

1.7

4.0

1.8

0.01b

Supported

The range of rating is from 1 to 7, with 1 being the best. The alpha error was 5%.

b

p

BIWE

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

820 Table 5 Subjects’ demographic profile Demographic information

Phase 1 (total: 19)

Phase 2 (total: 23)

Country of origin

Mexico (12), USA (3), Panama (1), Puerto Rico (1), Colombia (1), Peru (1) Undergraduate (13), bachelor earned (2), master earned (3), doctorate earned (1) 18–25 (14), 26–30 (2), 31–35 (2), 41–50 (1) Female (10), Male (9) o5 (1), 5–10 (2), 10–15 (1), 15–20 (3), 20–25 (9), 30–35 (1), 435 (2) 1.7 years

Mexico (9), USA (14)

Education Age range Gender Hours of using computer per week Average working experience

higher. We believed that SBizPort’s comprehensive domain collection contributed to a better information coverage and that the various browse support tools (summarizer, categorizer and visualizer) contributed to the better rating on analysis capability in Phase 1. On the other hand, the difference between the mean ratings of the two systems on ‘‘presentation quality and clarity’’ was not significant, although SBizPort’s mean rating was higher. This was perhaps because subjects were not used to new forms of result presentation (e.g., folder, map) shown in SBizPort. In Phase 2, SBizPort’s mean ratings on ‘‘presentation quality and clarity’’ and ‘‘usability and analysis capability’’ were significantly higher than those of YahooES. We believe that the SBizPort directory categorized related Web sites clearly and hence achieved a high presentation quality. The many levels of the SBizPort directory (see Fig. 1(f)) and their wide coverage of different business topics also supported searching and browsing of comprehensive Web resources. On the other hand, SBizPort directory needs to be refreshed because some links to external sites were broken, thus undermining its reliability. We conclude that H1 was partially supported. 5.2. Cross-regional search capability SBizPort achieved a significantly higher mean rating than BIWE and YahooES on the cross-regional search capability, thus supporting H2. We believe that SBizPort and its directory provided a wide range of information sources for subjects to choose from and its domain-specific collection helped subjects more effectively find answers to the tasks. Because the Spanish business intelligence domain encompasses a wide array of industries, countries and companies, a comprehensive Web portal such as SBizPort would enable users more easily to search across different regions and obtain the right information. 5.3. System performance attributes The ratings of SBizPort on system usefulness, ease of use and information display and interface design were all significantly higher than those of BIWE and YahooES. The mean differences ranged from 0.8 to 1.1. These

Undergraduate (21), associate degree earned (1), bachelor earned (1) 18–25 (9), 26–30 (8), 31–35 (1), 36–40 (2), 41–50 (3) Female (7), Male (16) o5 (2), 10–15 (4), 15–20 (2), 20–25 (6), 30–35 (2), 435 (7) 7.8 years

encouraging results demonstrate the high usability of SBizPort in searching and browsing Spanish business information. In particular, SBizPort obtained a very favorable rating on ‘‘ease of use’’ in Phase 1, demonstrating the friendliness of the system and intuitive search and browse features. We conclude that H3 was supported. 5.4. User satisfaction, behavioral observations and comments In addition, subjects rated SBizPort significantly better than BIWE and YahooES in terms of overall satisfaction, thus supporting H4. Subjects were more satisfied with SBizPort than the benchmark systems. We believe that several aspects of SBizPort contributed to its good performance: the high-quality meta-searchers and domain-specific collection used in SBizPort, the useful browse support tools, the comprehensive cross-regional coverage, and the well-organized Web directory. Subjects’ preferences on different applications of the systems are shown in Table 6, that shows overwhelming favor on SBizPort to perform different functions. The subjects in Phase 1 commented positively on SBizPort’s search and browse capabilities. Their written comments revealed that twelve subjects agreed that SBizPort was very useful for searching Spanish business information. For instance, a subject said that SBizPort ‘‘is very useful for searching,’’ and ‘‘(the information) is clear.’’ Another subject said ‘‘For specific topics (SBizPort) gave out specific results, making the searches better than other search engines.’’ The subjects also liked the browse support tools provided by SBizPort. A majority of seventeen subjects provided positive comments on it. For example, a subject said that SBizPort was ‘‘really nice to have different functions and have a catalog.’’ Another subject said that the browse tools ‘‘made it easy to view retrieved data.’’ Regarding the cross-regional search performance, fifteen subjects commented that SBizPort did a good job or has a greater variety than the benchmark search engine. For example, a subject said: ‘‘(SBizPort) gives lots of pages related to what I look for from different countries.’’ Another subject said ‘‘(SBizPort) looks with more information and (is) able to provide in detail.’’ However, five

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

821

Table 6 Subjects’ preferences on different applications Application

To To To To To

Phase 1

Phase 2

No. of subjects who preferred

No. of subjects who preferred

SBizPort search for specific Spanish company information 16 search for information from various Spanish business information sources 16 categorize a large number of Spanish business search results into meaningful groups 15 visualize a large number of Spanish business search results 14 search for cross-regional Spanish business information 15

subjects complained about the low speed of the system, especially when retrieving information from many metasearchers. In contrast, the subjects were dissatisfied with BIWE’s lack of relevance and clarity in searching and browsing. For example, a subject said that BIWE ‘‘gives irrelevant pages (of) other countries I’m not interested in.’’ Another subject said that it was ‘‘time-consuming’’ to use BIWE. Moreover, most users did not like the presence of pop-up advertisements when using BIWE. Nevertheless, six subjects said that BIWE was useful for searching Spanish business information. Three subjects commented that the system was easy to use and fast. The subjects in Phase 2 provided favorable comments on SBizPort directory. Sixteen subjects said that the organization of the categories was good and the directory provided sufficient information. For instance, a subject said: ‘‘It(SBizPort)’s very compact and straight-forward, each directory offers extensive information under each category. Then each category offers the precise page with a small description of that page.’’ Another subject said that the ‘‘links are clearly categorized for more efficient use.’’ The subjects also liked the user-friendliness and ease of navigation of SBizPort. Nine subjects commented positively on it. For instance, a subject said that SBizPort’s ‘‘categories make it easier to browse’’ and another subject said ‘‘not having too much (many) options (in SBizPort) is helpful to the user because it makes it easy to look for the information.’’ However, some subjects complained about the dead links (to external sites) provided by SBizPort. Some would like to be able to perform keyword searching within the directory. In comparison, subjects were less satisfied with YahooES than SBizPort. They had difficulty obtaining precise information from YahooES. Ten subjects said that YahooES had too many options to choose from and hence distracted them from finding relevant information. For instance, a subject said it was ‘‘extremely difficult to obtain information required (in YahooES), not easy to navigate, easy to get lost, to a point that you have to start all over again.’’ Another subject said ‘‘it is very hard to search for specific information. It (YahooES) has lots of worthless menus y.’’ A subject said ‘‘I was not able to find the information requested’’ and another subject said ‘‘I was

BIWE

Others

SBizPort

YahooES

Others

3 2 3 4 3

0 1 1 1 1

15 18 16 12 13

4 1 5 6 8

4 4 2 5 2

Table 7 Subjects’ self-rating on their profiles Profile

‘‘I am good at searching for information on the Web’’ ‘‘I strongly rely on the Web to search for information’’

Phase 1

Phase 2

Mean

S.D.

Mean

S.D.

2.7

1.4

2.0

1.0

2.0

1.4

1.9

1.4

Note: the rating was based on a 7-point Likert scale where ‘‘1’’ means ‘‘strongly agree.’’

not able to locate the information quickly.’’ The subjects also complained about the dead links in YahooES and would like to perform keyword searching in YahooES directory. However, some subjects commented positively on the familiar interface of YahooES and the quick response of the system. 5.5. Subject profiles Although most results were favorable to SBizPort, readers should use caution in interpreting these results because of the background of the subjects and their profiles (see Table 5). Table 7 shows the mean self-ratings on subjects’ background. On average, the subjects in Phase 1 had 1.7 years of experience working in industries. Twelve (out of 19) had no industry experience at all. While these profiles imply a young subject group that may rely heavily on the Web for information, the lack of industry experience might suggest potential problems in their ability to analyze business information and in fully benefiting from the search portals. In contrast, the subjects in Phase 2 had an average of 7.8 years’ working experience. One subject had 37 years of working experience, another had 22 years and several others had more than 10 years of working experience. Only six (out of 23) subjects had no industry experience. These figures show that the subjects had substantially richer industry experience than those in Phase 1, and thus might be more able to analyze business information. Moreover, nine subjects said that they spent more than thirty hours using computers per week (compared with only three

ARTICLE IN PRESS 822

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

subjects in Phase 1). This may indicate that the subjects in Phase 2 were more tech-savvy than those in Phase 1, thus more likely to benefit from the systems.

5.6. Implications of the results The encouraging results from our experiment demonstrate the usability of the portal in supporting non-English Web searching and browsing. We believe that the portal’s high information quality, comprehensiveness in content coverage, useful functionality and user-friendly interface contributed to the results. These important components help users who need to search for information from widely scattered regions in a language used by a multitude of countries and places. Given that the Internet will likely become more and more international in the future (O’Neill et al., 2003), we believe this research has shed light on information seeking and human analysis on the nonEnglish Web. Because previous research has implicitly assumed English to be the major language on the Web, this research provides new impetus for the study of HCI on the Web. Rapidly emerging issues related to non-English Web searching such as search engine development, information quality, cross-regional search capability and browse support have been explored in this study. While major languages like English and Chinese will still be important on the Web, the notion of a ‘‘multilingual Web’’ is expected to continue to draw attention from HCI researchers and practitioners in the future. This research has also provided practical guidelines for system developers. Because the development of SBizPort has incorporated various advanced techniques for Web searching and analysis, it helps system developers easily customize the search engine development to a specific language. A system developer needs only to provide regional information such as seed URLs, major information sources for meta-spidering and meta-searching and interface requirements. To our knowledge, not many existing portals support searching and analysis in nonEnglish domains (such as the Spanish business domain), where rapid future growth is expected. SBizPort helps to bridge the gap between the large amount of Spanish business information on the Web and the need for searching and analysis of this information.

6. Conclusions and future directions The rapid growth of non-English on-line populations has suggested a need for better support of searching for nonEnglish Web content. However, prior research implicitly assumed English to be the primary language for Web searching, which may not be true as many other languages gain popularity on the Web.

6.1. Conclusions Unlike previous work, this research explored information seeking on the non-English Web by empirically studying a Spanish business Web portal’s support of nonEnglish Web searching and browsing. Results of a twophase experiment involving 42 Spanish speakers show that the portal significantly outperformed the benchmark systems in terms of cross-regional search capability, information quality, system performance attributes and user satisfaction. Subjects very much preferred our portal to the benchmark systems in all types of applications. This research thus contributes to (1) exploring information seeking on the non-English Web, a subject often ignored in prior work, (2) providing an example of supporting Web searching in a non-English domain, and (3) understanding the information quality, cross-regional search capability and usability of a Spanish business Web portal. 6.2. Limitations of the study This research could have been improved in a number of ways. As our portal is a research prototype, the speed and stability are generally not as good as commercial search engines and Web directories such as the chosen benchmarks. This explains why several subjects complained about the slow responses of our system. Besides, we are limited by the scarce prior work on non-English Web searching, which prevented a more comprehensive review on this topic that possibly would offer better criteria for designing the system. As for the user study, we had difficulty in recruiting more Spanish speakers from different regions and experienced Spanish-speaking business professionals to serve as our subjects. Future work may consider expanding the sample size to establish a higher statistical confidence from the experimental results and to cover more Spanish-speaking regions. 6.3. Future directions Several directions are being pursued to extend this research. As the notion of ‘‘multilingual Web’’ continues to draw attention, we are exploring new HCI issues on nonEnglish Web searching to study how non-English on-line users can be better supported. For instance, multinational corporations (MNCs) typically provide information on their Web sites in different languages. Analyzing the relationships of MNCs with their multinational stakeholders (Chung, 2005) should clarify a holistic picture of how they stand in the international arena. The resulting business intelligence from stakeholders will serve to guide their global development strategies. Furthermore, information visualization has been shown to be promising in previous work (e.g., (Chung et al., 2005a, b)). We will study how new information visualization techniques can support browsing and comprehending massive non-English information on the Web.

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

823

Acknowledgments

Appendix A

This research was partly supported by funding from The University of Texas at El Paso. We thank all the contributors of system development and user study, and all the human subjects in the experiment.

Frequencies and percentages of subjects’ choices among different Likert-scale categories are shown in Table A1.

Table A1 Frequencies and Percentages of Subjects’ Ratings for SBizPort in Phase 1 Statement

Usefulness Using the system in my job would enable me to accomplish tasks more quickly. Using the system would improve my job performance. Using the system in my job would increase my productivity. Using the system would enhance my effectiveness on the job. Using the system would make it easier to do my job. I would find the system useful in my job. Ease of use Learning to operate the system would be easy for me. I would find it easy to get the system to do what I want it to do. My interaction with the system would be clear and understandable. I would find the system to be flexible to interact with. It would be easy for me to become skillful at using the system. I would find the system easy to use. Information display and interface design The information provided for the system is easy to understand. The information is effective in helping me complete the tasks and scenarios. The organization of information on the system screens is clear. The interface of this system is pleasant. I like using the interface of this system. This system has all the functions and capabilities I expect it to have. Information quality Accessibility Appropriate amount of information Believability Completeness Concise representation Consistent representation

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

2

3

4

5

6

7

No comment

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

3 15.8 2 10.5 1 5.3 1 5.3 3 15.8 4 21.1

7 36.8 7 36.8 7 36.8 7 36.8 6 31.6 8 42.1

9 47.4 7 36.8 10 52.6 10 52.6 8 42.1 4 21.1

0 0 3 15.8 1 5.3 1 5.3 2 10.5 2 10.5

0 0 0 0 0 0 0 0 0 0 1 5.3

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

6 31.6 4 21.1 2 10.5 5 26.3 9 47.4 9 47.4

10 52.6 13 68.4 11 57.9 9 47.4 5 26.3 6 31.6

3 15.8 0 0 5 26.3 3 15.8 5 26.3 3 15.8

0 0 2 10.5 0 0 2 10.5 0 0 1 5.3

0 0 0 0 1 5.3 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

4 21.1 8 42.1 5 26.3 6 31.6 4 21.1 1 5.3

10 52.6 6 31.6 6 31.6 6 31.6 9 47.4 10 52.6

4 21.1 5 26.3 6 31.6 5 26.3 5 26.3 6 31.6

0 0 0 0 1 5.3 2 10.5 1 5.3 1 5.3

1 5.3 0 0 1 5.3 0 0 0 0 1 5.3

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

3 15.8 5 26.3 7 36.8 2 10.5 2 10.5 4 21.1

10 52.6 9 47.4 6 31.6 8 42.1 6 31.6 8 42.1

4 21.1 3 15.8 6 31.6 7 36.8 8 42.1 6 31.6

1 5.3 1 5.3 0 0 2 10.5 2 10.5 1 5.3

0 0 1 5.3 0 0 0 0 0 0 0 0

1 5.3 0 0 0 0 0 0 1 5.3 0 0

0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

ARTICLE IN PRESS 824

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

Table A1 (continued ) Frequencies and Percentages of Subjects’ Ratings for SBizPort in Phase 1 Statement

Ease of manipulation Free-of-error Interpretability Objectivity Relevancy Reputation Timeliness Understandability Value-added Cross-regional search capability I am satisfied with the cross-regional search capability of this system. Overall satisfaction Overall, I am satisfied with this system.

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

2

3

4

5

6

7

No comment

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

6 31.6 3 15.8 7 36.8 6 31.6 6 31.6 5 26.3 8 42.1 9 47.4 6 31.6

9 47.4 6 31.6 8 42.1 8 42.1 9 47.4 7 36.8 9 47.4 9 47.4 8 42.1

2 10.5 9 47.4 3 15.8 3 15.8 1 5.3 4 21.1 1 5.3 0 0 4 21.1

1 5.3 0 0 0 0 2 10.5 2 10.5 1 5.3 18 94.7 0 0 1 5.3

1 5.3 1 5.3 1 5.3 0 0 1 5.3 1 5.3 1 5.3 0 0 0 0

0 0 0 0 0 0 0 0 0 0 1 5.3 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 5.3 0 0

Frequency Percentage

8 42.1

9 47.4

1 5.3

1 5.3

0 0

0 0

0 0

0 0

Frequency Percentage

6 31.6

11 57.9

1 5.3

1 5.3

0 0

0 0

0 0

0 0

Frequencies and Percentages of Subjects’ Ratings for BIWE in Phase 1 Statement

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

2

3

4

5

6

7

6 31.6 7 36.8 5 26.3 4 21.1 5 26.3 7 36.8

5 26.3 7 36.8 4 21.1 7 36.8 7 36.8 1 5.3

3 15.8 1 5.3 4 21.1 3 15.8 2 10.5 4 21.1

0 0 1 5.3 2 10.5 1 5.3 1 5.3 1 5.3

3 15.8 0 0 0 0 1 5.3 1 5.3 1 5.3

2 10.5 3 15.8 3 15.8 2 10.5 2 10.5 2 10.5

0 0 0 0 0 0 0 0 0 0 0 0

Frequency 5 Percentage 26.3 Frequency 2 Percentage 10.5 Frequency 1 Percentage 5.3 Frequency 1 Percentage 5.3 Frequency 3 Percentage 15.8 Frequency 1 Percentage 5.3

7 36.8 5 26.3 8 42.1 4 21.1 5 26.3 6 31.6

5 26.3 6 31.6 3 15.8 8 42.1 6 31.6 9 47.4

0 0 3 15.8 3 15.8 2 10.5 2 10.5 1 5.3

0 0 0 0 1 5.3 2 10.5 1 5.3 0 0

1 5.3 3 15.8 3 15.8 2 10.5 2 10.5 2 10.5

1 5.3 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0

Frequency Percentage

9 47.4

6 31.6

4 21.1

0 0

0 0

0 0

0 0

Usefulness Using the system in my job would enable me to accomplish tasks more quickly. Frequency 0 Percentage 0 Using the system would improve my job performance. Frequency 0 Percentage 0 Using the system in my job would increase my productivity. Frequency 1 Percentage 5.3 Using the system would enhance my effectiveness on the job. Frequency 1 Percentage 5.3 Using the system would make it easier to do my job. Frequency 1 Percentage 5.3 I would find the system useful in my job. Frequency 3 Percentage 15.8 Ease of use Learning to operate the system would be easy for me. I would find it easy to get the system to do what I want it to do. My interaction with the system would be clear and understandable. I would find the system to be flexible to interact with. It would be easy for me to become skillful at using the system. I would find the system easy to use. Information display and interface design The information provided for the system is easy to understand.

0 0

No comment

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

825

Table A1 (continued ) Frequencies and Percentages of Subjects’ Ratings for BIWE in Phase 1 Statement

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

The information is effective in helping me complete the tasks and scenarios. The organization of information on the system screens is clear. The interface of this system is pleasant. I like using the interface of this system. This system has all the functions and capabilities I expect it to have. Information quality Accessibility Appropriate amount of information Believability Completeness Concise representation Consistent representation Ease of manipulation Free-of-error Interpretability Objectivity Relevancy Reputation Timeliness Understandability Value-added Cross-regional search capability I am satisfied with the cross-regional search capability of this system. Overall satisfaction Overall, I am satisfied with this system.

2

3

4

5

6

Frequency 1 Percentage 5.3 Frequency 3 Percentage 15.8 Frequency 3 Percentage 15.8 Frequency 1 Percentage 5.3 Frequency 3 Percentage 15.8

8 42.1 5 26.3 5 26.3 7 36.8 3 15.8

5 26.3 4 21.1 4 21.1 3 15.8 3 15.8

1 5.3 2 10.5 4 21.1 4 21.1 5 26.3

3 15.8 4 21.1 3 15.8 2 10.5 3 15.8

0 0 1 5.3 0 0 2 10.5 1 5.3

1 5.3 0 0 0 0 0 0 1 5.3

0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

4 21.1 4 21.1 5 26.3 5 26.3 7 36.8 10 52.6 10 52.6 3 15.8 10 52.6 4 21.1 7 36.8 4 21.1 7 36.8 4 21.1 8 42.1

5 26.3 7 36.8 4 21.1 7 36.8 3 15.8 3 15.8 3 15.8 6 31.6 2 10.5 7 36.8 3 15.8 7 36.8 3 15.8 6 31.6 7 36.8

2 10.5 1 5.3 1 5.3 1 5.3 4 21.1 2 10.5 1 5.3 2 10.5 2 10.5 3 15.8 2 10.5 1 5.3 3 15.8 3 15.8 1 5.3

3 15.8 3 15.8 2 10.5 1 5.3 1 5.3 1 5.3 2 10.5 2 10.5 1 5.3 1 5.3 3 15.8 2 10.5 1 5.3 1 5.3 0 0

1 5.3 0 0 0 0 1 5.3 1 5.3 0 0 0 0 2 10.5 0 0 0 0 2 10.5 0 0 1 5.3 0 0 0 0

18 94.7 1 5.3 1 5.3 1 5.3 1 5.3 1 5.3 1 5.3 0 0 1 5.3 0 0 0 0 0 0 0 0 0 0 0 0

1 5.3 1 5.3 2 10.5 1 5.3 1 5.3 1 5.3 1 5.3 2 10.5 1 5.3 1 5.3 1 5.3 2 10.5 2 10.5 1 5.3 3 15.8

Frequency 2 Percentage 10.5

6 31.6

5 26.3

2 10.5

2 10.5

1 5.3

1 5.3

0 0

Frequency 3 Percentage 15.8

6 31.6

3 15.8

2 10.5

4 21.1

0 0

1 5.3

0 0

3 15.8 2 10.5 4 21.1 3 15.8 1 5.3 1 5.3 1 5.3 2 10.5 2 10.5 3 15.8 1 5.3 3 15.8 2 10.85 4 21.1 0 0

7

No comment

Frequencies and Percentages of Subjects’ Ratings for SBizPort in Phase 2 Statement

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

Usefulness Using the system in my job would enable me to accomplish tasks more quickly Frequency 4 Percentage 17.4

2

3

4

7 30.4

3 13

6 26.1

5

2 8.7

6

1 4.3

7

0 0

No comment

0 0

ARTICLE IN PRESS 826

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

Table A1 (continued ) Frequencies and Percentages of Subjects’ Ratings for SBizPort in Phase 2 Statement

Using the system would improve my job performance. Using the system in my job would increase my productivity. Using the system would enhance my effectiveness on the job. Using the system would make it easier to do my job. I would find the system useful in my job. Ease of use Learning to operate the system would be easy for me. I would find it easy to get the system to do what I want it to do. My interaction with the system would be clear and understandable. I would find the system to be flexible to interact with. It would be easy for me to become skillful at using the system. I would find the system easy to use. Information display and interface design The information provided for the system is easy to understand. The information is effective in helping me complete the tasks and scenarios. The organization of information on the system screens is clear. The interface of this system is pleasant. I like using the interface of this system. This system has all the functions and capabilities I expect it to have. Information quality Accessibility Appropriate amount of information Believability Completeness Concise representation Consistent representation Ease of manipulation Free-of-error Interpretability Objectivity

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

2

3

4

5

6

7

No comment

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

4 17.4 4 17.4 4 17.4 6 26.1 5 21.7

7 30.4 7 30.4 8 34.8 4 17.4 6 26.1

4 17.4 3 13 2 8.7 5 21.7 3 13.6

3 13 5 21.7 7 30.4 6 26.1 4 17.4

5 21.7 4 17.4 0 8.7 2 8.7 3 13

0 0 0 0 0 0 0 0 2 8.7

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

10 43.5 7 30.4 4 17.4 9 39.1 10 43.5 10 43.5

7 30.4 5 21.7 9 39.1 4 17.4 4 17.4 5 21.7

3 13 6 26.1 4 17.4 2 8.7 3 13 4 17.4

2 8.7 2 8.7 2 8.7 3 13 2 8.7 1 4.3

0 0 1 4.3 1 4.3 2 8.7 1 4.3 1 4.3

1 4.3 1 4.3 2 8.7 2 8.7 1 4.3 1 4.3

0 0 1 4.3 1 4.3 1 4.3 2 8.7 1 4.3

0 0 0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

6 26.1 5 21.7 8 34.8 8 34.8 6 26.1 6 26.1

9 39.1 10 43.5 8 34.8 7 30.4 9 39.1 5 21.7

4 17.4 2 8.7 3 13 2 8.7 0 0 3 13

1 4.3 1 4.3 3 13 2 8.7 4 17.4 4 17.4

0 0 2 8.7 1 4.3 3 13 1 4.3 2 8.7

2 8.7 2 8.7 0 0 1 4.3 2 8.7 1 4.3

1 4.3 1 4.3 0 0 0 0 1 4.3 2 8.7

0 0 0 0 0 0 0 0 0 0 0 0

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

7 30.4 4 17.4 6 26.1 2 8.7 9 39.1 11 47.8 9 39.1 4 17.4 9 39.1 7 30.4

4 17.4 6 26.1 7 30.4 6 26.1 5 21.7 6 26.1 2 8.7 4 17.4 5 21.7 8 34.8

6 26.1 6 26.1 4 17.4 5 21.7 4 17.4 1 4.3 4 17.4 8 34.8 4 17.4 4 17.4

3 13 3 13 5 21.7 5 21.7 4 17.4 3 13 5 21.7 3 13 3 13 1 4.3

2 8.7 3 13 0 0 1 4.3 1 4.3 2 8.7 2 8.7 3 13 2 8.7 0 0

1 4.3 1 4.3 1 4.3 3 13 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 4.3 0 0 0 0 1 4.3 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 4.3 0 0 3 13

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

827

Table A1 (continued ) Frequencies and Percentages of Subjects’ Ratings for SBizPort in Phase 2 Statement

Relevancy Reputation Timeliness Understandability Value-added Cross-regional search capability I am satisfied with the cross-regional search capability of this system. Overall satisfaction Overall, I am satisfied with this system.

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

2

3

4

7 30.4 7 30.4 6 26.1 9 39.1 5 21.7

6 26.1 8 34.8 5 21.7 5 21.7 9 39.1

4 17.4 4 17.4 5 21.7 6 26.1 6 26.1

4 17.4 1 4.3 6 26.1 1 4.3 3 13

Frequency 6 Percentage 26.1

7 30.4

2 8.7

2 8.7

Frequency 5 Percentage 21.7

8 34.8

2 8.7

5 21.7

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

5

6

1 4.3 1 4.3 0 0 1 4.3 0 0 3 13 1 4.3

7

No comment

1 4.3 0 0 0 0 1 4.3 0 0

0 0 1 4.3 0 0 0 0 0 0

0 0 1 4.3 1 4.3 0 0 0 0

1 4.3

2 8.7

0 0

1 4.3

1 4.3

0 0

Frequencies and Percentages of Subjects’ Ratings for YahooES in Phase 2 Statement

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

2

3

4

5

6

7

No comment

3 13 0 0 2 8.7 1 4.3 3 13 4 17.4

4 17.4 5 21.7 5 21.7 9 39.1 6 26.1 5 21.7

4 17.4 7 30.4 6 26.1 4 17.4 2 8.7 3 13

6 26.1 4 17.4 3 13 2 8.7 4 17.4 5 21.7

1 4.3 2 8.7 3 13 4 17.4 3 13 1 4.3

3 13 2 8.7 2 8.7 1 4.3 3 13 3 13

0 0 0 0 0 0 0 0 0 0 0 0

10 43.5 4 17.4 1 4.3 2 8.7 8 34.8 3 13

4 17.4 3 13 6 26.1 3 13 4 17.4 6 26.1

2 8.7 2 8.7 4 17.4 6 26.1 3 13 5 21.7

3 13 4 17.4 2 8.7 3 13 2 8.7 2 8.7

2 8.7 5 21.7 3 13 3 13 4 17.4 4 17.4

1 4.3 3 13 5 21.7 3 13 1 4.3 2 8.7

1 4.3 2 8.7 2 8.7 3 13 1 4.3 1 4.3

0 0 0 0 0 0 0 0 0 0 0 0

Frequency 3 Percentage 13 Frequency 2 Percentage 8.7 Frequency 2 Percentage 8.7 Frequency 3 Percentage 13

6 26.1 3 13 5 21.7 5 21.7

7 30.4 6 26.1 5 21.7 5 21.7

2 8.7 4 17.4 4 17.4 5 21.7

1 4.3 3 13 6 26.1 4 17.4

3 13 4 17.4 1 4.3 1 4.3

1 4.3 1 4.3 0 0 0 0

0 0 0 0 0 0 0 0

Usefulness Using the system in my job would enable me to accomplish tasks more quickly. Frequency 2 Percentage 8.7 Using the system would improve my job performance. Frequency 3 Percentage 13 Using the system in my job would increase my productivity. Frequency 2 Percentage 8.7 Using the system would enhance my effectiveness on the job. Frequency 2 Percentage 8.7 Using the system would make it easier to do my job. Frequency 2 Percentage 8.7 I would find the system useful in my job. Frequency 2 Percentage 8.7 Ease of use Learning to operate the system would be easy for me. I would find it easy to get the system to do what I want it to do. My interaction with the system would be clear and understandable. I would find the system to be flexible to interact with. It would be easy for me to become skillful at using the system. I would find the system easy to use. Information display and interface design The information provided for the system is easy to understand. The information is effective in helping me complete the tasks and scenarios. The organization of information on the system screens is clear. The interface of this system is pleasant.

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

ARTICLE IN PRESS 828

W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829

Table A1 (continued ) Frequencies and Percentages of Subjects’ Ratings for YahooES in Phase 2 Statement

Measure

Likert-scale Categories (1 ¼ strongly agree) 1

I like using the interface of this system. This system has all the functions and capabilities I expect it to have. Information quality Accessibility Appropriate amount of information Believability Completeness Concise representation Consistent representation Ease of manipulation Free-of-error Interpretability Objectivity Relevancy Reputation Timeliness Understandability Value-added Cross-regional search capability I am satisfied with the cross-regional search capability of this system. Overall satisfaction Overall, I am satisfied with this system.

References Bates, M.J., 1989. The design of browsing and berrypicking techniques for the on-line search interface. Online Review 13 (5), 407–431. Caramelli, P., 2003. The current and future rapid growth of older people in Latin America: implications in psychogeriatrics (keynote presentation). In: Proceedings of the Eleventh International Congress. International Psychogeriatric Association, Chicago, IL. Chen, H., Houston, A., Sewell, R., Schatz, B., 1998. Internet browsing and searching: user evaluation of category map and concept space techniques. Journal of the American Society for Information Science, Special Issue on AI Techniques for Emerging Information Systems Applications 49 (7), 582–603.

2

3

4

5

6

7

No comment 0 0 0 0

Frequency Percentage Frequency Percentage

1 4.3 2 8.7

3 13 2 8.7

8 34.8 5 21.7

5 21.7 3 13

4 17.4 4 17.4

2 8.7 3 13

0 0 4 17.4

Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

1 4.3 1 4.3 3 13 1 4.3 3 13 3 13 3 13 1 4.3 7 30.4 6 26.1 3 13 3 13 4 17.4 5 21.7 3 13

6 26.1 3 13 4 17.4 4 17.4 5 21.7 8 34.8 0 0 6 26.1 4 17.4 5 21.7 4 17.4 9 39.1 7 30.4 7 30.4 7 30.4

2 8.7 6 26.1 13 56.5 9 39.1 8 34.8 7 30.4 10 43.5 9 39.1 2 8.7 6 26.1 4 17.4 5 21.7 7 30.4 4 17.4 5 21.7

7 30.4 7 30.4 1 4.3 2 8.7 5 21.7 3 13 6 26.1 4 17.4 7 30.4 4 17.4 6 26.1 4 17.4 2 8.7 4 17.4 4 17.4

2 8.7 3 13 2 8.7 3 13 1 4.3 1 4.3 4 17.4 0 0 1 4.3 1 4.3 3 13 1 4.3 1 4.3 1 4.3 2 8.7

4 17.4 1 4.3 0 0 3 13 1 4.3 1 4.3 0 0 1 4.3 1 4.3 0 0 2 8.7 1 4.3 1 4.3 1 4.3 1 4.3

1 4.3 2 8.7 0 0 1 4.3 0 0 0 0 0 0 0 0 1 4.3 0 0 1 4.3 0 0 1 4.3 1 4.3 1 4.3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 8.7 0 0 1 4.3 0 0 0 0 0 0 0 0 0 0

Frequency Percentage

2 8.7

3 13

5 21.7

3 13

4 17.4

4 17.4

2 8.7

0 0

Frequency Percentage

2 8.7

3 13

4 17.4

4 17.4

5 21.7

3 13

2 8.7

0 0

Chen, H., Fan, H., Chau, M., Zeng, D., 2001. MetaSpider: meta-searching and categorization on the Web. Journal of the American Society for Information Science and Technology 52 (13), 1134–1147. Chung, W., 2005. Business stakeholder analyzer: an automatic classification approach to facilitating collaborative commerce on the Web. In: Hawaii International Conference on System Sciences. IEEE Computer Society, Island of Hawaii. Chung, W., Zhang, Y., Huang, Z., Wang, G., Ong, T.-H., Chen, H., 2004. Internet searching and browsing in a multilingual world: an experiment on the Chinese Business Intelligence Portal (CBizPort). Journal of the American Society for Information Science and Technology 55 (9), 818–831. Chung, W., Chen, H., Chaboya, L.G., O’Toole, C., Atabakhsh, H., 2005a. Evaluating event visualization: a usability study of COPLINK

ARTICLE IN PRESS W. Chung / Int. J. Human-Computer Studies 64 (2006) 811–829 Spatio-Temporal Visualizer. International Journal of Human–Computer Interaction 62 (1), 127–157. Chung, W., Chen, H., Nunamaker, J.F., 2005b. A visual framework for knowledge discovery on the Web: an empirical study of business intelligence exploration. Journal of Management Information Systems 21 (4), 57–84. Davis, F.D., 1989. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 13 (3), 319–340. Ellis, D., 1989. A behavioral approach to information retrieval system design. Journal of Documentation 45 (3), 171–212. Ericsson, K.A., Simon, H.A., 1993. Protocol Analysis: Verbal Reports As Data. MIT Press, Cambridge, MA. Firmin, T., Chrzanowski, M.J., 1999. An Evaluation of Automatic Text Summarization Systems. The MIT Press, Cambridge (pp. 325–336). Gallup, 2002. Encuesta sobre portales 2002. [Online]. Available at http:// aui.es/estadi/gallup/gallup_portales_2002.htm Global Reach, 2004a. Evolution of non-English online populations. [Online]. Available at http://global-reach.biz/globstats/evol.html Global Reach, 2004b. Global Internet Statistics (by Language). [Online]. Available at http://www.glreach.com/globstats/ Greene, S., Marchionini, G., Plaisant, C., Shneiderman, B., 2000. Previews and overviews in digital libraries: designing surrogates to support visual information seeking. Journal of the American Society for Information Science 51 (4), 380–393. Ingwersen, P., 1992. Information Retrieval Interaction. Taylor Graham, London. Kohonen, T., 1995. Self-Organizing Maps. Springer, Berlin. Kuhlthau, C., 1993. A principle of uncertainty for information seeking. Journal of Documentation 49 (4), 339–355. Kuhlthau, C., 1998. Longditudinal case studies of the information search process of users in libraries. Library and Information Science Research 10 (3), 257–304. Kuhlthau, C., Spink, A., Cool, C., 1992. Exploration into stages in the information search process in on-line IR: communication between users and intermediaries. In: Proceedings of the Annual Meeting of the American Society for Information Science, vol. 29, pp. 67–71. Kuhlthau, C.C., 1991. Inside the search process: information seeking from the user’s perspective. Journal of the American Society for Information Science 42 (5), 361–371. Lewis, J.R., 1995. IBM computer usability satisfaction questionnaires: psychometric evaluation and instructions for use. International Journal of Human–Computer Interaction 7 (1), 57–78. Lin, X., 1997. Map displays for information retrieval. Journal of the American Society for Information Science 48 (1), 40–54. Loiacono, E., 2002. WebQualTM: a Web site quality instrument. In: Proceedings of International Conference on Information Systems (ICIS) Doctoral Consortium, Charlotte, NC, USA. Marchionini, G., 1995. Information Seeking in Electronic Environments. Cambridge University Press, New York. Marshall, B., McDonald, D., Chen, H., Chung, W., 2004. EBizPort: collecting and analyzing business intelligence information. Journal of

829

the American Society for Information Science and Technology 55 (10), 873–891. Marsico, M.D., Levialdi, S., 2004. Evaluating web sites: exploiting user’s expectations. International Journal of Human–Computer Studies 60 (3), 381–416. McDonald, D., Chen, H., 2002. Using sentence selection heuristics to rank text segments in TXTRACTOR. In: Proceedings of the second ACM/ IEEE-CS Joint Conference on Digital Libraries. Portland, OR, USA: ACM/IEEE-CS, pp. 28–35. Mowshowitz, A., Kawaguchi, A., 2002. Bias on the web. Communications of the ACM 45 (9), 56–60. Myers, J., Well, A., 1995. Research Design and Statistical Analysis. Lawrence Erlbaum Associates, Publishers, Hillsdale, NJ, USA. O’Neill, E.T., Lavoie, B.F., Bennett, R., 2003. Trends in the evolution of the Public Web 1998–2002. Digital Library Magazine 9 (4). Ong, T.-H., Chen, H., 1999. Updateable PAT-array approach for Chinese key phrase extraction using mutual information: a linguistic foundation for knowledge management. In: Proceedings of the Second Asian Digital Library Conference, Taipei, Taiwan, pp. 63–84. Penslar, R.L., 2001. Institutional Review Board Guidebook, Office for Human Research Protection, U.S. Department of Health and Human Services. [Online]. Available at http://ohrp.osophs.dhhs.gov/irb/irb_ guidebook.htm Peterson, J., 2002. Quepasa announces agreement to acquire Vayala Corporation in: Hispanic PR Wire—Business Wire. [Online]. Available at Phoenix. Pipino, L.L., Lee, Y.W., Wang, R.Y., 2002. Data quality assessment. Communications of the ACM 45 (4), 211–218. Saracevic, T., 1996. Modeling interaction in IR. Review and proposal. In: Proceedings of the Annual Meeting of the American Society for Information Science, pp. 3–9. Spink, A., Saracevic, T., 1997. Interaction in IR: selection and effectiveness of search terms. Journal of the American Society for Information Science 48 (8), 741–761. Spink, A., Ozmutlu, S., Ozmutlu, H.C., Jansen, B.J., 2002. U.S. versus European Web searching trends. SIGIR Forum 36 (2). Sutcliffe, A.G., Ennis, M., 1998. Towards a cognitive theory of information retrieval. Interacting with Computers (Special Edition on HCI & Information Retrieval) 10, 321–351. Tombros, A., Sanderson, M., 1998. Advantages of query biased summaries in information retrieval. In: Proceedings of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. ACM Press, Melbourne, Australia, pp. 2–10. Voorhees, E., Harman, D., 1997. Overview of the Sixth Text Retrieval Conference (TREC-6). In: NIST Special Publication 500-240: The Sixth Text Retrieval Conference (TREC-6). National Institute of Standards and Technology, Gaithersburg, MD, USA. Wang, R.Y., Strong, D.M., 1996. Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 12 (4), 5–34. Wilson, T.D., 1999. Models of information behavior research. Journal of Documentation 55 (3), 249–270.