Exploring the usability of web portals: A Croatian case study

Exploring the usability of web portals: A Croatian case study

International Journal of Information Management 31 (2011) 339–349 Contents lists available at ScienceDirect International Journal of Information Man...

2MB Sizes 2 Downloads 56 Views

International Journal of Information Management 31 (2011) 339–349

Contents lists available at ScienceDirect

International Journal of Information Management journal homepage: www.elsevier.com/locate/ijinfomgt

Exploring the usability of web portals: A Croatian case study Andrina Granic´ a,∗ , Ivica Mitrovic´ b , Nikola Marangunic´ a a b

Faculty of Science, University of Split, Nikole Tesle 12, 21000 Split, Croatia Arts Academy, University of Split, Croatia

a r t i c l e

i n f o

Article history: Available online 27 November 2010 Keywords: Usability User testing Guideline inspection Horizontal information web portals Case study

a b s t r a c t Web portals act as a single point of access to information and services relevant to person’s work or personal interests. Market research findings related to Croatian web context report that nowadays horizontal information portals are the most visited sites. Whether they reach their aim of facilitating users’ access to diverse resources and to which extent, remains an open question. In this paper, this issue is addressed by two case studies conducted for summative assessment of Croatian horizontal information portals. Approach assembled expert inspection and user assessment that integrated a number of empirical methods into laboratory-based testing. We report that the results of inspection method were not in agreement with the ones obtained from user test methods. Although differences of this kind have been reported elsewhere, these were not as evident as in these studies. What is very interesting and represents a main contribution of the research is that in both rounds of evaluations this outcome is very sharp and clear. This suggests that we should conduct both kinds of assessments as they seem to be complementary. Evaluation provided some general findings and know-how from the experience and we believe that many readers, both practitioners and researchers, can learn from it. © 2010 Elsevier Ltd. All rights reserved.

1. Introduction Usability evaluation plays a fundamental role in human-centred design process, because it enables and facilitates design according to usability engineering principles. As usability is defined as a relationship between task, user and system purpose, there is no simple definition or meaningful single measure of usability. Commonly usability, a key concept in human–computer interaction (HCI), is related to ease-of-use and ease-of-learning. Most assessment methods are grouped into user-based methods and usability inspection methods. Current research on usability evaluation clearly searches for methods that produce beneficial results for users and developers at a low-cost, perhaps with the economic of assessment as the most important factor, cf. (Hvannberg, Law, & Larusdottir, 2007; Lewis, 2006). The idea of a portal is to collect information from different sources and create a single point of access to information, functions and services that are relevant to one person’s work or personal interests (Beringer, Lessmann, & Waloszek, 2001). Due to emphasized specifics of portals as web sites, here primarily addressing their complex and hybrid structure, and media specificities with

∗ Corresponding author. Tel.: +385 21 385 133; fax: +385 21 384 086. ´ [email protected] E-mail addresses: [email protected] (A. Granic), ´ [email protected] (N. Marangunic). ´ (I. Mitrovic), 0268-4012/$ – see front matter © 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijinfomgt.2010.11.001

diversity of user population, tasks and workflows, specific evaluation approaches should be employed. Horizontal information (broad-reach and news) portals are the most visited Croatian web sites. Whether such portals do indeed reach their aim of facilitating users’ access to diverse resources at the same time targeting the entire Internet community and, if so, to which extent, remains an open question. In this paper, we address this question with the summative evaluation of horizontal information portals. To carry out the comparison and to evaluate how easy to use those portals are, we have conducted a series of experiments that employed a range of assessment methods, both empirical and analytic. Our approach assembled inspection methods and user assessments that embodied an integration of a number of empirical methods into laboratory-based testing. Because of adequate resources available, we have taken the advantage of a good practice to triangulate different sources of data and take objective and subjective measures. The approach proved successful, provided some general findings and know-how from the experience and we believe that many readers, both practitioners and researchers, can learn from it. The paper is organized as follows. Section 2 introduces basic concepts and the aim of the overall research. Section 3 addresses the first study offering an insight into data collection, analysis and discussion of results. Section 4 deals with the second study, additionally bringing discussion and interpretation of overall findings. Section 5 concludes the paper.

340

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

Fig. 1. Approach to usability evaluation.

2. Background to the research 2.1. Web portals Web portals are a special breed of web sites, providing a blend of information, applications and services (Waloszek, 2001). Since portals have more than just information to offer, they are based on more advanced web technologies that go beyond the simple interface of typical web pages. One could say that portal design is a hybrid of web and application design with a flexibility requirement thrown in to keep things exciting (Blankenship, 2001). Portals can be categorized in two broad categories: hori-

zontal portals, covering many areas, and vertical portals, focused on one functional area (Winkler, 2001). Horizontal portals often referred to as “megaportals”, target the entire Internet community. They usually contain search engines and provide ability for user to personalize the page. Vertical portals differ only in their more specific objects and contents, offering information and services customized to specific audiences; the technology employed remains the same. Additionally, two generations of portals could be differentiated; the first-generation portals that tended to present a monolithic software architecture compromising portal development and maintenance, and the second-generation which let users create one or more personal pages composed of personalizable

Fig. 2. Interface screenshots (clockwise: Index-portal, Net-portal, Vip-portal and T-Portal).

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

341

Fig. 3. Excerpt from evaluation form.

Table 1 Distribution of age, gender and ICT experience. ICT experience

Male Female Total

Age

Beginner

Advanced

Expert

17–25

26–45

46–60

5 5 10

7 4 11

5 4 9

5 5 10

6 4 10

6 4 10

Fig. 4. Objective performance (lower M score indicates better result).

portlets (interactive web mini applications that render markup fragments e.g. news, weather, sports and so on) that the portal can aggregate (Bellas, 2004). Our research has been focused on broad reach and news portals, horizontal information portals characterized with the features of

Fig. 5. Subjective assessment (higher M score indicates better result).

the first-generation ones. In conducted studies we have addressed three portal functionalities (according to Winkler (2001): search and navigation, information integration and personalization). The rationale for our choice of portals is the fact that market research findings related to the Croatian web context report that nowadays horizontal information portals are the most visited Croatian web sites cf. (GemiusAudience, 2008). Those portals collect information in one place and provide access to information, functions and services that are relevant to general user interests. Users do not benefit of publishing through such portals, they are just consumers of collected information like news, employment, weather and travel information, on-line shopping, search engine, comments, games, etc. Consequently, horizontal information portals are also called “general” (Tatnall, 2005) or “generic” (Van der Heijden, 2003). As gateways to the web, they represent starting point in user browsing and/or serve as users anchor site. A number of such portals work as an engine to gather information from sites of news agencies and newspapers organizing them in one place. They came to be called “multi-source news portals” (Can et al., 2008). Currently horizontal information portals inevitably replace paper-based/print newspapers that constantly decline, thus undertaking the role of mainstream media and becoming the leading informative media.

2.2. Web portal usability evaluation Information presented on each page of a web portal addresses a very large user group with highly diverse needs and interests, aspects that have to be reflected by the portal design. Thus an easyto-use portal has to determine the steps or processes conducted by its users and then integrate utilities of services to improve a workflow (Carr, 2004). In that context, an effective portal interface assessment is essential because it identifies design problems to be corrected and provides guidance for the next iteration of the development process. In general, usability is context dependent and is shaped by the interaction between users, tasks and system purpose. A variety of usability evaluation methods have been developed over the past few decades and most used ones are grouped into usability test methods; user-based which involve end-users and inspection methods engaging usability experts. Research studies involving different kinds of applications, different user groups and evaluation techniques have been conducted and the need for method combination is well understood in the usability field (see e.g. Rosenbaum, 2007; Uldall-Espersen, Frøkjær, & Hornbæk, 2008). Besides, a num-

342

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

Fig. 6. Analysis of “instant experts”’ inspection.

ber of studies comparing user-based evaluation and inspection methods have been conducted (Cockton, Lavery, & Woolrych, 2003; Sears & Jacko, 2008). It is concluded that we should make use of complementing usability techniques. Due to emphasized specifics of portals as web sites, here primarily addressing their complex and hybrid structure, and media specificities with diversity of user population, their tasks and workflows, particular evaluation approaches should be employed. Portal usability is more than the usability and design of its parts, because portal has also to care for the more general issues of packaging these offerings, structuring, integrating and organizing them and tailoring a portal to a specific user group or role (Waloszek, 2001). In the context of global trend of portal specialization, recent research is mostly addressing evaluation of vertical portals such as enterprise, tourist, e-government, educational, etc. Shelat and Stewart (2004) conducted travel portal testing with eight users participating in end-user testing with twelve tasks. Testing included thinking aloud method, questionnaire, interview and debriefing session. Another travel sites study employed task-based end-user

testing along with open and close ended questionnaires on the sample of twenty users (Carstens & Patterson, 2005). Evaluation of specialized text-analysis research portal TAPoR (Carr, 2004) was based on questionnaire and semi-structured interviews with representative small user group (less then ten). Tourist-service portal evaluation based on the sample of 172 users and usability questionnaires identified differences between user groups (Klausegger, 2005). In assessment of customizable library portal small group of participants (less then ten) conducted nineteen tasks (Brantley, Armstrong, & Lewis, 2006). Approach included task-based user testing with thinking aloud method. Further, there are also studies situated in specific cultural background, including specificity of the local context that cannot be seen from the global perspective. Theng and Soh’s (2005) Asian study of healthcare portals was based on usability web questionnaire on the sample of 127 participants aiming to conduct comparison with relevant studies carried out in “the West”. In their research of cultural usability for news portals, Tsui and Paynter (2004) identified areas of web usability in the news portals that may be culturally specific. Fang and Rau (2003) carried

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

343

Fig. 7. Interface screenshots (clockwise: Vecernji-portal, Slobodnadalmacija-portal, Jutarnji-portal and 24sata-portal).

questionnaire-based experiments to examine effects of cultural differences between the Chinese and the US users on the perceived usability and search performance of web portals. Apparently, while there is a number of studies associated with usability evaluation of specialized vertical portals (some additionally taking into account particular cultural context), the research related to the evaluation of horizontal information portals has been rather poor. In such context, the study aiming to explain individual acceptance and usage of generic web portal could be stressed, where Van der Heijden (2003) examines perceived ease-of-use, usefulness and enjoyment. Data was collected by means of online survey. In service quality evaluation of “information presenting portal”, for Yang, Cai, Zhou, and Zhou (2004) usability is just one of the assessment components. Concerning news portal evaluation, the study aimed to identify areas of web usability in the news portal industry that may be culturally specific could be highlighted (Tsui & Paynter, 2004). This paper offers insight into comprehensive research concerning nowadays the most visited Croatian web sites. It reports on two empirical studies on evaluating broad-reach and news portal, respectively. Both analytic (expert review) and user-based evaluation have been employed, collecting a range of quantitative and qualitative data.

empirical methods into laboratory-based usability evaluation (see Fig. 1). 3.1. Scenario-guided user evaluation We conducted a controlled experiment that advocates scenarioguided user evaluations used to collect both quantitative data and qualitative “remarks”. Thirty participants with basic computer literacy skills were involved. Their age and gender distribution along with experience with information and communication technologies (ICT) is given in Table 1. End-user testing was based on criteria expressed in terms of objective performance measurement of effectiveness and efficiency along with subjective users’ assessment (ISO/IEC, 2006). The System Usability Scale (SUS), a simple, standard, ten-item attitude questionnaire with a five-point Likert scale (Brooke, 1996), was used for the subjective evaluation. As additional subjective feedback, answers to the semi-structured interview were Table 2 Distribution of age, gender and study group. Study group

3. First study The evaluation of horizontal information portal included inspection method and user testing that embody an integration of three

Male Female Total

Age

Graduate 3rd year

Graduate 4th year

Total

20–21

22–23

Total

3 4 7

7 2 9

10 6 16

4 4 8

6 2 8

10 6 16

344

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

reaction to the portal as a whole, providing an indication of the level of statement agreement on a five-point Likert scale. The feedback was augmented with users’ answers in semi-structured interview. 3.2. Guideline-based inspection

Fig. 8. Objective performance measure (lower M score indicates better result).

collected. We assessed the most visited broad-reach portals: Index-portal (www.index.hr), Net-portal (www.net.hr), Vip-portal (www.vip.hr) and T-Portal (www.tportal.hr), see Fig. 2. To check assigned tasks, time interval, measuring instruments and adequacy of hardware and software support, pilot testing was performed. We described a work scenario and chose several typical tasks whose structure and location has not changed over time. The tasks were categorized in four categories (Kellar & Watters, 2006): fact-finding, information gathering, browsing and transactions. Initially, every participant had to browse for 5 min. Subsequently, they had to find certain show in TV schedule and gather information related to the traffic situation on Croatian highway. Concerning transaction category, signing as a new portal user was required. For each portal, undertaken tasks were the same and the probability of their completion was similar. The evaluation was carried out individually, longed 45 min for each participant who used a personal computer with Internet access. All portals were assessed, with the order of their evaluation controlled and changed for every participant to avoid order effects. An evaluation procedure consisted of task-based end-user testing, usability satisfaction questionnaire and semi-structured interview. Task-based end-user testing involved a scenario-guided assessment with tasks selected to show the basic portal functionality, balanced in their complexity and expected time for their completion. User efficiency (time on task) and effectiveness (percent of task completed) were measured. For each user, the time limit for all assigned tasks per portal was 20 min. To retain all participants’ results in the analyses, we have used user’s objective accomplishment measure called fulfilment. We defined fulfilment as the average time spent on all allocated tasks weighted with successfulness of task completion. Specifically, to reach participant’s fulfilment measures, an average time spent on all allocated tasks was multiplied with percentage of completed tasks. For subjective assessment, we used SUS questionnaire, as it is argued that this yields the most reliable results across sample sizes (Tullis & Stetson, 2004). It’s questions address different aspects of user’s

Fig. 9. Subjective satisfaction measure (higher M score indicates better result).

User-based testing was supplemented with less strict heuristic evaluation (Nielsen, 1994) i.e. guideline inspection conducted with “instant” specialists from the HCI field. To overcome the problem of not being able to engage a number of HCI experts, we had the inspection performed by ten “instant experts” cf. (Wright & Monk, 1991). They were computer scientists who learnt the principles of user-centred design and evaluation assessments (two computer science university professors, three web developers, one web journalist editor and four web practitioners with experience in web production). An evaluation form with a set of principles and auxiliary guidelines was prepared (example in Fig. 3). A document containing detailed instructions and an evaluation form was sent to chosen “experts”. Aiming to detect design problems, they had to mentally simulate tasks to be performed, mark and comment on the evaluation form. Therefore, to supply all necessary information, the evaluation form had to be thorough and self-explanatory. Nielsen’s usability heuristics, as a set of ten key principles (Nielsen, 1994), was explained and adjusted to portal usage. As additional explanation to the principle, a series of auxiliary guidelines concerning portal design were also provided, cf. (MIT, 2004). To obtain quantitative relationships in the context of conducted summative evaluation, the heuristics has been “translated” into scale and the portal’s score was calculated as an average mark on seven-point Likert scale. “Experts” had to specify a level of conformity with a principle and provide a comment to justify the assigned mark. They were encouraged to offer additional remarks related to advantages/disadvantages of assessed portals. Observations concerning the overall inspection procedure were welcomed. 3.3. Data analysis Descriptive statistics for objective measure fulfilment, including arithmetic means (M) and standard deviations (SD), are presented in Fig. 4. The results for the distribution for fulfilment on TPortal differs significantly from normal distribution (K-S = 0.008). Accordingly, Fried-man’s test as a non-parametrical procedure was performed. A statistically significant value of chi square (2 = 49.4, df = 3, p < 0.01) indicates the existence of differences in objective measure among portals. Wilcoxon’s post hoc test showed significant difference for all portal combinations except between Index-portal and Vip-portal. Descriptive statistics of results for subjective measure SUS are shown in Fig. 5. No statistical difference in distribution of the results from the expected normal distribution was found (K-S 1, 2, 3, 4 > 0.05). To test the difference among portals, the analysis of variance (one-way ANOVA) as a parametric procedure was applied. Significant F-ratio (F = 746.94, df = 29, p < 0.01) indicates the existence of differences among the portals in results for SUS. Post hoc procedure revealed differences for all portal combinations except between Index-portal and Vip-portal, and between Net-portal and T-portal. Pearson’s correlation coefficients for the participants’ results in SUS and fulfilment showed no significant correlation between overall SUS and overall fulfilment (r = 0.14). The results of user testing showed statistically significant differences among portals according to the measures of objective and subjective achievements. This suggests that portals could be ranked by mean values. The measures of user’s objective achievement and subjective satisfaction were not significantly correlated.

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

Fig. 10. Analysis of “instant experts”’ feedback.

Fig. 11. Analysis of HCI experts’ feedback.

345

346

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

The experimental results and findings acquired through guideline inspection are addressed in what follows. Arithmetic means of marks from Likert scale showed that the highest mark was given to Vip-portal (mean = 5.38), followed by Net-portal (mean = 4.85), TPortal (mean = 4.64) and Index-portal (mean = 4.01). Results could be further related to “experts”’ comments. For Vip-portal the welladjusted and consistent layout, simple navigation and feeling of control was emphasized (scored the best for guidelines 8 and 2). Net-portal obtained highest scores for guidelines 2 and 4, and lowest for 10. It was described as consistent with well structured information, but with poor and old fashion visual appearance. TPortal complied very well with guideline 5, but did not comply with guideline 8. Identified problems addressed overly diverse types of navigation and initial page that was too lengthy. Lack of consistency and aggressive “visual noise” were the main reasons why Index-portal got mostly bad marks (the worst marks for guideline 4). Identified problems included ambiguous home page, lack of consistency, navigation overload. The evaluation form analysis included the assessment of the adapted principles and judgment of the quality of “experts”’ evaluations. Qualitative analysis criteria were expressed in terms of mark-span and value of comments (see Mark-span and Info in Fig. 6). To eliminate an effect of extreme results, the lowest and highest marks were excluded (see the upper mark-spans). Those extreme marks widen the span (see marks from the brackets), possibly suggesting guideline vague formulation. Average mark-span of the marks in brackets (3.18) is considerably lower then average mark-span of all marks (4.48). Each guideline was examined analyzing all acquired comments and observations made by “experts” (“horizontal” analysis), assigning low (L), medium (M) and high (H) values according to the quantity and the level of details of comments provided (Info column). For instance, remarks like “there is no mistake” or “not good at all” represent comments of low information quality. Conversely, detailed ones that listed specific observations concerning page layout, fonts, navigation/links and graphics, were classified in medium and high information quality categories. Remarks classified as medium quality are “information is not presented in a simple way”; “page layout is consistent”; “portal offers help while working”. Examples of some comments categorized as high quality are “navigation on portal is supported horizontally and vertically, there is a visible bread crumb trail, there is no feeling of lostness on portal”; “upper and left navigation bar are not consistent, menus are different thus providing diverse information”. The range of marks expresses the lowest and highest marks given by the experts (Mark-span column). The same information quality criteria were used while analyzing “experts”’ work (“vertical” analysis): number, percentage and quality of comments provided and number of additional observations, see Fig. 6. Analyses of the marks and comments for each guideline indicate that some provided extremely poor comments and wide mark-span implying constrains in their understanding or possible vague formulation. Besides, possible suggestions for portal design improvement are not offered. Results suggest that a good guideline is one which is characterized by a narrow mark-span and “provokes” high quality comments, criticisms which identify design problems and at the same time offers solutions (see guidelines 2, 3 and 7). Two “experts” provided only marks without any comment and eight supported their marks with number of comments, ranging from 8 (20%) to 40 (100%). Any additional observation was registered and analyzed accordingly.

were adequate for use in the context of this case study. In particular, the results of the end-user testing showed statistically significant differences among the portals according to the measures of objective achievement and subjective satisfaction. This suggests that portals could be ranked by mean values. The measures of user’s objective achievement and her/his subjective satisfaction were not significantly correlated (coherent with the finding in (Frokjaer, Hertzum, & Hornbaek, 2000)) suggesting two variables which measure different aspects of interface design. The overall results could be further related to the most frequent interview comments, considered as qualitative data. Participants felt especially satisfied and comfortable working with portals where their objective success was high. They considered them as sites with good quality of information structure, respectable layout and straightforward navigation. Important finding is related to the choice of the sample size in addition to the structure of engaged end-users which is in accordance with the outcomes of related studies; specifically in the Hornbæk and Law’s (2007) meta-analysis of usability measures the average number of participants involved per study was 32 (SD = 29, ranging from 6 to 181). Conversely, although showing considerable potential guidelinebased evaluation raised a number of concerns. Quantitative and qualitative analysis of the evaluation form showed that it would benefit from few revisions. Some of the principles adapted from Nielsen’s heuristics showed poor applicability in portal context, not providing useful information for improving portals’ usability. Therefore, a number of guidelines should be clarified, auxiliary guidelines need to be revised and redundant ones excluded. Furthermore, the issue of “instant” specialists should be carefully considered. Diversity in quality and quantity of acquired information suggests the non-homogeneous group of “instant experts” concerning their HCI expertise. However, we believe that in the case of HCI experts’ involvement, every single obtained mark would be equally significant and no reason for dropping out any would be justified. Furthermore, we assume that obtained quite low markspan of some of the guidelines implies their good quality, although study involved non-uniform group of specialists. Overall, this first experience was a valuable learning experience in terms of both identifying portal usability issues and gaining experience in usability evaluation. Nevertheless, it was assumed that user testing could be improved engaging fewer resources with reduced number of users. Additionally, particular aspects of the inspection method could be upgraded; here referring to the issues of experts’ number and selection along with the evaluation form. In conclusion, we can report that the results of the inspection method were not in agreement with the ones obtained from the user test methods. However, this observation is not surprising. Although differences of this kind have also been reported elsewhere (see for e.g. Frøkjær & Hornbæk, 2008), there were not as clear as in this study. Consequently, it seemed reasonable and valuable to perform an additional empirical study valuating at the same time the improved assessment approach.

3.4. Discussion

Different types of raw usability data are collected using empirical methods, while experts’ reports are gathered via guideline inspection conducted with both “instant” specialists and HCI experts.

The reported experience indicates that the chosen research instruments, measures and methods for user-based evaluation

4. Second study Our experience and achieved results indicate that we can build up and improve our evaluation methodology further, as set out below. 4.1. Data collection

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

Methodology triangulation of three empirical methods involved sixteen participants; randomly chosen computer science students (consult Table 2). Concerning the sample size, we point out findings from first study when we analyzed several randomly chosen samples extracted from the original of 30 participants. Reduced sample of 20 users showed no or very little difference, while a sample with less than 10 skewed original variable relations completely. We concluded that a sample of 13–16 participants could be considered in future studies, thus also being in line with relevant studies (e.g. Faulkner, 2003). Additionally, homogenisation of sample was carried out. Diverse “categories” of participants were involved in the first study while in the second one those who, according to market research findings (GFK Croatia, 2008), represent majority of knowledgeable Croatian Internet users. Four most frequently visited news portals were included: Slobodnadalmacija-portal (www.slobodnadalmacija.hr), Jutarnjiportal (www.jutarnji.hr), Vecernji-portal (www.vecernji.hr) and 24sata-portal (www.24sata.hr), see Fig. 7. The evaluation procedure was carried out in an equivalent manner as in the first study, including task-based user testing with scenario-guided assessments determining user efficiency and effectiveness, along with attitude questionnaire and semistructured interview for users’ subjective satisfaction evaluation. Considering experience from previous study, first guideline inspection was conducted with four “instant” specialists; a homogeneous group of computer science professionals, whose level of gained expertise in user-centred design and their motivation for engagement in inspection activities was high. Second round of inspection was performed with four experienced specialists from the HCI field. To conduct thorough study and achieve valuable results, we were very keen in providing resources needed for experts’ engagement. We believed that better quality outcome will be gathered comparable to the one acquired through “instant” specialists’ assessment. Complete instructions were sent to both groups. Because some of Nielsen’s (1994) principles showed inadequate in portal context, an evaluation form with a reduced set of seven guidelines was provided. As additional explanations, reduced and more “valuable” sets of auxiliary guidelines were offered. Experts had to specify a level of agreement with a principle on a seven-point Likert scale and to provide a comment to justify the assigned mark. Additional notes were encouraged and collected. 4.2. Data analysis Descriptive statistics of the objective accomplishment measure fulfilment are presented in Fig. 8. No statistical difference in the distribution of the results from the expected normal distribution was found (K-S 1, 2, 3, 4 > 0.05). To test the difference among portals, the analysis of variance (one-way ANOVA) as a parametric procedure was applied. Significant F-ratio (F = 412.88, df = 15, p < 0.01) indicates the existence of differences among the portals. Post hoc tests showed significant difference for all portal combinations except between Jutarnji-portal and 24sata-portal. Descriptive statistics of results for subjective satisfaction are shown in Fig. 9. Again no statistical difference in the distribution of the results from the expected normal distribution was found (K-S 1, 2, 3, 4 > 0.05) and the difference among portals was tested using one-way analysis of variance. Significant F-ratio (F = 597.65, df = 15, p < 0.01) indicates the existence of differences among the portals in the results for subjective measure. Those differences were found in post hoc procedure only between Slobodnadalmacija-portal and Vecernji-portal, and Vecernji-portal and 24sata-portal. Pearson’s correlation coefficients for the participants’ results in objective and subjective measures showed significant correlation between SUS and fulfilment on Slobodnadalmacija (r = −0.61, p < 0.05) and Jutarnji-portal (r = −0.57, p < 0.05).

347

No significant correlation was revealed between any results on other portals. Results and findings from four “instant experts”’ inspections are addressed in the following. Arithmetic means of marks show that the highest mark was given to 24sata-portal (mean = 5.64), followed by Vecernji-portal (mean = 5.17), Jutarnjiportal (mean = 5.07) and Slobodnadalmacija-portal (mean = 4.96). The guidelines were “horizontally” examined through expert’s comments and observations (Fig. 10). An average mark-span of all experts’ marks for every guideline (and for every portal) was calculated. Its value (2.18) showed good applicability of the chosen set of guidelines in the context of web portals. Simultaneously, it offered a measure of “specialists”’ homogeneity in their expertise. “Vertical” analysis comprised an examination of specialist’s feedback (Fig. 10). Finally, results from four HCI specialists’ inspections are offered. Arithmetic means of marks reveal that the highest mark was given to 24sata-portal (mean = 5.39), followed by Slobodnadalmacijaportal (mean = 5.25), Vecernji-portal (mean = 4.89) and Jutarnjiportal (mean = 4.64). Again, the guidelines were “horizontally” and “vertically” examined (see Fig. 11). An average mark-span for all experts marks was 2.29 indicating once more rather high homogeneity of the sample. 4.3. Discussion The second empirical study again showed adequacy of methods employed in user testing although the original sample of 30 participants has been cut in half. Namely, we considered the outcome of statistical post-sample analysis of the first study showing stability of measures with 13–16 participants. Statistically significant differences among portals according to the measures of user’s objective achievement and subjective satisfaction suggest that portals could be ranked by mean values. Again it was confirmed that participants felt notably pleased and comfortable working with portals where their objective success was high, usually emphasizing portals’ visual attractiveness. The interesting finding of the study was agreement in the feedback acquired from “instant experts” and HCI specialists. Analysis of both the HCI and the “instant” experts’ data showed high level of conformity on marks appointed to the evaluated portals. Moreover, they all offered similar comments for the majority of guidelines. According to the evaluation criteria, both groups identified the same portals as the best and the worst ones. They all emphasized straightforward navigation, consistent layout, accordance to media standards, intuitive design and easy scanning for important information. For both groups the average mark-spans were relatively small and comparable showing high homogeneity in their expertise. Yet again we find the obvious results found in this second study, that the user testing results are quite different from the inspection results, very interesting. The highest ranked web portal by users was the lowest one in the experts’ evaluation. This finding could be further related to the most frequent statements from the users’ interviews and experts’ comments. 5. Conclusion This paper offers a thorough description of a case study about usability evaluations of horizontal information portals, as the most visited Croatian web sites. A series of experiments that employed a range of user-based and inspection methods was conducted, obtaining useful data and producing helpful usability information. It was shown that the chosen research instruments, measures and methods for user-based evaluation were adequate. It was

348

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

proved that the assessment could be improved with a faster procedure that provides stability of measures with reduced sample size. Achieved testing results were in accordance with the metaanalytic research report on correlations among usability measures (Hornbæk & Law, 2007) Although some minor correlation between objective and subjective measures has been shown, it is concluded that to acquire precise insight of portals’ usability, both measures should be attained. Regarding the experts’ inspection, an engagement of a fewer number of specialists with higher HCI expertise along with the employment of simpler evaluation form have provided enough quantitative and qualitative informative feedback. Feedback from both “instant experts” and HCI specialists showed high level of agreement in all aspects of inspection procedure. Consequently, it gave the impression that we could employ “cost-effective” group of “instant experts” which provided evaluation feedback of the similar quality level as the HCI one. The most important finding of the studies is that the userbased and inspection methods disagree. Although differences of this kind have already been reported (Cockton et al., 2003; Frøkjær & Hornbæk, 2008), there were not as evident as in this study. Both empirical studies reported show the discrepancies between user testing and experts reviews. On the one hand, this observation is not surprising if we take into account only the first study. Namely, the expert reviews could be questioned because so-called “instant experts” were employed. However, on the other hand, analysis of both the HCI specialists and the “instant experts”’ feedback of the second study showed high level of their outcome agreement; the results of both groups were quite different from the user testing results. It seems that users’ understanding of the “quality of use” concept is different from the specialists’ conception. Users are oriented toward information finding and their subjective look-andfeel of portal design while experts go deeply into portal structure trying to identify problems that influence its functions. What is very interesting and represents a main contribution of this research is that in both rounds of evaluations this outcome is very sharp and clear, suggesting that the user testing results are quite different from inspection results. These findings support the assertion already acknowledged by the HCI research, that we should combine both assessment methods to get complete and reliable evaluation feedback. The two case studies were conducted for summative evaluation of Croatian horizontal information portals. However, to perform a formative assessment and besides identifying usability problems, improve particular web portal accordingly, an evaluation report along with measurable outcomes should be derived. In this context, the issue of structured usability problem reporting using web tool or paper media should be considered cf. (Cockton & Lavery, 1999; Hvannberg et al., 2007). This would make the approach applicable and useful for wider and “every day” use for evaluations of portals which are similar to the horizontal information ones. To further improve the applicability of the methodology to practice and to cope with the problem of its broad generalization, extensive collaboration within the usability community to conduct multi-site experiments, include a cross-cultural sample and support exchange of ideas and experiences is essential, and will be considered in our future work.

Acknowledgments This work has been carried out within project 177-03619941998 Usability and Adaptivity of Interfaces for Intelligent Authoring Shells funded by the Ministry of Science and Technology of the Republic of Croatia.

References Bellas, F. (2004). Standards for second-generation portals. IEEE Internet Computing, 8(2), 54–60. Beringer, J., Lessmann, C., & Waloszek, G. (2001). SAP AG–May 21. generic portal pages—what do most portals need? http://www.sapdesignguild. org/editions/edition3/generic pages.asp Blankenship, E. (2001). SAP user experience, SAP AG—May 21. Portal design vs web design. http://www.sapdesignguild.org/editions/edition3/graphic.asp Brantley, S., Armstrong, A., & Lewis, K. M. (2006). Usability testing of a customizable library web portal. College & Research Libraries, 67(2), 146–163. Brooke, J. (1996). SUS: a “quick and dirty” usability scale. In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & A. L. McClelland (Eds.), Usability evaluation in industry. London: Taylor and Francis. Can, F., Kocberber, S., Baglioglu, O., Kardas, S., Ocalan, H. C., & Uyar, E. (2008). Bilkent news portal: A personalizable system with new event detection and tracking capabilities. Singapore: SIGIR’08, July 20–24. Carr, A. (2004). Tapor: A case study of web portal usability. http://tapor.ualberta. ca/News/TAPoR UI.pdf Carstens, D. S., & Patterson, P. (2005). Usability study of travel websites. Journal of Usability Studies, 1(1), 47–61. Cockton, G., & Lavery, D. (1999). A framework for usability problem extraction. In A. Sasse, & C. Johnson (Eds.), INTERACT 1999 (pp. 347–355). Cockton, G., Lavery, D., & Woolrych, A. (2003). Inspection-based evaluations. In J. A. Jacko, & A. Sears (Eds.), The human–computer interaction handbook: Fundamentals, evolving technologies and emerging applications (pp. 1118–1138). Hillsdale, NJ: Human Factors and Ergonomics, L. Erlbaum Associates. Fang, X., & Rau, P. L. P. (2003). Culture differences in design of portal sites. Ergonomics, 46(1–3), 242–254. Faulkner, L. (2003). Beyond the five-user assumption: Benefits of increased sample sizes in usability testing. Behaviour Research Methods, Instruments, & Computers, 35(3), 379–383. Frokjaer, E., Hertzum, M., & Hornbaek, K. (2000). Measuring usability: Are effectiveness, efficiency, and satisfaction really correlated? In Proceedings of the ACM CHI 2000 Conference on Human Factors in Computing Systems The Hague, Netherlands, April 1–6, (pp. 345–352). New York: ACM Press. Frøkjær, E., & Hornbæk, K. (2008). Metaphors of human thinking for usability inspection and design. ACM Transactions on Computer–Human Interaction (TOCHI), 14(4), 1–33. GemiusAudience (2008). http://www.valicon.net/uploads/tablica za web aktualni podaci za kolovoz o8.pdf. GFK Croatia (2008). http://www.gfk.hr/press1/infopis.htm. Hornbæk, K., & Law, E. L-C. (2007). Meta-analysis of correlations among usability measures. In CHI 2007 Proceedings San Jose, CA, USA, April 28–May 3, Hvannberg, E., Law, E. L-C., & Larusdottir, M. (2007). Heuristic evaluation: Comparing ways of finding and reporting usability problems. Interacting with Computers, 19(2), 225–240. ISO/IEC 25062:2006. (2006). Software engineering—Software product Quality Req and Evaluation (SQuaRE)—Common Industry Format (CIF) for usability test reports. Kellar, M., & Watters, C. (2006). Using web browser interactions to predict task WWW 2006 May 23–26, Edinburgh, Scotland. Klausegger, C. (2005). Evaluating internet portals—an empirical study of acceptance measurement based on Austrian national tourist office’s service portal. Journal of Quality Assurance in Hospitality & Tourism, 6(3–4), 163–184. Lewis, J. R. (2006). Sample sizes for usability tests: Mostly math, not magic. Interactions, 13(6), 29–33. MIT Usability Guidelines (2004). http://web.mit.edu/is/usability/selected.guidelines .pdf. Nielsen, J. (1994). Heuristic evaluation. In J. Nielsen, & R. Mack (Eds.), Usability inspection methods (pp. 25–64). New York: John Wiley and Sons Inc. Rosenbaum, S. (2007). The future of usability evaluation: Increasing impact on value. In E. Law, E. Hvannberg, & G. Cockton (Eds.), Maturing usability: Quality in software, interaction and value. Springer. Sears, A., & Jacko, J. (2008). The human–computer interaction handbook: Fundamentals, evolving technologies and emerging applications (2nd edition). Human Factors and Ergonomics. Shelat, B., & Stewart, T. (2004). Transport direct portal. Usability testing report. http://www.transportdirect.gov.uk/research/pdf/mr61r1.pdf Tatnall, A. (Ed.). (2005). Web portals: The new gateways to Internet information and services. Hershey: Idea Group Publishing. Theng, Y. L., & Soh, E. S. (2005). An Asian study of healthcare web portals: Implications for healthcare digital libraries. In Proceedings of 8th International Conf. on Asian Digital Libraries. Bangkok: ICADL. Tsui, W. C., & Paynter, J. (2004). Cultural usability in the globalisation of news portal. In M. Masoodian, S. Jones, & B. Rogers (Eds.), Computer–human interaction, proc. of 6th Asia pacific conference. APCHI. LNCS 3101, Springer. Tullis, T. S., & Stetson, J. N. (2004). A comparison of questionnaires for assessing website usability. In Proceedings of UPA conference Minneapolis, http://home.comcast.net/∼tomtullis/publications/UPA2004TullisStetson.pdf Uldall-Espersen, T., Frøkjær, E., & Hornbæk, K. (2008). Tracing impact in a usability improvement process. Interacting with Computers, 20(1), 48–63. Van der Heijden, H. (2003). Factors influencing the usage of websites: the case of a generic portal in The Netherlands. Information and Management, 40(6), 541–549. Waloszek, G. (2001). Portal usability—is there such a thing? In SAP design guild (3rd edition). http://www.sapdesignguild.org/editions/edition3/overview edition3. asp

A. Grani´c et al. / International Journal of Information Management 31 (2011) 339–349

349

Winkler, R. (2001). Portals—The all in one web supersites: Features, functions, definitions, taxonomy. In SAP Design Guild (3rd edition). http://www.sapdesignguild.org/editions/edition3/overview edition3.asp Wright, P., & Monk, A. (1991). A cost-effective evaluation method for use by designers. International Journal of Man-Machine Studies, 35(6), 891– 912. Yang, Z., Cai, Y., Zhou, K. Z., & Zhou, N. (2004). Development and validation of an instrument to measure user perceived service quality of information presenting web portals. Information & Management, 42(4), 575– 589.

involved or is currently participating, as coordinator or as partner, in a number of internationally and nationally funded projects.

Andrina Grani´c is Associate Professor at the Department of Computer Science where she teaches courses for the computer science curriculum. She is currently focused on both theoretical and application-oriented aspects of universal access, system usability, accessibility and smartness, user models and intelligent interaction. She has developed several undergraduate and graduate courses related to human–computer interaction field. She has published over 60 papers as book chapters or in internationally refereed journals and conferences in her area of interest. She has been

Nikola Maranguni´c is junior researcher on the national project Usability and Adap´ tivity of Interfaces for Intelligent Authoring Shells coordinated by Dr. Andrina Granic. He graduated psychology at the Faculty of Philosophy, University of Zagreb where he also achieved his M.Sc. degree in the field of human–computer interaction. His scientific interests and research are focused on multidisciplinary field of human–computer interaction where he contributes through the perspective of cognitive psychology.

Ivica Mitrovi´c holds M.Sc. in artificial intelligence. He is teaching new media design and doing research taking place in a multidisciplinary team of researchers, including social scientists, cultural theorists and cognitive psychologists. From 2001 he is promoting interaction design in Croatia. He was local organizer of the Interaction Design Convivio Summer School in Split. In 2007 he started a series of workshops with international guest leaders as an introduction to interaction design. He is author of the first interaction design course in Croatia.