Information Processing and Management 35 (1999) 819±837
www.elsevier.com/locate/infoproman
Task complexity, problem structure and information actions Integrating studies on information seeking and retrieval Pertti Vakkari* Department of Information Studies, University of Tampere, P.O. Box 607, FIN-33101 Tampere, Finland
Abstract Analyzing actions to be supported by information and information retrieval (IR) systems is vital for understanding the needs of dierent types of information, search strategies and relevance assessments, in short, understanding IR. A necessary condition for this understanding is to link results from information seeking studies to the body of knowledge by IR studies. The actions to be focused on in this paper are tasks from the angle of problem solving. I will analyze certain features of work tasks and relate these features to types of information people are looking for and using in their tasks, patterning of search strategies for obtaining information and relevance assessments in choosing retrieved documents. The major claim is that these information activities are systematically connected to task complexity and structure of the problem at hand. The argumentation is based on both theoretical and empirical results from studies on information retrieval and seeking. # 1999 Elsevier Science Ltd. All rights reserved.
1. Introduction Research in information science aims to comprehend the facilitation of access to information for supporting purposeful action. The major themes to be addressed have been how information is organized for access, how it is retrieved from storage, and how it is sought out and used for various purposes. Two central research areas in the ®eld are information retrieval (IR) and information seeking (IS) (Rochester & Vakkari, 1998). Although intuitively the ®elds * Tel.: +358-3-2156-968; fax: +358-3-2156-560. E-mail address: pertti.vakkari@uta.® (P. Vakkari) 0306-4573/99/$ - see front matter # 1999 Elsevier Science Ltd. All rights reserved. PII: S 0 3 0 6 - 4 5 7 3 ( 9 9 ) 0 0 0 2 8 - X
820
P. Vakkari / Information Processing and Management 35 (1999) 819±837
seem to be overlapping, their research communities have been active in their own enclosures. Few researchers have visited the neighboring side. However, there are researchers (Bates, 1989; Belkin, 1993; Belkin, Seeger & Wersig, 1983; Belkin & Vickery, 1986; Brooks, Daniels & Belkin, 1985; Ellis, 1989; Ingwersen, 1992; 1996; JaÈrvelin, 1987; Kuhlthau, 1993; Marchionini, 1995; Saracevic & Kantor, 1988) who have stressed the need to connect results from both research traditions. IR can be seen as a part of a broader process of information seeking. By IS is understood a process of searching, obtaining and using information for a purpose (e.g. form a solution for a task) when a person does not have sucient prior knowledge. By IR is understood the use of an information system for obtaining relevant information for a purpose (e.g. a task). This implies that information systems are a speci®c means among other sources and channels for obtaining information. Thus, from the point of view of a person constructing a solution by using information from various sources, IR systems and retrieving information provide one means of getting in touch with useful information for that purpose. It is obvious that the role of IR systems and the information they provide varies depending on the task and the situation of the actor (Belkin, Seeger & Wersig, 1983; Checkland & Holwell, 1998; Ingwersen, 1996; JaÈrvelin, 1987; Kuhlthau, 1993; Wersig, 1973). The nature of an IR system is a function providing information to support people taking purposeful action. According to Checkland and Holwell (1998, pp. 110±111) any information system can be described as a pair of systems, one a system which is served (the people taking the action), the other a system which does the serving (IR system). The basic principle of systems thinking is that the necessary features of the system which serves can be worked out only on the basis of a prior account of the system served. The nature of the system served Ð the way it is thought about Ð will dictate what counts as a `service', and hence what functions the system, which provides that service must contain (Checkland & Holwell, 1998, p. 237; cf. Belkin, Oddy & Brooks, 1982; Belkin, Seeger & Wersig, 1983; Brooks, Daniels & Belkin, 1985; Dervin, 1992). The systems development should not start with the data and technology, but with a focus on the action served by the system. Thus, before any system which provides informational support can be construed, there has to be a clear account of, ®rstly, the action supported and then the information relevant to people carrying out the action (Checkland & Holwell, 1998, p. 114; cf. Belkin, Seeger & Wersig, 1983; Ingwersen, 1992; Kunz, Rittel & Schwuchow1977; Wilson, 1981). 1.1. Types of IR and IS studies Although boundary spanners in information studies have shown a common ®eld of interest in IR and IS studies Ð understanding information needs re¯ecting tasks, for information retrieval interaction and systems design Ð the integration of the research traditions is fragmentary. The ®eld of IR has mostly been interested in representations of documents for their retrieval, search strategies, and assessment of the relevance of retrieved documents, and not so much in representations of information needs and actions to be supported (Bates, 1989; Belkin, 1993; Schamber, 1994). On the other hand, studies in IS have mainly concentrated on the use of documents and channels for supporting various actions, but not as connecting results for developing IR systems (Ingwersen, 1992; Rouse & Rouse, 1984; Vakkari, 1997). Fig. 1 shows in greater detail the relations between the various types of IR and IS studies.
P. Vakkari / Information Processing and Management 35 (1999) 819±837
821
We can dierentiate between system oriented and non-system oriented studies. The former approach information searching from the system's perspective; the latter pay attention to the user's way of conceptualizing search activities. These categories can be further divided depending on whether the object of observation in the studies is documents or channels of information. The second major division is between task and non-task oriented studies. In task oriented studies, the point of departure in studying information actions is the task Ð or more generally the purpose Ð to be supported by the information. The other category of studies do not take this into consideration. The task dimension can be divided into subcategories depending on the degree to which the studies analyze information activities as a process, taking changes in time into account. Thus, the studies can be called process or non-process oriented. A traditional IR study, which concentrates on representations of documents for their retrieval, is system oriented and studies documents, not tasks and processes. A typical library user study is interested mainly in the frequency of use of a certain channel and its services Ð a library Ð but not as a process or as connected to some tasks. Studies on IR interaction between texts (documents) and searchers are typically designed to concentrate on a single search session. They are search process oriented, but do not relate the search to a task. Citation studies can be described as document oriented. Typically, they do not take into account the tasks or processes that lead to the use of citations. Information seeking research
Fig. 1. Types of information seeking and information retrieval studies.
822
P. Vakkari / Information Processing and Management 35 (1999) 819±837
can be divided at least into two: types of studies that analyze tasks as aggregates in the level of jobs, where the unit of observation is a job and its information requirements and that normally take into account the task performance process. A second type may both analyze the task performance process and relate information actions to that process. Both types are interested in the use of documents and channels. 1.2. Aims of the study The point of departure of this paper is that analyzing actions to be supported by information and IR systems are vital for understanding the needs of dierent types of information, search strategies and relevance assessments, in short, understanding IR. A necessary condition for this understanding is to link results from IS studies to the body of knowledge by IR studies. The actions to be focused on in this paper are tasks from the angle of problem solving. Certain features of work tasks and problem solving will be discussed and these features will be related to types of information people are looking for and using in their tasks, their patterning of search strategies for obtaining information, and their relevance assessments in choosing retrieved documents. The aim is to develop a general model for explaining variance in the type of information needed, in search strategies adopted, and in relevance assessments in IR in work tasks. The major claim is that these information activities are systematically connected to task complexity and the structure of the problem at hand. The argumentation is based on both theoretical and empirical results from studies on IR and IS. The position of actions to be supported by information in explaining information seeking and retrieval will be discussed ®rst. Then is given an introduction to the basic concepts of construing a model for explaining the chosen information activities. A partial validation the theory suggested will be given by presenting empirical ®ndings from studies in IR and IS. 2. IR and IS studying information in support of purposeful action In IS studies a central methodological rule has been to start the analysis of information needs and seeking by scrutinizing the activity they are a part of. Information seeking is seen as embedded in the activity that generates it (Dervin & Nilan, 1987; Kuntz, Rittel & Schwuchow, 1977; Vakkari, 1998; Wilson, 1981). General models describing this connection have been developed by, e.g. Dervin (1992); Kuhlthau (1993); Wersig and Windel (1985), and Wilson (1981; 1997). However, there are only a limited number of advanced attempts at analyzing empirically how information seeking is related to the various features of activity (work) processes it supports (e.g. Allen, 1978; BystroÈm & JaÈrvelin, 1995; Dervin, 1992; Garvey, 1979; Kuhlthau, 1993; Kuntz & Rittel, 1977; Streat®eld & Wilson, 1982; Sutton, 1994). In these studies certain features of the activity have been used to explain variation in use of information, source and channel types. These features include the nature and phase of the R&D process (Allen, 1978; Garvey, 1979; Kuntz & Rittel, 1977), the work process of professionals (Streat®eld & Wilson, 1982; Sutton, 1994) and the task uncertainty or complexity (BystroÈm & JaÈrvelin, 1995; Kuhlthau, 1993). The scope of a few IS studies has included analysis of search strategies or relevance assessment (e.g. Ellis, 1989; Ellis & Haugan, 1997;
P. Vakkari / Information Processing and Management 35 (1999) 819±837
823
Kuhlthau, 1993; Sutton, 1994). Thus, the traditional domain of IR has mainly been untouched by IS studies (cf. Ingwersen, 1996; Vakkari, 1997). The classical IR research model has concentrated on the representation of documents and queries and techniques for the comparison of document and query representations. Although it has been interested in problems of selecting documents from a database in response to a query, the model has treated information needs as pre®xed and static (Bates, 1989; Belkin, 1993; Ingwersen, 1996; Schamber, 1994). Bates (1989, p. 409) states that the query has been treated as a single unitary, one-time conception of the problem. The classical model has not adressed the problems of connecting search strategies or relevance evaluation to the activities supported by IR. Several researchers (Bates, 1989; Belkin, 1980; 1993; Belkin, Seeger & Wersig, 1983; Ellis, 1989; Ingwersen, 1992; 1996; Saracevic, 1996) have contributed to developing a more general model of IR based on the interaction of texts and users. This interactionistic approach supposes that information searching is inherently an interactive process between humans and texts intermediated by an IR system. Bates (1989, pp. 409±410) describes IR as evolving searches. Every new piece of information the searcher encounters gives him new ideas and directions to follow, and consequently, a new conception of the query. She called this process berrypicking. Belkin (1993) also remarks that if an information seeker's knowledge changes by virtue of engagement with the text, it will be re¯ected in some change in the anomalous state of knowledge that led to that information seeking. Thus, it is evident that this will lead to a change in the search tactics by the seeker as well as in his criteria of assessing the relevance of the information carried by the texts. The interactionist approach emphasizes the changing nature of information needs during the search process. However, it typically studies moves between search tactics without relating these changes to variance in information needs. The research concentrates more on identifying the steps of search strategies than on explaining patterns of the strategies from changes in information needs or by features of the activity that generated the search. This is re¯ected in the research setting of many of these studies: IR has been studied as the interaction between texts and a searcher within a single search session (Spink, Wilson, Ellis & Ford, 1997). The session might consist of several search strategies and resemble the berrypicking model presented by Bates (1989). The searcher might modify at least the query and probably the request if the process changes his problematic situation. It is crucial that the IR interaction is understood here as a process containing a single search session. If we analyze IR as a tool for obtaining information for problem solving, it becomes obvious, that limiting the interaction within a single session does not cover the whole process of problem solving or task performance. Consequently, the single session approach will not depict accurately the changes in information needs during the problem solving process, and thus, it does not re¯ect the changes in the conceptual structure of the actor. Hert (1996) has shown that searchers do not modify greatly their goals, what they intend to accomplish during a particular search session. They are not able to assess reliable the usefulness of the retrieved documents with the bibliographic information typically available. This weak relevance does not provide searchers with sucient insight to make useful changes in cognitive structures concerning the problematic situation. This typically implies that a single search session does not resolve a problematic situation (Harter, 1992; Hert, 1996). However, the ultimate goal of IR systems is to match the representations of the conceptual structure of an actor in a
824
P. Vakkari / Information Processing and Management 35 (1999) 819±837
problematic situation to the conceptual structure of the texts or their surrogates in the system and provide relevant texts in each phase of the problem solving. The single session approach has problems in answering this challenge. Also Robertson and Hancock-Beaulieau (1992) and Spink (1996) have suggested that researchers should take into account the whole process of information searching. The research put forward by the interactionist approach has deepened our understanding of the IR process. Many studies (Bates, 1989; Belkin, Marchetti & Cool, 1993; Ellis & Haugan, 1997; Xie, 1997) have contributed to our understanding of the nature and elements of the information search process. However, they have left open the question of the relationship between task performance that have led to an anomalous state of knowledge and the choice of information search tactics found in the studies. They have not tried to explain variations in information search strategies by a person's goals in the information seeking activity or by the nature of the person's problematic situation as Belkin, Marchetti and Cool (1993, p. 328) suggest doing in order to create a more comprehensive theory of information searching. Thus, the studies have been classi®catory. To conclude, it seems that studies on IS have not succeeded in contributing to how changes in the problematic situation and consequent information needs are re¯ected in patterning of search strategies and relevance assessments. The interactionist perspective in IR has been able to demonstrate theoretical ideas for relating tasks and problematic situations to IR activities, but its empirical studies have been classi®catory ®nding elements for building theories to link IR interaction with actions that it supports. 3. Basic concepts Informational support is sought in situations when an actor does not have sucient prior knowledge to accomplish his purposeful action. This situation is conceptualized as an anomalous state of knowledge (Belkin, 1980), problematic situation (Wersig, 1979), sense making (Dervin, 1992; Weick, 1995) or uncertainty reduction (Daft & Lengel, 1986; Kuhlthau, 1993). The lack of understanding generates information actions for solving the problematic situation in order to proceed in the task. The major elements in the situation are actions to be supported by information, insucient prior knowledge of the actor and informational support mechanisms. These elements are conceptualized as perceived by the actors. In this paper, the scope of actions will be restricted into work tasks which will be analyzed from the point of view of problem solving. Relation between task complexity and problem structure has been left partly open. The evolving framework is applicable only for information actions in work environments. In this phase, the framework fails to take into account features of the communities, e.g. their structure and information provision where the actions are taking place. 3.1. Task complexity A worker's job consists of tasks, which in turn consist of levels of progressively smaller subtasks. Tasks are given or identi®ed by the actor. Each task has a recognizable beginning and end, the former containing recognizable stimuli and guidelines concerning goals or measures to be taken (BystroÈm & JaÈrvelin, 1995; Hackman, 1969). Seen in this way, both large
P. Vakkari / Information Processing and Management 35 (1999) 819±837
825
tasks as such or any of its subtasks may be considered as a task. In this study tasks are analyzed as perceived by the actors, because the understanding of the task by the actors is the basis for interpretation of information needs and the promising actions for satisfying them (Belkin, Seeger & Wersig, 1983; BystroÈm & JaÈrvelin, 1995, p. 193). The complexity of a task is a central feature in determining its performance and consequent information needs. Task complexity can be understood in many ways. It has been associated with the predeterminability of, or uncertainty about, the task. This dimension is related to the following characteristics of a task: repetitivity, analyzability, the number of alternative paths of task performance, and outcomes novelty (Campbell, 1988; JaÈrvelin, 1987; March & Simon, 1967; MacMullin & Taylor, 1984; Van den Ven & Ferry, 1980). Its central dimensions can be reduced to a priori determinability of tasks and the extent of tasks (BystroÈm & JaÈrvelin, 1995, p. 193). In IS research, a priori determinability is the feature of task complexity which has
Fig. 2. Task categories.
826
P. Vakkari / Information Processing and Management 35 (1999) 819±837
been mostly used in studies (Vakkari, 1998). It will be the point of departure in de®ning task for the target framework. By task complexity is meant the degree of predeterminability of task performance. The predeterminability of a task can be divided into the predeterminability of its information requirements, process, and outcome. The determinability of the task is often associated with its structuredness (BystroÈm & JaÈrvelin, 1995, p. 194; Van den Ven & Ferry, 1980). By structure is meant the elements of the task and their interrelations (cf. Partridge & Hussain, 1995, p. 82). The more structured the task, the more determined in advance is its performance. Both de®nitions imply that the determinability of the task increases when knowledge about its information requirements, process and outputs increases. The more the actor knows about the dimensions of the task, the less complex it becomes, and the easier it is to accomplish. Thus, we can connect the degree of predeterminability of a task to the structuredness of the knowledge or conceptual space of the performer about the task. The structure of the conceptual space depends on a person's prior knowledge of the dimensions of the task. If there is a severe lack of knowledge about the task, we can say that the person is in a problematic situation and has an anomalous state of knowledge. BystroÈm and JaÈrvelin (1995, pp. 194±195) have divided task complexity into ®ve categories according to the predeterminability of information requirements, process, and output of the task (Fig. 2). Simple tasks are routine information processing tasks where the elements of the task are predetermined, i.e. the actor knows them. Complex tasks are new and genuine decision tasks where the information required for the accomplishment cannot be determined in advance. 3.2. Problem structure Problem formulation and problem solving are distinct phases in task performance. The structure of a problem is a central feature in problem solving. A problem is structured if the variables involved and their relationships are well known and unstructured, if they are unknown or vague (Partridge & Hussain, 1995, p. 82). In problem treatment, formulation creates a solution space and determines the information requirements of the task (BystroÈm & JaÈrvelin, 1995, p. 194). After the formulation step, the actor has a problem that might be solved, and he knows more clearly what information is relevant. The problem formulation includes the choice of the central elements (concepts and their relations) of the task and guides the actor to focus on them. The elements might be only vaguely understood if the actor does not have much prior knowledge about them. However, the formulation helps the task performer to focus on these elements and search for information to solve the problem in question. It is plausible that there is a connection between task complexity and its problem structure. The more structured the problem of a task, i.e. the more is known about its central variables and their interrelations, the more determinated are its information requirements, process, and outcome (cf. Campbell, 1988; Van den Ven & Ferry, 1980). The more clearly an actor knows the major elements of a task, the better he is capable of assessing what kind of information is needed and what processes are required for its accomplishment. We can also infer that simple tasks are typically tasks with structured problems (cf. Campbell, 1988). Kuhlthau (1993) has shown that task performance can be dierentiated into various phases
P. Vakkari / Information Processing and Management 35 (1999) 819±837
827
depending on how clearly the actors understand its information requirements, and structure the process. Once the focus of the task has been identi®ed, actors tend to employ dierent information seeking strategies. Prior to the focus identi®cation their understanding of the problem is vague and they use general information sources. After focus ®nding their conception of the task became clearer and they used more speci®c sources. As BystroÈm and JaÈrvelin (1995, p. 194) have pointed out, in terms of Kuhlthau's (1993) model, the phases of initiation, selection, exploration and formulation belong to the problem formulation step and collecting and presentation to the problem solving step. If a problem is formulated, it is structured, and thus, it has a focus. The stages can be called the prefocus and postfocus stages. Focus formulation of the problem is a crucial step in task performance. It means that the actor has been able to choose and structure the central concepts and their relations in the problematic situation. This mental construct directs the actor to observe certain feature in the reality and also gives him insight about what information might be useful. When the construct becomes more elaborated during the task performance, the anomaly in the problematic situation diminishes and the person's ability to understand and express his/her information needs increases (cf. Belkin, 1980; Kuhlthau, 1993; Yang, 1997). Thus, the process of a task performance is characterized by the increasing awareness of its information requirements. The structure of the task also becomes clearer. The progress in task completion and problem solution is connected to the growth of knowledge on the issue at hand as well as with the decrease in perceived task complexity. During its execution, the task becomes less complex to its performer. 3.3. Prior knowledge Prior knowledge about a task by an actor is a major factor in determining what information is needed for its accomplishment. Current research both in cognitive psychology and in organization research has shown that human perception and learning of new categories is dependent on a person's knowledge and theories about the world. Theory refers to a body of knowledge that may include scienti®c principles, stereotypes, and informal observations of past experiences (Checkland & Holwell, 1998, pp. 98±104; Choo, 1998; Hahn & Chater, 1997, pp. 49±50; Heit, 1997, pp. 10±15; Weick, 1995). In learning about new categories people act as if these categories will be consistent with previous knowledge. People seem to act economically so that previous knowledge structures are reused when possible. According to Heit (1997, pp. 10± 15), learning of new categories is aected by our prior knowledge in at least three respects: integration, selective weighting, and facilitation eects. One of the basic in¯uences of prior knowledge on the learning of new categories is integration of prior knowledge with new observations. The initial representation of a new category is based on prior knowledge, and this representation is updated gradually as new observations are made. Selective weighting eects of prior knowledge are also critical in category learning (Heit, 1997, pp. 11±13). Previous knowledge leads us to selectively attend to certain features or certain observations during concept learning, thereby narrowing the space of hypotheses to be considered. Some eects of prior knowledge are best described as simply the overall facilitation of learning. It seems plausible that learning about certain kinds of category
828
P. Vakkari / Information Processing and Management 35 (1999) 819±837
structures might be more or less facilitated depending upon the prior knowledge that is accessed, e.g. depending on the kind of category structure expected. Learning is most ecient when the structure to be learned is compatible with the structure that was expected according to prior knowledge (Heit, 1997, pp. 11±15). These ®ndings in cognitive psychology resemble the ideas developed in theory of science as well as in systems thinking. In theory of science a central principle is that observations are theory laden. Our way of conceptualizing a theory in¯uences the way we observe the phenomena. Theory provides us with lenses to observe certain objects and to restrict features of those objects (Outhwaite, 1983, pp. 12±15). Systems theory and organizational research also remind us that an individual selectively perceives his world, judges it and takes intentional action in the light of those perceptions and judgements. This selective perception is a result of the interest, previous experience and history of the actor (Checkland & Holwell, 1998; Choo, 1998; Weick, 1995). The signi®cance of prior knowledge to human action and information processing is also expressed in the root de®nition of the cognitive paradigm in IR. It implies that each act of information processing Ð whether perceptual or symbolic Ð is mediated by a system of categories and concepts which, for the information processing device, constitute a world model (De Mey, 1980). Although the cognitive point of view has expressed an interest in prior knowledge and its relations to the changes of cognitive structures of actors in information searching, we can ®nd few studies that try to capture this theoretically or empirically. In IS this problem is tackled by, e.g. Dervin (1983; 1992), Kuhlthau (1993); Sutton (1994); Todd (1997); Wersig (1973) and Yang (1997). IR studies by Belkin (1980; 1993), Belkin and Seeger and Wersig (1983); Brooks, Daniels and Belkin (1985); Harter (1992); Hert (1996); Ingwersen (1992; 1996) and Wang (1997) are indicative. One of the main diculties in research has been how to describe changes in understanding, i.e. in cognitive structures in a way that would connect the change process to changes in information actions. We can conclude that prior knowledge is vital in determining what information is needed to accomplish a task. This implies that the degree of knowledge about the task is a major factor, which determines what type of information is sought, how the search strategy is formulated, and how the information discovered is evaluated and utilized. Thus, changes in prior knowledge steer changes in information activities. Changes in prior knowledge are also closely related to the degree of predeterminability of the task. The more we know, i.e. the greater our prior knowledge, the more we can anticipate and predetermine the task performance and its dimensions (cf. Belkin, Seeger & Wersig, 1983; Heit, 1997; Kuhlthau, 1993). 3.4. Cognitive structures IR research is interested in cognitive structures both in the mind and in literature and especially their interaction in IR. They both can be described as consisting of concepts and their relations. In cognitive psychology these structures are called mental models or semantic networks (Gavin, 1998, pp. 85±107). In theory of science a familiar noun for concepts and their relations is theory. If an actor has insucient knowledge and thus, insucient conceptual structure about a
P. Vakkari / Information Processing and Management 35 (1999) 819±837
829
task, it implies that he does not have the necessary concepts and links for the phenomena he intends to understand. We can say that insucient knowledge refers to the degree a person is capable of connecting a task with his prior knowledge. By combining this with our de®nition of task complexity, we can infer that insucient knowledge refers to the degree a person is incapable of relating his prior knowledge about information requirements, process and outcome of the task to his knowledge structures (cf. Belkin, Oddy & Brooks, 1982). The person might be lacking sucient concepts, relations, or if±then relations. The process of problem solving is moving from a vague conception of the problem with a state of uncertainty towards a more clear understanding of the problem with a coherent conceptual structure. Once the task is accomplished, a person has a more developed conceptual structure concerning the task. We can borrow ideas of describing growth of theories from the theory of science for modeling this growth of knowledge and understanding (Kuokkanen & Savolainen, 1994; Wagner & Berger, 1985; Vakkari & Kuokkanen, 1997). The scope of a person's conceptual structure refers to the domain covered by those concepts. Dierentiation refers to the number of elements of the conceptual structure in the domain. Integration refers to the amount of interrelations between the concepts in the domain. Thus, we have three elements for describing a conceptual structure: scope, dierentiation, and integration. The more dierentiated and integrated the conceptual structure of a task, the richer it is. A rich conceptual structure is equivalent to abundant knowledge about a domain, e.g. task. It is also evident that the more concepts and links an actor has in a problem domain, the more options, i.e. positions, he has for creating and linking new concepts in his knowledge structure. The more the system (e.g. knowledge structure) is dierentiated and integrated, the more the potential information in the input is utilized in the outputs (Driver & Streufert, 1969, p. 274; cf. Isenberg, 1986). The more an actor knows about the task at hand, the easier it is for him to come up with new solutions for its accomplishments. An expert can be described as a person who has a rich conceptual structure in his ®eld. His comprehensive and dierentiated knowledge model of the problem domain provides him with more options (positions) to infer from his knowledge resources hypotheses for problem solving than does the more limited knowledge of a novice (Isenberg, 1986). This explains why experts are able to infer more hypotheses for a problem solution based on a single cue than can novices (cf. Marchionini, 1995, p. 34). We can thus infer that an increase in expertise and learning leads to an increase in predeterminability of the task. The idea of a rich conceptual structure providing more positions for ecient problem solving implies that the inferences are made on the basis of the relevant conceptual network and are not based on only a single category. It is because properties of objects are not independent and thus cannot be independently assessed in categorization but are embedded within networks of inter-property relationships, which organize and link them. Not only relational aspects between features, but also their causal connections can play a crucial role in categorization and linking (Hahn & Chater, 1997, p. 50). The meaning of a concept re¯ects the meaning of those concepts it is related to. We have shown that insucient prior knowledge refers to a lack of concepts and their relations. This is typical in the pre focus phase of task accomplishment (cf. BystroÈm & JaÈrvelin, 1995). In these situations, in particular new concepts or categorizations of concepts and new links between them are required. Links can connect concepts into an if±then relation or other relations (hierarchical, associative, etc). When a task structure is constructed in the
830
P. Vakkari / Information Processing and Management 35 (1999) 819±837
form of an emerging mental model, dierent types of knowledge in this post focus phase are needed. This knowledge consist of parallel cases for corroborating the model, facts and knowledge items that will ®t within the model (cf. Heit, 1997, pp. 12±15; Yang, 1997). We can now brie¯y summarize the relation of our basic concepts as follows (cf. Belkin, Oddy & Brooks, 1982): the more complex the task, the more ill-structured it is, and the less prior knowledge the actor has. The richer the conceptual structure of an actor about a task, the more clearly can be expressed (1) what type of information is useful, (2) concepts and relations in a request and a query, and (3) criteria for relevant documents; and the easier it is to select relevant documents on the bases of metadata and retrieved documents. Fig. 3 depicts elements of the theory on task complexity and information actions. The model presents the elements in a general level. For an empirical research it has to be speci®ed more in detail.
4. Empirical ®ndings The following empirical studies on the relations between task performance, information types, search strategies and relevance assessment will be analyzed especially from the point of view of stages in task accomplishment. The main supposition is that the construction of the focus Ð formulating the problem Ð in task performance is systematically connected to the patterning of these information activities. The major claim is that the formulation of the problem also dierentiates types of information activities. Thus, the task performance process will be divided into pre and post focus construction stages, and the results from the empirical studies will be interpreted according to that dierentiation. The aim is to ®nd results that support the theoretical ideas presented from empirical studies.
Fig. 3. Elements of a model on task complexity and information actions.
P. Vakkari / Information Processing and Management 35 (1999) 819±837
831
4.1. Tasks, information types and searching A handful of empirical studies have linked task performance, information types and search patterns. The results show that there is a systematic connection between these elements. Based on a series of studies on search process during learning, Kuhlthau (1993) has dierentiated the task completion process into six phases and has shown that the information sought and search types vary accordingly. To summarize, the ®nding of a focus is crucial in the search process. In pre focus phases thoughts are general, fragmented and vague, and actions involve seeking background information. The searcher is unable to construct the task and unable to express speci®cally what kind of information is needed for it. Browsing and discussions with other people are the most frequently used modes of information searching. General background sources such as encyclopedias are mostly used at this stage. After a focus has been construed, the search for information becomes more directed. Thoughts about the task become clearer and more structured. A clearer understanding guides the person to seek relevant, focused information using the whole spectrum of information resources. At the end of the process rechecking searches are made for possible additional information. The ®ndings by Yang (1997, pp. 81±82) corroborate Kuhlthau's results. He identi®ed three major search strategies during the information seeking process in hypertext by students. Each state of searching re¯ected the subjects current mental state. They typically engaged in exploratory searching before coming up with a speci®c direction. At this stage, they aim to establish a framework for their task. Databases were searched without speci®c criteria or a coordinated plan. Purposeful searching occurred once they could maintain a more constant points of reference. At this stage they could search for speci®c information which they had identi®ed as directly relevant to the current goals. Finally, they demonstrate associative searching when they proactively look for related and interconnected information to support arguments they had already established. Yang (1997, p. 81) showed also that as the task becomes more clear, the share of exploratory and purposeful searches decrease and associative searches increase. In a study of the work of higher civil servants, BystroÈm and JaÈrvelin (1995) showed that task complexity is systematically connected to the use of certain information types. They dierentiate between problem information (PI), domain information (DI) and problem solving information (PSI). PI describes the structure, properties and requirements of the problem. DI consists of known facts, concepts, laws, and theories in the domain of the problem. PSI covers the methods of problem treatment. It describes how problems should be formulated, and what PI and DI should be used in order to solve the problem. According to BystroÈm and JaÈrvelin (1995) as task complexity increases (1) the number of dierent information types increases, (2) the need for DI and PSI increases, (3) the share of general purpose sources increase, (4) the share of problem and fact oriented sources decreases and (5) the number of sources used increases. 4.2. Tasks and search strategies The studies by Kuhlthau (1993); Yang (1997) already showed connections between search strategies and the stage of task performance. Next, results by Ellis and Haugan (1997) show the relation between the R&D process and search patterns. Then categorization of search
832
P. Vakkari / Information Processing and Management 35 (1999) 819±837
strategies by Belkin, Marchetti and Cool (1993) will be introduced. Finally, a summary of the ®ndings on search strategies will be given, and they will be connected to the problem structure. Ellis and Haugan (1997, p. 388) have modeled the information searching patterns of engineers and research scientists. They found eight generic search strategies, of which the ®ve that relate to this study are introduced. Surveying is characterized by the initial search for information to obtain an overview of the literature. Chaining means following chains of dierent forms of referential connections. Monitoring is characterized by activities involved in maintaining awareness of developments in a ®eld by regularly following particular sources. Browsing refers to the scanning of primary sources or metadata from searches. Ending means rechecking of sources in the ®nal stage of accomplishing a task. Ellis and Haugan (1997, p. 388) leave open the interrelations of these activities. However, in some cases, they relate the stage of the research process to certain information seeking patterns. Ellis and Haugan (1997, p. 400) claim that when researchers progress through preliminary to advanced phases of the project and become more knowledgeable and speci®c about the problem, they are increasingly selective. The use of formal channels decreases as they progress in the project, and person to person communication becomes more dominant. Thus, the more familiar the topic, the more the use of formal channels decreases and informal, person to person communication, increases. Studies by Allen (1978) and Garvey (1979) also show that the type and source of information used by scientists and engineers vary depending on the stage of the R&D project. These studies also showed that problem formulation crucially dierentiated source selection and information use. Belkin, Marchetti and Cool (1993) have categorized information seeking strategies by combining three dimensions. Method of interaction dierentiates between searching, which refers to looking for a speci®c item, and scanning, which means looking around for something interesting. The goal of interaction is either to learn about relevant issues by inspecting items and their contents or to select useful items by identi®cation. Mode of retrieval dierentiates between recognition, which refers to looking around in a group of items, and speci®cation, which refers to searching for items on some identi®ed topic.
Fig. 4. Problem structure and search strategies.
P. Vakkari / Information Processing and Management 35 (1999) 819±837
833
As argued earlier, the structure of the problem in¯uences the patterning of search strategies. It be supposed that the more structured the problem, and thus the more focused a task is, the more structured is the respective search. By structure of a search is understood the degree of prior knowledge of the terms and their relation in a search. The suggestion that the structure of the problem dierentiates search strategies as depicted is shown in Fig. 4. Ill-structured problems lead to ill-structured searches. If concepts and their links in a problem domain are vague, it will not be possible to formulate an exact request and consequently a structured query (Marchionini, 1995; Taylor, 1986). Browsing is the generic search strategy in these cases. By browsing is understood information searching where the initial search criteria or goals are only partly de®ned or known in advance (Chang & Rice, 1993, p. 235). In several studies (Chang & Rice, 1993, p. 238; Marchionini, 1995, p. 100; O'Connor, 1993, p. 213) it is argued that browsing is a major mode of looking for ideas, approaches and general information when shaping the structure of a task. Surveying and chaining (Ellis & Haugan, 1997), and journal run (Bates, 1989) are typical strategies as well as browsing of references and abstracts from retrospective searches. In the latter case, a dierentiation between browsing where the object is metadata and scanning where the object is text should be made. The goal of interaction in the current case is learning (cf. Belkin, Marchetti & Cool, 1993). In the prefocus phase when a persons' cognitive structure is vague it is more likely that he is browsing in the database in order to learn something about the problem domain rather than directly selecting texts, although this would be expected to be his ultimate goal. It is also likely that persons with ill-structured problems choose recognition as their mode of interaction in state of speci®cation. Because their topic is unfocused they do not have an identi®ed topic to search for. It is more typical in this situation to look around in a group of items. In the opposite case, when the problem, and thus also request and query are structured, the term for this generic search strategy would be querying. In the post focus stage the typical search strategy is the formulation of exact queries. The goal of interaction occurs more in selecting useful items, and the mode of retrieval speci®cation when one is looking for items on some identi®ed topic. 4.3. Tasks and relevance There are a handful of empirical studies analyzing relevance assessments in relation to dealing with problems. Saracevic and Kantor (1988) found that if a user has a well-de®ned problem, then the probability to judge the retrieved items as relevant increases. If a problem is well de®ned, the user has been able to form a focus for his problem. In a postfocus situation, the actor has a structured mental model of the phenomena he is interested in, and thus, he is able to assess clearly the contribution of the sources to his task. Saracevic and Kantor (1988) also showed that if a request is low in clarity and speci®city, then the possibility of judging the retrieved items as relevant increases; and if a question is high in complexity or presuppositions, it increases the odds that retrieved items would be assessed as relevant. Both of these cases refer to the prefocus stage of task performance when the requester is unable to formulate his problem. In the prefocus phase the conceptual structure of the problem is vague and the discrimination power of its concepts is low. The criteria for relevance are loosely de®ned, partial, and fuzzy. All references that indicate that the source could contribute in some way to
834
P. Vakkari / Information Processing and Management 35 (1999) 819±837
the problem solution are accepted. An ill-structured problem implies an ill-structured set of relevance criteria, which in turn have a low ®ltering capacity. Wang (1997) has studied how users' information needs change at dierent stages of a research process by analyzing their document selection from retrieved documents. She analyzed the vocabulary of users in request, document selection and postproject stages. She showed that the persons introduced narrower and related terms as the research process proceeded (Wang, 1997, pp. 312±313). Introduction of narrower terms refers to the speci®cation of the research problem and construing a focus for the work. It is typical in the research process that a general topic transforms to a more focused research problem. Conceptual relations within the problem ®eld also become more dierentiated. This implies the possibility of using narrower and related terms. Wang (1997, p. 315) also showed that the actual vocabulary in each later search stage is substantially larger in size than in the previous one, broader and deeper in hierarchy, and wider in breadth. This also refers to the growth of understanding of the research problem and its solution. When the researcher is proceeding towards the end of the research process with a structured problem, he or she has a more detailed and structured image of the research theme. Central concepts and their relations can be expressed in greater detail. Thus, the vocabulary is broader and the terms more speci®c and precise. It can be concluded that the more structured the problem, the more precisely the relevance of a document can be assessed both in terms of content and metainformation.
5. Conclusions Task complexity and the related structure of the problem are crucial factors determining task performance. They are connected to the types of information people are looking for and using, to patterning of search strategies, and to choice of relevance criteria in tasks. Research in IS and IR have created results which could be used as elements in pursuit of a deeper understanding of information actions and information seeking in general. By integrating results from both ®elds, we are able to create a more holistic view of the search process and its dierent stages. In this paper an outline of a model between task complexity and information actions has been sketched. The model explains variance of these information actions in task performance. It broadens the scope of earlier studies in three respects. Firstly, most of the earlier studies have been classi®catory in describing the various categories of the studied information activities. This study has proposed some systematic relations between categories of information activities, task complexity and problem structure. It provides a tentative model for explaining information activities. However, for empirical testing, the model should be more speci®ed. Secondly, this model connects studies of information types, search strategies, and relevance assessments into one framework whereas previously it has been common to study them separately. The model facilitates understanding of interrelations between these three information activities, and provides a more holistic understanding of them. Thirdly, the model contains concepts, which can be linked to concepts and models in other ®elds of research. By elaborating it, we can connect studies on information types, search strategies, and relevance
P. Vakkari / Information Processing and Management 35 (1999) 819±837
835
assessments to ideas in cognitive psychology, arti®cial intelligence, organization research, and communication research. Task complexity and its systematic relations to central dimensions of the information actions are elements from which we can build a more comprehensive and precise theory of information searching. It presupposes two kinds of theoretical activities. First we have to specify the working strategy (Wagner, Berger & Zeldith, 1992) of information searching, and incorporate task complexity into it. Studies for elaborating Belkin's ASK working strategy in the 1980s are encouraging examples (e.g. Belkin, Oddy & Brooks, 1982; Belkin, Seeger & Wersig, 1983; Brooks, Daniels & Belkin, 1985). Ingwersen's (1996) attempt at this kind of working strategy might be another interesting point of departure. On the other hand, we should sketch a research program for analyzing the various relations between task complexity and the major dimensions of information activities. A research program consists of a set of interrelated unit theories and the empirical research for testing them (Wagner & Berger, 1985). Thus, concerted eorts for theoretically sound empirical research would lead to a growing understanding of the phenomena of information seeking and use. This in turn could be utilized in the building of systems or other means of obtaining information.
References Allen, T. J. (1978). Managing the ¯ow of technology: technology transfer and dissemination of technological information within the R&D organization. Cambridge, MA: MIT Press. Bates, M. (1989). The design of browsing and berrypicking techniques for the online search interface. Online Review, 13(5), 407±424. Belkin, N., Oddy, J., & Brooks, F. (1982). ASK for Information retrieval I and II. Journal of Documentation, 38(2/ 3), 61±164. Belkin, N., Seeger, H., & Wersig, G. (1983). Distributed expert problem treatment as a model for information system analysis and design. Journal of Information Science, 5(5), 153±167. Belkin, N., Marchetti, P., & Cool, C. (1993). Braque: design an interface to support user interaction in information retrieval. Information Processing and Management, 29(3), 325±344. Belkin, N. (1980). Anomalous states of knowledge as basis for information retrieval. Canadian Journal of Information Science, 5, 133±143. Belkin, N. (1993). Interaction with texts: information retrieval as information seeking behavior. In Information retrieval '93. Konstanz: Universitetsverlag Konstanz. Belkin, N., & Vickery, A. (1985). Information in Information Systems. British Library LIRR 35. London: British Library. Brooks, H., Daniels, P., & Belkin, N. (1985). Problem descriptions and user models. Informatics, 8, 191±214. BystroÈm, K., & JaÈrvelin, K. (1995). Task complexity aects information seeking and use. Information Processing and Management, 31(2), 191±213. Campbell, D. J. (1988). Task complexity: a review and analysis. Academy of Management Review, 13(1), 40±52. Chang, S., & Rice, R. (1993). Browsing: a multidimensional framework. In M. Williams, ARIST 28 (pp. 231±276). White Plains, NY: Knowledge Publications. Checkland, P., & Holwell, S. (1998). Information, systems and information systems. New York: John Wiley. Choo, C. (1998). The knowing organization. Oxford: Oxford University Press. Daft, R. L., & Lengel, R. H. (1986). Organizational information requirements: media richness and structural design. Management Science, 32(5), 554±571.
836
P. Vakkari / Information Processing and Management 35 (1999) 819±837
De Mey, M. (1980). The relevance of cognitive paradigm to information science. In O. Harbo, Theory and application of information research (pp. 48±61). London: Mansell. Dervin, B. (1983). Information as a user construct: The relevance of perceived information needs to synthesis and interpretation. In S. A. Ward, & L. A. Reed, Knowledge structure and use: implications for synthesis and interpretation. Philadelphia, PA: Temple University Press. Dervin, B. (1992). From the mind's eye of the user: the sense-making qualitative methodology. In J. D. Glazier, & R. R. Powell, Qualitative research in information management (pp. 61±84). Englewood, CO: Libraries Unlimited. Dervin, B., & Nilan, M. (1986). Information needs and uses. In M. E. Williams, Annual Review of Information Science and Technology, 21 (pp. 3±33). White Plains, NY: Knowledge Industry Publications. Driver, M., & Streufert, S. (1969). Integrative complexity: an approach to individuals and groups as information processing systems. Administrative Science Quarterly, 14, 272±285. Ellis, D., & Haugan, M. (1997). Modelling the information seeking patterns of engineers and research scientists in an industrial environment. Journal of Documentation, 53(4), 384±403. Ellis, D. (1989). A behavioural approach to information retrieval system design. Journal of Documentation, 45, 171± 212. Garvey, W. D. (1979). Communication: the essence of science. Oxford: Pergamon Press. Gavin, H. (1998). The essence of cognitive psychology. London: Prentice Hall. Hackman, J. R. (1969). Toward understanding the role of tasks in behavioral research. Acta Psychologica, 31, 97± 128. Hahn, U., & Chater, N. (1997). Concepts and similarity. In K. Lamberts, & D. Shanks, Knowledge, concepts and categories (pp. 43±92). Hove: Psychology Press. Harter, S. (1992). Psychological relevance and information science. JASIS, 43(9), 602±615. Heit, E. (1997). Knowledge, concepts and categories. In K. Lamberts, & D. Shanks, Knowledge and concept learning (pp. 7±41). Hove: Psychology Press. Hert, C. (1996). User goals on an online public access catalog. JASIS, 47(7), 504±518. Ingwersen, P. (1992). Information retrieval interaction. London: Taylor Graham. Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of Documentation, 51(1), 3±50. Isenberg, D. (1986). Thinking and analyzing: a verbal protocol analyis of managerial problem solving. Academy of Management Journal, 29(4), 775±788. JaÈrvelin, K. (1987). On information, information technology and the development of society: an information science perspective. In P. Ingwersen, L. Kajberg, & A. M. Pejtersen, Information technology and information use (pp. 35± 55). London: Graham Taylor. Kuhlthau, C. (1993). Seeking meaning. Norwood, NJ: Ablex. Kunz, W., & Rittel, H. (1977). A systems analysis of the logic of research and information processes. MuÈnchen: Verlag Documentation. Kunz, W., Rittel, H., & Schwuchow, W. (1977). Methods of analysis and evaluation of information needs: a critical view. MuÈnchen: Verlag Dokumentation. Kuokkanen, M., & Savolainen, J. (1994). The growth of sociological theories: A structuralist alternative to seeking theoretical continuity. Quality and Quantity, 28, 345±370. MacMullin, S. E., & Taylor, R. S. (1984). Problem dimensions and information traits. The Information Society, 3, 91±111. March, J., & Simon, H. (1967). Organizations (2). New York: John Wiley. Marchionini, G. (1995). Information seeking in electronic environments. Cambridge: Cambridge University Press. O'Connor, B. (1993). Browsing: a framework for seeking functional information. Knowledge: Creation, Diusion, Utilization, 15(2), 211±232. Outhwaite, W. (1983). Concept formation in social science. London: Routledge. Partridge, D., & Hussain, K. (1995). Knowledge based information-systems. London: McGraw-Hill. Robertson, S., & Hancock-Beaulieau, M. (1992). On the evaluation of IR systems. Information Processing and Management, 28(4), 457±466. Rochester, M., & Vakkari, P. (1998). International LIS research. IFLA Journal, 24(3), 166±175.
P. Vakkari / Information Processing and Management 35 (1999) 819±837
837
Rouse, W., & Rouse, S. (1984). Human information seeking and design of information systems. Information Processing and Management, 20(1/2), 129±138. Saracevic, T., & Kantor, P. (1988). A study of information seeking and retrieving. Part II. users, questions and eectiveness. JASIS, 39(3), 176±177. Saracevic, T. (1996). Relevance reconsidered '96. In P. Ingwersen, & N. Pors, Information science: integration in perspective (pp. 201±218). Copenhagen: Royal School of Librarianship. Schamber, L. (1994). Relevance and information behavior. In M. Williams, ARIST 29 (pp. 3±48). White Plains, NY: Knowledge Publications. Spink, A., Wilson, T., Ellis, D., & Ford, N. (1998). Modeling user's successive searches in digital environments. DLib Magazine, p. 1998. Spink, A. (1996). Multiple search sessions model of end-user behavior: an exploratory study. JASIS, 47(8), 603±609. Streat®eld, D. R., & Wilson, T. D. (1982). Information needs in local authority social services departments: a third report on poject INISS. Journal of Documentation, 38(4), 273±281. Sutton, S. A. (1994). The role of attorney mental models of law in case relevance determinations: an explorary analysis. JASIS, 45(3), 186±200. Taylor, R. (1986). Value-added processes in information systems. Norwood, NJ: Ablex. Todd, R. (1997). Information utilization: a cognitive analysis of how girls utilize drug information based on Brookses' fundamental equation. In P. Vakkari, R. Savolainen, & B. Dervin, Information seeking in context (pp. 351±370). London: Graham Taylor. Vakkari, P., & Kuokkanen, M. (1997). Theoretical Growth in Information science. Applications of the theory of science to a theory of information seeking. Journal of Documentation, 53(5), 497±519. Vakkari, P. (1997). Information seeking in context: a changing and challenging metatheory. In P. Vakkari, R. Savolainen, & B. Dervin, Information seeking in context (pp. 452±646). London: Graham Taylor. Vakkari, P. (1998). Growth of theories on information seeking: an analysis of growth of a theoretical research program on relation between task complexity and information seeking. Information Processing and Management, 34(3/4), 361±382. Van de Ven, A., & Ferry, D. (1980). Measuring and assessing organizations. New York: John Wiley. Wagner, D., & Berger, J. (1985). Do sociological theories grow? American Journal of Sociology, 90, 697±728. Wagner, D., Berger, J., & Zeldith, M. (1992). A working strategy for constructing theories. In G. Ritzer, Metatheorizing (pp. 107±123). Thousand Oaks, CA: Sage. Wang, P. (1997). User's information needs at dierent stages of a research project: a cognitive view. In P. Vakkari, R. Savolainen, & B. Dervin, Information seeking in context (pp. 307±318). London: Graham Taylor. Weick, K. (1995). Sensemaking in organizations. Thousand Oaks, CA: Sage. Wersig, G., & Windel, G. (1985). Information science needs a theory of information action. Social Science Information Studies, 5(1), 11±23. Wersig, G. (1973). Informationssoziologie: Hinweise zu einem informations-wissenschaftlichen Teilbereich. Frankfurt am Main: AthenaÈum. Wersig, G. (1979). The problematic situation as basic concept of information science in the framework of the social sciences. In New trends in informatics and technology. Moscow: VINITI. Wilson, T. (1997). Information behaviour: an interdisciplinary perspective. In P. Vakkari, R. Savolainen, & B. Dervin, Information seeking in context (pp. 39±52). London: Graham Taylor. Wilson, T. (1981). On user studies and information needs. Journal of Documentation, 37(1), 3±15. Xie, H. (1997). Planned and situated aspects in interactive IR: patterns of user interactive intentions and information seeking strategies. In ASIS '97, Proceedings of the Annual Meeting, 34. Medford, NJ: Information Today. Yang, S. (1997). Information seeking as problem-solving using a qualitative approach to unfover the novice learners' information-seeking process in a perseus hypertext system. Library and Information Science Research, 19(1), 71± 92.