Supporting patent retrieval in the context of innovation-processes by means of information dialogue

Supporting patent retrieval in the context of innovation-processes by means of information dialogue

World Patent Information 31 (2009) 315–322 Contents lists available at ScienceDirect World Patent Information journal homepage: www.elsevier.com/loc...

509KB Sizes 1 Downloads 37 Views

World Patent Information 31 (2009) 315–322

Contents lists available at ScienceDirect

World Patent Information journal homepage: www.elsevier.com/locate/worpatin

Supporting patent retrieval in the context of innovation-processes by means of information dialogue Paul Landwich, Tobias Vogel, Claus-Peter Klas *, Matthias Hemmje FernUniversität Hagen, Universitätsstr. 1, 58097 Hagen, Germany

a r t i c l e

i n f o

Keywords: Information retrieval Innovation-process Dialogue systems Patent retrieval Result visualization Interactive search

a b s t r a c t Innovations are an essential factor of competition for manufacturing companies in technical industries. Patent information plays an important role for innovation-processes and innovators in the knowledge management. The combination of cross-organizational spread information and resources from patent databases and digital libraries is necessary in order to gain profit for innovation experts. The major challenge is to overcome the current information deficit and to fulfill the information need of the experts in the innovation-process. In this paper we first present in detail three innovation scenarios to derive challenges on advanced information systems which support an information dialogue but also the complete search process. Then we define essential conditions of a search task and from that we derive the elementary information sets and activities in the next step. An example shows the applicability and utility of the formalization described and shows how the activities fill up the user’s information dialogue context. During the formalization we apply a cognitive walk-through on a patent database. We will use the DAFFODIL-system as an experimental system for further development and evaluation of the above described framework. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction and problem description Innovation seeking enterprises can no longer rely on products created by accidental creativity or by isolated ingenious inventors hunting for ideas. Novel technical solutions are a competition factor and essential for entrepreneurial success [1]. Therefore companies want to create new products systematically [1] considering restrictions of time and costs, while increasing product quality [2,3]. Systematic development of unique products demands innovation-processes, which are based on all available information and provide relevant information. This, for the firms appeared situation rescues challenge for three areas. We have an innovationprocess with the need for information. To reduce this information need information retrieval systems are used. A better cognitive processing of the information is improved by information visualization. In the following we want to introduce these three areas. Innovation-processes: Are defined sequences of activities with the goal to development new products [4]. To profit from ideas and unique solutions, patents are used to claim and protect intellectual property rights [2]. Applications for patents are the closing activities in an innovation-process. When innovators (effected is a plurality of experts of different domains, e.g. engineers, scientists, developers and patent attorneys) work on innovation-process activities (as patent research or analysis of available products, * Corresponding author. E-mail address: [email protected] (C.-P. Klas). 0172-2190/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.wpi.2009.04.004

etc.), they need to extract valuable information within their company from distributed databases of imaginable different file types and formats. Innovation-processes become more and more crossorganizational, often with suppliers’ and customers’ experts as corporate partners supporting collaborative development and creation of novel products [5]. Völker et al. [3] provide an overview of knowledge sources for innovation-processes and state, that next to company internal knowledge, external knowledge sources adopt a significant role, e.g. knowledge of patent agencies, research facilities, federations and associations. Information retrieval: Innovating experts have to retrieve information from external information sources, e.g. external data memories and digital libraries, which are often only accessible by proprietary text-driven search services. Many times, affliction of experts in technical industries, are due to a lack of information caused by impenetrable information, difficult search mechanisms and incomprehensible result presentation. Unavailable and not utilizable information may cause hesitation and prolongation of development duration and time-to-market. Information as a result of information retrieval and patent research is a key-resource and also a determining factor of successful innovations. Information visualization: Card et al. [6] define information visualization, as ‘‘. . .the use of computer-supported, interactive, visual representations of abstract data to amplify cognition”. Therefore information visualization is more than display of data sets. It can be considered as a methodology for visual data mining. Users interactively manipulate the way data is presented until the resulting

316

P. Landwich et al. / World Patent Information 31 (2009) 315–322

visualization reveals the required information. In classical information retrieval we find mostly a result list. We know that a result list and text-based display are important for information retrieval [7]. Users mostly need experience to get visualization tools and decrease response time. Therefore one is apt to prefer a text-based display to get a quick result. But this is not the comprehension of [6] of information visualization. We need also visualization tools to support the perception and analysis of found information-objects and the information context in which the user moves during his search process. Goal of this visualization is the presentation of the current situation within the search process [8]. After this Introduction, in the second section innovation scenarios are used to point out information needs of experts working on innovations. From these needs challenges on advanced information systems like query reusability and result visualization are derived. Section three gives an overview on innovative search with related work in the fields of information and patent retrieval. In section four we present our understanding of an information dialogue and retrieval mechanism (with elementary information sets and activities) and offer a cognitive walk-through. In section five, we provide an outlook on future information retrieval systems.

2. Innovation scenarios and resulting challenges for patent retrieval Innovation experts complain about information deficit. Following scenarios within innovation-processes point out information needs of experts inside companies but also cross-organizational. Each scenario description explains goals to be fulfilled by process-oriented approaches and is used to derive challenges on information retrieval, especially challenges in patent retrieval, retrieval mechanisms, query entering, information access and information visualization. 2.1. Interfaces for information need description and query reusability Experts working on innovations will start to analyze existing patents to figure out, which claims already exist and which solutions are available, free of charge or even under license. On one hand, there is patent information available in patent databases; on the other hand there is a need of information by experts. Major objective is:  How to make experts’ lack of information and patent information overloads match? The challenge is to conciliate plurality of information and demand of information. Information seeking experts have to externalize and formulate questions or keywords. They have to describe their informational needs. Furthermore they have to formalize and construct a syntactically correct query to feed text-based user interfaces of patent search engines. In case of match, results will be presented in a result-list and must be examined manually. After a selection, valuable patent documents can be stored locally one by one, or in a company internal file system. A typical challenge occurs:  What happens if the search results are neither matching nor satisfying? Today in consequence, experts enter complete new queries or combinations of last tried keywords extended by additional ones to refine the search. But are text-based user interfaces useful to search for CAD files or technical drawings? How about using similar keywords with different syntax or abbreviations? How about synonyms and keywords in different languages? Text-driven approaches will not always lead to expected results.

Incremental innovation-processes [5,1] are used to force innovations in discrete time steps. These processes are for example used to enforce continuous development and improvement of technical products, e.g. (next generation) electronic control units for vehicles [9]. Therefore engineers will satisfy their information needs by seeking for patents in the fields of stateof-the-art electronics, advanced test equipment, advanced products of competitors and further more. As mentioned, incremental innovations will take place in discrete time. So in discrete time, engineers will restart text-based search by using same or similar queries. Experts will be confronted with the challenge of query reuse:  How to save, provide and reuse search queries for the next search? Innovation-process activities within innovation-process models can be used to structure and store information [9]. In WPIM (knowledge-based and process-oriented innovation-management) all available information, e.g. patents, lessons learned as well as search-queries, can be stored in context of innovationprocess activities [10,9,5]. Challenging for the advanced satisfaction of informational needs is to setup interactive user interfaces to reuse query and to provide auto-fill function when entering queries. Query reuse might reduce time to formulate queries as well as spelling mistakes. As shown, within innovation-processes experts with different technical background will search in patent databases. Therefore the user interface design should be simple, self-describing and interactive. A user interface that allows dialogues could be an approach to support patent retrieval. Retrieval dialogues could start with a set of keyword, which must be extendable, changeable and refine able without loosing search results of previous searches. For advanced informational retrieval systems it is a challenge to setup simple, interactive and self-describing user interfaces to reuse query. 2.2. Collaboration on innovation due to combination of external information In inter-organizational innovation-processes different companies can collaborate on innovations. Manufacturing companies in technical industries demand their business partners, e.g. customers and suppliers to contribute initial ideas to innovative products. But also cooperation and collaboration with research organizations, patent agencies and scientific communities are common. It is a challenge:  To exploit external knowledge for the own business model aiming at unique innovations. In such a collaborating environment, so called innovation-network [11] experts feel an information deficit and an information need to understand the field of work their cooperating partners are specialized in. If necessary, manufacturing companies are willing to pay a license fee to cooperation partners or patent owners. An example is the CAN-Bus technology (http://www.semiconductors.bosch.de/), a patent hold by Robert Bosch GmbH. Most car manufacturer pay for using CAN-Bus. A further example for collaborating on innovations is the Night-Vision (http:// www.bosch-nightvision.com) project, developed in cooperation between Robert Bosch GmbH and Daimler AG [10,9]. Future solutions in automotive bus technologies will be under FlexRay (http://www.FlexRay.de) licensing. Patents are already held by FlexRay Consortium GbR, an industrial partnership of General Motors, BMW, Daimler, VW, Freescale, NXP, and Bosch. Such collaborating partner share their ideas, exchange concepts and models in early stages of innovation development. Within open

P. Landwich et al. / World Patent Information 31 (2009) 315–322

innovation-process documents and patents are shared, to give partners a deep insight into state of research and providing knowledge. Focusing on information, it is challenging to access and retrieve data from external information sources, e.g. databases of cooperating organizations. Information is spread in different organizations and also on the web. Furthermore, organizations have own information storing systems like (partly) accessible servers, databases and digital libraries. Collaborating partners share their ideas, documents and patents. Thereby it is challenging to access and retrieve information from external databases of cooperating organizations.  How to extract information from (various) sources a single step? For example, how to extract FlexRay patents of patent agency database and FlexRay specification documents of FlexRay webpage in only one step? Information seeking experts demand to receive a single, combined result set of one query, containing information retrieved from spread external information sources as digital libraries, distributed patent collections and the web. Effective innovation-processes supporting collaborating experts demand for information retrieval using one query for several databases, combining, presenting or visualizing search results in a single result set. 2.3. Innovation experts expect advanced result visualization An activity within innovation-processes can be research inside a patent database before applying and claiming an own patent [9]. Intention is to figure out related fields of innovation and existing claims. Today’s patent retrieval services as DEPATISnet (http:// depatisnet.dpma.de) present retrieved results text-driven, as an index or a list. Text-based representations are important but can put a cognitive burden on human experts. Too many matches and a flood of results make humans feel lost in information. Information visualizations have the potential to reduce this cognitive overload for human spectators. Requirement for advanced information retrieval systems:  Innovating experts and engineers need a self-explaining presentation or visualization of retrieved search results. Also retrieval experts receive patent results text-based while using conventional expert search engines. When patent officers receive a new patent application they check if demanded claims are already reserved. In order to find patents from same applicants it can be helpful to search the patent database by (sets of) inventors’ names. Retrieval experts should be supported in storing, integrating and combining different retrieval sets by result visualization. Search algorithms should provide features to search keywords and combination of keywords, but also to search focused on relevant cluster of patents. Detected clusters could be marked by color or font-type. Challenging is to figure out which sensory and receptor skills of humans are the most sensitive, e.g. extracts, short and simple information, keywords colors or symbols. By excluding keywords result sets can be reduced. Setting up visualization of queries and excluded keywords a visual mapping between information requirements and retrieved results might be a first approach. Mapping systems should be able to provide visual dialogues on relevance of retrieved patents. Challenging is a representation of search results in a human-understandable way, on one hand for beginners, (e.g. innovators, scientists) on the other for retrieval experts, (e.g. patent attorneys or patent officers). Advanced visualization of retrieved results could support information seeking experts.

317

 Above shown scenarios in innovation-process demand for further development on existing information systems and retrieval mechanism.  Explicit worked out disadvantages of today’s patent retrieval help to describe existing challenges more precisely.  General approaches on improving query combination and query saving, but also on result visualization and storing are derived. These three scenarios supported us to derive challenges in patent retrieval within innovation-processes. To support experts on innovations advanced retrieval mechanisms, query reusability, interactive interfaces and result visualizations are demanded. 3. Innovative search The identified challenges for patent retrieval are incitement for an innovation of retrieval systems. From the experience one knows that information retrieval is a process which is not concluded with the first query. In the patent search it is not mostly about finding again a patent. Rather a working context is elaborated. With this understanding becomes clear that there must be an accompanying dialogue. This also corresponds to the human need for communication. For this we want to introduce a first basic model to catch these requirements. 3.1. Introduction The requirements definition of an interactive information retrieval system can be viewed from two perspectives: first by the existence of an information deficit corresponding to an information need with respect to the user [12]; and second by the mediated provision of stored information through the underlying access support system (i.e. database management, digital library or information retrieval systems). Under optimal conditions the information deficit of a user can be resolved through the use of the provided interactive information retrieval system. In classical information retrieval (IR) research, the system-oriented view dominated in the past. This view stipulates that a user formulates a query and then judges the found elements according to their relevance. This kind of static IR process was improved and refined in the past. Particularly within the fields of IR engines and result presentation many innovations were developed in this special setting. This static setting does not always correspond to the communication and interaction needs of humans. IR systems should explicitly support the cognitive abilities of users in order to realize a dynamic dialogue between the user and the system. An information dialogue which not only supports an individual query but also the complete search process is useful. Only this model will support the possibility of satisfying an information need. The model shown in Fig. 1 shows the key elements required for satisfactory information dialogue. A central part of this conceptual model is the interactive information dialogue cycle [13]. The interactive information visualization cycle supports an information dialogue with an information retrieval system starting with a first query and ending after n cycles with an information status on the users’ side that has either eliminated or at least reduced the information deficit. It is important that a dialogue is made available through the system, in order to stimulate the user to interact with the interface. The outcome is a new dialogue cycle (or a part of it) with changed parameters in query, navigation, relevance or information visualization. In this way, interactive information processes can be supported. Every information visualization cycle is driven by activities. The results of these activities are, e.g., sets of

318

P. Landwich et al. / World Patent Information 31 (2009) 315–322

Fig. 1. Conceptual model for visually direct-manipulative information retrieval dialogues.

information objects, visualizations or assessments. These activities and their results fill our information dialogue context. In order to fully understand the basis for modeling of interactive IR systems and with it patent retrieval, the complete information seeking process must be understood. For this essential activities and conditions of a search task must be identified and defined. What are application possibilities for a user to interact? What are possible responses of the system? Let us compare the retrieval-process with the often chosen example of searching literature in a library. First of all, the existence of relevant books for the satisfaction of an information need may not be known in advance and therefore their precise characterization, location, access and inspection may not be possible in a one step procedure. Therefore a more complex information behavior will occur. The visitor will walk through the library and the bookcases, study the back of serval books, thumb through books, ask the librarian and so on. Similar it is with a patent search process. An initial query will start a chain of activities and results. But every link of this chain brings the necessary information to speed up the search and to restrict the problem. We define this collected Information as our information dialogue context. This information dialogue context is for our view the central point of a search process. In the following Section we will discuss the related work in the area of user interfaces and information visualization in the context of the information dialogue model. In Section 4 the foundation of this model is presented including a walk-through to describe a typical information search process along with the inherent activities and a concluding discussion. In the summary we will have a look at the potential of the introduced model. 3.2. Related work Cooper [14] has established already distinguished measures for the quality of information systems based on relevance and usefulness. The aim of increasing not only the relevance but also the usefulness of results was presented in the so-called information behavior model by Wilson [15], which opened two related fields of research. The first went in the direction of user interfaces of information systems and the second in the direction of information visualization. Visual presentation, perception and cognitive interpretation of information were considered for a long time [16] as an efficient method for communicative situations. For this purpose tools were developed which visualize the exploration by means of query construction and (re)formulation [17] and the result presentation [18–20]. In consequence visualization is not only representation but also stimulation for interaction and dialogue. First approaches into this direction are already implemented and studied, e.g., by the LyberWorld [21], VisMeB [22] and prefuse [23] systems.

Hemmje et al. [24] were motivated by the leading thought for the support of dialogue when they introduced a cognitive model of the information dialogue and a model of an interactive information visualization cycle supporting an IR dialogue [13]. Landwich et al. [25] took up this approach and introduced a cognitive enhanced model of IR which led to a model for visually direct-manipulative information retrieval dialogues. In order to optimize information systems, models in which the human users are not only a part of the system (e.g. providing only input) but also become an important component of the system and even its centre, respectively, were developed. Kuhlthau [26] investigated how information seeking and corresponding access can be understood. In consequence cognitively oriented models and approaches to support the concept of information seeking were introduced. Information need often changes during a seeking process due to changes in user awareness, the understanding of the concept of information access was extended and with [27] the first so-called information strategies were identified and investigated. Many other works [28–30] showed the complexity of the information search and their activities. All introduced works flow into the design of user interfaces. Different works [31,32,18] investigate this problem and examine the challenges for the optimization of the human being–computer–interface.

4. Framework for an information dialogue Results of existing research have been used to design user interfaces. But for this, Thimbleby [33] wrote: ‘‘Most user interfaces are unknown, in that they have grown from adding features. . . Instead, build simple, explicit interaction frameworks to lay the foundation for clear interaction structures.” This basic principle concerns the core element of interfaces, the dialogue. Landwich et al. [25] showed, that the dialogue fills the information dialogue context, as the basis for managing and coupling the states of, e.g., the database management system or the retrieval engine in their consecutive operations. It supports users in reducing their expressed uncertain state of information need. In order to receive a first simple, explicit interaction framework we analyze the basic elements of the dialogue of IR-systems. In doing so, we have to ask ourselves the following questions:  What are the possible activities?  How do activities effect our information dialogue context? To answer these questions, we have to describe the different sets of information objects which are present, as well as those that could be created in sequence of activities. Furthermore we provide a formal description of the activities.

P. Landwich et al. / World Patent Information 31 (2009) 315–322

By the use of a cognitive walk-through we illustrate this formalization. With a series of representative examples on the basis of a usual retrieval system of a digital library a reference for the practice is established and will simulate a search. 4.1. Scenario and possible sets of information objects In order to clarify the definitions, we outline a scenario within a retrieval system which has access to different databases. How in Section 1 described, we need for the innovation process access to information of imaginable different file types and formats. Therefore we speak in the following any more from patents or articles, but from information objects. In these databases we find a large amount of information objects. So we define: Content set: Let O be an information object, which describes information of most different kind. Then the content set C of a data source a user has access to defines itself by

C ¼ fO1 ; O2 ; O3 ; . . . ; On g with n 2 N On the other hand a certain number of information objects exist that would help us to solve our information problem. But these do not necessarily have to be within the databases. Interest set: The interest set I exists as a set of concrete information objects which are able to reduce an information deficit. For the simplification we assume I is constant.



n X

Oj ¼ const: with n 2 N

319

In our simulated search the user is looking for a specific patent, but he is also interested in all information objects (articles, etc.) which deal with the same research field. He starts a first query with some keywords. For this beginning query, the user gets a first result with a set of information objects presented as a result list. This list is so big that it is distributed to several pages. To get an overview, the user will scroll, resort or change the page within the result list. Let us now define the first activities and resulting sets after this first step of the user. 1. EXPLORATION: The access to the content set C in the form of a query q and the visualization and realization of the produced result set rq defines the EXPLORATION. For this beginning query, the user gets a first result with a set of information objects. But only a certain part of rq will be a part of the interest set I. In classical information retrieval the term recall quantifies the fraction of known relevant information objects which were effectively retrieved. In dependence on it we define Explored recall set: The intersection of the result set rq of a query q and the interest set I defines the explored recall set ke,q

ke;q ¼ r q \ I Result sets can be of arbitrary size. On the other hand, display space for visualization is restricted. Hence, user can cognitively capture only a part of the articles at a time. For this we define: 2. FOCUS: The focus set fn,q represents the subset of information objects Oi of a result set rq which reach the field of vision of the user through a visualization and is the result of the activity FOCUS

j¼1

So the set of available and relevant information objects is defined as Relevance set: The relevance set R derives itself from the intersection of the contents set C and the interest set I

R ¼ C \ I ¼ const: In search for relevant information objects, we query the databases and a result set is returned as an answer. Result set: For a query q onto the data source we receive the result set r q which is a subset of the contents set C

rq # C Fig. 2 illustrates the targets of our formal definitions by means of a single query so far. 4.2. Cognitive walk-through and possible activities Beginning with the described scenario and the defined sets (see Section 4.1) we present a fictive seeking process. With the walkthrough we want to introduce the activities EXPLORATION, FOCUS, NAVIGATION, INSPECTION, EVALUATION and STORE.

fn;q # rq If there are relevant information objects within the focus set we can define: Focused recall set: The intersection of the relevance set R and the focus set fn,q defines the focused recall set kf,q

kfn;q ¼ R \ fn;q In order to identify elements of the explored recall set ke,q, the user will analyze different parts of the result list. For this, he must move within the result list. This activity is defined as: 3. NAVIGATION: The movement within a set of information objects (information space) or between different information spaces. This causes a change of the focus fn,q. Let us return to our scenario. During the movement within the set of information objects the user will find information objects of interest. He will try to get more detailed information about this information object. If possible, an abstract or the full information object can be displayed by clicking on the title. Now he gets additional information to estimate which value this object has for him. In a positive case he has now the possibility to give a feedback to the system or to store his information objects of interest. The continuation of the Scenario has brought us to new definitions of activities and resulting sets. 4. INSPECTION: INSPECTION is used for the cognitive determination of the state of an information object. Following the INSPECTION the user himself classifies the information object referring to his assessment of relevance. This activity is defined as: 5. EVALUATION: EVALUATION gives the system a feedback of the user’s understanding of relevance and appoints the verified recall set. Verified recall set: Within different INSPECTIONs the user identifies relevant information objects. This set of information objects is defined as the verified recall set kv for this query

kv # Fig. 2. Dialogue state after a first explorative query.

m [ n¼1

kfn;q with m 2 N

320

P. Landwich et al. / World Patent Information 31 (2009) 315–322

To be able to access later to relevant information objects, these must be stored. 6. STORE: This activity allows the user to store found documents. It either happens logically in form of a storage box on the user interface or physically when a document is downloaded or printed. Naturally the user can store only information objects which lay before in his focus. However, there is no necessary on inspecting or verifying these before. Therefore we define: Stored recall set: The stored recall set ks,q represents the subset of information objects Oi of the focused recall set kf,q

ksn;q # kfn;q We have derived from our scenario many definitions of activities and sets of information objects for a first query. It becomes clear that the interdependency of activities and (visualized) sets defines our dialogue. Each activity or created set is a step of this dialogue. But a search process will rarely be finished after one query. The user will improve his query on the basis of the won information. This does not result in that the new results are a total subset of the previous ones. Some quite new information objects will come out. Within Fig. 3 the result of three separate queries performed at the moment t1, t2, and t3 are displayed. It becomes clear, that every query causes a change of the explored recall set. If we display each query onto the projection plane the union sets becomes visible. Context: Context rtotal is defined as the union of all result sets rq and stretches the information space in which the user moves

kv #

m [

kfn;q with m 2 N

n¼1

So as the context rtotal increases, also the explored, the focused and the verified recall set (ke,total, kf,total, kv,total) will grow with every new exploration. So, with every new query, the information disappears from user’s field of vision. The user in our scenario has only access to the current information objects of the last query. So he will ask himself more often: Which information objects have I already inspected, actually? Or, which information objects in the search process were relevant to me? But with the presented definitions we are now able to analyze existing systems. This makes it possible for us, to improve existing systems or to define goals for new systems in that way, the user is able to capture and handle the whole context of a search process.

4.3. Analysis and challenge In Figs. 2 and 3, three interactive modes can be identified: (1) The mode of Access which causes a change of the informational context set of through access the database by means of a query. (2) The mode of Orientation which changes the view onto the informational context set and represents a movement between information-objects and spaces. (3) The mode of Assessment which identifies information objects of the interest set. Every mode is totally enclosed and has its own activities. The first mode is Access. Within this mode there is only one activity, EXPLORATION. After the first EXPLORATION the user changes into the second mode Orientation. Activities for this mode are NAVIGATION, FOCUS and INSPECTION. The user has now the ability to change the visual as well as the informational focal point in an information visualization of the dialogue context. The mode Assessment is reached, if the user finds objects of interest during his inspection. For this mode the activities EVALUATION and STORE are available. They help to express the user appreciation of relevance and to define the identified recall set. The projection displays the weak points of traditional information retrieval systems very well. Every exploration produces only a snapshot of the complete seeking process and navigation is only possible within a single exploration step. However, through such a sequence of interactions the users are only able to identify and manage the sum of all relevant information objects. Therefore, tools supporting this demand in the user interface as well as utilizing it within the underlying retrieval engine are of great potential. It is important to support the narrowing down of the focus (up to the single objects again) to be able to support a more detailed inspection, analysis and assessment of the information objects. To support all these types of activities the IR system has to provide appropriate interaction tools in order to navigate within a single exploration step but also in the projection plane. Aim is an optimal intersection between focus and recall set. Also, it has to be investigated in which way an optimized focus of an information dialogue step can be transferred to the next step, e.g. speed up the focusing process in the next step. Furthermore, the sequence of dialogue can also provide insight in the type of information behavior that users are performing and, in the ideal case, an information strategy can be identified and utilized to

Fig. 3. Sequence of separate queries.

P. Landwich et al. / World Patent Information 31 (2009) 315–322

derive successive dialogue steps from a given starting point. By applying corresponding visualization and interaction tools, the users are enabled to strategically optimize their information behavior and reach the satisfaction of their information need faster. To be able to actually support and evaluate our framework we need a system which meets the following demands: the system should    

fundamentally support the interaction model, map the described activities to support the user, enable the quantitative and qualitative evaluation of the model, be highly flexible and extensible to integrate new visualization techniques.

Following the formal description of the information dialogue and given the demands we will use the DAFFODIL-system as an experimental system for further development and evaluation of the above described framework. DAFFODIL is a virtual digital library system targeted at strategic support of users during the information seeking and retrieval process. It provides not only basic and but also sophisticated search functions for exploring and managing digital library objects including meta-data annotations over a federation of heterogeneous digital libraries. It already matches the above named demands to a high degree and can be instantly used to verify the new framework.

5. Summary and outlook The idea and motivation to combine visualization and information retrieval through the information dialogue in a new model is the focus of this paper. With the presented formal description of sets and activities – EXPLORATION, FOCUS, NAVIGATION, INSPECTION, EVALUATION and STORE – we are now in the position to illustrate, store and exploit the search process in a series of finite steps. The set-oriented description and the derivation of definite sets illustrates, how innovation experts proceed from information need to the satisfaction of this need through the process of one or more queries. An added value has been created by the appliance of the advanced retrieval system in the domain of innovation processes. As showed by the cognitive walk-through experts on innovation can be supported when searching patent databases and digital libraries. By the visualization of result sets and query reuse a benefit can probably be created for the patent seeking expert. The information dialogue described in the model can now be applied to capture and analyze the information search process and the elaborated context. This captured information represents the basis to further support the user through recommendation by the IR system as well as collaboration with other users in a similar situation. In Section 1 we pointed out the requirements to support an innovation process. The whole innovation-process is attended by the search for information especially for patent information. With our model we are able to capture the whole search process. We can use the elaborated context to reflect our process standing or to analyze it for further searches. With the DAFFODIL-system as a basis for further research we see enough challenges for implementations. These challenges can be differentiated between the system support and the visualizations to the user, which also overlap. 5.1. System On the system side we identified the following challenges, where relevance feedback plays the major role:

321

Search strategy: With the help of the user or by monitoring the activities the system must provide different search strategies to raise efficiency. Social database: By monitoring many searches in form of a set of activities, it is perhaps possible to support a user through recommendations. Analyzing a new search from the beginning, the system could be able to identify similar stored search processes. If this knowledge is visualized for the user, he could get benefit for his own search. Relevance feedback: The concept of relevance feedback is the most important challenge on the system site. The users relevance assessments must be captured in explicit and implicit form and need to be processed and analyzed to be used for recommendations. All this upcoming functionality needs to be presented to the user through new, innovative and user-friendly visualizations. 5.2. Visualization For visualization we can identify different areas for improvement. Generally, every visualization tool needs to be analyzed and alternatives need to be developed and evaluated, but our focus will be on the following: Result list: The visualization of results must go beyond the usual measure. The user needs a portfolio of visualization tools, which approach his cognitive abilities. These visualization tools should be highly interactive so the user can capture the results under different visual angles. Search history: The user must be able to get the full control of his search history and the developed information context. Relevance feedback: The user must have the ability to notify and review his assessment of relevance through the user interface in order to understand and change them. Search pattern/strategies: With a capable visualization the user can recognize that he follows a specific strategy. This awareness could lead to new directions in the further search. 5.3. Outlook We have created a simple framework in the sense of [33]. With this framework we have established well-founded basis for further research. The great number of challenges listed above show the high potential of this framework. Every development of the challenges improves the DAFFODIL-system under the points of view dialogue and search process. We see a special importance, however, in the implementation of relevance-feedback. And here in particular the question how activities (see Section 3.2) express relevance feedback explicitly as well as implicitly. The question as to how well we can map and develop search strategies with our defined activities is of great importance to us. The next steps in our research will focus on the relevance feedback issues (i.e. what is needed and what is possible) and to integrate this into DAFFODIL. On the theoretical side we will deepen the new framework and relate it to existing models on the information seeking process. Summative and formative evaluation on qualitative and quantitative basis will be executed to support our ideas. Acknowledgments This article has been developed from the presentation at the Retrieval Facility Symposium (IRFS) in November 2007 in Vienna, Austria. This work is funded via the German Science Foundation (DFG) in the project LACOSTIR.

322

P. Landwich et al. / World Patent Information 31 (2009) 315–322

References [1] Grabowski H, Paral T. Erfolgreich Produkte entwickeln Methoden. Prozesse. Wissen: LOG_X Verlag GmbH; 2004. [2] Bullinger H-J, Fokus innovation – Kräfte bündeln, Prozesse beschleunigen. Carl Hanser Verlag; 2006. [3] Völker R et al. Wissensmanagement im Innovationsprozess. Physica-Verlag; 2007. [4] Vogel T, Hemmje M. Wissensbasiertes und Prozessorientiertes Innovationsmanagement (WPIM) – Herausforderungen und Einsatz in der Domäne Automotiv. In: Mehr Wissen – Mehr Erfolg. CMP-WEKA Verlag; 2007. p. 485–93. [5] Meixner O. Entscheidungsunterstützung und Wissensmanagement in der Neuproduktentwicklung, NPD-X: Ein Expertensystem zum betrieblichen Innovationsmanagement. WiKu-Verlag; 2003. [6] Card SK, Mackinlay JD, Shneiderman B. Readings in information visualization, using vision to think. Los Altos, CA: Morgan Kaufmann; 1999. [7] Sebrechts M et al. Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces. In: SIGIR ’99: proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, New York, NY, USA. ACM Press; 1999. [8] Schumann H, Müller W. Visualisierung – Grundlagen und allgemeine Methoden. Heidelberg: Springer-Verlag; 2000. [9] Vogel T, Hemmje M. Heranführung an Wissens-basiertes und Prozessorientiertes Innovationsmanagement anhand von Herausforderungen in dynamischen Innovationsprozessen. In: Wissen wirkt! Aber wie?! AV + Astoria Druckzentrum; 2006. p. 171–8. [10] Vogel T, Hemmje M. Auf dem Weg zu einem ‘‘Wissens-basierten und Prozessorientierten Innovationsmanagement” (WPIM) – Innovationsszenarien, Anforderungen und Modellbildung. In: Mit Wissensmanagement besser im Wettbewerb; 2006. p. 287–94. [11] Queitsch M, Baier D. Computergestütztes Innovationsmanagement in Wertschöpfungsnetzwerken. Forum der Forschung; 2004. [12] Campbell I. Supporting information needs by ostensive definition in an adaptive information space; 1995. . [13] Hemmje M. Unterstützung von Information-Retrieval-Dialogen mit Informationssystemen durch interaktive Informationsvisualisierung. Darmstadt; 1999. [14] Cooper WS. A definition of relevance for information retrieval. Inf Storage Ret 1971;7:19–37. [15] Wilson T. Human information behavior; 2000. . [16] Larkin JH, Simon HA. Why a diagram is (sometimes) worth ten thousand words. Cogn Sci 1987;11:65–100. [17] Schaefer A et al. Active support for query formulation in virtual digital libraries: a case study with DAFFODIL. Research and advanced technology for digital libraries, 9th European conference, ECDL 2005, vol. 3652. Vienna, Austria: Springer; 2005. [18] Komlodi A et al. Search histories for user support in user interfaces. J Am Soc Inf Sci Technol 2006;57:803–7. [19] Davis L. Designing a search user interface for a digital library. J Am Soc Inf Sci Technol 2006;57:788–91. [20] Harper DJ, Kelly D. Contextual relevance feedback. In: IIiX: Proceedings of the 1st international conference on information interaction in context, New York, NY, USA. ACM Press; 2006. p. 129–37. [21] Hemmje M et al. LyberWorld a visualization user interface supporting fulltext retrieval. In: Proceedings of SIGIR ’94, New York, NY, USA. New York: SpringerVerlag, Inc.; 1994. p. 249–59. [22] Müller F et al. Visualization and interaction techniques of the visual metadata browser VisMeB. In: I-know; 2003. [23] Heer J et al. Prefuse: a toolkit for interactive information visualization. In: Proceedings of the 2005 conference on human factors in computing systems, CHI, Portland, Oregon, USA. ACM Press; 2005. p. 421–30. [24] Hemmje M et al. A multidimensional categorization of information activities for differential design and evaluation of information systems. In: GMDStudien, Sankt Augustin. GMD – Forschungszentrum Informationstechnik; 1996. [25] Landwich P et al. Ansatz zu einem konzeptionellen Modell für interaktive Information-Retrieval-Systeme mit Unterstützung von Informationsvisualisierung. In: Proceedings ISI; 2007. p. 327–32. [26] Kuhlthau CC. Longitudinal case studies of the information search process of users in libraries. Libr Inf Sci Res 1988;10:257–304. [27] Belkin NJ et al. Cases, scripts, and information-seeking strategies: on the design of interactive information retrieval systems. In: Arbeitspapiere der GMD. Sankt Augustin: GMD; 1994. [28] Pharo N. A new model of information behaviour based on the search situation transition schema. Inf Res 2004:10. [29] Rose DE. Reconciling information-seeking behavior with search user interfaces for the Web. J Am Soc Inf Sci Technol 2006;57:797–9. [30] Xu Y, Liu C. The dynamics of interactive information retrieval, Part II: An empirical study from the activity theory perspective. J Am Soc Inf Sci Technol 2007;58:987–98.

[31] Klas C-P et al. Das DAFFODIL system – Strategische Literaturrecherche in Digitale Bibliotheken. In: 4.ter HIER workshop. UVK; 2005. [32] Resnick ML, Vaughan MW. Best practices and future visions for search user interfaces. J Am Soc Inf Sci Technol 2006;57:781–7. [33] Thimbleby HW. Press on: the principles of interaction programming. Cambridge, Massachusetts: The MIT Press; 2007.

Paul Landwich is CIO for the head office of the Protestant Church of the Palatinate and Ph.D. student at the Distance University of Hagen, Germany. There he is part of the workgroup for multimedia and internet applications (Professor Dr.-Ing. Matthias Hemmje). He is currently working in the area of visual data-mining and information visualization. He received his degree in engineering from the University of Kaiserslautern in 1994 and his Master of Computer Science from the FernUniversität in Hagen in 2005.

Tobias Vogel was born in Munich in 1978, he finished his studies of Wirtschaftsinformatik at Otto-FriedrichUniversity in Bamberg in 2004. During his studies he was co-founder of an academic research consultancy. Besides academia he collected working experience at Linde AG and ProSieben AG. His diploma theses covered the fields of vehicle on-board diagnosis in cooperation with BMW AG. Since 2004 Tobias Vogel is working for BERATA GmbH member of the ALTRAN Group and started as Consultant for safety electronics, in 2007 he became Business Unit Manager for aerospace and since 2008 he is responsible for the ALTRAN Automotive Business Line USA. In 2006 he started working on his doctoral thesis at FernUniversität Hagen, guided by Prof. Dr.-Ing. L. Matthias Hemmje, in the fields of ‘‘Knowledge-based and Process-oriented Innovation Management” (WPIM).

Claus-Peter Klas received his Computer Science diploma at the University of Dortmund in 1998, where he started to work as research assistant. In 1999 he started to work as the principal person responsible for the Daffodil project, a cutting edge virtual digital library system to strategically support the user during the information search process. In 2007 he received a Dr. degree at the University of Duisburg-Essen and joined the chair of multimedia and internet-applications at the FernUniversität in Hagen in 2008. His research focuses on information retrieval, information systems, digital libraries and access in longterm preservation architectures.

Matthias Hemmje is full professor at the FernUniversität in Hagen, one of the leading distance universities in Germany. He is involved in research on Virtual Information and Knowledge Environments with special focus on distributed collaborative Digital Libraries, Multimedia Archives, Information Retrieval, Filtering, Linking, Enrichment, Personalization, and Information Visualization. He is building on research expertise from a long standing history in national and European projects. His primary research interests include information retrieval, multimedia databases, virtual environments, information visualization, human computer interaction, and multimedia. He also guide research in the area of content engineering, knowledge technologies, peer-to-peer based systems, collaboration and e-learning support systems as well as mobile and location based services.