An intelligent human-expert forum system based on fuzzy information retrieval technique

An intelligent human-expert forum system based on fuzzy information retrieval technique

Expert Systems with Applications Expert Systems with Applications 34 (2008) 446–458 www.elsevier.com/locate/eswa An intelligent human-expert forum sy...

531KB Sizes 0 Downloads 75 Views

Expert Systems with Applications Expert Systems with Applications 34 (2008) 446–458 www.elsevier.com/locate/eswa

An intelligent human-expert forum system based on fuzzy information retrieval technique Yueh-Min Huang *, Juei-Nan Chen, Yen-Hung Kuo, Yu-Lin Jeng Department of Engineering Science, National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan 701, Taiwan, ROC

Abstract The forum system is useful for sharing knowledge and help-seeking. However, existing forums often have the problem that some questions remain unanswered. This study proposes an intelligent human-expert forum system to perform more efficient knowledge sharing. The system uses fuzzy information retrieval techniques to discover important discussion knowledge and actively invites human-experts who might be capable of answering the question to participate in the discussion. The selection of human-experts employs the ExpertTerms Correlation Matrix, which stores the knowledge strength of human-experts. Moreover, the forgetting curve is adopted into our forum system for modeling the variations in memory strength. The experiment uses three different categories of discussions to be the testing data, and the performance study shows acceptable results on discussion searching and expert discovery.  2006 Elsevier Ltd. All rights reserved. Keywords: Fuzzy information retrieval; Human-expert forum system; Knowledge sharing

1. Introduction Nowadays, online forums have become a useful tool for problem solving (Stein & Maier, 1995), learning discussion (Jeng, Huang, Kuo, Chen, & Chu, 2005), and knowledge building. Steehouder indicated the advantages of user forums to individual users, the most important benefit being they can receive tailored answers from peers after formulating the problem in their own words, without using specific keywords to search online (Steehouder, 2002). Additionally, Cross, Rice, and Parker (2001) concluded five categories of benefits when seeking information from other people. (1) Solutions: People can resort to other people for obtaining answers that solve the given problems. Moreover, such solutions tend to be either declaratives or procedurals.

(2) Meta-knowledge: The help-seekers learn the relevant information to answer their questions. The relevant knowledge might be held in events or by other people, and learning the information could enhance the efficiency of future problem solving. (3) Problem reformulation: The help-seekers and other people often engage in discussions that lead them to consider different dimensions of the problem. This can also aid people to reformulate their own problems. (4) Validation of plans or solutions: The plans or solutions proposed by people can be validated by their peers. Moreover, validation in a forum allows people to modify their own opinions to express them more confidently and effectively. (5) Legitimation from contact with a respected person: In a forum, people consult with others and thus increase their credibility, leading to more respect.

*

Corresponding author. Tel.: +886 6 2757575x63336; fax: +886 6 2766549. E-mail addresses: [email protected] (Y.-M. Huang), nan@ easylearn.org (J.-N. Chen), [email protected] (Y.-H. Kuo), jeng@ easylearn.org (Y.-L. Jeng). 0957-4174/$ - see front matter  2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2006.09.037

Conversely, Steehouder argued that the disadvantages of traditional user forums are as follows: (Steehouder, 2002).

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

(1) If there is no participant interested in discussing the issue, the question will remain unanswered. (2) Questions may remain in the discussion group for a long time before being answered. (3) The help-seeker usually has little guidance when given conflicting recommendations. (4) Apart from the discussions, the advice or solution offered might be written in a form that is unclear or unreadable. Considering these four disadvantages, this paper proposes a novel forum system for more efficient knowledge sharing. To this end, the proposed system includes both search engine (Abecker, Bernardi, Maus, Sintek, & Wenzel, 2000; Catarci, Chang, Liu, & Santucci, 1998; Jiang, Tseng, & Tsai, 1999; Lin, Chen, Ho, & Huang, 2002; Matsumura, Ohsawa, & Ishizuka, 2005; Weng & Lin, 2003) and discussion forum. The help-seeker can ask a specific problem using natural language in a search engine. The search engine would parse the user defined query by stemming (Porter, 1980), stop word pruning, and n-gram (Brown, deSouza, Mercer, Pietra, & Lai, 1990) algorithms, and uses the parsed terms with fuzzy information retrieval technique (Baeza-Yates & Ribeiro-Neto, 1999) to find the relevant discussions. Nevertheless, if the searching results do not meet the user’s needs, then the user defined natural language query would automatically be posted to a forum for further discussion. We call the posted problem an unsolved issue, and users can discuss such issues in the forum. Additionally, the forum would actively invite users who might be capable of answering unsolved issues to participate in this discussion. The forum system analyzes the discussion behavior of each participant, and models their behavior in an Expert-Terms Correlation Matrix. In particular, the Expert-Terms Correlation Matrix records the participators’ knowledge strength patterns, which is useful in finding the users who may provide the best answers to the questions. Therefore, the selected users from this system are called human-experts. The participants in the discussion environment play two roles, the first, help-seekers who ask questions, and the second, human-experts who answer them. For each issue discussed, people can vote on the discussion or the replies, and if the voting result surpasses the predefined threshold it is presumed the issue is solved. Whenever a problem is solved, the system would feedback the answer to its repository to be part of the static knowledge for further help-seeking. This introduction shows the plan of the entire knowledge delivering process, and it deals with the disadvantages of traditional forum systems in two ways. (1) The forum system actively invites the human-experts to solve the given problem. Accordingly, questions would not be ignored and should become solved in a short time via the responses human-experts. (2) Information seekers can vote on discussions, therefore useful solutions would eventually surpass the

predefined threshold score. have passed peer validation knowledge in the forum’s help-seekers can trust these have passed peer review.

447

These useful solutions and thus become static repository. Generally, responses because they

This study uses 3000 discussion threads as the testing data, collected from three different forums, thinkpads.com, englishforums.com and lonelyplanet.com. Moreover, the experiment uses 15 natural language queries to request information and help from human-experts. We use the precision and recall to measure the query results and to evaluate the proposed system of this study. The detailed discussions of experimental results are given in Section 5. The organization of this article is as follows. Section 2 introduces the human-expert forum architecture and describes the discussion knowledge building process. Section 3 gives our methodology and its definitions. Section 4 illustrates the dynamic maintenance mechanisms, and Section 5 discusses the experimental results. Finally, Section 6 draws the conclusions of this study and offers suggestions for further research. 2. Intelligent human-expert forum system Fig. 1 shows the architecture of the discussion system. In the discussion architecture, users play two different roles at the same time, the first is as a user asking questions, and the second is as an expert offering suggestions and solutions. The discussion system consists of four major parts, the details of which are introduced as follows. 2.1. Discussion Knowledge Search Engine The Discussion Knowledge Search Engine is responsible for receiving the user’s specific natural language query and generating the parsed query term to find relevant discussions in the Discussion Repository. In order to generate query terms, the search engine uses n-gram (Brown et al., 1990), with n = 1–4, to parse the natural language query. The parsed term set consists of terms which are combined with numbers of words (one to four words). Moreover, the parsed term set eliminates stop words that do not belong in the query term set. Based on fuzzy query term expansion, the remaining query term set would expand the term members itself, and becomes an expanded query set (Berardi et al., 2004; Billerbeck, Scholer, Williams, & Zobel, 2003; Chen, Yu, Furuse, & Ohbo, 2001; Li & Agrawal, 2000). Finally, the search engine employs the expanded query set and fuzzy query technology to discover relevant discussions in the discussion repository. The discussions discovered return to the user to answer the specific query. Two important techniques are introduced in the above discussion, ‘fuzzy query term expansion’ and ‘fuzzy query’. This paper discusses fuzzy query term expansion in Section 3.4, and fuzzy query in Section 3.5.

448

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

Fig. 1. The architecture of the intelligent human-expert forum system.

2.2. Discussion Repository

2.4. Knowledge-based Forum System

The Discussion Repository consists of the entire discussion records, member data, and some analytic information. For the analytic information, the Terms Correlation Matrix and Expert-Terms Correlation Matrix are the two fundamental structures that store the essential information. The Terms Correlation Matrix records the relationship between two terms, and the information is important for fuzzy query. Additionally, the relationship value is according to the co-occurrences of terms in discussions. The construction of the Terms Correlation Matrix is described in Section 3.2. The Expert-Terms Correlation Matrix stores the human-experts’ strengths with regard to the domain knowledge. Based on the user’s discussion behavior, the strengths of their domain knowledge would be set within different degrees. Moreover, these strengths would become over time. More details about Expert-Terms Correlation Matrix can be found in Section 3.6, and the maintenance of analytic information is introduced in Section 4.

The Knowledge-based Forum System provides a way for the forum to share knowledge. In addition, the discussion board plays an important role in solving questions by means of user cooperation, the Discussion Knowledge Search Engine, the Discussion Repository, and the Human-Expert Set Discovering Mechanism. The cooperating scenario is introduced as follows. First, the user submits a natural language query to the Discussion

2.3. Human-Expert Set Discovering Mechanism The Human-Expert Set Discovering Mechanism is designed to discover the human-expert set by using the expanded query set. The mechanism uses the expanded query set to compare with the Expert-Terms Correlation Matrix, and accumulates the relevant strengths of each human-expert. Following that, the mechanism returns the high strength human-experts to be the candidate expert set. The candidate experts are the users who have the most potential to solve the given problem. A detailed introduction of the Human-Expert Set Discovering Mechanism is depicted in Section 3.7.

Fig. 2. The user interface of the human-expert forum system.

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

Knowledge Search Engine, and the search engine then parses and expands the query as the expanded query term set. The expanded query term set along with the fuzzy query can be used to find information relevant to the discussion, which would then be sent to the questioner. If the user is satisfied with the answer, then the process is terminated. If not, the natural language query would be posted on the forum for further discussion. Whenever the forum receives a new post, it would pass the expanded query set to the Human-Expert Set Discovering Mechanism, to look for the candidate human-experts who are relevant to the query. The forum would then actively send alert messages to the experts so that they can participate in this discussion. Fig. 2 shows the user interface of the forum portal. 3. Definitions and methodology In the following, we gives some definitions and symbols to express the searching and matching mechanisms used in this study. Moreover, parts of the fuzzy information retrieval techniques are expanded from (Baeza-Yates & Ribeiro-Neto, 1999; Martin-Bautista, Sanches, Chamorro-Martinez, Serrano, & Vila, 2004). In order to make this paper self-contained, we first illustrate the fundamental terms of fuzzy information retrieval, and then express our methodology. 3.1. Definitions Definition 1 (Discussion Set). Let D = {d1, d2, . . . , di} be the set of discussions, where "di 2 D, i > 0. Definition 2 (Index Term Set). Let k be the index term, then the discussion d = {k1, k2, . . . , ki} can be presented by a set of index terms, where i > 0, "ki 2 d. Moreover, an index term set is defined as K = ¨ (members of di), where i > 0, "di 2 D.

449

index term ki appears in discussion dj. The maxlfreql,j is the number of times the most frequent index term, kl, appears in discussion dj. N is the number of discussions in D, and ni is the number of discussions which contain index term ki. In order to fit in our application, two parameters are added to the equation of term frequency (tf) (Horng, Chen, & Lee, 2003). Details of the two parameters are as follows: (1) The first parameter p presents the highest weight of a term’s position of occurrence in a discussion. The major idea is the index terms should have different weights with regard to where they appear. For example, Fig. 3 shows two index terms k1, k2 in a discussion d1, the k1 appears in the title and main discussion area of d1, and the k2 appears in the reply area of d1, so we assign p = max(1.5, 1.3) = 1.5 for k1 and p = 1.0 for k2. (2) The second parameter r is a reply factor which indicates the additional importance of a term in a discuscountðreply Þ

sion. For the second parameter countðreplyi;jÞ , it shows j the importance of a term in a discussion. In the equation, count(replyj) presents the count of replies in a discussion dj, and the count(replyi,j) presents the count of replies which contain the term ki in discussion dj. The idea for the parameter is that the important terms would appear repeatedly in each reply. For example, see Fig. 3 again, the discussion d1 contains two reply areas, and there is only one reply that contains the term k2. Hence, the parameter r of tf1,2 can be calculated as follows: the countðreply 1 Þ ¼ 2; ¼ 1;

the countðreply 2;1 Þ and

the

countðreply 2;1 Þ countðreply 1 Þ

1 ¼ ; 2

  Definition 3 (Query). Let query Q ¼ fq 1 ; q2 ; . . . ; qi g be  the set of query terms, where 8qi 2 Q, and i > 0. In addition, each query term q i it can be seen as a signed index  term that belongs to query discussion d q ¼ fk  1 ; k2 ; . . . ;  þ  k i g. Additionally, the qi and qi represent the positive and negative related query terms respectively.

Definition 4 (Human-Expert Set). Let E = {e1, e2, . . . , ei} be the set of human-experts, where "ei 2 E, i > 0. Definition 5 (TF*IDF). Assume there are t index terms used to present a discussion dj, and the dj can be expressed as a vector ~ d j ¼ ðw1;j ; w2;j ; . . . ; wt;j Þ, in which wi,j is the ith weight of a vector of dj in t dimensions space. The wi,j can be seen as the importance of index ki to discussion dj, and it can be estimated by Eq. (1). Eq. (1) is similar to the famous tf*idf equation, the freqi,j is the frequency that

Fig. 3. An example of discussion area presentation.

450

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

wi;j ¼ tfi;j  idfi ;

ð1Þ

where freqi;j  ð1 þ rÞ; maxl freql;j N idfi ¼ log ; and ni ( countðreply i;j Þ ; if has reply in d j ; r ¼ countðreply j Þ 0; if has no reply in d j :

into the dj index term list. The entire process, which expands the dj index term list to include ki, is called the term expansion process.

tfi;j ¼ p 

3.2. Terms Correlation Matrix First of all, the thesaurus can be constructed based on ~ c (Terms Correlation Matrix). The row and column is composed of the index terms in K. The coefficient ci, l, which represents the interrelations between ki and kl, can be calculated based on Eq. (2). Eq. (2) uses conditional probability to measure the interrelationship between ki and kl. Referring to Eq. (2), the ni stands for the number of discussions which include the index term ki. Similarly, ni, l is the number of discussions which include both ki and kl. Based on ~ c, we can go a step further to build the degree of membership model from the discussion dj to index term ki. ni;l : ð2Þ ci;l ¼ P ðk l jk i Þ ¼ ni Making use of Definition 5, we can clean out an index term set K, and these index terms can represent the entire discussions in discussion set D. Therefore, we can collect these index terms to be the thesaurus about this topic, and construct the Fuzzy Query Term Expansion Model and the Expert-Terms Correlation Matrix. 3.3. Fuzzy term-discussion membership degree The degree of membership li,j can be computed by using Eq. (3) Y ð1  ci;l Þ: ð3Þ li;j ¼ 1  k l 2d j

Let us examine an example to introduce the basic idea of Eq. (3) and the term expansion process. In some cases, the discussion dj contains the index term kl, but does not contain the index term ki. Therefore, we can ensure that ll,j P tl, where tl stands for the threshold of membership degree. In addition, ll,j P tl means that index term kl is highly related to discussion dj, and we should add kl to be an index term of dj. Index terms ki and kl have the coefficient ci, l P tc, where tc stands for the threshold of the coefficient. If ci, l satisfies tc, it means that the index terms ki and kl have a high correlation. Suppose ll,j P tl and ci, l P tc cause the condition li,j P tl to become true, the index term ki is related to the discussion dj, even if the dj does not contain ki. Following that, the ki should be added

3.4. Fuzzy query term expansion Based on Eq. (3), we can conduct Eq. (4) to present the fuzzy query term expansion process. The fuzzy query term expansion process can improve the recall rate because to it adds more information to the query set Y li;q ¼ 1  ð1  csigned Þ; k l 2d q

(

where csigned ¼

ci;l ; ð1  ci;l Þ;

if k l ¼ k þ l ; if k l ¼ k  l ;

ð4Þ

Eq. (4) assumes the set of query terms Q can be seen as a   query discussion d q ¼ fk  1 ; k 2 ; . . . ; k l g, and the terms involved in dq are the members of Q. Given an index term ki 2 K, ki 62 dq, and it has coefficients ci,p P tc, (1  ci, n) P  tc with query terms k þ p 2 d q , k n 2 d q . Assume ci, p P tc and (1  ci, n) P tc supports li, q P tl to become true, the index term ki should be added into dq to be in the list of query terms. Finally, we denote the expanded query set as the Qexp.

3.5. Fuzzy query   Given a query Q ¼ fq 1 ; q2 ; . . . ; qi g, consisting of a set of positive and negative query terms. After the query term expansion process, the query Q expands to Qexp, and Q  Qexp. Following that, it has to use Qexp to find out the set of discussions relevant to Q. For Eq. (5), the input discussion dj has to estimate its score for the query Qexp by multiplying the membership degree between each query term kq and discussion dj. The estimated score of each discussion can be seen as a membership degree between the discussion’s index term set and the expanded query term set. The calculated scores can be used to rank the discussions, and the top of the ranked discussions would be the query answer set returned. Notice that the e in Eq. (5) is a very small value, which is used to avoid the score value of the discussion becoming zero Y scoreðd j Þ ¼ ðlq;j þ eÞ; ð5Þ k q 2Qexp

where lq;j ¼ 1  ( csigned ¼

Y

ð1  csigned Þ;

k l 2d j

cq;l ;

if k q ¼ k þ q;

ð1  cq;l Þ;

if k q ¼ k  q:

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

3.6. Expert-Terms Correlation Matrix This section defines an Expert-Terms Correlation Matrix for discovering a set of candidate human-experts who can solve the given query question. Fig. 4 is a classic Expert-Terms Correlation Matrix whose rows are associated with a set of human-experts E = {e1, e2, . . . , ej}, and the column is represented by a collection of index terms set K = {k1, k2, . . . , ki}. The matrix assumes each index term can be seen as a specific domain knowledge, and an expert may have a diversity of domain knowledge. Following that, each correlation between an index term ki and a humanexpert ej is defined as memory strength di,j, where di,j is a floating value used to represent that the human-expert ej has strength di,j on the specific domain knowledge ki (index term). Therefore, an expert ej can be expressed by a factor ! ej ¼ ðd1;j ; d2;j ; . . . ; di;j Þ, where i is the number of index terms collected in K. In the Expert-Terms Correlation Matrix, the strength di,j is a dynamic variable with a value depending on the discussion behavior of human-expert ej. For example, if an expert ej reads a solved discussion thread, which is related to three domain knowledge k1, k2, and k3 (li, l P tl, i = {1, 2, 3}), then we can increase the corresponding strengths d1,j, d2,j, and d3,j to characterize the behavior. Similarly, the strengths can be increased when the expert participates in discussion activities which are associated with specific domain knowledge. The detailed memory strength maintenance issue is illustrated in Section 4.2. 3.7. Candidate Human-Expert Discovery Fig. 5 shows the Human-Expert Set Discovering Mechanism, which combines three main processes: (1) separating

451

the positive expanded query term set ðQþ exp Þ from the expanded query (Qexp), (2) according to the positive expanded query term set to calculate the strengths of each P human-expert with regard to the query ( di,j), and (3) applying the TOP_N( ) function to select the N strongest human-experts to be in the returned candidate humanexpert set (Ec). The details of the Human-Expert Set Discovering Mechanism are as follows. In order to discover the candidate expert set by using the Expert-Terms Correlation Matrix, all the positive query terms have to be separated from an expanded query. The essential idea is it only focuses on experts who can solve the question, and the negative query set thus becomes unnecessary information. For this reason, the mechanism defines the positive expanded query set first. An expanded   query Qexp ¼ fq 1 ; q2 ; . . . ; qi g, which consists of a set of positive and negative query terms. Let Qþ exp  Qexp be the positive expanded query set, which contains all of the posiþ tive query terms of Qexp. Let Q exp ¼ Qexp  Qexp be the neg ative expanded query set, and Qexp  Qexp . Moreover, Qþ exp þ  and Q have the following characteristics: Q \ Q ¼ ;, exp exp exp  þ Qþ exp [ Qexp ¼ Qexp . The Qexp is the essential query information for generating candidate human-expert set Ec. Eq. (6) is used to discover Ec, the equation then sums up each expert’s strength which is associated with Qþ exp , and it selects strong experts to be part of the candidate humanexpert set X  Ec ¼ TOP N di;j ð6Þ

In Eq. (6), the Ec is the candidate expert set, Ec  E. The di,j is the correlation strength between ki and ej, where k i 2 Qþ exp , and ej 2 E. Moreover, the TOP_N is the function, which returns the N strongest. 4. Dynamic maintenance

Fig. 4. The Expert-Terms Correlation Matrix.

The Terms Correlation Matrix and the Expert-Terms Correlation Matrix are the two matrices that store analytical information for the intelligent forum system. The stored information helps to query discussion knowledge and discover the potential human-experts. However, the stored information dynamically changes its value as new discussion threads come into the system. Therefore, how to maintain the two matrices without redundant computing becomes a critical issue. This section introduces how to maintain the Terms Correlation Matrix and Expert-Terms Correlation Matrix in a reasonable fashion.

Fig. 5. The Human-Expert Set Discovering Mechanism.

452

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

4.1. Incremental Update of the Terms Correlation Matrix

ð7Þ

4.2. Maintenance of the Expert-Terms Correlation Matrix The Expert-Terms Correlation Matrix stores the strengths of the human-experts with regard to the domain knowledge. Maintaining the matrix presents some difficult considerations. For instance, how long does specific domain knowledge keep in a person’s memory, or how is it best to increase the strength of one specific domain’s knowledge. Consequently, the forgetting rate (Ebbinghaus, 1913) is applied to model the phenomenon of strength decline. Following that, in Eq. (8), di,j(t0) stands for the strength of human-expert ej with regard to domain knowledge ki, and di,j(t0 + Dt) presents the remaining strength of human-expert ej with regard to domain knowledge ki after several days (Dt). Moreover, R is the memory retention t expressed as a percentage, and it can be defined as eS , where t is the time and S is the relative strength of memory. In order to minimize the scope of the problem and to make the implementation easier, it is assumed that t and S are constants, with values of t = Dt = 1 day and S = 100. This assumption is that the strength di,j would decrease 1% over

Remaining strength (%)

The Terms Correlation Matrix stores the information about the interrelationships between index terms. If we update all of the information at once, it would take a lot of time and be inefficient. Therefore, we need an update method that operates incrementally to maintain the matrix within the minimum time cost. In analyzing construction process, recall that Eq. (2) constructs the Terms Correlation Matrix. Following that, we can make two observations: (1) new terms are only added when discussion occurs, and (2) the changes of terms’ interrelationships are dependent on what terms are included in the new discussion. Accordingly, Eq. (7) is used to incrementally update the Terms Correlation Matrix. In Eq. (7) ci,j(t + 1) is the changed interrelationship value, the conditional probability that the term kj occurs given that term ki occurs. Furthermore, ni,j(t) stands for the number of discussions that contain both the terms ki and kj before the new discussion dnew is added into the system. Similarly, ni(t) stands for the number of discussions that contain the term ki before value changes. For updating ci,j, it checks whether the new discussion dnew contains both terms ki and kj, or contains only the term ki. Based on Eq. (2), the numerator is associated with ni,j(t) and the denominator is associated with ni(t). Suppose the dnew contains both ki and kj, it adds 1 to both the numerator and the denominator. If the dnew only contains the term ki, it adds 1 to the denominator and updates the interrelationship value ci,j 8 < ni;j ðtÞþ1 ; if ðk i  d new Þ AND ðk j  d new Þ; ni ðtÞþ1 ci;j ðt þ 1Þ ¼ : ni;j ðtÞ ; if ðk i  d new Þ AND ðk j 6 d new Þ: ni ðtÞþ1

120 100 80 60 40 20 0

0

10

20

30

40

50

60

70

80

90

100

Passing of days

Fig. 6. The forgetting curve of parameters setting with t = 1 and S = 100.

day. The phenomenon of declining memory strength is illustrated in Fig. 6 t

di;j ðt0 þ DtÞ ¼ Rdi;j ðt0 Þ ¼ eS di;j ðt0 Þ:

ð8Þ

Based on Eq. (8), Eq. (9) is designed for handling increases in the strength of di,j. In Eq. (9), d0i;j is the enhanced strength and di,j is the original strength. Additionally, b is the discussion behavior factor, in which value is dependent on the users’ discussion behaviors. Table 1 illustrates five types of relationships between discussion behaviors and the corresponding behavior factors. Moreover, the Increment Ration illustrates the increment ration of the new strength ðd0i;j Þ. Table 1 defines three actions (propose, reply, and read) and two states (solved or unsolved) for discussion activities. For the defined actions and states, it is easy to understand what the actions mean, but it is difficult to determine whether a discussion question is solved. In order to solve this problem, we use a vote mechanism. The discussion system allows its members to vote on the discussions, and the results would have to satisfy a predefined threshold for the discussions to be considered solved d0i;j ¼ eb di;j :

ð9Þ

4.3. The example of Expert-Terms Correlation Matrix maintenance This section takes an example to describe the maintenance of the Expert-Terms Correlation Matrix. First of all, it assumes that a human-expert has 86% memory strength with a specific term. After 15 days passed, the human-expert replied to and solved a question related to the specific term. The initial memory strength would then decrease by the corresponding ratio of days, and the remaining memory strength would increase to 10.5% by referring Table 1. Fig. 7 shows the variation of the memory strength between human-expert and the index term. However, applying computation power to the daily decrease of the remaining human-expert memory strength is expensive. Therefore, an approach is proposed to maintain the remained memory strength on demand without daily updating. The core idea of the approach is that it needs

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

two experiments will be conducted to evaluate the proposed methods and the given discussion set. The two adopted evaluation measures are the commonly used precision and recall measurements, as in Eqs. (10) and (11). This study uses the two measurements to evaluate the precision of the discussion searching. Following that, it employs the given natural language query to select the top 10 membership degrees of discussion knowledge, and thus judges the relevance of each discussion manually. Similarly, the precision rate of human-expert discovery can be measured in a similar manner. It chooses the top 10 human-experts by memory strength, and judges their relevance to the query. Additionally, the recall rate can be evaluated by measuring the ratio between numbers of relevant discussions discovered and the numbers of total relevant discussions in the database repository. Eventually, the recall rate of human-expert set discovery can be calculated by the number of selected relevant human-experts divided by the total number of relevant human-experts. The results of searching discussions and human-experts are shown in Figs. 8–11

Table 1 The discussion behaviors and the corresponding behavior factors Behavior factor (b)

Asking a question, or proposing a discussion Replying to a question but not solving it Replying to a question and solving it Reading a solved discussion Reading an unsolved discussion

0.01

1.0

0.03

3.0

0.1

10.5

0.08 0.01

8.3 1.0

Remaining strength (%)

Discussion behavior

Increment ration (eb  1) (%)

90 85 80 75 70 65 Initial state

After forgetting

After enhancing

found and relevant ; total found found and relevant ; recall rate ¼ total relevant

Updating states

precision rate ¼

Fig. 7. The variation of the strength between the human-expert and the index term.

5. Experimental results

Precision rate of discussion query

This section first introduces two evaluation measures to evaluate the performance of the proposed retrieval mechanisms. The two measures are precision rate and recall rate (Baeza-Yates & Ribeiro-Neto, 1999). The test discussion set and then the sample query set are described. Finally, 1 0.8 0.6 0.4 0.2 0 Q2

Q3

Q4

Q5

Q6

ð10Þ ð11Þ

This study randomly samples 3000 discussions from three different forums. The three forums are (1) thinkpads.com, (2) englishforums.com and (3) lonelyplanet.com, and the features of each forum will be introduced later. Each forum contributed 1000 discussions as the experimental testing data. We manually imported the testing data into our implemented forum system and constructed the human-experts’ profiles with the corresponding discussions. Notice that since the testing discussions and experts are manually entered into the proposed system within a few days, the situation of human-experts’ forgetting is not evident. Therefore, it might cause some bias in the results of human-expert discovering. Tables 2 and 3 show the properties of the discussion sets and human-expert sets respectively. Moreover, the features of the three forums are introduced as follows: (1) thinkpads.com is a professional ThinkPad notebook forum. We used 1000 discussion threads as our testing data

to deal with the strength decrease only when the strength has to be increased or be used. Following that, it can calculate the remaining strength by considering both memory forgetting and remembering at once. Finally, the remaining memory strength of the given example’s initial state is 86%, after forgetting it is 74.02%, and the final state is 81.79%. 15 10 The strength can be calculated as ð86%  e100 Þ  e100 ¼ 81:79%, and the strength value of each updated state is shown in Fig. 7.

Q1

453

Q7

Q8

Q9 Q10 Q11 Q12 Q13 Q14 Q15

Query set

Fig. 8. The precision rate of the top 10 retrieved discussions with the natural language query set.

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

Recall rate of the discussion query

454 1

0.8 0.6 0.4 0.2 0 Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9 Q10 Q11 Q12 Q13 Q14 Q15

Query Set

Precision rate of humanexpert discovery

Fig. 9. The recall rate of all the retrieved discussions with the natural language query set.

1 0.8 0.6 0.4 0.2 0 Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9 Q10 Q11 Q12 Q13 Q14 Q15

Query set

Recall rate of human-expert discovery

Fig. 10. The precision rate of the top 10 discovered human-experts with the natural language query set.

1 0.8 0.6 0.4 0.2 0 Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9 Q10 Q11 Q12 Q13 Q14 Q15

Query set

Fig. 11. The recall rate of all the discovered human-experts with the natural language query set.

Table 2 Summary of the test discussion sets Discussion set category

Number of discussions

Number of solved discussions

Average number of replies in each discussion

Average number of participators in each discussion

thinkpads.com englishforums.com lonelyplanet.com

1000 1000 1000

863 934 960

5 11 34

3 3 11

from a subcategory titled Thinkpad – General HARDWARE/SOFTWARE questions. This subcategory consists of troubleshooting discussion threads and other general questions. After observation, we discovered the collected

testing data included a number of users who participate in only one discussion thread. Usually, these users are first-time ThinkPad notebook buyers, and they often ask a similar question like ‘‘I am going to buy a ThinkPad

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458 Table 3 Summary of the test human-expert sets Human-experts category

Number of humanexperts

thinkpads.com 843 englishforums.com 441 lonelyplanet.com 1553

Average times of each humanexpert participates

Maximum number of times each human-expert participates

5 73 53

251 612 433

notebook. Please give me some more suggestions.’’ This situation decreases the value for the average times participating for each human-expert, as can be observed in Table 3. The second observation is that the discussion threads in Thinkpad – General HARDWARE/ SOFTWARE questions often terminated when the questions were answered. People in the forum often read the discussion but not reply to it. We conclude the reason is that the answers of the forum are clear, and it is not necessarily to provide other alternatives. This causes the value for the average number of replies in each discussion to become small, as can be seen in Table 2. (2) englishforums.com is an English language discussion place. The experiment used 1000 discussion threads as the testing data, which were collected from the forum’s subcategory the General English grammar questions (EFL/ESL). Major issues discussed in the englishforumns.com-General English grammar questions (EFL/ESL) are about English grammar and the usage of the specific English terms. In addition, the general discussion process is more like online chatting. For example, the help-seeker first asks a question and then human-experts reply. Sequentially, the help-seeker proposes a more detailed question arising from the given answers, and then the corresponding human-experts answer it. The question and answer process would repeat until the original problem is solved and the final answer convinces both the help-seeker and other participators. Each discussion tends to have a number of replies in it, even though it has few participants. This is why each discussion from englishforums.com in Table 2 has a greater average number of replies than the average number of participants. Notice that the value for the average number of replies in each discussion has not explosively growth because there are still a number of threads that have only one message on reply. In Table 3, the number of humanexperts on englishforums.com is much less than for the other two forums. We conjecture that is because the English forum needs more professional knowledge than the other two forums. In our opinion, the English forum needs more professional knowledge than notebook forum, and the notebook forum requires more specific knowledge than traveling forum. Moreover, the lower threshold of expert knowledge in a forum would attract more human-experts to participate. This can explain the different number of participants between forums, as seen in Table 3. Finally, from Table 3, it can be seen the participators of englishforums.com have the most loyalty.

455

(3) lonelyplanet.com is a professional traveling site and it includes the Thorn Tree Forum, which has a number of regional subcategories. Our experiment selected 1000 discussion threads from the Asia-North-East Asia forum to be the testing data. The sub-forum Asia-North-East Asia consists of travel information for North-East Asia. In addition, the discussion issues cover a wide range of topics, such as shopping, traveling, and living in a specific city. As mentioned before, to join the traveling forum requires a lower threshold of domain knowledge, therefore lonelyplanet.com has more participants than the other two forums examined (see Table 3). Similar to the English forum, the discussion process in the traveling forum is like an online conversation. Furthermore, most discussion threads in lonelyplanet.com are only terminated when the topics are no longer relevant, e.g. a time specific event has passed or thread has been inactive for a long time. It is interesting to note that, although a persuasive answer occurs in the discussion, the human-experts still provide other alternatives to solve original problem. In our conjecture, this may be because a definitive answer is not always possible for issues. This could also be described why this form has more participants and more average replies than the other two forums. Table 4 shows the natural language query set, which was used as the testing query data in the experiments. The testing query data can be categorized into three categories, which correspond to the three forums considered. Each category of query set has five natural language queries, and all the queries are parsed into query term sets, which can be found in the fourth field of Table 4. Before measuring the precision and recall rates of the testing data, we will examine the query sentences and their parsed query terms. As mentioned before, each query sentence has to undergo stemming, the elimination of stop words, and n-gram parsing. Accordingly, each query sentence is split into difference lengths of words and each word is transformed into its base form. Following that, the two issues that could affect the precision and recall rates are discussed as follows: (1) Stemming: In this phase, the Porter’s Stemming algorithm (Porter, 1980) is adopted. The algorithm transforms the word into its base form, e.g. ‘‘dogs’’ becomes ‘‘dog’’. Intuitively, the stemming phase could increase the word matching accuracy, and indirectly enhance the precision and recall rates of discussion searching. However, there are still some drawbacks in stemming results. For instance, in Q3 of Table 4, it can be found that the word ‘‘BIOS’’ has been transformed into ‘‘bio’’. The word ‘‘bio’’ is not a good index term in category thinkpads.com, but unfortunately it does make sense in another category, bioinformatics, even though the category is not included in our testing data. Therefore, although stemming can improve the performance of the proposed approach, it also causes some bias in the results.

456

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

Table 4 The natural language query set Query ID

Category

Query sentence

Parsed query term set

Q1 Q2 Q3

thinkpads.com thinkpads.com thinkpads.com

update, warranti bui, x, t, seri, thinkpad, bui x, t seri update, bio, latest, help

Q4 Q5 Q6 Q7 Q8 Q9

thinkpads.com thinkpads.com englishforums.com englishforums.com englishforums.com englishforums.com

Q10

englishforums.com

Q11

lonelyplanet.com

Q12

lonelyplanet.com

How can I update my warranty? Should I buy X or T series of ThinkPad? I can not update my BIOS to the latest one, help me. Why does my notebook turn off by itself? How can I set my Wi-Fi setting? I just want to know what CBBE means. What is a good writing style? What is an easy way to learn English? The government do/does a lot for us. Which one is the correct? Tell me the differences between Business English and General English. I am going to Shanghai in July with 5 adults and 3 kids. Where is better to stay in Shanghai? Could you suggest a 5-days traveling schedule in Tokyo for this month?

Q13 Q14

lonelyplanet.com lonelyplanet.com

Q15

lonelyplanet.com

How to best discover Xinjiang? What kinds of clothes should I should bring to Taiwan in August? On new year eve in Hong Kong, where has the celebration?

(2) Stop words pruning: The stop word pruning phase employs 319 English stop words from the stop word list of www.wikipedia.org to eliminate the insignificant words in sentence. Generally, the precision and recall would be increased, but some fragments of phrases would also be pruned. (3) The entity of discussion content: The proposed method considers the frequency of each term in discussions, therefore the words in a discussion thread are the most important factor to affect the searching results. On the other hand, the words combination of the query sentence is also an important attribute which can affect the search accuracy. In the experiments, we discovered this is most obvious in englishforums.com. The reason is the discussion threads in the grammar teaching forum include lots of terms that appear across multiple forum categories, and these terms do not belong in the category of englishforums.com. The help-seekers in the English grammar forum usually post a complete sentence and ask for other human-experts to correct the grammar. The sentence usually contains a wide range of words in it with the specific grammar issue remaining implicit, and this would decrease the searching precision and recall rates. Figs. 8 and 9 are the precision and recall rates respectively of searching for discussion knowledge, while Figs. 10 and 11 are the precision and recall rates respectively of human-expert discovery. These results are produced by the measurements mentioned before (see Eqs. (10) and (11)). For the experimental parameters, the membership

notebook, turn set, wi, fi, set, fi set, wi fi, wi fi set just, want, know, cbbe, mean, just want, cbbe mean good, write, style, write style, good write, good write style easi, wai, learn, english, easi wai, learn english govern, doe, lot, correct tell, differ, busi, english, gener, busi english, gener english shanghai, juli, 5, adult, 3, kid, better, stai, 5 adult, 3 kid

suggest, 5, dai, travel, schedule, tokyo, thi, month, 5 dai, dai travel, travel schedule, thi month, 5 dai travel, dai travel schedule, 5 dai travel schedul best, discov, xinjiang, discov xinjiang, best discov, best discov xinjiang kind, cloth, bring, Taiwan, august new, year, ev, hong, kong, ha, celebr, new year, year ev, hong kong, new year ev

degree threshold of searching the discussion is set to 0.1 and the threshold of human-expert discovery is 3.0. Before discussing the results, it roughly sets a criterion (P0.5) to classify the precision and recall rates into two classes: high precision/recall rate and low precision/recall rate. Since the early part of this section has explained some general results and effect factors of the proposed method, the following will only discuss the most interesting parts of the results through the binary classes (high and low). Generally, the more common queries would produce a high precision/low recall or low precision/high recall results in the searching process. However, in Figs. 8 and 9, Q2 presents both the low precision and low recall rates in the experiment. The major reason is the parsed term members of Q2 are not significant terms. In particular, the passed term set of Q2 is too common in the forum category thinkpads.com, and almost all discussion threads have the same terms. This could explain why Q2 has low precision. Moreover, the low recall is caused by the Q2 having no significant terms in it. Therefore, Q2 cannot construct enough membership relation to the correct discussions, and the related discussion threads would not be found. Sequentially, the next interesting discussion target is query sentence Q6. In Figs. 8 and 9, Q6 presents the extreme results in its precision and recall rates. The precision rate of Q6 for searching discussion knowledge is 0.1 and the recall rate is 1.0. After examining the discussion set of englishforums.com, it is discovered there is only one discussion thread in the database relevant to Q6. The parsed terms of Q6 include a significant term ‘cbbe’, and

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

the term can exactly locate the relevant information in the knowledge repository. Nevertheless, referring to Eq. (10), the precision rate is calculated by the relevance of the top 10 items discovered divided by the number of measured discussions (= 10). Consequently, the precision rate of Q6 is (1/10) = 0.1 and the recall rate is (1/1) = 1.0. The Q15 is the last object in our evaluation. Examining the parsed term set of Q15, it consists of several significant terms, ex. ‘celebr’, ‘hong kong’, and ‘new year ev’. Through the significant term set, we can infer that the precision and recall rates would give increased results for the discussion searching. Looking in Figs. 8 and 9, the experimental results support our inference (precision rate = 0.9 and recall rate = 1.0), demonstrating how good query terms would helpful in discovering critical knowledge. Fuzzy query term expansion expands the query term set and improves the recall rate of knowledge searching. Fig. 9 shows the acceptable recall rate for searching discussion threads, and in the type of system implemented the recall rate is more important than the precision rate. The reason is the help-seeker does not care about the how many solutions are found on the first result page, but rather are there any solutions on the first page. To discover solutions from the existing knowledge database is the most important objective of the query phase of our implemented knowledge forum system. Figs. 10 and 11 show the precision and the recall rates of human-experts discovery. They show the good performance of the average precision and recall rates. The reason is the human-experts were collected from the three different forums, and it is assumed that each participant is unique in the collection of experts set. Therefore, low precision would occur when the human-experts belong to the incorrect forum, or the experts do not have enough domain knowledge to answer the given question. Unfortunately, for the second situation, it is difficult to judge whether a discovered expert has enough knowledge, and thus the first situation becomes the only criteria for us to examine the precision and recall rates. Hence, the great difference between the forums causes the human-expert discovery mechanism to give excellent results, as shown in Figs. 10 and 11. After careful observation, it can be discovered only Q9 has a low precision rate in Fig. 10. This is because the original sentence of Q9 consists of a number of stop words which were all pruned off Q9. Following that, let us examine the pruned words of Q9, {‘the’, ‘do’, ‘a’, ‘for’, ‘us’, ‘which’, ‘one’, ‘is’}. To examine Q9 from the semantic view, the significant words are ‘do’ and ‘does’, but the word ‘do’ was eliminated in stop word pruning phase. Consequently, the precision rate of Q9 is lamentable. However, the discovery of humanexperts focuses on the recall rate, and the average precision rate is acceptable. 6. Conclusions We have studied the entire knowledge building and delivering processes of a help-seeking system, and it gives

457

the following approaches to counter the disadvantages of the traditional forum system. In the first, the forum would actively invite human-experts to solve the given problem. Accordingly, questions would not fall into neglect, and the issue would become solved in a short time. Second, the information seekers can vote on discussions, until useful a solution’s score overcomes a predefined threshold. These useful solutions have thus passed peer validation and become static knowledge in forum repository. The methodology has adopted common information retrieval techniques to parse the testing discussion knowledge and query sentences. The parsing technique includes stemming, stop word pruning, and n-gram segmentation. Furthermore, we have modified the tf*idf to fit in with the structure of the discussion knowledge. We have also introduced fuzzy information retrieval to improve the recall rate of the knowledge and expert searching by applying fuzzy query expansion. The fuzzy query measures the similarity between the query sentence and the collected discussion set. In addition, the experimental results indicated the fuzzy query has an acceptable performance, especially on the recall rate. With regard to human-experts discovery, this study has proposed the Expert-Terms Correlation Matrix to model the experts’ knowledge strength. The incremental maintenance issue has been considered here, employing the human forgetting curve to decrease the memory strength of experts. The human-experts’ interactions on the forum are mapped to the corresponding behavior factors, and these could be used to enhance domain knowledge strength in the Expert-Terms Correlation Matrix. Finally, the human-expert discovery mechanism was proposed, which evaluates the relationship between experts and the question posed by the help-seekers. The experimental result also shows the human-expert discovery mechanism could perform its task efficiently. In conclusion, the proposed human-expert forum system could improve the efficiency of knowledge building, and the help-seekers could use their own words to find the tailored solution. Acknowledgements This work was supported in part by the National Science Council (NSC), Taiwan, ROC, under Grant NSC 94-2524-S-006-001. References Abecker, A., Bernardi, A., Maus, H., Sintek, M., & Wenzel, C. (2000). Information supply for business processes: coupling workflow with document analysis and information retrieval. Knowledge-Based Systems, 13(5), 271–284. Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Addison-Wesley. Berardi, M., Lapi, M., Leo, P., Malerba, D., Marinelli, C., & Scioscia, G. (2004). A data mining approach to PubMed query refinement. In Proceedings of the 15th international workshop on database and expert systems applications, Zaragoza, Spain (pp. 401–405).

458

Y.-M. Huang et al. / Expert Systems with Applications 34 (2008) 446–458

Billerbeck, B., Scholer, F., Williams, H. E., & Zobel, J. (2003). Query expansion using associated queries. In Proceedings of the 12th international conference on information and knowledge management, New Orleans, LA (pp. 2–9). Brown, P. F., deSouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1990). Class-Based n-gram Models of Natural Language. Computational Linguistics, 18(4), 467–479. Catarci, T., Chang, S. K., Liu, W., & Santucci, G. (1998). A light-weight Web-at-a-Glance system for intelligent information retrieval. Knowledge-Based Systems, 11(2), 115–124. Chen, H., Yu, J. X., Furuse, K., & Ohbo, N. (2001). Support IR query refinement by partial keyword set. In Proceedings of the second international conference on web information systems engineering, Singapore (Vol. 1, pp. 245–253). Cross, R., Rice, R. E., & Parker, A. (2001). Information seeking in social context: structural influences and receipt of information benefits. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, 31(4), 438–448. Ebbinghaus, H. (1913). Memory: A contribution to experimental psychology. New York: Teachers College, Columbia University. Horng, Y. J., Chen, S. M., & Lee, C. H. (2003). A new fuzzy information retrieval method based on document terms reweighting techniques. International Journal of Information and Management Sciences, 14(4), 63–82. Jeng, Y. L., Huang, Y. H., Kuo, Y. H., Chen, J. N., & Chu, W. C. (2005). ANTS: agent-based navigational training system. Lecture Notes in Computer Science – Advances in Web-Based Learning, 3583, 320–325.

Jiang, M. F., Tseng, S. S., & Tsai, C. J. (1999). Intelligent query agent for structural document databases. Expert Systems with Applications, 17(2), 105–113. Li, W. S., & Agrawal, D. (2000). Support web query expansion efficiently using multi-granularity indexing and query processing. Journal of Data and Knowledge Engineering, 35(3), 239–257. Lin, S. H., Chen, M. C., Ho, J. M., & Huang, Y. M. (2002). ACIRD: intelligent Internet document organization and retrieval. IEEE Transaction on Knowledge and Data Engineering, 14(3), 599– 614. Martin-Bautista, M. J., Sanches, D., Chamorro-Martinez, J., Serrano, J. M., & Vila, M. A. (2004). Mining web documents to find additional query terms using fuzzy association rules. Fuzzy Sets and Systems, 148(1), 85–104. Matsumura, N., Ohsawa, Y., & Ishizuka, M. (2005). Combination retrieval for creating knowledge from sparse document-collection. Knowledge-Based Systems, 18(7), 327–333. Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137. Steehouder, M. F. (2002). Beyond technical discussionation: user helping each other. In Proceedings of IEEE international professional communication conference, Netherlands (pp. 489-499). Stein, A., & Maier, E. (1995). Structuring collaborative informationseeking dialogues. Knowledge-Based Systems, 8(2–3), 82–93. Weng, S. S., & Lin, Y. J. (2003). A study on searching for similar documents based on multiple concepts and distribution of concepts. Expert Systems with Applications, 25(3), 355–368.