Developing an intelligent web information system for minimizing information gap in government agencies and public institutions

Available online at www.sciencedirect.com Expert Systems with Applications Expert Systems with Applications 34 (2008) 1618–1629 www.elsevier.com/loca...

Download PDF

720KB Sizes 2 Downloads 84 Views

Report

PDF Reader
Full Text

Available online at www.sciencedirect.com

Expert Systems with Applications Expert Systems with Applications 34 (2008) 1618–1629 www.elsevier.com/locate/eswa

Developing an intelligent web information system for minimizing information gap in government agencies and public institutions Tae Hyun Kim a

a,*

, Gye Hang Hong

b,1

, Sang Chan Park

a,2

Department of Industrial Engineering, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Republic of Korea b Dongbu CNI, #28th Floor, Dongbu Financial Center, 891-10 Daechi-dong, Gangnam-gu, Seoul 135-523, Republic of Korea

Abstract A purpose of a web information service in government agencies and public institutions is in providing various kinds of public information to support decision-making of people. However, people have a diﬀerence of information access environment, ability to understand information and information pursuit desire, and so on, so that they have the information gap which aﬀects proﬁt gaps among them. Therefore, we suggest an intelligent web information system for minimizing information gap in government agencies and public institutions so that disadvantaged people can understand web contents and they make the more proﬁt in their economic behaviors. In order to remove information gap, various and useful contents should be designed and managed so that disadvantaged people can understand easily and utilize beneﬁcently. The government agencies and public institutions should provide the disadvantaged people that are experiencing the problem of information gap with the requisite web contents that can be understood by them. This paper discusses the desirable web information system in the government agencies and public institutions based on web intelligence technology such as web mining and web personalization tools for the providing the class of information weakness with the requisite web contents and supporting their successful practical use. We show application of the web information system to Ministry of Agriculture & Forestry (MAF) in Korea. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: Electronic government; Information gap; Data mining; Personalization

1. Introduction In this paper, we suggest an intelligent web information system for minimizing information gap in government agencies and public institutions delivering personalized web contents which disadvantaged people can understand and from which they make the more proﬁt in their eco-

*

Corresponding author. Tel.: +82 42 869 2960; fax: +82 42 869 3110. E-mail addresses: [email protected] (T.H. Kim), kaistduck@ dongbu.com (G.H. Hong), [email protected] (S.C. Park). 1 Tel.: +82 2 3011 5648; fax: +82 2 3011 5551. 2 Tel.: +82 42 869 2920; fax: +82 42 869 3110. 0957-4174/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2007.01.041

nomic behaviors. For developing the system, we identify disadvantaged people having a lot of total losses and a high probability for loss per transaction through analyzing transaction data of all markets. Then we identify the diﬀerence of information gap between disadvantaged people and the other advantaged people, and redesign the contents of web pages for disadvantaged people to make good a gap of information and to understand it easily. As one of means for the electronic government embodiment, the website construction of public sectors such as government agencies and public institutions/enterprises have been actively propelled. Electronic government construction aims at providing citizen with service quickly and accurately, eﬀectiveness of government work, innovation by

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

redesign work process, and raising national competitiveness by improving productivity (Lee & Jung, 2004). Since e-government (electronic government) ﬁrstly appears in the United States of America in 1993, today’s several countries in the world, including South Korea, are propelling the e-government building projects which actively take advantage of Information-Communication technologies in government administration. Those projects are seeking the eﬃciency elevation of government administration and its customer-oriented service through the use of those technologies. For implementing e-government, the website construction of government agencies and public institutions is actively propelled. It provides civilians with various information and service and plays the most important means in the collecting public opinion. The website of government agencies and public institutions ultimately diﬀers with the private sector’s website in its seeking the public tendency and equability. That is, it is constructed and operated to realize the public beneﬁt and equability. They provide interactive electronic administration service, act as the passage of administrative information spread and communication, public opinion of citizens and its reﬂection. They also raise the transparency of administration and perform the activation function of country and local economy. According as electronic administration and public information service are expanded because of e-government implementation, the information gap problem is happened in case of diﬃcult access about that service. In other words, government agencies and public institutions should provide public information so that all people can easily access electronic administration and public information service and take advantage of necessary information regardless of the economical, physically handicap and region diﬀerence. However, in case the universality of access, use and beneﬁt about this public information service is not secured, the information gap is happened. It is mainly considered that the causes of information gap are the diﬀerences of information access environment, ability to understand information and information pursuit desire, and so on, which come from the economical, education level, physically handicap and regional diﬀerence. When the contents that oﬀer in the website are hard to understand and not utilized beneﬁcently, it is recently indicated that more serious information gap can be produced by combining with those causes such as the economical, education level, physically handicap and regional diﬀerence. In order to remove information gap, various and useful contents should be designed and managed so that disadvantaged people can understand easily and utilize beneﬁcently. Therefore, the government agencies and public institutions should provide the disadvantaged people that are experiencing the problem of information gap with the requisite web contents that can be understood by them. This paper discusses the desirable web information system in the government agencies and public institutions

1619

based on web intelligence technology such as web mining and web personalization tools for the providing the class of information weakness with the requisite web contents and supporting their successful practical use. We show application of the web information system to Ministry of Agriculture & Forestry (MAF) in Korea. The application is designed for supporting farmers’ decision-making because diﬀerence of information access environments among them is very high, so information gaps among them are very serious. As a result of the information gap, they have a high diﬀerence of proﬁt in spite of selling the products having the same quality. 2. Web information system in MAF and research background In this section, we describe research background of egovernment and then a current public institution’s web information system in MAF. Lastly, we present some issues which we will solve for developing intelligent web information system in MAF. 2.1. Research background E-government is deﬁned as an application of IT to government service. E-government is a global phenomenon and public servants around the world are adopting novel ways to leverage IT to better serve their constituents. The e-government services can be divided into three categories: transaction, citizen participation, and access to information (Marchionini, Sanan, & Brabdt, 2003). The access to information of these categories is the most common e-government application. Our e-government project also is included in the research ﬁeld. A purpose of the service is in providing various kinds of public information to support decision-making of various users such as researchers, public servants in a web information system government agency and in other government agency, citizens and so on. Because various people have diﬀerent education level, information pursuit desire, etc, access to the pubic information, government agencies should collect various data, transform the data into various kinds of information, and deliver the information to them. Therefore, it is very important for government agencies to consider how to integrate vast volumes of data managed in heterogeneous system and how to manage the data eﬀectively (Bouguettaya, Ouzzani, & Cameron, 2001). They should identify users’ needs and determine what kinds of web contents they may serve for satisfying various needs of users. Lastly, they should consider how to design and deliver the web contents which users want and can understand. We can consider two technologies to solve the problems: web pages design and personalization. Marchionini and Levi (2003) suggested design issues in their BLS (the Bureau of Labor Statistics) project. They introduced four design philosophies: Use a highly structured, information intensive display, use dynamic

1620

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

representations and control mechanisms, use end-user vocabulary and use multiple representations and views. That is, they emphasized a design method in which we can contain maximum information in minimum pages and can view one kind of information into various diﬀerent types of information for users to understand. A personalization is that information about the speciﬁc user is used to develop an electronic proﬁle based on diﬀerent types of user-speciﬁc information. So, users can have beneﬁt for reducing searching time of information. The personalization is mainly used to private-sector information and service providers such as Yahoo and Amazon.Com. However, pubic sectors such as government agency and public institute don’t apply the technology to their web system because of legal restrictions regarding the use of personal information and privacy. Increasing the user demands can overcome the legal restrictions and make the technology to apply to their system (Hinnant & O’Looney, 2003; Medjahed, Rezgui, Bouquettaya, & Ouzzani, 2003). Therefore, the public sector system must deliver the beneﬁcial information considering user level, so that users will make the more proﬁts by referring the information to decision-making and they will trust the public sector system. 2.2. Web information system in MAF A current public institution’s web information system in MAF was made for supplying beneﬁcial agriculture information such as shipping information connected with the distribution of farm products and culture information of farm village life in order to contribute in the development of agriculture farm village and agriculturist welfare.

As shown in Fig. 1, MAF investigates transaction data in markets and transmits the data to main transaction data server everyday. The transaction data consists of market name, buyer name, supplier name, product name, quality, quantity and price. The data are summarized into many kinds of information in which people are interested. For example, any people may have an interest in the information summarized according to two dimensions of month and city and other people may have an interest in the information summarized according to three dimensions of week, market and item. So, we ﬁnd feasible dimensions and summarize data according to feasible combinations of the dimensions. The information is built to data warehouse and On-Line Analytical Processing (OLAP). It is delivered to farmers by web pages. 2.3. Deﬁning issues We considered the following three issues about the intelligent web information system that can bring user such as farmer overall proﬁt improvement through the use of that system. First, people use a little kinds of information in spite of being delivered seventy kinds of information or more. It is why many kinds of information are not useful for their decision-making or designed for people not to understand easily. Because it is web design issue, we should design web contents which contain maximum information in minimum pages and we should suggest a method in which we can make one kind of information into various diﬀerent types of information considering user’s ability to understand.

Ministry of Agriculture and Forestry (MAF) Market 1

People Farmers

Transaction Data Server

Investigating transaction data (price, quantity, quality, etc.) in all market

Merchants Region

Customers

. . .

Market

Market n

Summarizing data and building data warehouse

Time

Web Pages

Market 2

Summarized Data : Price, Quantity

Fig. 1. Public institution’s web system in MAF subsidiary.

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

Second, users have a diﬀerence of ability to understand the same information due to diﬀerence of a standard of education, experience, etc. Therefore, every people refers diﬀerent kinds of information for his decision-making and makes diﬀerent proﬁt or loss in the result. We suggest a method in which we deliver valuable web information to improve proﬁts of disadvantaged users. Third, the value of information can be changed over the period in time because market environment is changed. However, many users do not know the changed value of information. They may not be adapted to changed market environment and make many losses in market. They may not trust in the system and allow use of personal information in the result. Therefore, we suggest a method in which we monitor changes of market environment, identify new important information, and deliver the valuable information. 3. Intelligent web information system In this section, we describe framework of an intelligent web information system (IWIS) in government and all modules of the IWIS. Firstly, we deﬁne main words as Han and Kamber (2001): – Cuboid: a combination of dimensions. It represents a diﬀerent degree of summarization. – Number of dimensions of a cuboid: it represents complexity level of a kind of information. – Concept level of a dimension: summary level of a dimension in a cuboid. – Summarization schema: cuboid and concept levels of dimensions of the cuboid.

Various kinds of Web information

Make a decision

Web Service

Web log & Web pages

1621

As shown in Fig. 2, users refer various kinds of web information delivered in IWIS and then they decide an item, its quantity, a market in which they sell it. We collect users’ transactions in markets, evaluate users’ decision-making in the proﬁt aspect and then segment users into advantaged group and disadvantaged group. We deﬁne advantaged group as the group which make more proﬁt than average value of users’ proﬁts in markets. Next, we identify the diﬀerence of used web information between two groups. That is, we ﬁnd which web pages mainly referred by users of each group, what features of information included in the web pages and ability to understand information by users. Lastly, we deliver the redesigned web pages to improve proﬁt of disadvantaged group. The intelligent web information system is proposed to support disadvantaged group to improve their proﬁts. As shown in Fig. 3, the intelligent web information system consists of two modules: Web-information creation module (WCM) and personalization module (PM). The WCM is the module in which we identify summarization schemas and decision variables for all clusters. And then it ﬁnds the best summarization schema and decision variables of those and redesigns web information with the best things. The WCM consists of three sub-modules: Monitoring change of market environment, Identifying key summarization schema and decision variable, and Redesigning web pages. Monitoring change of market environment is the module in which we ﬁnd important market environment variables from main customers’ behaviors in market, compare those with the environment variables of previous period and make the next module to rework if two environments are diﬀerent.

Redesign web information for improving profit of disadvantaged group

Identify the difference of used web information between advantaged group and disadvantaged group

advantaged group

Collect transaction data in markets & Segment users in the profit aspect

Markets Fig. 2. Framework of intelligent web information system.

disadvantaged group

1622

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

Web-information Creation Module (WCM) Monitoring change of market environment

Update

Market data

Identifying key decision variables and summarization schema

Redesigning web pages

Key summarization schema & Key decision variables of each cluster

Web pages repository

Users’ Behavior data

Web logs

Database Users’ behavior data

User log-in

New Comer?

Is the profit of a user improved?

No

No

Searching new web pages for each user

Yes

Yes Designed all web page

Web Service

Packaging personalized web pages

Personalization Module (PM) Fig. 3. Main modules of intelligent web information system.

Identifying key summarization schema and decision variable is the module in which we identify key summarization schema and decision variable for all cluster by analyzing web-log data and users’ behavior data. And then it ﬁnds the best summarization schema and decision variable of those for proﬁtable decision-making. Redesigning web pages is the module in which we understand user’s usage ability of information. And then, it transforms the information summarized with best summarization schema and decision variables into the diﬀerent type of information which users can understand. The PM identiﬁes whether registered user or not. If a user is a new visitor, it deliveries all web information to ﬁnd his behaviors of web usage and abilities to understand. If a user is a registered visitor, we deliver personalized web pages of the user. The PM monitors the user’s proﬁt by analyzing on-line/oﬀ-line transaction data after using the web information. If a user didn’t improve his proﬁt, the PM searches new type of web pages from web page repository, and deliveries the new packaged web-pages to the user. Till now, we described the two main modules of intelligent information system in government. In the next subsection, we will describe the sub-modules of WCM and designed methods of those speciﬁcally. 3.1. Identifying key summarization schema and decision variable The value of information can be diﬀerent according to users’ understanding ability in the previous section. We

should evaluate the user’s behavior in market after referring information to measure the value of the information by user. To measure the value of information, we distinguish advantaged users from the other users. We transform users’ behavior data into the data of three factors: – Total proﬁt (TP): Proﬁt is deﬁned as diﬀerence between user’s transaction price per one item and average transaction price of markets per one item. Total proﬁt is obtained by summing the proﬁt made from user’s all transactions during speciﬁc period. – Probability for proﬁt (PFP): It is obtained by dividing the number of transaction in which user obtain proﬁt by total transactions during speciﬁc period. – Probability for loss (PFL): It is obtained by dividing the number of transaction in which user obtain loss by total transactions during speciﬁc period. Then, we segment users into several clusters having similar features with self-organizing-map (SOM), which is one of the clustering methods (Kohonen, 1982). The SOM is an unsupervised learning schema to train the neural network. Unsupervised learning comprises those techniques for which the resulting actions or desired outputs form training sequences are not known. The network is only told the input vectors and then self-organizes these inputs into categories. As shown in Table 1, we can see that users of clusters 1 and 2 make more total proﬁts than users of clusters 3

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

classiﬁcation methods (Quinlan, 1993). Classiﬁcation is the process which ﬁnds the features of a newly presented object in a database and assigns it to one of a set of predeﬁned classes (Fig. 4). It discoveries key summarization schema and decision variables of the best cluster (e.g. cluster 1) of advantaged clusters and those of the other clusters as shown in Fig. 5. Firstly, we make a ‘user to web page’ matrix by analyzing web logs. It describes access counts by user and web page. Secondly, we transform the user to web page matrix into a diﬀerent type of matrix represented by user and schema. And the new matrix has value of ﬁve categories (High, Above Average, Average, Below Average, and Low) transforming access counts. We should classify web pages into kinds of schema and then measure degree of web usage of the schemas. Each web page consists of summarization schema and decision variables as shown in Fig. 6. As shown in Fig. 6, a ‘webpage 05’ is made by summarizing transaction data according to two decision variables of price and quantity and two dimensions of time and market. If we ﬁnd the web pages characterizing the best cluster and understand the schema and decision variables of the pages, we can identify important decision variables and how to summarize data of the decision variables for a good decision-making. Thirdly, we identify schema and decision variables of the web information characterizing each cluster. We use the C4.5 classiﬁcation method which is one of classiﬁcation methods and obtain decision tree as shown in Fig. 5. We can result from nodes of decision tree and conditions for

Table 1 Summary of cluster characteristics Cluster 1 2 3 4 Total average

Fraction of users (%)

Total proﬁt

Probability for proﬁt

Probability for loss

12.5 28.1 34.4 25.0

38,725 7082 25,213 33,404

0.86 0.63 0.63 0.33

0.04 0.33 0.39 0.94

100.0

3203

0.61

0.43

1623

and 4. Especially, users of cluster 1 have very higher probability for proﬁt per transaction than that of users of other clusters. On the other hand, users of cluster 4 have very lower probability of proﬁt per transaction and total proﬁt. Users of cluster 3 have the almost same probability for proﬁt with users of cluster 2, but they have lower total proﬁt than users of cluster 2. They can improve their total proﬁts if the users reduce average amount of loss per transaction. These diﬀerences among clusters are caused by users’ abilities to understand information and what kinds of information are referred for decision-making. So, we must ﬁnd that they mainly use any web pages and the web pages are summarized with any summarization schema and decision variables. The WCM identiﬁes important web pages which can distinguish clusters of advantaged users from the others of disadvantaged users. We use the C4.5 which is one of

Identifying key decision variables and summarization schema

Users’ Behavior data

Transforming the data into values of three factors: • TP (Total Profit) • PFP (Probability For Profit) • PFL (Probability For Loss)

Clustering

Several clusters having similar features

clusters & their features

Each user’s cluster number

Access degree of each user by web pages

Identify variables and summarization schema of cluster

Summarizing web logs occurred between user behaviors Web logs

Decision variables and summarization schema of cluster Environment variables of market Yes

Market environments are changed? Market data

Create the web pages which can compare alternatives of each cluster with alternatives of the best valuable cluster

Finding main customers of market in terms of RFM and identifying their characteristics

Web Pages repository

Monitoring change of market environment

Redesigning web pages

Fig. 4. Sub-processes of web information creation module.

1624

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

Web01 User01 User02 … User k

Web02

…

Web03

3

Web m

7 10

A01

A02

A03

…

Am

H L

L H

BA L

… …

L AA

Cluster Number 1 2

L

L

A

…

BL

4

User01 User02 … User k

transform 2 4

• H : High • AA : Above Average • A : Average • BA : Below Average • L : Low

Summarize access counts of web pages

Transform web counts into counts of summarization schema

A01 ≤ AA

> AA

Identify decision variables and summarization schema of each cluster

A04

Class 4

≤ AA

> AA

A15

A06

>A

>H

Class 1

Class 2

≤H

Decision variables and summarization schema of each cluster

Class 3

Fig. 5. Extracting summarization schema and decision variable.

Summarization Schema 1st cuboids Time

1 dimension Daily (A1)

1 concept hierarchy

n-th cuboids

2nd cuboids Summarization schema for Presenting cuboids and its concept level

Time

Weekly (A2)

Monthly (A3)

Market

2 dimension

Market A (A4)

2 concept hierarchy

Daily

Supplier

Weekly

Item

Weekly

Monthly

Market

Time

Supplier

Monthly

Item

Item

Market B (A5)

A Item

Market

d dimension

Market A (A m)

d concept hierarchy

Price Trend

05

06

07

08

……

Quantity Trend

……

Webpage m

Price

……

Webpage

Webpage 04

Webpage

Webpage 03

Webpage

Webpage 02

Webpage

Webpage 01

Quantity

Decision variables for decision-making Fig. 6. Schema and decision variables of web pages.

their branches as shown in Table 2. Users of cluster 1, which is the best advantaged cluster, can be characterized as the following: They have an ability to understand information summarized with four kinds of dimensions and consider the two decision variables of quantities and its trend. The values of the decision variables are summarized into average val-

ues per a week. In the other hand, users of cluster 4, which is the lower proﬁt cluster, have an ability to understand information summarized with three kinds of dimensions. So, if we delivery information summarized according to four kinds of dimensions or more to the users, the users may not understand means of the information and may be misled to error in decision-making. And the users

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

1625

Table 2 Summary of schema and decision variables

Cluster 1

Cluster 2

Cluster 3

Cluster 4

No. of Schema

Degree of reference

Description for schema and decision variable

Characteristics

A01

6AA

A04

>AA

A15

>A

Cubiod: time(daily), item, market Decision variables: price and quality Cubiod: time(week), market, and item Decision variables: price and quality Cubiod: time(week), market, supplier, item Decision variables: quantities and its trend

Maximum degree of dimension: 4 Key summarization: (1) Summarizing data of quantities and its trend according to week, market, supplier and item (2) Summarizing data of price and quality according to week, market, and item

A01

6AA

A04

6AA

Maximum degree of dimension: 4 Key summarization: Summarizing data of price and quantity according to week, market, customer, and item

A06

>AA

Cubiod: daily, item, market Decision variables: price and quality Cubiod: week, market, and item Decision variables: price and quality Cubiod: week, market, customer, and item Decision variables: price and quantity

A01

6AA

A04

6AA

Maximum degree of dimension: 4 Key decision variable: Summarizing data of price and quality according to daily, market, supplier and item

A06

6AA

A03

>AA

Cubiod: daily, item, market Decision variables: price and quality Cubiod: week, market, and item Decision variables: price and quality Cubiod: week, market, supplier, and item Decision variables: price and quantity Cubiod: daily, market, supplier, item Decision variables: price and quality

A01

>AA

Cubiod: daily, item, market Decision variables: price and quality

Maximum degree of dimension: 3 Key decision variable: Summarizing data of price and quality according to daily, market and item

consider the decision-variables of price and quality summarized into average value of transactions per a day. In the result, we can see that advantaged users importantly consider quantity of an item transacted and its trend by week for decision-making while disadvantaged users do daily price and daily quantity of an item. And advantaged users compare their transaction with transactions of other suppliers while the disadvantaged users don’t consider it importantly. Therefore, if the disadvantaged users refer the information comparing with the other suppliers’ quantity and its trend by week importantly, they may improve total proﬁt and probability for proﬁt. We need to redesign the information to a type of information which they can understand. 3.2. Monitoring change of market environment As described in previous section, environment variables of market are changed over the period in time. However, users may not be aware of the changed market environment and they may consider the same kinds of information which were used for decision-making in previous period. It is why they have many losses and lower the probability for proﬁt per transaction. Therefore, we should monitor market environment and inform new environment variables to users. For monitoring market environment, we summarize transaction data of market to data of recency, frequency, and monetary (RFM). Using summarized RFM

data, we are able to ﬁnd main customers of market and identify consuming pattern of main customers. The RFM clustering method is one of the analyzing methods for discovering customer patterns (Bult & Wansbeek, 1995; Woo, Bae, & Park, 2005). We deﬁne RFM as follows. – Recency: the last period of time of purchases during an analyzing period. – Frequency: the number of purchases during an analyzing period. – Monetary: amount of money spent during an analyzing period. After summarizing to data of RFM, we segment customers into the several groups having a similar consuming pattern with the SOM. When segmenting customers, we can obtain the results as shown in Table 3. In the period P1, main customer group is the cluster 2 because cluster 2 has the characteristics of recency and relative importance of total sales. We can see that the customers buy small quantity of the item which has below average quality, average price in the period P1. However, the characteristics of main customer group are changed as high quality, high price, and many transaction quantities in the period P2. Therefore, we should monitor the change of proﬁts of advantaged users because important web information for a good decision-making may be changed. If their proﬁts are decreased, we should ﬁnd new advantaged users

1626

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

Table 3 Identifying change of market environment Period

Cluster

Recency

Frequency

Monetary

Relative importance of total sales (%)

Market environment (=consuming pattern of main customers in each cluster)

P1

Cluster 1 Cluster 2 Cluster 3 Cluster 4 Total average

1.0 1.0 1.0 0.33 0.83

0.23 0.04 1.0 0.64 0.48

0.72 0.01 0.53 0.39 0.41

6 47 12 35

Cluster 2 Quality: 5th–7th level Frequency: average Price: average Quantity: below average

P2

Cluster 1 Cluster 2 Cluster 3 Cluster 4 Total average

0.66 0.38 0.22 0.76 0.51

0.30 0.98 0.17 0.09 0.39

0.54 0.72 0.08 0.05 0.35

51.95 13.68 16.62 17.75

Cluster 1 Quality: 1st level Frequency: above average Price: high price Quantity: many quantity

and their key summarization schema and decision variables. The monitoring change of market environment module allows that users make a suﬃcient proﬁt continuously.

3.3. Redesigning web pages After extracting summarization schema and decision variables of each cluster, we can see the diﬀerence between summarization schema and decision variables of the advantaged cluster and those of the others. Disadvantaged users may improve their proﬁt if we deliver the information which is summarized with the summarization schema and decision variables of the advantaged cluster to them. However, the users may not understand the information if we deliver the same type of the information because of diﬀer-

ence of abilities to understand among users. Therefore, we must redesign the information to consider an understanding ability of users. As shown in Fig. 7, we redesign the web information to improve proﬁts of disadvantaged users. We retrieve data of alternatives related to the decision variables of best advantaged cluster and summarize those according to summarization schema of best advantaged cluster. Then, we segment the alternatives into several similar groups and measure values of all groups. A value of group is calculated from a linear weighting model (de Boer, Labro, & Morlacchi, 2001) as following: TV i ¼ wi1 li1 þ wi2 li2 þ þ wij lij where i is the label of group, j is decision variable, w is weight of decision variable j of group i, and l is average

Values of the decision variables of the best advantaged cluster

Values of the decision variables of each cluster

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

0 or 1

frequency

price

price trend

quantity

quantity trend

quality

frequency

price

price trend

quantity

quantity trend

quality

summarization schema of the best advantaged cluster

Searching the best alternatives for decision-making

summarization schema of each cluster

Searching the best alternatives for decision-making information of the best alternatives resulted from the information used by each user

information of the best alternatives resulted from the information used by the best advantaged users

Converting the information of the best alternatives of each user into information summarized with summarization schema of the best advantaged cluster converted information related to the best alternative of each user

Combining two alternatives to comparing the best alternatives of the best advantaged users with the best alternatives of each users and redesign web information Fig. 7. Searching the best alternatives for decision-making.

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

1627

Fig. 8. Redesigning an web-page. (a) Web information and its schema which users of cluster1 refer for decision-making. (b) Delivering the modiﬁed web information of (a) for users of cluster4 to select a better alternative.

value of decision variable j of group i. The reason for measuring values of all groups is comparing all groups and then recommending alternatives of the best group for decision-making. Because comparing values of multi-variables, we suggest a linear weighting model which considers important degree of each variable totally. The weight of each decision variable is set by market environment. We compare the values of all groups and choose alternatives of the best group. The alternatives are the optimal decision which best advantaged users can choose. As shown in the example of Fig. 8a, the best advantaged users refer the web page which summarized with quantity and its trend of decision variables and the schema having four dimensions of item, market, week and supplier. We can show the alternative, which is market in this case, having more quantity and its trend than others as pointed by circle in the graph of Fig. 8a. Using the same method, we search the alternatives which users of each cluster think to optimal decision. A user of disadvantaged cluster may refer the web information which is summarized with price and quantity of decision variable and the schema having three dimensions of daily, item and market as left web page of Fig. 8b. However, optimal alternative of the web information is diﬀerent from that of the best advantaged users. We should inform

disadvantaged users of the diﬀerence of decision-making and guide them to rethink their decision-making again. Therefore, we convert web information of the alternatives which disadvantaged user regards as optimal solution into the diﬀerent type of web information summarized with the decision variables and schema of the best cluster. When converting the web information, we consider disadvantaged user’s ability to understand. As shown in the right of Fig. 8b, we ﬁnd information of optimal alternative with the decision variables and schema of best cluster and then we convert the information into a diﬀerent type of the information presented by three dimensions of item, market, week and one ﬁxed dimension of supplier. The new information is combined with information of the alternatives ﬁned from the best cluster for showing that the alternative is not optimal decision in right of Fig. 8b although an alternative is optimal decision in left of Fig. 8b. Therefore, disadvantaged users may change their decision-making and understand other important decision variables and schema. 4. Evaluation To analyze eﬀectiveness of the intelligent web information system in government, we used three measures which are total proﬁt, probability for proﬁt and probability for

1628

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

b

c

Fig. 9. Comparing control group with treatment group. (a) Relative proﬁt of clusters changed over period in time. (b) Comparing total proﬁts among clusters. (c) Comparing probability for proﬁt/loss among clusters.

loss. We randomly selected twenty users in users of cluster 4, then one half of the users is included in control group and the other half is included in treatment group. Then, we redesigned the information summarized with schema and decision variables of cluster 1 as a type of information which users of the treatment group can understand easily. We served the information to users of the treatment group. Then, we obtained the following results after testing two groups during 10 weeks. As shown in Fig. 9a, users of the best cluster had the higher proﬁts than two groups during almost all periods. Users of the treatment group didn’t have any diﬀerence of proﬁt with the control group in ﬁrst period but they had the higher proﬁts than the control group during rest periods. In the point of total proﬁt, there was comparison between treatment group and control group as shown in Fig. 9b. That is, proﬁts of treatment group are sharply improved. In the point of probability for proﬁt/loss in Fig. 9c, probability for proﬁt of treatment cluster is increased and probability for loss of treatment cluster is decreased. The probability for proﬁt of cluster 1 and treatment cluster is 70% because they have made proﬁt in seven

periods during 10 weeks. However, the control cluster has made proﬁt only one during 10 weeks. We can see that diﬀerence of abilities of which users understand information and diﬀerence of information used for decision-making from the result have many aﬀects to success or failure of their economic behavior. Therefore, intelligent web information system in government should be designed to support eﬀective information to the people who have a low ability to understand. 5. Conclusion The purpose of an intelligent web information system in government agencies and public institutions is in providing various kinds of public information to support decisionmaking of people. Because people have a diﬀerence of a standard of education, experience, etc., they have a diﬀerence of ability to understand information. They make a different decision in spite of delivering the same many kinds of information for decision-making, so that they make a diﬀerent proﬁt/loss. Therefore, we suggested an intelligent web information system in government for help disadvantaged users make

T.H. Kim et al. / Expert Systems with Applications 34 (2008) 1618–1629

more proﬁt in their economic behaviors. We deﬁned the important issues for developing intelligent web information system in government eﬀectively: design of web contents, personalization, and corresponding to change of market environment. For developing the system, we collect users’ transactions in markets, evaluate users’ decision-making in the proﬁt aspect and then segment users into advantaged groups and disadvantaged group. Then, we identify the diﬀerence of web information used between two groups. That is, we ﬁnd which pages users of each group mainly refer, what features of information included in the pages, and ability to understand information by user. Lastly, we redesign the web information to improve the information weakness of disadvantaged group. Then, we used three measures which consist of total proﬁt, probability for proﬁt and probability for loss to analyze eﬀectiveness of the intelligent web information system in government. From the evaluation, we could see that diﬀerence of abilities of which users understand information and diﬀerence of information used for decision-making have many aﬀects to success or failure of their economic behavior. If web system of government can support that disadvantaged users can understand the information which advantaged users refer for decision-making, the web system may be trusted to users because people can make a more proﬁts.

1629

References de Boer, L., Labro, E., & Morlacchi, P. (2001). A review of methods supporting supplier selection. European Journal of Purchasing and Supply Management, 7, 75–89. Bouguettaya, A., Ouzzani, M., & Cameron, J. (2001). Managing government databases. IEEE Computer, 34(2), 56–64. Bult, J. R., & Wansbeek, T. J. (1995). Optimal selection for direct mail. Marketing Science, 14(4), 378–394. Han, J., & Kamber, M. (2001). Data mining – Concepts and techniques. San Francisco: Morgan Kaufman. Hinnant, C. C., & O’Looney, J. A. (2003). Examining pre-adoption interest in on-line innovations: An exploratory study of e-service personalization in public sector. IEEE Transactions on Engineering Management, 50(4), 436–447. Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69. Lee, J. A., & Jung, J. W. (2004). Strategy for implementing high level e-government based on customer relationship management. Korean National Computerization Agency. Marchionini, G., & Levi, M. (2003). Digital government information services: The bureau of labor statistics case. Interactions, 10, 18–27. Marchionini, G., Sanan, H., & Brabdt, L. (2003). Digital government. Communications of the ACM, 46(1), 25–27. Medjahed, B., Rezgui, A., Bouquettaya, A., & Ouzzani, M. (2003). Infrastructure for e-government web services. IEEE Internet Computing, 7(1), 58–65. Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: MacGraw-Hill. Woo, J. Y., Bae, S. M., & Park, S. C. (2005). Visualization method for customer targeting using customer map. Expert Systems with Applications, 28(4), 763–772.

Developing an intelligent web information system for minimizing information gap in government agencies and public institutions

Developing an intelligent web information system for minimizing information gap in government agencies and public institutions

Recommend Documents