Mining association rules procedure to support on-line recommendation by customers and products fragmentation

Mining association rules procedure to support on-line recommendation by customers and products fragmentation

Expert Systems with Applications PERGAMON Expert Systems with Applications 20 (2001) 325±335 www.elsevier.com/locate/eswa Mining association rules ...

832KB Sizes 1 Downloads 59 Views

Expert Systems with Applications PERGAMON

Expert Systems with Applications 20 (2001) 325±335

www.elsevier.com/locate/eswa

Mining association rules procedure to support on-line recommendation by customers and products fragmentation S. Wesley Changchien*, Tzu-Chuen Lu Department of Information Management, Chaoyang University of Technology,168 GiFeng E. Road, WuFeng, Taichung County, Taiwan, China

Abstract Electronic Commerce (EC) has offered a new channel for instant on-line shopping. However, there are too many various products available from a great number of virtual stores on the Internet for Internet shoppers to select. On-line one-to-one marketing therefore becomes a great assistance to Internet shoppers. One of the most important marketing resources is the prior daily transaction records in the database. The great amount of data not only gives the statistics, but also offers the resource of experiences and knowledge. It is quite natural that marketing managers can perform data mining on the daily transactions and treat the shoppers the way they prefer. However, the data mining on a signi®cant amount of transaction records requires ef®cient tools. Data mining from automatic or semi-automatic exploration and analysis on a large amount of data items set in a database can discover signi®cant patterns and rules underlying the database. The knowledge can be equipped in the on-line marketing system to promote Internet sales. The purpose of this paper is to develop a mining association rules procedure from a database to support on-line recommendation. By customers and products fragmentation, product recommendation based on the hidden habits of customers in the database is therefore very meaningful. The proposed data mining procedure consists of two essential modules. One is a clustering module based on a neural network, Self-Organization Map (SOM), which performs af®nity grouping tasks on a large amount of database records. The other rule is extraction module employing rough set theory that can extract association rules for each homogeneous cluster of data records and the relationships between different clusters. The implemented system was applied to a sample of sales records from a database for illustration. q 2001 Elsevier Science Ltd. All rights reserved. Keywords: Data mining; SOM; Rough set; Association rules; On-line marketing

1. Background and motivation According to the investigations of the Angus Reid Group in March 2000, there have been more than three hundred million users using the Internet in the world, and there will be three billion users by 2005 (E. Business Implementation, 1999). All of the investigations indicate the signi®cant increase of Internet users and the future as a networkoriented century. The development of the Internet has caused a new wave of business revolutions. Most of the Electronic Commerce (EC) businesses endeavor to survive and become leaders in the frontier of the new wave. The major key factors of success include learning customers' behavior of purchasing, developing marketing strategies to create new consuming markets, and discovering latent loyal customers, etc. Therefore, support of domain expertise to make better decisions and new IT techniques to promote EC marketing are essential. * Corresponding author. Fax: 1886-4-3742337. E-mail address: [email protected] (S.W. Changchien).

Data mining is one of the most popular techniques that can ®nd potential business knowledge from enterprise databases in support of making better decisions. Data mining by automatic or semi-automatic exploration and analysis on large amounts of data items set in a database (e.g., transactions database) can discover potentially signi®cant patterns and rules underlying the database. The patterns and rules are mostly the habits of purchasing and other consumers' behavior. It usually takes two approaches: veri®cation-driven data mining, which is a top-down approach that attempts to substantiate or disprove preconceived ideas, and discovery-driven data mining, which extracts information automatically from data (Basu, 1998; Bayardo & Agrawal, 1999; Beery & Linoff, 1997; Lin & Cercone, 1997; Kleissner, 1998). Both data mining approaches are to achieve some of the data mining tasks including association rules extraction, clustering, classi®cation, estimation, and so on. An Association Rule is represented by X ) Y where X and Y are a set of items. The rule means that the transaction records in database that contain X tend to contain Y. A good number of ef®cient algorithms for mining association

0957-4174/01/$ - see front matter q 2001 Elsevier Science Ltd. All rights reserved. PII: S 0957-417 4(01)00017-3

326

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

rules have been proposed (Agrawal, Imielinski & Swami, 1993; Agrawal & Srikant, 1995; Anand, Patrick, Hughes & Bell, 1998; Bayardo, 1998; Goulbourne, Coenen & Leng, 2000; Han & Fu, 1999; Houtsma & Swami, 1993; Mastsuzawa & Fukuda, 2000). Clustering is a task of segmenting a heterogeneous population into a number of more homogeneous subgroups or clusters. Classi®cation consists of examining the features of a newly-presented object and assigning it to one of a prede®ned set of classes. Estimation deals with continuously valued outcomes. Estimation can come up with a predicted value for some unknown continuous variable such as income, height, or credit card balance. In order to achieve those tasks, various data mining techniques have been proposed (Beery & Linoff, 1997; Dhond, Gupta & Vadhavkar, 2000; Kingsmanm Hendry, Mercer & de Souza., 1996; Lin & Cercone, 1997; Shaw, Gardner & Thomas, 1997; Smith & Gupta, 2000). Some researches focus on EC data mining: for instance, market segmentation, customer targeting, marketing decision support, and so on. For example, Shaw et al. (1997) laid out the dimensions of EC and developed a framework for research issues associated with EC. Vellido, Lisboa and Meehan (1999) used SOM to cluster data and ®nd the potential interest for on-line marketers. Yao, Teng and Poh (1998) incorporated Arti®cial Neural Network (ANN) into a marketing decision support system that in¯uences sales performance of color televisions. Brijs, Goethals, Swinnen, Vanhoof and Wets (2000) introduced a model for selecting the most interesting products based on their cross selling. Bhattacharyya (2000) used evolutionary computation-based procedures for obtaining a set of nodominated models with respect to multiple stated objectives. To learn more about their large number of different customers, EC managers' intuition is to separate the customers into smaller groups, each of which can be interpreted more speci®cally. In marketing terms, subdividing the population according to variables is already known to be good discriminators and is called clustering (Beery & Linoff, 1997). Clustering is an unsupervised partitioning of data and there are numerous clustering algorithms such as automatic cluster detection, K-means method, agglomerative algorithm, divisive methods, self-organizing maps, and so on (Beery & Linoff, 1997; Flexer, 1999; Ha & Park, 1998; Kaski, Honkela, Lagus & Kohone, 1998). One of them is a SelfOrganizing Map (SOM), a neural network that can recognize unknown patterns in the data. The basic SOM network has an input layer and an output layer. When training data sets are fed into the network, SOM will compute and come up with a winner node. The network will adjust winner node and neighborhood weights accordingly. Training continues until it converges. When a new input data is fed into the network, trained neural network, SOM will then be capable of determining which cluster the new input belongs to. A clustering technique like SOM can separate data items into clusters of items, but the drawback of SOM network is that it cannot explain the clustering results speci®cally. It needs other methods to ®gure out the underlying features for

each cluster. Ha and Park (1998) used a decision tree based on classi®cation C4.5, to show the various possible sequences of classi®cations. The tree, however, may be very complex for enterprise managers. Therefore, this paper uses association rules to explain the meaning of each cluster. Knowledge is usually represented in the form of rules. Rules are used for deducting the degree of association among variables, mapping data into prede®ned classes, identifying a ®nite set of categories or clusters to describe the data, etc. Mining association rules has attracted a great number of researchers (Deogun, Raghaven, Sarkar & Sever, 1997; Mastsuzawa & Fukuda, 2000; Pasquier, Bastide, Taouil & Lakhal, 1999; Tsechansky, Pliskin, Rabinowitz & Porath, 1999; Tsur, Ullman, Abiteboul, Clifton, Motwani, Nestorov et al., 1998; Zhang, 2000). For example, mining associations among sales transaction items may ®nd the presence of a fact that some speci®c items are usually purchased at the same time. Here we use a rough set to derive the association rules. Rough set theory is used for approximation of a concept. It uses the concepts of lower and upper sets. When we inspect the data mining queries with respect to the rough set theory, dependency analysis and classi®cation of data items are well investigated. The associations between values of an attribute can easily be solved by the rough set theory (Chan, 1998; Grif®n & Chen, 1998; Lin & Cercone, 1997). Pawlak proposed the rough set theory in 1982 (Lin & Cercone, 1997). This theory is an extension of set theory for the study of intelligent systems with incomplete information. Let U be a ®nite, non-empty set called the universe, and let I be an equivalence relation on U, called an indiscernible relation. I(x) is an equivalence class of the relation I containing element x. The indiscernible relation is meant to capture the consequence of inability to discern in view of the available information. There are two basic operations on sets in the rough set theory, the I-lower and the I-upper approximations, de®ned respectively as follows: Ip …X† ˆ {x [ U : I…x† # X};

…1†

I p …X† ˆ {x [ U : I…x† > U ± f}:

…2†

Usually in order to de®ne a set, we use the con®dence function. The con®dence function is de®ned as follows: CF…x† ˆ

Num…X > I…x†† Num…I…x††

…3†

where CF…x† [ ‰0; 1ŠNum…X > I…x†† is the number of objects that occur simultaneously in X and I(x), and Num…I…x†† is the number of objects in I(x). Con®dence function can be used to rede®ne Eqs. (1) and (2) as follows: Ip …X† ˆ {x [ U : CF…x† ˆ 1};

…4†

I p …X† ˆ {x [ U : CF…x† . 0}:

…5†

The value of the con®dence function denotes the degrees

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

327

Fig. 1. Data mining steps.

of how the element x belongs to the set X in view of the indiscernible relation I. The purpose of this paper is mining association rules from a database with a large amount of transaction records and implementing rules for EC on-line marketing. The data mining procedure consists of two modules. The ®rst one is a clustering module based on a neural network, Self-Organization Map (SOM), which performs af®nity grouping tasks on database records. The other is a rule extraction module based on rough set theory that can extract association rules for each homogeneous cluster of data records and can also ®nd out the relationships between clusters.

2. A proposed data mining procedure Data mining is an interactive and iterative process involving several steps. This article presents a procedure of detailed data mining steps outlined in Fig. 1 . First, customers browse over the electronic store and place orders. At the same time the customers' information and transactions are recorded in the database. The proposed data mining procedure is then applied to analyze customers' purchase records. After the data mining procedure is performed, the system automatically creates association rules from the database. The rules are signi®cant facts, patterns or knowledge about customers' data and purchase orders. The rules can be utilized in support of marketing strategy, on-line promotion strategy, target market selection, etc. Equipped with the knowledge in the form of rules, the system can provide one-to-one recommendations by the characteristics and habits of each one's speci®c market segment. Following is the proposed procedure of data mining of database for online recommendation.

2.1. Step 1Ðselection and sampling A database consists of current and historical detailed data items, summarized data items and metadata, etc. Step 1 selects target database tables, dimensions, attributes, and records for data mining. It consists of four activities: creating a fact table, selecting cause and result dimensions, selecting dimension attributes, and ®ltering data. 1. Creating a fact tableÐGenerally speaking, there are lots of tables in the enterprise's databases. Only the fact tables that correlate closely with the mining purpose are taken into account. For example, there may be a vendor table, a product table, a sale table, and a member table in a database (Fig. 2). Thus the fact table contains four sets of attributes, i.e., vendor, product, member, and sale. Users will then select dimensions from the fact table next. 2. Selecting dimensionsÐTo analyze the database records and extract association rules, important dimensions should be selected from the fact table. All dimensions of great interest to the analysts are selected and the relationships among the attributes within a dimension or between dimensions will be explored. For example, the manager can select a table member from the fact table and conduct clustering of their customers. 3. Selecting attributesÐThe selected tables usually have multiple attributes, but not all of those attributes are considered in the analysis because attributes are of different levels of importance to the manager. Therefore, selective attributes with different weights are analyzed. 4. Filtering dataÐIn order to get the appropriate database items for analysis, users can retrieve data with

328

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

Fig. 2. Fact table of a database.

constraints such as the range of an attribute. This step also involves removal of noises or handling of missing data ®elds. The analysts can also take random samples or a selective part of records from the database for analysis. For instance, if dimension member is selected, attributes may include customers' education, job, gender, etc. 2.2. Step 2Ðtransformation and normalization Before proceeding to data mining on data set, raw data should be normalized and/or scaled if it is necessary. Since this paper uses a neural network for data clustering, we transform data into values ranged between 0 and 1. Here are two different formulas that can be used to transform data. 1. If the attribute is numerical then use the following equation: response_valuejk ˆ

…valuejk 2 min…attj †† …max…attj † 2 min…attj ††

…6†

where: response 2 valuejk is the normalized value for the jth attribute of record k, k [ ‰1; pŠ, min(attj) is the minimum value of the jth attribute, max(attj) is the maximum value of the jth attribute, and val(attjk) is the original value of the jth attribute of record k. 2. If the attribute is non-numerical then speci®c data transformation scaling can be designed. For instance, the data type of attribute job is character. We thus need to transform it into normalized numerical response value. When all attributes have the same

measurement and ranges, we then proceed to the association rules extraction process.

2.3. Step 3Ðdata mining of association rules In this paper, we focus on clustering and association rules extraction in data mining tasks. Clustering is conducted before association rules are extracted. We use a neural network based clustering module ®rst then a rough set theory based rule extraction module is employed to discover association rules that explain the characteristics of each cluster and the relationships of attributes among different clusters. 2.3.1. Clustering module Kohonen proposed SOM in 1980 (Flexer, 1999). It is an unsupervised two-layer network that can recognize a topological map from a random starting point. The result of SOM shows the natural relationships among the input attributes. By SOM we can group enterprise's customers, products, and suppliers into clusters. According to different clusters' characteristics, different marketing recommendation strategies may be adopted by making use of the corresponding discovered association rule sets. In SOM network, input nodes and output nodes are fully connected with each other. Each input node contributes to each output node with a weight. Fig. 3(a) and (b) shows the network structure and ¯ow chart for the training phase of clustering module, respectively. In our developed system analysts can assign different numbers of output nodes (cluster number), learning rate, radius rate, and converge error rate, etc. For instance, we can choose two attributes: education and job from the table member as the SOM input nodes. Output nodes are set

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

329

Fig. 3. (a) neural network structure of SOM and (b) the ¯ow chart of SOM training procedure.

to be nine clusters. SOM network ®rst determines the winning node using the same procedure as the competitive layer. The weight vectors for all neurons within a certain neighborhood of the winning neuron are then updated. After the SOM network converges, we use those weights to split the data set in the dimension table. According to the weights, data items can be assigned to their corresponding clusters.

2.3.2. Rule extraction module Rough set theory can extract association rules for each homogeneous cluster of data records and can also ®nd out the relationships for attributes among different clusters. For instance, Table 1 presents a partition clustering results of members. The attributes of the records were normalized and clustered, where GID is the cluster number and Mem i is member's identi®cation number. Fig. 4 shows the ¯ow

330

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

Table 1 Records of members Member ID 1

Mem Mem 2 Mem 3 Mem 4 Mem 5 Mem 6 Mem 7

Table 2 Results equivalence classes Education (E)

Job (J)

GID

K

GID

Xk

N N N L H H L

N H H L H H N

B A B C A A C

1 2 3

A B C

{Mem (2), Mem (5), Mem (6)} {Mem (1), Mem (3)} {Mem (4), Mem (7)}

chart of implementing rule extraction module. With the rule extraction module, we ®rst use rough set to extract the rules that can describe the characteristics of each cluster. 2.4. Characterization of each cluster 1. Generate result equivalence classes. Let Xk denote the result equivalence class as a set of objects for cluster k for a given dimension table. Table 2 lists the results equivalence classes for a dimension table member. 2. Generate cause equivalence classes. Let Yij denote the cause equivalence class, a set of objects, for a speci®c attribute i with value of j (Aij). Table 3 gives the cause equivalence classes.

3. Create lower approximation rules. Let AXijpk denote the set of objects that all have the same attribute Aij and are all contained by cluster k (result equivalence class). Here Eq. (1) is used to create lower approximation rules. For example, for X1 ˆ {Mem (2), Mem (5), Mem (6)} where GID ˆ A …5† 1 , Mem (6)}, (Table 2), AXEN1 p ˆ {Mem…2† }, AXEH p ˆ {Mem X1 X1 X1 …2† AELp ˆ {f}, AJNp ˆ {f}; AJHp ˆ {Mem , Mem (5), Mem (6)}, AXJL1p ˆ {f}: Since AXEH1 p is not an empty set, therefore we have found a lower approximation rule, which is `R1: If Education ˆ H then GID ˆ A'. Accordingly, we can ®nd all the lower approximation rules. The con®dence of every lower approximation rule is 100%. 4. Create upper papproximation rules and compute con®dences. Let AijXK denote the set of objects that have the same attribute Aij and only some but not all objects of the set are cluster k. Here Eqs. (2) and (3) are applied to create upper approximation rules and to compute rules' con®dences. 4.1. Create upper approximation rules. Continue with the example illustrated above. For GID ˆ A, contained by only some of the objects in YEN and YJH are contained by X1 . It means that the set of objects in cause equivalence classes of Education ˆ N and Job ˆ H contains more objects than those in result equivalence class X1. That is, not all the objects in these two cause equivalence classes exist in result requivalence class X1. Hence we create two upper approximation rules for X1, which are `R2': If Education ˆ N then GID ˆ A' and `R3: If Job ˆ H then GID ˆ A'. 4.2. Compute con®dence for each upper approximation rule. Using Eq. (3) to compute upper approximation rule's con®dence. Here analysts can assign a threshold value (minimum con®dence) to determine which rules will be accepted. For instance, for the rule `R2: If Education ˆ N then GID ˆ A', the con®dence is 1/ 3. In Eq. (3), Num…X > I…x†† is the number of objects Table 3 Cause equivalence classes Aij

Fig. 4. The ¯ow chart of implementing rough set.

iˆE iˆE iˆE iˆJ IˆJ IˆJ

Yij jˆN jˆH jˆL jˆN jˆH jˆL

YEN ˆ {Mem (1), Mem (2), Mem (3)} YEH ˆ {Mem (5), Mem (6)} YEL ˆ {Mem (4), Mem (7)} YJN ˆ {Mem (1), Mem (7)} YJH ˆ {Mem (2), Mem (3), Mem (5), Mem (6)} YJL ˆ {Mem (4)}

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

331

Fig. 5. (a) joins of any two single equivalence classes, (b) the combinatorial rules found for X1 by joining cluster EN and clusters JN, JH and JL and (c) rules found for cluster A.

that occur both in X1 and YEN and Num…I…x†† is the number of objects in YEN . p

X1 CF…AEN †ˆ

1…i:e:; Mem…2† † 3…i:e:; Mem…1† ; Mem…2† ; Mem…3† †

! ˆ

1 < 0:33 3

For the rule `R3: If Job ˆ H then GID ˆ A', its con®dence is 3/4. p

CF…AJHX1 † ˆ

3…i:e:; Mem…2† ; Mem…5† ; Mem…6† † 4…i:e:; Mem…2† ; Mem…3† ; Mem…5† ; Mem…6† †

! ˆ

3 ˆ 0:75 4

If the minimum con®dence is set to 75%, only R3 will be accepted. 5. Create new combinatorial rules. Following the above procedure, all the single attribute association rules will be found. Nevertheless, relationships do exist in two or more attributes. In the following we

consider two or more attributes to generate association rules, called combinatorial rules. However, how to combine and what attributes suit to be combined are quite challenging. To solve those questions, we join each class in cause equivalence classes. Fig. 5(a) displays the results of joins of any two attributes. For p X1 the example above, AEN leads to a rule R2 whose con®dence is 1/3. Combine AEN with the other attribute job to ®nd if there are any combinatorial rules. 1 is Fig. 5(b) shows the result. Only one node AXEN;JH X1 not empty. The node AEN;JH combines two classes 1 and AXJH1 . X1 > AEN;JH contains only one object AXEN Mem (2), so Num…X1 > AEN;JH † is 1. The con®dence 1 of AXEN;JH is calculated as follows:   1 1 1 X1 ˆ ˆ 0:25 CF…AEN;JH † ˆ Min ; 3 4 4 Here we de®ne the con®dence function of an n-attribute

332

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

combinational rule: 

CF…x† ˆ Min

Num…Xi > I…x†† Num…Xj > I…x†† Num…Xn > I…x†† ; ; ¼; Num…I…x†† Num…I…x†† Num…I…x††



…7† where CF(x) [ [0,1] So we create a new combinatorial rule `if Education ˆ N and Job ˆ H then GID ˆ A CF ˆ p(0.25)'. The con®dence X1 1 of AX EN;JH is smaller than that of AEN . 6. Explain the characteristics of each cluster. Lower approximation rules, upper approximation rules and combinatorial rules that have the same result equivalence class are combined to explain the meaning of each cluster. For example, rules from R1 to R3 characterize cluster A. The members in cluster A have the following characteristics: 100% members in cluster A possess higher education, about 1/4 members in cluster A whose education level is normal and job position is high, and about 3/4 members in cluster A whose job position is high. Fig. 5(c) shows the ®nal characterization of cluster A in terms of association rules. 7. Go to (3), process the next equivalence class X2, and repeat until all the result equivalence classes are completed, then stop.

2.5. Association of different clusters Rough set theory can be applied to explain the characteristics of each cluster and can also be used to analyze the relationships between two different clusters. For instance, it may discover rules of which groups of products is preferred by which kinds of customers clusters. Rules can be utilized for electronic stores in building the one-to-one marketing strategy of database marketing of products catalog, advertisement, promotion, customized service, on-line recommendation, and so on. The procedure of discovering the relationships between different clusters is similar to the procedure described in Section 2.1. Table 4 displays the records of sales table. The member ID in members' table can be mapped to the corresponding member cluster number. For example, if the buyer ID is `3', his member cluster number is `1' using member clustering table. Table 5 shows the members who could be gift buyers or gift receivers in a member table. Similarly, products can be clustered. Table 6 shows some of the records after buyer, receiver, and product are transformed into cluster members. According to Table 6, we obtain the result equivalence classes for receivers (Table 7), cause equivalence classes for buyers (Table 8) and cause equivalence classes for products (Table 9). Take receiver ˆ 2 as a result equivalence class for illustration, X2 ˆ {OID (2), OID (3), OID (5), OID (6)}. Conduct the rule extraction module, the resultant lower approximation rules are `R1: if product ˆ 3 then receiver ˆ 2 (CF(x) ˆ 1)'

Table 4 Records of orders OID

Buyer

Receiver

Product

1 2 3 4 5 6

3 3 1 2 2 2

5 2 6 4 1 2

10 25 13 24 15 26

Table 5 Members and their corresponding cluster number Member ID

Cluster

1 2 3 4 5 6

2 2 1 3 3 2

Table 6 Records of orders after being clustered OID

Buyer

Receiver

Product

1 2 3 4 5 6

1 1 2 2 2 2

3 2 2 3 2 2

7 3 6 2 7 6

Table 7 Results equivalence classes of gift receivers `Receiver'

Xi

2 3

{Obj (2), Obj (3), Obj (5), Obj (6)} {Obj (1), Obj (4)}

Table 8 Cause equivalence classes of buyers Aij

Yij i ) Buyer

jl1 jl2

YBuyer,1 ˆ {Obj (1), Obj (2)} YBuyer,2 ˆ {Obj (3), Obj (4), Obj (5), Obj (6)}

Table 9 Cause equivalence classes of products Aij

Yij i l product

jl2 jl3 jl6 jl7

Yproduct,2 ˆ {Obj (4)} Yproduct,3 ˆ {Obj (2)} Yproduct,6 ˆ {Obj (3), Obj (6)} Yproduct,7 ˆ {Obj (1), Obj (5)}

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

333

Fig. 6. (a) the use interface of clustering module, (b) the user interface of rule extraction module and (c) some of the rules found for a given result equivalence class `Friend ID cluster 1'.

334

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

Fig. 6. (continued)

and `R2: if product ˆ 6 then receiver ˆ 2 (CF(x) ˆ 1)'. The resultant upper approximation rules are `R3: if buyer ˆ 1 then receiver ˆ 2 (CF(x) ˆ 0.5)' and `R4: if buyer ˆ 2 then receiver ˆ 2 (CF(x) ˆ 0´75)'. Rules R1 and R2 mean that product clusters 6 and 3, respectively, may be the favorites for gift receiver cluster 2. EC virtual stores can use those rules to recommend products on-line to their customers based on the corresponding customer market segments and characteristics. If customers want to purchase for their friends, they can recommend the products based on the friends' preferences that satisfy both customers' and their friends' needs. 3. Application in EC and Discussions To further illustrate the proposed data mining procedure, we conduct data mining on purchase records of a store. There are 1120 records in product table and 35 records in customer table. Following the proposed procedure, we select dimensions, attributes and sample for clustering customers and products. To simplify, we take samples of records of one whole day in sales table, totally 2000 records. We chose customers' job, education and gender for clustering customers and chose products' sales price, import price and the sale price of VIP customers for clustering products. After the normalizing step, data mining stepÐclustering module and rules extraction module are performed. Fig. 6(a) is the user interface showing the results of SOM clustering for products. Here the radius is set equal to 2.0 and learning rate equal to 1.0. Initial weights are randomly assigned. The change rate of radius and learning rate are both set to 0.98. The lowest radius and learning rate is 0.01. When the error rate is lower than 0.1, the network is converged and the ®nal weights are obtained, then we compute the ®nal score of each record with ®nal weights. According to the ®nal score, each record is assigned to its appropriate cluster. Next we use a rough set based rules extraction module to generate association rules describing the relationships between different clusters. Fig. 6(b) shows the user interface that applies rough set theory to create lower and upper

approximation rules. Total number of rules obtained is 99 and Fig. 6(c) shows only a portion of rules whose result equivalence class is `Friend Id' and cluster ˆ `1' and the threshold (minimum con®dence) is set to be 0.2. For example, one of the rules is `If MEM 2 ID ˆ `7' and PRODUCT 2 ID ˆ '9' then Friend ˆ `1' (CF ˆ 0.57)'. This rule can help make better decisions in on-line recommendation. For example, if a person wants to buy a present for his friend, but he has no idea of what present will suit his friend, if he belongs to customer cluster `7' and his friend belongs to customer cluster `1', then on-line one-to-one marketing system will recommend the top selling products that belongs to product cluster `9' for him. 4. Conclusions In this paper our proposed data mining procedure integrates a neural networkÐSOM and rough set theory into the clustering and rule extraction modules, respectively, to discover association rules. Clustering module separates the products and customers into groups prior to data mining process. A rule extraction module characterizes each cluster and describes the relationships among different clusters in the form of association rules. These rules can help the electronic store to perform their customers or products fragmentation, one-to-one on-line marketing via recommending products accordingly, and analysis of customers' favorites, and so on. Furthermore, analysts can customize a questionnaire to select the important attributes to cluster their customers, such as astrology analysis, psychological test or blood type. According to the different combination of multiple attributes, marketing managers can discover the customers' behaviors and rules by the proposed data mining procedure from a database. Using the developed data mining system presented above can help discover many of the rules, but there is still space for improvement. More intelligent system may be developed to dynamically include feedback and attributes for analysis. Regarding the market segmentation, the proposed clustering method, although a natural way of clustering, may take into account more factors, such as customers' and products' pro®les, purchase purposes, etc. Besides,

S.W. Changchien, T.-C. Lu / Expert Systems with Applications 20 (2001) 325±335

more ef®cient association rules mining and ®ltering algorithms may also be the future work. Acknowledgements This research was supported by the National Science Council, Taiwan, R.O.C., under contract no.: NSC-892416-E-324-023. References Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining Association between sets of items in massive database. International Proceedings of the ACM±SIGMOD International Conference On Management of Data, 207±216. Agrawal, R. & Srikant, R. (1995). Fast algorithms for mining association rules. In Proceedings of the International Conference on Very Large Data Bases (pp. 407±419). Anand, S. S., Patrick, A. R., Hughes, J. G., & Bell, D. A. (1998). A data mining methodology for cross-sales. Knowledge-Based Systems, 10, 449±461. Basu, A. (1998). Theory and methodology perspectives on operations research in data and knowledge management. European Journal of Operational Research, 111, 1±14. Bayardo, R. J. & Agrawal, R. (1999). Mining the most interesting rules. In Proceeding of the Fifth ACM±SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 145±154). Bayardo, R. J. (1998). Ef®ciently mining long patterns from databases. In Proceeding of the ACM±SIGKDD International Conference on Management of Data (pp. 85±93). Beery, J. A., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support, New York: Wiley. Bhattacharyya, S. (2000). Evolutionary algorithms in data mining: multiobjective performance modeling for direct marketing. In Proceeding of the ACM±SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 465±473). Brijs, T., Goethals, B., Swinnen, G., Vanhoof, K., & Wets, G. (2000). A data mining framework for optimal product selection in retail supermarket data: the generalized PROFSET model. In Proceeding of the ACM±SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 300±304). Chan, C. C. (1998). A rough set approach to attribute generalization in data mining. Information Sciences, 107 (1-4), 169±176. Deogun, S., Raghavan, V., Sarkar, A., & Sever, H. (1997). Data mining: trends in research and development. Rough sets and data miningÐ analysis of imprecise data, Kluwer Academic Publishers. Dhond, A., Gupta, A., & Vadhavkar, S. (2000). Data mining techniques for optimizing inventories for electronic commerce. In Proceeding of the ACM±SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 480±486). E-Business Implementation June 3, 1999. http://ec.mis.ccu.edu.tw/news/ default_stats.htm. Whither B2C?, http://www.maxpro®t.com/txtDB/

335

index.asp?id ˆ 4505&TypeNo ˆ 9&Page ˆ 1&kw ˆ http://www.®nd.org.tw/news/howmany_gb.asp Flexer, A. (1999). On the use of Self-Organizing Maps for clustering and visualization. In Proceeding of the 3rd International European Conference PKDD 99, Prague, and Czech Republic. Goulbourne, G., Coenen, F., & Leng, P. (2000). Algorithms for computing association rules using a partial-support tree. Knowledge-Based Systems, 13 (2-3), 141±149. Grif®n, G., & Chen, Z. (1998). Rough set extension of Tcl for data mining. Knowledge-Based Systems, 11 (3-4), 249±253. Ha, S. H., & Park, S. C. (1998). Application of data mining tools to hotel data mart on the Intranet for data base marketing. Expert Systems with Applications, 15, 1±31. Han, J., & Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on knowledge and data engineering, 11 (5), 798±804. Houtsma, M., & Swami, A. (1993). Set-oriented mining of association rules. Research report RJ 9567, San Jose, California: IBM Almaden Research Center. Kaski, S., Honkela, T., Lagus, K., & Kohone, T. (1998). WEBSOMÐSelforganizing maps of document collections. Neuron computing, 21 (1-3), 101±117. Kingsman, B., Hendry, L., Mercer, A., & de Souza, A. (1996). Responding to customer enquiries in make-to-order companies problems and solutions. International Journal Production Economics, 219±231. Kleissner, C. (1998). Data mining for the enterprise. In Proceeding of the 31st Annual Hawaii International Conference on System Science (pp. 295±304). Lin, T. Y., & Cercone, N. (1997). Rough sets and data miningÐanalysis of imprecise data, Kluwer Academic Publishers. Mastsuzawa, H. & Fukuda, T. (2000). Mining structured association patterns from databases. In Proceeding of the 4th Paci®c±Asia Conference, PAKDD 2000 (pp. 233±244). Pasquier, N., Bastide, Y., Taouil, R., & Lakhal, L. (1999). Ef®cient mining of association rules using closed itemset lattices. Information System, 24 (1), 25±46. Shaw, J., Gardner, M., & Thomas, H. (1997). Research opportunities in electronic commerce. Decision Support System, 21, 149±156. Smith, A., & Gupta, N. D. (2000). Neural networks in business: techniques and applications for the operations researcher. Computers and Operations Research, 27 (11-12), 1023±1044. Tsechansky, S., Pliskin, N., Rabinowitz, G., & Porath, A. (1999). Mining relational patterns from multiple relational tables. Decision Support Systems, 27, 177±195. Tsur, D., Ullman, J.D., Abiteboul, S., Clifton, C., Motwani, R., Nestorov, S., & Rozenthal, A. (1998). Query ¯ocks: a generalization of associationÐrule mining. In Proceeding of the ACM±SIGMOD (pp. 1±12). Vellido, A., Lisboa, P. J., & Meehan, K. (1999). Segmentation of the online shopping market using neural networks. Expert Systems with Application, 17, 303±314. Yao, J., Teng, H., & Poh, H. L. (1998). Forecasting and analysis of marketing data using neural networks. Journal of Information Science and Engineering, 14, 843±862. Zhang, T. (2000). Association rules. In Proceeding of the 4th Paci®c±Asia Conference, PAKDD 2000 (pp. 233±244).