Computers ind. Engng Vol. 35, Nos 3-4, pp. 583-586, 1998 © 1998 Published by Elsevier Science Ltd. All rights reserved Printed in Great Britain P I I : S0360-8352(98)00164-8 0360-8352198 $19.00 + 0.00
Pergamon
MODELING USERS T H R O U G H AN E X P E R T SYSTEM AND A NEURAL N E T W O R K ~1~ C. Moghrabi & M. S. Eid Computer Science & Industrial Engineering UniversitO de Moncton. Moncton, N. B. Canada, E1A 3E9 moghrac @umoncton.ca & eids @umoncton.ca
ABSTRACT With the number of Interact and Web users increasing rapidly, electronic service providers are competing to satisfy and better serve customers looking for information or channels of advertisement. A wide variety of browses, specialized sites, custom made software, etc. are being offered on a regular basis. However, the user has to filter through a large number of files before finding what he/she is really looking for. This paper presents a user modeling expert system, SIGMA, based on neural networks for encapsulating Internet and Web users' habits and preferences. SIGMA is an artificial intelligence application designed to answer an Internet client needs and preferences. It analyses the user supplied demographic data and the monitored transactions then generate a tailored profile that is ultimately used to filter what information is being passed on to him/her in an effort to reduce and hopefully eliminate the time and energy expended in sifting through raw and often unwanted data. © 1998 Publishedby ElsevierScienceLtd. All rights reserved.
KEYWORDS User modeling, artificial intelligence, expert systems, neural networks, multi-layer, error backpropagation, connectionist expert systems, supervised learning.
INTRODUCTION A wide variety of Interact and web browses, specialized sites, custom made software, etc., are appearing in the market on a regular basis to compete for the rapid increase in the number of Interact and Web users. These users could be individuals looking for information or companies looking for additional channels of advertising. To gain a competitive edge in this rapidly growing market, electronic service providers find themselves in a difficult position in trying to satisfy and better serve their customers. While using the Interact, an individual user is required to go through a tedious task of trying to filter through a large number of files before finding what he/she is looking for. A tool that can help ease this task would benefit the users and lend a competitive edge to the Internet provider. 583
584
Selected papers from the 22nd ICC&IE Conference
tl) This research was funded and authorized for publication by Rod Mullins, Inc., an AI systems developer, Moncton, New Brunswick, Canada. To them we extend our appreciation.
Neural networks have been widely and successfully used in expert system applications for medical diagnosis (e.g. Baxt 1990); stock price predictions (e.g. Dutta & Shekhar 1988); automobile fault diagnostic; etc. ( Lippman 1989; Miller, Sutton & Werbos 1990; Omidvar 1992; Pao 1989; Pomerlau 1989; Sejnowsky & Rosenberg 1987; Widrow 1989). Based on these successes, these approaches are sought for utilization in the cyber world applications area. This paper presents a user modeling expert system, SIGMA, based on neural networks for encapsulating Internet and Web users, habits and preferences. SIGMA is an artificial intelligence application designed to answer an Internet client needs and preferences. It analyses the user supplied demographic data and the monitored transactions then generate a tailored profile that is ultimately used to filter what information is being passed on to him/her in an effort to reduce and hopefully eliminate the time and energy expended in sifting through raw and often unwanted data.
EXPERT SYSTEMS VS. NEURAL NETWORKS The conventional approach to building an expert system requires a human expert to formulate the necessary rules that are needed for analyzing the complex and huge amounts of input data. The number of rules that are dealt with in a real system situation is usually large and the intercorrelations among them are quite frequent. Complicating the situation further is the fact that system builders lack specialized domain knowledge that allows them to independently formulate the expert rules. These two facts create a need for intensive and lengthy coordination between the human expert and the system builder before the knowledge acquisition is complete, or, alternately speaking, before expert rules can be extracted and adequately coded into the system. Realizing that layered neural network architectures can be trained to do the job without encapsulating the expert knowledge itself (e.g. Omidvar 1992), system builders can use neural networks in real world applications of connectionist expert systems (Gallant 1988). In so doing, it is expected that this approach reduces the lengthy coordination between system builders and the human experts in addition to achieving system flexibility. In an effort to reduce and hopefully eliminate the time and energy expended by an Internet user in sifting through raw and often unwanted data, a system, SIGMA, was sought to achieve this goal. SIGMA is an artificial intelligence application designed to answer an Internet client needs and preferences. It analyses the user supplied demographic data and the monitored transactions then generate a tailored profile that is ultimately used to filter what information is being passed on to him/her in an effort to reduce and hopefully eliminate the time and energy expended in sifting through raw and often unwanted data. The development of the system was effected in two stages: first with an expert system and then a neural network approach was experimented with.
SOLUTION 1: AN EXPERT SYSTEM Lacking enough statistical data to model the user, the first traditional approach of an expert system was used for the development of SIGMA. The kinds of information needed for the system and the inference rules to be applied to them were supplied by a human expert who also examined the obtained results for verification. The concept of an expert system OmodelingO the user was thus established. To use the system, the client fills in a demographic oriented questionnaire that
Selected papers from the 22nd ICC&IE Conference
585
is used to produce his specific profile. This latter in turn models the kind of information forwarded to the client. Interaction between the system/concept developer and the system builder was, as is normally expected, a time consuming element in the development process. Both sides learned what is feasible and what is wanted during the product, s development phase. This is more evident in doing R&D which is the case here. With the evolution of the expert system, it became clear that more information was required. However, it was not clear how much or which ones were needed. Further, due to time constraints and not knowing whether the profiler would be installed on the user. s computer or the server, it was decided to use a neural network approach in which the training is done on data obtained from the expert system solution. This would permit in the mean time for the service supplier for which the system was developed to collect the training examples that the expert system needs. They can also train with different inputs and outputs according to their evolving needs.
S O L U T I O N 2: T H E NEURAL N E T W O R K The well known feed-forward error back-propagation multi-layered supervised learning was used for the development of this system. Building layered feed-forward neural networks is not new in itself. The originality of the present work and its risk lie within the fact that the system is trying to model human behavior rather than biological processes which is the more usual application as recorded in the literature. To implement a neural network solution, the input sets/cases and the corresponding output patterns need to be englobing enough to cover all the required decision regions in the search space. Uncertainties are handled through neurode outputs having more than two discrete values. One inconvenience, however, is the difficulty of achieving or simply the lack of self-explanation components that one has become accustomed to have in traditional expert systems. In the present system development stage, only 10 0realO cases were available, which is certainly not enough to train any network. To experiment with the system, a random set of 1000 new cases were created and tried with a multitude of parameter values. Following are the results obtained: The learning rate varied from 0.005 to 0.9 and the momentum varied between 0 and 0.9. Linear and/or sigmoid activation and/or output functions were tried. The initial weights and biases were tried with constant values between 0 and 1 as well as random values formed from all kinds of combinations between -15 to +15. The architectures used were two and three layers, ACONs and OCONs. The closest the network converged to was with a mean square error of 54 to 87 with two layers. The three layers work best with 4 cells in the middle layer with an error varying between -0.5 to 0.5. It is still premature to state today that the network learns properly. More sophisticated methods are still under investigation: a classifier if the network has to Odiscover@ on its own as opposed to being taught by full example sets. * The system will have to include a temporal architecture if a database of the Ohistory0 of
Selected papers from the 22nd ICC&IE Conference
586
user interactions is to be maintained. * In an expert system approach the time consuming element in the development process is that all the rules and examples have to be defined in advance. * In a neural network approach it is the training that is time consuming. Finding this magical combination of all the parameters which is more in the realm of experimentation!
CONCLUSION The new domains of expert systems and neural networks can help accelerate the search performed on the Internet making it a valuable too for Interact service providers. However, it is felt that this techniques can play a complimentary role to instead of a replacement role for the usual computational simulations or other artificial intelligence techniques. One inconvenience that was experienced during this research, however, is the difficulty of achieving, or simply the lack of, self-explanation components that one is accustomed to in traditional expert systems.
REFERENCES Baxt, W. G. (1990) 0Use of an Artificial Neural Network for Data Analysis in Clinical Decisionmaking: The Diagnosis of Coronary Occlusion0, Neural Computation, vol. 2, pp.480-489. Dutta, S. & Shekhar, S. (1988) 0Bond Rating: a Non-Conservative Application of Neural Networks0, Proc. IEEE Int. Conf. on Neural Networks, San Diego, CA. Gallant, S. I. (1988) OConnectionist Expert SystemsO, Communications of the ACM, vol.3, no.2, pp. 152-169. Kohonen, T. (1989) Self-Organization and Associative Memory, 3rd ed., Springer-Verlag, New York, NY. Kraft, L. G. & Campagna, D. S. (1990) OA Comparison Between CMAC Neural Network Control and Two Traditional Adaptive Control SystemsO, IEEE Control Systems Magazine, April 1990, pp. 36-43. Lippman, R. P. (1989) Review of Neural Networks for Speech Recognition, in Neural Computation, vol. 1, no. 1, pp. 1-38. Miller, T.W., Sutton, R.S., & Werbos, P.J., Ed. (1990) Neural Networks for Control, MIT press, Cambridge, Mass. Omidvar, O. M. (1992) Progress in Neural Networks, Ablex pub., Norwood, NJ. Pao, Y. (1989) Adaptive Pattern Recognition and Neural Networks, Addison-Wesly Readings, MA. Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plaisible inference, Morgan-Kauffman pub., San Francisco, CA. Pomerlau, D. A. (1989) OALVINN: An Autonomous Land Vehicle in a Neural NetworkO, in Advances in Neural Information Processing Systems, vol. 1, D. Touretsky ed., MorganKauffman Pub., San Mateo, CA. Sejnowsky, T. & Rosenberg, C. (1987) 0Parallel Networks That Learn to Pronounce English Text0, Complex Systems, vol. 1, pp. 145-168. Specht, D. F. (1988) @Probabilistic Neural Networks for Classification, Mapping, or Associative Memory", Proc. Int. Conf. on Neural Networks, vol. 2, pp.525-532. Widrow, B., Director (1989) DARPA Neural Network Study, Final Report, M1T Lincoln Laboratory Technical report 840, Lexington, MA.