Comparing generalization by humans and adaptive networks
COMPARING GENERALIZATION BY HUMANS AND ADAPTIVE NETWORKS. M, P~ve~,Mgrk A, Gluck.Van Henlde.Departmentof Psychology,JordanHail;Bldg.420,StanfordUniver...
COMPARING GENERALIZATION BY HUMANS AND ADAPTIVE NETWORKS. M, P~ve~,Mgrk A, Gluck.Van Henlde.Departmentof Psychology,JordanHail;Bldg.420,StanfordUniversity, Stanford, California 94305. Inductive learning is one of the most difficult and least understood aspects of cognition. D u ~ supervised learning an organism is exposed to a few examples of stimulus-response pairs (the training set) from which the organism infers how to to generate correct ~ to many other stimuli. The theoretical problem arises from the fact that there usually are many rules that are consistent with the training set but which generate different responses to the novel stimuti. Unlike deduction, induction has no a p r i o r / ~ v e procedure to decide which set of rules is the most appropriate. Thus, induction problems can be c o n s ~ in-pnsed problems in that there too many very different solutions. Such problems can be solved by introducing additional ~ t s or objectives that are external to the original problem. One of the central problems for understanding induction in natural (human) or artificial systems is to determine useful comtraints or resularization principles that convert the illposed problems into well posed problems. In spite of the ~ difficulties with d e ~ ~gond" inductions, people appear to be very good at rapidly learning to induce useful rules. Investigation of how people perform induction or generalization is, therefore, interesting not only to the students of cognition but also to builders of artificial learning machines. One goal of study reported here was to examine how people generalize in a dmple deterministic categorization task in which each pattern is characterized in terms of known binary f_~m__,tes. "While w e ~ certain similarities to emerge across human learners, we anticipated that the particular generalizations might be subject to considerable individual differences. To test this idea, we used an e x p e ~ ~ that would permit us to to observe individual subjects during the ~ of a ~ task on a set of ~ ~ arid then allow us examine the types of categnizatiom they made on a set of novel test patterns. Later we compared human generalizations to those of a small adaptive network. An interesting class of models to c o ~ d e r for catego0.zat~ are multi-layered adaptive networks. Because an m g o m u a i = ~ network can make m y possible generalizali~ ~ ¢ammaitm mu= be~imposed if an adaptive network is tO predict human generalizafi~ perfolmance. AnimpottaM q~eltioIt to ask is whel~r or not a network with a specific set of cmmmims can gedict a pmticular 8mzralizalkm. One pouible comtmint is to consider the minimal pomible network Ihtt could mlve the t m i n ~ trek. To eumine the ~ behavior of minimal fretwork8 ll~luire~ It ¢ o ~ l t ~ o r l g l melhod ~ of f l n d ~ =ll the ~ tO It categorization problem for a given number of hidden units. We used a tedmiqne developed by Pavel md Monte (1988) using a linear programming approach to emtmerate all the soinfim= fora given number oflddden units in sinai] two-layer networks. The smallest network c a p ~ e of perfonnil~ the pglic;tlar trek used in our is a two-layer network with four input, two hidden and one output uniL For such a network thet~ are 18 diffen~ solutions to the p ~ which result in 8 distinct genmlizalkms. The remdling dim#mtkm d t f l ~ from that otmnved for the human subjects. In fact, only three genemlizaflom found in the relndlJ of tbe bumal~ learntag ~ were generated by a two-layer network with two hidden units. The s=me analyses were perflmaed c~-netwegks with larger number of hidden units. As the number of units increased, the number of htmum generalizations accounted for by the networks increased but so did the number of generalizatio~ not exhibited by b u m ~ subject. We have demonstrated that subjects who learn the same pattern categorization may abstract diffez~t principles and therefore show large individual differences in their generalization behavior. Adap~e networks with the minimmn number of hidden mdts e x h ~ t a similar behavior but generalize differently. Thus, the constraint of using the minimum number of ld__dden_units does not alone provide a mfficient =mmaint on ~ e network models to allow them to model human categorization processes. C'urmaly we are invmfigafing the effects of other consu~nts.