Nondeterministic adaptive automata that play matching pennies

Nondeterministic adaptive automata that play matching pennies

NONDETERMINISTIC ADAPTIVE AUTOMATA THAT PLAY MATCHING PENNIES Richard C. W i n d e c k e r AT&T Bell L a b o r a t o r i e s Middletown, NJ 0 7 7 4 8 ...

38KB Sizes 2 Downloads 96 Views

NONDETERMINISTIC ADAPTIVE AUTOMATA THAT PLAY MATCHING PENNIES Richard C. W i n d e c k e r AT&T Bell L a b o r a t o r i e s Middletown, NJ 0 7 7 4 8 This p a p e r d e s c r i b e s a class of n o n d e t e r m i n i s t i c a d a p t i v e a u t o m a t a t h a t play t h e game Match i ng Pennies. G a m e T h e o r y says t h a t t he s t r a t e g y t h a t m i ni m i zes a player's m a x i m u m loss in this game is to play h e a d s (H) a n d tails (T) with equal probability, 0.5. Here we d e s c r i b e how a n o n d e t e r m i n i s t i c (i.e. s t o c h a s t i c ) a d a p t i v e a u t o m a t a , as one player, A, can learn over a s e q u e n c e of plays, to take a d v a n t a g e of n o n r a n d o m p a t t e r n s of play of its o p p o n e n t , B. T h e s e p a t t e r n s can be d e t e r m i n i s t i c or n o n d e t e r m i n i s t i c . For ex amp le, if B plays t he r e p e a t i n g s e q u e n c e H H T T H H T T ..... A can l earn to m a t c h this s e q u e n c e exactly. If B plays H with probability 0.6, A can learn to play H all the time to maximize expe c t e d gain at 60%. If B r e p e a t s his choice two plays p a s t with probabi l i t y 0.8, A can learn to duplicate B's choice two plays p a s t with. probability one. T h e a u t o m a t a t h a t c a n l e a r n to recognize t h e s e p a t t e r n s are s e q u e n t i a l n e t w o r k s of simpler p a r t s some or all of w h i c h are adaptive. In this sense, this c a n be regarded as a n e u r a l n e t w o r k model. The first p a r t of the p a p e r d e s c r i b e s t he model. The s e c o n d p a r t gives t he r e s u l t s of c o m p u t e r s i m u l a t i o n e x p e r i m e n t s i l l u s t r a t i n g t he t y p e s of n o n r a n d o m p a t t e r n s n e t w o r k s c a n l e a r n , how f a s t t h e y c a n l e a r n , a n d how t h e y s o m e t i m e s fail to learn the a p p r o p r i a t e pat t ern.

473