ARTIFICIAL INTELLIGENCE
197
RESEARCH NOTE
Heuristic Search in Restricted Memory P.P. Chakrabarti, S. Ghose, A. Acharya and S.C. de Sarkar Department of Computer Sctence and Engineering, Indian Institute of Technology, Kharagpur, India 721302 ABSTRACT
Thts paper presents heurlsttc search algortthms whtch work wtthm memory constramts These algortthms, MA* (for ordinary graphs) and M A O * (for AND~OR graphs) guarantee admtsstble soluttons wlthm spectfied memory hmltattons (above the rnmtmum requtred) The memory versus node expanstons tradeoff ts analyzed for the worst ca~e In the case o f ordmary graphs, some expertments usmg the Ftfteen Puzzle problem are carried out under vartous prunmg condmons These parametertzed algortthms are found to encompass a wide class o f best f r s t search algortthms
1. Introduction The space reqmred by heuristic search algorithms like A* and AO* is enormous. This is due to the fact that all the nodes generated by the search algorithm are reqvired to be saved prior to termination. The number of states generated has been shown to be exponential with the length of the solution path [10] and is very large for combmatonally hard problems Thus it is necessary to devise strategies which terminate with admissible solutions even ff enough memory to store all the states is not available. Several attempts have been made to devise algorithms which work within restricted memory. The IDA* algorithm due to Korf [6] is a depth-first iteratlve-deepemng techmque for ordinary graphs which guarantees admisslbtlity using minimum memory. However, the number of nodes (or states) generated by it (though asymptotically optimal) is large, because it prunes away all nodes after one ~terat~on and does not retain much path mformat~on from one iteration to the next. Moreover, ~t cannot fruitfully utilize more memory even if available. There is a scope for tradeoff between the number of nodes generated and the number of states saved. IDA* and A* fall on two extremes. Arttfictal huelhgence 41 (1989/90) 197 221 0004-3702/89/$3 50 © 1989, Elsevier Science Pubhshers B V (North-Holland)
198
P P CHAKRABART1 ET AL
It would be worthwhile to develop algorithms which have the flexiblhty to use available memory in order to reduce the total number of states generated Depth_m search due to Ibarakx [5] is another algorithm which works under restricted memory for ordinary graphs. But this algorithm once again tends to fall back upon an essentially depth-first search unless the available memory is very large. Moreover, for small values of m, the number of nodes generated prior to termination is large. In the domain of game tree search, Marsland et al. [7] have proposed a strategy which can be programmed to use only the space available at some degradation in performance. Another strategy to search game trees in restricted memory is the algorithm ITERSSS* of Bhattacharyya and Bagchl [2] This paper is concerned with the development of heuristic search algorithms for both ordinary and A N D / O R graphs which not only terminate with admissible solutions m restricted memory, but also fruitfully utilize the available memory These algorithms take as input the amount of additional memory M A X (over the minimum required) and prune away nodes whenever the memory size M A X is exceeded. First we deal with ordinary graphs An algorithm MA* is presented and analyzed MA* is studied under different pruning conditions and modifications are suggested to decrease ItS selection overhead (which by itself is relatively small compared to that of A*). Next, an algorithm MAO* is presented for acychc A N D / O R graphs. It is shown that several interesting special cases arise from MAO*, one being an lteratlve-deepenlng algorithm for A N D / O R graphs. Both MA* and MAO* are analyzed for the worst case to show how the variation of MAX affects the number of states generated by the algorithms
2. Ordinary Graph Search in Restricted Memory 2.1. Notation s
g n,m,p,q,r,t,u
S(n) P,P' c(m, n) g(n) h*(n, P) h(n, P) h*(n, m) h(n, m) f(n, P)
start node; goal node; other nodes; set of immediate successors of n in the state space; paths in the search tree rooted at s; cost of a path from m to n, = c(s, n) cost of the minimal cost path from n to g along path P (n is on P); heuristic estimate of h*(n, P); = h*(n, P), where m is on P and m is in S(n); = h(n, P), where m is on P and m is in S(n); = g(n) + h(n, P);
HEURISTIC SEARCH IN RESTRICTED MEMORY
f(P) h *(n) C* l(P) a(n)
Ic*l
199
= f(n, P), where n is the tip node of P; = min[h*(n, P ' ) ] for all P ' such that n is on P ' ; = h*(s), cost of minimal cost path from s to g; length of P (if P = (n o, n l , . . . , nk), l(P) = k), length of the path from s to n; = max[/(P)] for all P such that f ( P ) <~ C*.
Assumption 2.1. The heuristic estimate is a lower bound, that is, for all nodes and paths h ~< h*. 2.2. Algorithm MA* Since the complete state space is huge, the algorithm works on an exphcit search tree which initially consists of the start node s (like A* [9]). The algorithm proceeds by extension of the most promising path in the present tree. The path information is kept in the nodes of the tree. Every node n has the values g(n), d(n) and h(n, m) for all m E S(n). At every stage of the algorithm, the tip node of the best path P is selected for expansion and its most promising successor generated. If after expansion n has no more successors to be generated, it no longer remains a tip node of any path in the generated tree. Thus there may be several nodes which are partially expanded in the sense that some of its successors have been generated while others have not. These nodes still remain tip nodes of some paths in the tree and may be expanded further. Definition 2.2. Now, for every node n the following are defined:
hf(n) = min[h(n, m)] for all m E S(n) which are not present in the explicit tree;
f(n) = g(n) + hf(n); hF(n) = min[h(n, m)]
for all m E S(n);
F(n) = g(n) + h F ( n ) . The tip node with smallest f(n)-value, which is the tip node of the best path to be extended, is selected for expansion. Ties are resolved for "the deepest path" (or "the most recently generated node"). Since a single successor is generated at a time, the f-value of a node changes due to expansion. The successor rn, for which h(n, m) is equal to hf(n), is generated as the next child of n. The idea of pathmax is used during node generation. Thus for each child m of n, for each q E S(m), If h(m, q) < h(n, m) - c(n, m) ,
then set h(m, q) ~-- h(n, m) - c(n, m).
200
P P C H A K R A B A R T I ET AL
This effectively makes the heuristic e s h m a t e (which is a lower bound) monotonic in nature. Thus if q is a successor of r on path P, we have h ( q , P) +
c(r, q) >1h(r, P). The algorithm uses a b o t t o m - u p cost revision to revise the F-value of a node n to the smallest f(m)-value amongst all m which are tip nodes of some path P ' via n. This helps to retain valuable cost reformation of pruned nodes in their ancestors. The algorithm MA* is now presented and it is proved to be admissible. MA* takes the amount of excess available m e m o r y " M A X " as input. It is also proved that MA* requires no more than M A X + IC*] states as memory.
Algorithm MA* (MAX). ( 1)
[Initialize] O P E N consists of the start node s. C L O S E D is empty.
set d(s) ~---0; count<--0; calculate
O) h(s, m) for all m E S(s); (ii) hf(s), f(s), hF(s), F(s). (2)
[Selectl] Select that node n in O P E N for which f(n) is smallest [Resolve ties for the node with larger d(n).] set G L ~--f(n).
(3)
[Termination] If n is goal then terminate with success and Trace the solution path.
f(n)
as the solution cost.
(4)
[Expand] Expand n generating its next successor m. 0) calculate (a) h(m, q) for all q E S(m) (use pathmax); (b) d(m), hf(m), f(m), hF(m), F(m). set parent pointer from m to n (ii) If n has no m o r e possible successors then put n in C L O S E D after removing from O P E N else recalculate hf(n), f(n) (iii) Put m in O P E N . (w) Increment count.
(5)
[Select2] If f(m) <~G L then select m as the next node (now call it n) to be expanded and goto (3).
HEURISTIC SEARCH IN RESTRICTED MEMORY (6)
[Cost revision] (i) Put m in a set Z (initially empty). (ii) Repeat the following steps until Z is empty: (a) Remove the node p from Z. Let q be the parent of p. (b) If h F ( p ) + c(q, p ) > h(q, p) then set h( q, p) ~-- h E ( p ) + c( q, p) recalculate hF( q), F( q) and put q in Z.
(7)
[Continue] If count ~
201
then goto (2). (8)
[Pruning] (i) Select the leaf node r in OPEN with largest F(r). [Resolve ties for the node with smaller d(r).] (A leaf node has no successors either in OPEN or C L O S E D )
(ii) If F(r) = F(s) then goto (2). (iii) Remove r from OPEN and decrement count. (iv) Let t be the parent of r. If t is in CLOSED then remove t from CLOSED and put it in OPEN. (v) Recalculate hf(t), f(t). (vi) goto (8.i) The algorithm MA* presented above is for trees. Its extension to graphs is quite straightforward. Both OPEN and CLOSED have to be searched before inserting a node in OPEN. Once pathmax is properly maintained, the heuristic estimate effectively becomes monotone and there is no need to bring a node from CLOSED to OPEN (see Nilsson [9]). Before presenting some experimental results, the admissibility and correctness of MA* are proved. Lemma 2.3. At any mstant before MA* termmates, there is always a path P in the generated tree such that f ( P ) <~ C*. Proof. Similar to Nilsson [9].
[]
Lemma 2.4. For all nodes n in the generated tree (that is in OPEN or CLOSED), after bottom-up cost revision (step (6)), we have F ( n ) = min[f(n, P)] for all P such that n ts on P. Proof. By an induction on the steps of the bottom-up cost revision of MA* taking leaf nodes as the basis. []
202
P P C H A K R A B A R T I ET AL
L e m m a 2.5. W h e n a n o d e n is selected for expansion by MA*, we have f(n) = F(s).
Proof. The proof follows from L e m m a 2.4 and the fact that the node with minimal f(n)-value m O P E N is always selected for expansion. [] Lemma 2.6. When a path P is extended in step (2) o f MA* to a path P' by successtve executions o f step (5), until MA* reaches step (2) again (after prunmg if requtred) we have f ( P ' ) > f ( P ) . Proof. Due to step (5), extension of a path continues until its cost increases. Now there may be two cases: Case 1. This path is not pruned. Clearly the lemma holds. Case 2. Some nodes on this path are pruned and t ~s the tip node of the remaining path P'. If u is the child of t that has been pruned, then clearly F(u) > F(s). By cost revision we have g(t) + h(t, u) > F(s), and this is the cost of the pruned path in the present tree. Since by L e m m a 2.5, F(s) was the cost of the path when ~t was selected for extension, clearly the cost of the path has increased. [] Theorem 2.7. Algortthm MA* ts admisstble, that &, it terminates with mmtmal cost solutions if h <~h*
Proof. Since the cost of a path increases due to extension (Lemma 2.6), a contradiction similar to [9] can be used to prove that MA* terminates despite pruning (even for infinite graphs, provided the costs on the arcs are positwe and finite). Using Lemma 2.3, it can now be proved that MA* is admissible. [] Lemma 2.8. For any node n selected for expanston by MA*, f(n) <<-C*. Proof. The proof follows directly from L e m m a 2.3.
[]
Lemma 2.9. A t any mstant o f MA*, when a node n ts selected for expansion at step (2), we have d(n) >1 D , where D = c o u n t - M A X and count is the value of the variable count at that instant. Proof. By induction on the sequence of nodes selected for expansion at step (2). Basis. When s is selected for expansion, count = 0. Since MAX/> 0, D ~< 0. As d(n)>1 0 by definition we have d(n)>! D.
203
HEURISTIC SEARCH IN RESTRICTED MEMORY
Induction step. Case 1. count ~< M A X . T h e n D ~<0. Clearly the next node n selected for expansion has d(n) >~O. Thus d(n) >1 D. Case 2. count > M A X . Then D > 0. This can occur only when n is selected at step (2) of MA* just after pruning has occurred. Consider the previous instance when a node p was selected for expansion at step (2). Let D ' = count' - M A X , where count' is the value of the variable count at that instant. Clearly by the assumption of induction we have d(p)>i D'. By L e m m a 2.5, we also have f ( p ) = F(s). Now expansion will continue (in step (5)) until for some child q of p, f ( q ) > G L occurs. Let d ( q ) = d ( p ) + a. If count" is the new value of the variable count then count" = M A X + D ' + a = count' + a . Now after pruning we have count=MAX+D,
D>0.
Case 2(a). If cost revision occurs such that F ( q ) = F(s). Then, since q is a leaf node, f ( q ) = F(q) Now because pruning has occurred we have D ~< D ' + a. Therefore d ( q ) - - d ( p ) + a ~ D ' + a I> D . Also
f(q) = min[f(m)]
for all m ~ O P E N
(because f ( q ) = F(s)). So if any node n is selected, it must have d(n) >1d(q) due to the tie resolution. Thus
d(n) >~D . Case 2(b). If F( q) = f( q) > F(s). Then q will surely be pruned or, in general, some nodes on the path from p to q will be pruned until a leaf node r remains such that F(r) = f(r) = F(s) . Now the n u m b e r of nodes pruned = c o u n t " - c o u n t . pruned can be from the path along r to q, we have
Since at most all nodes
d(r) >~d( q) - ( c o u n t " - count) d(p) + a-
~d(p)-
D' + D
~D'-D'+D ~D.
(MAX+ D' + a- MAX-
D)
204
P P CHAKRABARTI ET AL
Also f ( r ) = F(s) = m l n [ f ( m ) ]
for all m E O P E N
A g a i n since for any n o d e n selected at step (2), we have d(n) >i d(r), d(n) >! D .
[]
T h e o r e m 2.10. M A * never uses m o r e m e m o r y than M A X + ]C*I. Proof. By L e m m a 2.9, w h e n e v e r a n o d e n is selected at step (2) by M A x d(n) ~ D , where D = count - M A X and c o u n t is the value of the variable count at that instant. N o w since n o d e selection along a path c o n t m u e s at step (5) until the f-values increases on that path, for any n o d e m selected at step (5) we still have d ( m ) >~ D, where D = count - M A X and c o u n t is the new value of the variable count But by L e m m a 2 8, for all n o d e s q selected by M A * f ( q ) <~ C* Thus we can easily say that d ( q ) ~< Ic*l, Now as d ( q ) / > D, we have D ~< I C*l. A g a i n since D = count - M A X , we have count ~< M A X +
Ic*l
T h e r e f o r e at no instant M A * uses m o r e m e m o r y than M A X + ] C * I
[]
Corollary 2.11. For arcs wtth equal cost, M A * requires no m o r e than M A X + l( P*) states as m e m o r y where P* is the m i n i m a l cost solutton path and I( P*) t,s us length. Before we examine the results of M A * , two modifications to the basic algorithm are suggested Firstly, if at generation time, nodes which are likely to be p r u n e d soon can be detected and discarded, the pruning o v e r h e a d will be r e d u c e d This m e t h o d of pre-empttve p r u n i n g is as follows'
Algorithm MA*-PP.
Replace steps (4 iil) and (4.iv) of M A * by:
(4.in) If f(m) <~G L then put m in O P E N and increment count else put n in O P E N if it was in C L O S E D In M A * the same n o d e m a y be p r u n e d and r e e x p a n d e d several times. T h e r e f o r e instead of pruning away all possible candidate nodes (which may result in only a few nodes being retained and s o m e p r u n e d nodes Immediately r e e x p a n d e d ) , it m a y be useful to retain a certain fraction (0). A m e t h o d ( t e r m e d theta pruning) is given below:
HEURISTIC SEARCH IN RESTRICTED MEMORY
205
Algorithm MA*-0P. Modify step (8.ii) of MA* as follows: (8.ii) If count ~< MAX" 0 (0 ~< 0 ~< 1) or F(r) = F(s) then goto (2). It may be noted that the admissibility and correctness of MA* are not affected by these modifications. We shall consider the influence of 0 on MA* later.
2.3. Variation of memory size 2.3.1. Special cases: IDA*, A* Two special cases of MA* are interesting, one with M A X equal to a very large value and the other with M A X = 0. For a very large value of M A X pruning does not occur because step (8) is never executed. In such a case MA* works in a manner similar to A* except for the fact that one successor is generated at a time. When M A X = 0, MA* becomes an iterative-deepening algorithm (like IDA* [6]) because pruning (which can now be termed backtracking) occurs immediately after path extension (because pathcost increases). However, in MA* the h(n, m)-values which are backed up after revision are used to generate the successors in a best-first manner unlike IDA*. It is therefore expected that MA* with MAX = 0 will perform better than IDA* in terms of the number of nodes expanded. Experimental results on the Fifteen Puzzle problem (using Korf's data set [6]) show that the average of the ratio of the number of nodes expanded by MA* ( M A X = 0) to that expanded by IDA* is about 57%. Some of the results are shown in Table 1. (These examples have been interval sampled by taking every fifth problem from Korf's set.) However, it may be noted that this difference is due to node ordering and IDA* can also be modified to achieve this improvement. The importance of MA* ( M A X ) , however, is that it bridges the algorithmic gap between IDA* on the one hand and A* on the other. The memory versus node expansions tradeoff is discussed next. 2.3.2. Worst-case analysis We now study the effect of varying M A X on the number of states generated. We expect that the larger the value of M A X the fewer the number of reexpansions (and therefore the fewer the total of states generated). A worst-case analysis is performed for this. A uniform b-ary tree of unit arc cost and depth C* is assumed to be the model. The goal is at a depth of C*. The heuristic estimates are assumed to be zero. Let k = [log b M A X J . Therefore no pruning occurs until nodes at depth k are examined. The total number of states generated until then is E,k_-11b'. In order
206
P P C H A K R A B A R T I ET AL
t
×
g ¢$
¢-
~~o z
HEURISTIC SEARCH IN RESTRICTED MEMORY
207
to examine nodes at depth k, b ~ new nodes have to be generated and pruning will now occur. (Also a node, which is pruned when F(s) = i, has F(n) > i and is not regenerated as long as F(s) = i.) After pruning, only M A X (= b k) nodes are retained, thus leaving a shortfall of at most b k which have to be regenerated for the next depth. Therefore, for searching nodes at depth k + 1, b k + b k+l nodes have to be generated instead of b ~+1. In general, for any depth j (k ~
(1) We can now state the following result. Theorem 2.12. MA* usmg M A X as the extra m e m o r y provides a savings o f an O ( M A X ) n u m b e r o f nodes over MA*(0). Proof. Using (1) above we calculate the savings for k = [log b M A X J" (C* - 1)b + (C* - 2)b 2 + " " : bk(C *-k+l) b - 1 =0~ in general.
+ (C* - k + 1)b k-~
+ be(b k 2 _ 1) (b - 1) 2
(C* - 1)b b - 1
[ b~(C * - k + ~-]1)) = O(MAX)
[]
For example with C* = 20, b = 2 and M A X = 1024, the savings obtained is about 13,000 nodes. The use of heuristic information should increase this savings even further. It may be noted here that a savings of not m o r e than O ( M A X ) is not surprising since A* is an optimal algorithm and MA*(0) (like I D A * ) is asymptotically optimal to it. This result shows that though both A* and MA*(0) expand O ( b c*) nodes, A* also provides a savings of O ( b c*) over MA*(O). 2.3.3. Effect o f v a r y m g M A X and 0 We now study the effect of variation of M A X and 0 in M A * - 0 P using the Fifteen Puzzle problem. For the heuristic estimate we use a "modified Manhattan" distance defined as follows: If n = goal then h ' ( n , m ) = 0 else h'(n, m) = h(n, m) - 0.5 ,
C* 42 31 39 40 40 40 38 48 38 33 39 41
Sequence
14, 1, 9, 6, 4, 8, 12, 5, 0, 7, 2, 3, 10, 11, 13, 15 1, 7, 6, 3, 4, 8,11,15, 5, 0,13, 9,12,14, 2,10 9, 5, 7, 0, 8,14, 1, 2,10,15,13, 6,12, 4,11, 3 1,12, 2, 6, 5,15, 4, 8, 0,14, 3, 7,10,13, 9,11, 1, 3,15,13, 8, 0, 5, 4,12, 6, 2,11, 9, 7,14,10 6, 1, 0,14,13,10, 3, 5, 8,15, 2, 7, 4, 9,11,12 13, 8, 14, 3, 9, 1, 4, 7, 15, 2, 5,10,12, 0, 6,11 13, 4, 6,11,14, 8, 3, 9, 0,15, 7, 1, 2,10,12, 5 1, 9, 5, 7,11,13, 4, 3,14,12, 0, 2, 8, 6,10,15, 4, 2, 7,10, 1, 3,13, 6, 8, 0, 9, 5,12,11,14,15 6, 5, 9,10, 0,11, 4, 2, 1,12, 7, 3,14, 8,13,15 3,14, 9,11, 5, 4, 8, 2,13, 1,12, 6, 0,10,15, 7 112,006 37,302 177,212 143,576 125,013 139,385 70,893 182,892 133,462 101,014 332,796 74,588
MAX = 0
Table 2 Variation of MAX and 0 in MA*-0P using modified Manhattan distance
111,882 30,077 175,581 138,674 123,148 138,093 69,480 181,412 132,096 99,106 33(/,524 73,264
81,036 22,707 178,924 115,745 121,003 126,090 66,149 181,832 119,363 96,058 327,722 73,334
MAX - 3000
0= 0 0
MAX = 1000
0
0 5
80,006 24,113 153,714 113,860 113,889 121,898 68,924 179,902 114,874 95,322 330,216 72,494
MAX = 1000
77,976 22,268 149,428 111,605 110,874 117,785 65,924 177,686 111,789 91,839 326,431 71,611
M A X - 3000
t"-'
....q
to
209
HEURISTIC SEARCH IN RESTRICTED MEMORY
where h(n, m) denotes the Manhattan distance and h'(n, m) the modified Manhattan distance. This modification serves two purposes. Firstly, it reduces the study to the worst case since all nongoal nodes with f(n) = C* now have f'(n)< C* and are surely expanded. (Since h and g are positive integers, it also ensures that no node with f(n)> C* has if(n)<~ C*.) Secondly, this reduces inconsistencies arising out of tie resolutions for nongoal nodes with f(n) = C*. The results of varying M A X and 0 using modified Manhattan is presented in Table 2. (As the experiment is forced into the worst case, a large number of nodes are generated, especially for very difficult problems. However, since the objective is merely to highlight the difference in node expansions, a simpler set of problem instances suffices.) It may be noted that the case of M A X = (which results in A*) is not possible to test due to space reasons. The results show an expected general decrease in the number of states generated with increasing MAX. It may also be noted that for 0 = 0.5 fewer nodes are generated than for 0 = 0. This is expected because if we retain more nodes fewer reexpanslons occur. But in certain cases this reduction is so large that it even outperforms the use of a much larger MAX with 0 = 0. This may be explained as follows. Since tie resolution during pruning is done by selecting the node with smaller d(n)-value (step (8.0) and during node selection by larger d(n) (step (2)), MA* expands one subtree at a time and therefore tends to maintain a " m a j o r subtree". It continues to expand this major subtree until its root cost is revised to a higher value. For 0 < (some) 0", this major subtree is pruned when the memory is full These nodes are again reexpanded just after pruning as this most often remains the major subtree. This extra reexpanslon and cost revision causes the total number of states to increase. For 0/> 0* this subtree is left intact during pruning, thus resulting in fewer reexpanslons. An example is presented below to highlight this effect. This is shown in Table 3 where there is a sudden drop in the total number of states generated after a certain value of 0. Table 3 An example of MAX versus 0 varlauon with sudden drops, sequence is (14, 1, 9, 6, 4, 8, 12, 5, 0, 7, 2, 3, 10, 11, 13, 15), C* = 4 2 MAX
0=00
0 = 0 25
0 =0 5
0 = 0 75
0 25 50 100 250 500 1000 3000
112,006 111,979 111,949 111,851 111,523 111,445 111,822 81,036
111,970 111,929 98,988 81,030 80,747 80,166 77,916
111,957 111,910 81,291 81,046 80,680 80,006 77,976
111,948 81,464 81,241 80,924 80,580 79,705 77,954
210
P P CHAKRABARTI ET AL
Table 4 MA*-~bP (MAX = 1000) with ~b pruning (4~ - 0 8) using modified Manhattan distance Sequence
6*
Nodes expanded
14, 1, 9, 6, 4, 8,12, 5, 0, 7, 2, 3,10,11,13,15 1, 7, 6, 3, 4, 8.11.15, 5, 0,13, 9,12.14, 2,10 9, 5, 7, 0, 8,14, 1. 2,10,15,13, 6,12, 4.11, 3 1,12, 2, 6, 5,15, 4, 8, 0,14, 3, 7,10,13. 9,11 1, 3,15,13. 8, 0, 5, 4.12, 6, 2,11, 9, 7.14,10 6, 1, 0,14,13,10, 3. 5, 8,15, 2, 7, 4, 9,11,12 13, 8,14, 3. 9, 1, 4, 7,15, 2, 5,10,12, 0, 6.11 13, 4, 6,11,14, 8, 3, 9, 0.15, 7, 1, 2,10, 12. 5 1, 9, 5, 7,11,13, 4, 3,14,12, O, 2, 8, 6,10,15 4, 2, 7, 10, 1, 3,13, 6, 8, 0, 9, 5, 12,11, I4,15 6, 5, 9, 10, 0,11, 4, 2, 1,12, 7, 3, 14, 8,13.15 3,14, 9,11, 5, 4, 8, 2, 13, 1,12, 6, 0,10, 15, 7
42 31 39 40 40 40 38 48 38 33 39 41
79.371 24,773 153,032 113,327 113,677 121,433 68.311 179,974 I14,414 94,86(/ 330,390 72,632
T h e reason why the value of 0* should decrease If M A X increases (as seen in Table 3) is also clear f r o m the a b o v e a r g u m e n t A smaller 0 causes the m a j o r subtree to be p r u n e d if M A X is smaller. A d y n a m i c pruning strategy based on the length of the deepest subpath to k e e p 0 > 0* is given below in A l g o r i t h m MA*-&P. H e r e , instead of exiting f r o m p r u n i n g after a fixed percentage of nodes has been p r u n e d , we prefer to retain those nodes whose depth is closer to the depth of the deepest subpath prior to pruning. This will try to ensure that the m a j o r subtree is not pruned.
Algorithm MA*-~bP. Step (8.ii) of M A * is modified as follows" (8 il) If d(r) > 4)" d or f ( r ) = f ( s )
then goto (2) where 0~
2.4. Reducing overheads T h e most i m p o r t a n t o v e r h e a d in heuristic search algorithms hke A* is the n o d e selection time. In practice it is f o u n d that for m a n y p r o b l e m s the time taken to handle a large O P E N p r e d o m i n a t e s over the time taken for n o d e expansions Since the size of O P E N in M A * is b o u n d e d , and in general m u c h smaller than that of A*, in m a n y p r o b l e m s M A * would terminate m u c h earlier. T h e r e f o r e the choice of a small M A X in M A * would suffice (It m a y be noted that instead of M A X = 0 it is better to have a small M A X so as to prevent the
HEURISTIC SEARCH IN RESTRICTED MEMORY
211
pruning of the major subtree.) However, when the time taken for node expansions is also important, there is a need to devise some strategy which allows for efficient handling of a large O P E N (since the choice of a small M A X would result in more states being generated). Two strategies are outlined below.
Multiple OPEN (MA*-MO). If O P E N is an ordered hst, then insertion time is O ( M A X . N), where N is the number of nodes generated. If instead of a single O P E N several (say k) are kept and insertion is performed in these small lists in a circular manner, then the selection overhead becomes O ( N - k ) and the insertion overhead becomes O ( N . M A X / k ) . Therefore the total overhead is O(N.k+ N.MAX/k) which is minimized for k = ~ leading to an overhead of O ( N . ~ ) Another efficient techmque is to keep O P E N as a tree. However, the use of lists has the advantage of easy parallelization The parallel version of MA*-MO is discussed and analyzed in [3]. Virtual open (MA*-VO). In the second strategy, a small list OPEN1 (called "virtual O P E N " ) is maintained along with OPEN. Newly generated nodes are inserted into OPEN1 only. When it is full, OPEN1 is merged into O P E N and emptied. Like in MA*-MO, it can be shown that an optimal choice of the size of OPEN1 (= ~ ) ensures that the total selection time (which now is the sum of selection and merging times) can be done in O ( V M A X . N) time An overhead of MA* which 1s not present in A* is that of pruning. Here again, the node to be pruned is required to be selected and its parent inserted into the ordered list (This is similar to the selection, expansion and insertion into the list.) The previously mentioned strategies of multiple O P E N and virtual O P E N can also be used for decreasing the pruning overhead. It may be noted here that this overhead is also greatly reduced due to pre-emptive pruning (at generation time) mentioned earlier. In spite of these overheads, the choice of a small MAX is not to be preferred in all cases. In the Fifteen Puzzle problem expansion time is miminal, but in other problems where both generation of children and calculation of a reasonably accurate heuristic require sufficient computation, the choice of a larger MAX would be optimal in terms of actual execution time.
3. AND/OR Graph Search under Memory Limitations The idea of searching in restricted memory is now extended to A N D / O R graph search algorithms like AO* [4, 9]. However, the strategy for ordinary graphs is not directly applicable due to the presence of A N D nodes. Firstly, a top-down cost revision cannot be used to make an underestimating heuristic monotone. Secondly, the pruning will now be on the arcs instead of at the nodes. In this section we present an algorithm called MAO* for handling A N D / O R graphs. These A N D / O R graphs are assumed to be of the type
212
P P CHAKRABARTI ET AL
discussed in [4] where only pure A N D and O R nodes are allowed. (As mentioned in [4], it does not restrict the use of the algorithm ) Due to the presence of A N D nodes, certain defimtions need to be extended We now have a partial solution graph (psg) which is selected for extension instead of a partial path and the algorithm seeks to find a mimmal cost solution graph (sg) instead of a minimal cost path. The defininons of a psg and sg are the same as in [4]. The algorithm uses h(n)-values at the nodes (which is the heuristic estimate of the cost of the minimal cost sg rooted at n) and the backed up costs from its children to calculate an esnmate f(n) at the nodes. The f ( n ) - v a l u e s are revised after node expansion h*(n) is now the cost of the minimal cost sg rooted at n. C* = h*(s) is the minimal cost sg rooted at s
3.1. Algorithm MAO* Algorithm M A O * takes as p a r a m e t e r the amount of available m e m o r y M A X . Like A O * [4, 9], at every node it marks its most promising successors. In addition to the f ( n ) - v a l u e , M A O * keeps m each node all f(n, n,)-values for each n, E S(n) (where S(n) denotes the set of immediate successors of n in the implicit graph G). These f(n, n,)-values are revised by backing up the [f(n,) + c(n, n,)]-values during cost revision. Moreover, unlike AO* during expansion of an O R node M A O * generates only a single successor. (A partially expanded node continues to remain a tip node of the explicit graph G'.) The algorithm modifies the cost revision of A O * by backing up costs only if the new values are greater. This makes the cost function effectively m o n o t o n e in nature It also increases the cost of a psg to its " p a t h m a x " value and decreases the worst-case set of nodes expanded (see [4, p. 112]). When the the total space exceeds the available m e m o r y ( M A X ) some u n m a r k e d nodes and arcs are pruned so as to continue with expansion In the algorithm, A is the space required by an arc and N denotes the space required by a node
Algorithm MAO* (1)
[Initialize] Initially the explicit graph G ' and the marked psg both consists solely of the start node s.
(1) f(s),--h(s). (iQ If s is an O R node
then Vn, E S(s), f(s, n,) ~---f(s) and m a r k some n, in else if s is an A N D node
S(s)
(arbitrarily)
then Vn, @ S(s), f(s, n,) ~--0. (m) M E M O R Y +-- N. (iv) If s terminal then label s SOLVED.
HEURISTIC SEARCH IN RESTRICTED MEMORY
213
(2)
[Termination] If s is labelled SOLVED, terminate with success and the marked psg as the solution.
(3)
[Expand] (3.1) Choose any nonterminal (rooted at s)
tip node n of the marked psg
(3.2) (i) If n is an O R node then generate a single successor arc (n, n'), n ' ~ S(n) and n' is the marked successor of n. (i 0 If n is an A N D node then generate all arcs (n, n'), n' E S(n). (ii 0 If k arcs are generated then M E M O R Y ~-- M E M O R Y + kA. (iv) Create new nodes for those n' which are not already present in G'. If k' new nodes are generated then M E M O R Y <--MEMORY + k'N. (v) For each newly generated node n" (a) f(n") *- h(n"). (b) If n" is an O R node then Vm E S(n"), f(n", m) *--f(n") and mark some m (arbitrarily) else if n" is an A N D node then Vm E S(n"), f(n", m) *--0. (c) If n" is terminal then label n" SOLVED. (vi) f(n, n')*--max[f(n, n'), f(n') + c(n, n')]. (4)
[Cost revision] (4.1) Create a set Z of nodes containing only node n. (4.2) Repeat the following steps until Z is empty. (4.2.1) Remove from Z, a node m such that no descendant of m in G ' occurs in Z. (4.2.2) (i) If m is an O R node then (a) e*--min[f(m, m')], V m ' E S(m). (b) Mark that m' in S(m) for which the above minimum occurs. [Resolve ties for a S O L V E D child. In case of ties again, resolve in favor of the presently marked child.] (c) If the marked child is present in G ' and is S O L V E D then label m SOLVED.
214
P P CHAKRABARTI ET AL
(U)
If m IS an A N D n o d e
then
e~--cb[f(m, m ' ) ] , V m ' E S(m), [where q~ m a y be Z, max, etc ] (b) Mark all m ' in S(m) (c) I f each m ' is present in G ' and all are S O L V E D then label m S O L V E D . (ii 0 If f(m) < e then set f(m) ~-- e. (4.2.3) If f(m) changes value at (4.2.2) above or m is labelled S O L V E D then put m Z all parents p of m such that m is m a r k e d m p . (a)
(5)
[Continue] If M E M O R Y ~< M A X or s is labelled S O L V E D
then goto (2) else goto (6).
(6)
[Prune] (6 1) Select some O R n o d e r in G ' such that there is a successor arc (r, r ' ) in G ' and r' is not m a r k e d in r. (6 2) If no such n o d e r is f o u n d or M E M O R Y ~< 0- M A X
then goto (3)
(6.3) 0 ) R e m o v e (r, r') f r o m G ' and set M E M O R Y - - M E M O R Y A (u) If r' has no other parents (excluding r) in G ' then put r' m a set Z ' (initially e m p t y ) (6.4) R e p e a t the following steps until Z ' is e m p t y (6 4 1) R e m o v e any node p from Z ' and r e m o v e ,t from G ' M E M O R Y ~- M E M O R Y - N (6.4.2) (,) If p has children Pt, P2, , p~ which are present in G ' then r e m o v e all arcs ( p , p,) and set M E M O R Y - - M E M O R Y kA. (i 0 If any p, has no m o r e parents in G ' then put p, In Z ' (6.5) G o t o (6 1) Example 3.1. We now illustrate the working of M A O ~ t h r o u g h an example with M A X = 0 taking the imphclt graph G as shown m Fig. 1 In Fig 2 the e x p h o t graphs generated during the execution of the algorithm are shown. The f ( n ) , f(n, n,)-values and the marking are depicted reside each node. (The f ( n ) - v a l u e is at the top and the f(n, n,)-values at the b o t t o m of each node.) In Fig. 1, the values at the nodes are the h(n)-values. All arc costs are umty
215
HEURISTIC SEARCH IN RESTRICTED MEMORY A
0
8
s
i
'
Fig 1 An lmp|lClt graph
From Figs. 2(a)-(q) we observe that, though 12 nodes are generated at no stage, there are more than 8 nodes in the explicit graph. The minimal cost solution is obtained after termination. Node B is generated twice and node G three times.
3.2. Analysis of MAO* Lemma 3.2. If h(n) <<.h*(n) for all nodes n in G, then f(n) <~h*(n) holds for
each n in G'. Proof. By induction on the steps of the algorithm.
,I o
3 I~'o
1
lal
(b)
IA::3
[]
3JI I A,3 ,6
(c)
A]
3
(d)
3JI (e) Fig 2(a)-(e) Execution of MAO* on the graph of Fig 1
216
P P CHAKRABART1 ET AL
Lemma 3.3. Let f(s, (P, (m, n), i)) denote the f(s)-value when arc (m, n) ts generated for the ith time with P as the marked path from s to m. Then f(s, (P, (m, n), ])) >f(s, (P, (m, n), t))
t f l > t.
Proof. Consider the tth and (z+l)th instance. The proof follows by induction [] Theorem 3.4. MAO* is admtssible provtded h(n) <~h*(n).
IA,sj
Ajla
l/
Ii : S /
sJl 6
a [6
6
/
B]2
Jl~2J
0Ji\0 (t)
E]~
1
i~', (g)
A,s] sJI 6
°,,
I:
(h)
2J1\2 [
I1°ii " ;,I (i) Fig 2 ( f ) - 0 )
217
H E U R I S T I C S E A R C H IN R E S T R I C T E D M E M O R Y
E
"H-H
~A
v
A
-H
I
¢q
EL
A
JH
218
p p CHAKRABART|
ET AL
"x -
e,
-H
v
e~
_A -
w
I ¢-q
.20
~,
4,
HEURISTIC SEARCH IN RESTRICTED MEMORY
219
Proof. By L e m m a 3.2, f(s) <- h*(s) = C* at every instant before MAO* terminates. Since the cost of the arcs are finite and positive the algorithm must terminate despite pruning even for infinite graphs. It cannot continue expanding and pruning a psg (or a set of them) indefinitely because by L e m m a 3.3 the f(s)-value increases when the same path is regenerated. Continuing indefinitely would violate L e m m a 3.2 as the arc costs are finite and poslnve. That MAO* terminates with f(n) = C* follows directly from L e m m a 3 2. However if the arc costs are zero, MAO* guarantees termination with minimal cost solunons provided the imphcit graph is finite. In game trees, arc costs are zero but since evaluation ~s done at a fimte level the same result holds. [] Theorem 3.5. MAO* never uses more memory than M A X + Ic*l where Ic*l : max[size(D'): for all psgs D' with cost ~
Proof. Due to pruning at step (6), all arcs and nodes except those on the marked psg (rooted at s) may be pruned. Now by L e m m a 3.2 the marked psg must have F'-value less than or equal to C*. So after pruning we have M E M O R Y ~< max[MAX, [ c * l l . Also M E M O R Y can never exceed M A X + [C*] before pruning occurs. Hence the proof. [] Using an analysis similar to that of MA*, we can show the following: Theorem 3.6. For A N D ~ O R trees, (i) MAO*(0) is asymptottcally opttmal to AO* m terms of the number o f nodes generated, (u) usmg M A X amount o f extra memory, MAO* provtdes a savmgs o f O ( M A X ) number o f node generattons over MAO*(0).
It may be noted here that MAO* does not generalize MA* in the case of ordinary graphs. Firstly, MAO* cannot be used for ordinary graphs with loops while MA* can. Secondly, since there are no A N D nodes, MA* can use a second cost function at the nodes for pruning. Thirdly, the marking technique of MAO* cannot be used to easily select the k best nodes. Therefore, while MA* can easily be modified to use the techmques of algorithms like A~* [10], C and PropC [1], these versions are not directly derivable from MAO*. However, MAO* results m several other interesting special cases. For M A X = 0, it gives an iterative-deepening algorithm for A N D / O R graphs. The example m Figs. 1 and 2 illustrates how such a strategy works, for large M A X
220
P P C H A K R A B A R T I ET AL
it becomes a variation of AO* (with single successor generation). For ordinary acyclic graphs using a large M A X we get an algorithm similar to M a r k A [l] which is shown to have a lower selection overhead. In the case of game trees, M A O * can be used as a restricted m e m o r y variation of SSS* and would specialize to some known strategies (see [2, 7]) It may be noted here that m game trees, since evaluation is done at a finite level, the value of ]C* I is known a priori and therefore the algorithm can be restricted to use no more than max[IC*l, MAX] as m e m o r y instead of M A X + Ic*l. Several improvements on algorithm M A O * can be made The pruning strategy can be modified in a n u m b e r of ways. In step (6 1) the arc to be pruned was selected arbitrarily a m o n g those which were not marked. A more appropriate strategy would be to choose the one with larger f(m, n)-value (It 0 > 0). Again, if an O R node is S O L V E D , its u n m a r k e d children can easily be pruned, since they will not be regenerated as long as the node is present in the explicit graph. If M A X = 0, then pruning can be effected immediately after marking shifts, eliminating the selection process of step (6.1) In step (6.4) all nodes with no parents in G ' were removed However, all such nodes need not be pruned if they are likely to be generated again. (But this will require an overhead of keeping track of such nodes.) A theoretical or empirical study ot the effect of different pruning strategies using M A O * (especially for game trees) would be interesting. M A O * can be analyzed in a m a n n e r similar to A O * [4] where the cost has been decomposed to G + H to obtain weaker admlsSlblhty conditions than h ~< h*. The strategies of weighting can also be used to maintain admissibility if heuristics overestimate (like W A O * ) 4. Conclusions In MA* the technique of single node generation has been used and the f-value of a node changes (never decreases) when a successor is generated. This has several advantages. It can be shown that if single node expansion is used the total n u m b e r of states generated is fewer. This is particularly important if the cost of node expansion is high. Moreover, for several problems, one may have different types of heuristic estimates for different paths and storing such path information in the nodes could be useful. In M A O * also, single successors are generated at the O R nodes. This can be done for the A N D nodes as well MA* and M A O * also provide an insight into the working of a whole class of search algorithms like A*, A O * , I D A * , M a r k A and SSS* (many of which are directly derivable). This strategy also throws light on the m e m o r y versus node expansions tradeoff in best-first search strategies. T h e r e are several variations of these algorithms. One interesting variety is the modification of the use of the virtual O P E N technique of MA* keeping M A X very large so as to obtain a more efficient version of A*. H e r e O P E N
HEURISTIC SEARCH IN RESTRICTED MEMORY
221
c a n b e k e p t in t h e s e c o n d a r y s t o r a g e a n d O P E N 1 in t h e m a i n s t o r e . A n efficient disk-merge algorithm can be used to merge OPEN1 into OPEN. B o t h M A * a n d M A O * c a n e a s i l y b e c a t e g o r i z e d in t h e g e n e r a l b r a n c h a n d b o u n d f r a m e w o r k o f N a u , K u m a r a n d K a n a l [8] u s i n g t h e S P L I T a n d P R U N E paradigm. ACKNOWLEDGEMENT The authors thank the referees for their comments Mr T K Nayak is thanked for his comments on an earher version of the manuscript REFERENCES 1 A Bagchl and A Mahantl, Three approaches to heuristic search m networks, J ACM 32 (1985) 1-27 2 S Bhattacharyya and A Bagchl, Making the best use of available memory when searching game trees, in Proceedmgs AAAI-86, P~iladelphla, PA (1986) 163-167 3 P P Chakrabarti, Some studies m heuristic search theory and algorithms. Ph D Thesis, Indian Institute of Technology, Kharagpur (1988) 4 P P Chakrabarti, S Ghose and S C de Sarkar, Admissibility of AO* when heuristics overestimate, Artificial lntelhgence 34 (1988) 97-113 5 T Ibaraki, Depth_m search in branch and bound algorithms, lnt J Comput lnf Sct 7 (1978) 315-343 6 R E Korf, Depth-first Iterat~ve-deepening An optimal admissible tree search, Artlficlal lntelhgence 27 (1985) 97-109 7 T A Marsland, A Reinefeld and J Schaeffer, Low overhead alternatives to SSS*, Artificial Intelligence 31 (1987) 185-199 8 D S Nau, V Kumar and L N Kanal, General branch and bound, and its relation to A* and AO*, Artificial Intelhgence 23 (1984) 29-58 9 N J Nilsson, Prmclples of Artificial Intelhgence (Tioga, Palo Alto, CA, 1980) 10 J Pearl, Heuristics (Addison-Wesley, Reading, MA, 1984)
Received July 1987; revised version received April 1989