North-Holland Microprocessing and Microprogramming 18 (1986) 617-622
617
QUANTITATIVE EVALUATIONAND SELECTION CRITERIA FOR IMPLEMENTATION OF DECISION TABLES
Paolo Ercoli Dipartimento di Informatica e Sistemistica Universita di Roma l , via Eudossiana 18 00184 Roma, I t a l y
Decision tables are being used in an increasing number of f i e l d s since they represent functions and r e l a t i o n s in a general way and in p a r t i c u l a r because they can represent rule-based systems when there are hundreds of rules in hundreds of variables or more. A new method of decision tables implementation, called " h a s h - l i k e " , is evaluated and compared, both a n a l y t i c a l l y and experimentally, with some of the known methods already a v a i l a b l e , from the points of view of: ease of implementation, r e l i a b i l i t y , p r e d i c t a b i l i t y , speed, memory occupation. This gives practical and q u a n t i t a t i v e c r i t e r i a f o r choosing a method of implementation of decision tables.
I.
INTRODUCTION
The progress in microelectronics has put on the market microprocessors of notable power at very low prices, but the cost and time needed for producing r e l i a b l e software are s t i l l very high. Since r e l i a b l e software means software performing as expected, such expectations should De stated formally and in a way which is convenient for the system designer. One type of formal specification which f u l f i l l s such condition is the tabular form, which is then easily implementedas a decision table or as a decision tree. A b r i e f survey of methods for the representation of decision tables and a number of novel methods are given in (1) while (2) is a rather complete summary of classical methods. In this paper we evaluate the most interesting of the new methods proposed in (1), i . e . the hash-like method (called HL in the following) and we compare i t with the branch method, which in general is the more convenient of the known methods. Such comparison is done a n a l y t i c a l l y and then the results checked against tests performed both with an 8 b i t processor (the 8080) and with a 36-bit processor with only one instruction per word (Sperry llO0). These two can be taken as representative of the two extremes of the range of processors available in terms of instruction structure, excluding supercomputers, vector processors and the l i k e . The comparison w i l l also include other points
such as: ease of implementation and therefore cost and r e l i a b i l i t y ; p r e d i c t a b i l i t y of performance. In what follows we shall call "conditions" the variables in the table, the entries of which are going to be called "rules" and the value of the function corresponding to a rule w i l l be called an "action". Furthermore the conditions are supposed to be binary, since i f they are n-ary they can easily be binarycoded as far as the HL method is concerned, while in the branching method binary-branching w i l l be substituted by "case" instructions. I t must also be noted that the comparison between methods can be done without too great d i f f i c u l t y when the number of conditions (and rules) is not larger than, say, 50 and therefore our analysis is directed towards higher values than these. The actions are supposed to be specified in the decision tables by addresses, which represent subroutine entry points. Thus the contribution of such addresses to the memory required w i l l usually be ignored. Each rule w i l l consist of a vector defined on the set: O, l , d, where d stands for "don't care" as usual. The parameters we w i l l take into consideration w i l l be: n: number of conditions, r: number of rules, f: average number of d's in the rules w: number of bits in one word of the processor, s: smallest integer larger than log r , q: smallest integer larger than 2n/w.
618
2.
P. Ercoli //mplementations o f Decision Tables
TIME AND MEMORYREQUIREMENTSFOR HL METHOD.
As explained in ( I ) the HL method consists in representing each rule with 2n b i t s ( i n p a r t i c u l a r d with 00, 0 with I0 and 1 with 01) and s t o r i n g i t at the address given by 2n/w m u l t i p l i e d by the number represented in binary by the s most s i g n i f i c a n t d i g i t s of the rule when each d is replaced by a I . Addresses are rel a t i v e to the o r i g i n of the table. For best results the columns of the t a b l e , in which a rule is represented by a row, should be interchanged so that those with the larger number of d's stay on the r i g h t ( i . e . occupy the less s i g n i f i c a n t positions) and the more f r e q u e n t l y accessed rules stay on top. However t h i s l a s t requirement can be e a s i l y overlooked without much loss. Cases of c o l l i s i o n can be solved as in the usual hashing schemes, i . e . s t o r i n g a r u l e , which finds an address already occupied, e i t h e r in an overflow area or in an address f o l l o w i n g the one already s t o r i n g another r u l e , etc. I t must be noted that c o l l i s i o n s may require that the table is stored in an area larger than the one specified above and given by the f o l lowing number of words: (K+q)2 exp s (2.1) when they occur at the largest of the above addresses or near i t . In(2.1) K is the number of words needed f o r an a c t i o n , i . e . f o r an address. Access to the table is performed by scanning i t from the address given by the s more s i g n i f i c a n t d i g i t s of the search key, u n t i l a match between rule and key is found. No match corresponds to the "else" clause, which thus can be easi~ly added to the table. I f the key d i g i t s are represented with the complements of the representation used for the rules ( i . e . 0 is represented by Ol and 1 by I0) a match is found when the boolean AND between rule and key gives a l l zeros. N a t u r a l l y f o r a match q AND operations must be performed, since a rule is stored in q words, but f o r a mismatch only one unsuccessful AND ( i . e . one with a non-zero r e s u l t ) may be s u f f i c i e n t . This f a c t w i l l be recalled l a t e r on. Naturally i f ( s - l ) is j u s t smaller than log r the storage area given by (2.1) w i l l be almost twice that s t r i c t l y necessary to store the r rules. Consequently also the number of search steps w i l l include access to empty locations and could become unnecessarily large, although the number of c o n f l i c t s w i l l be r e l a t i v e l y small.
v a r i a n t of the HL method can then be used, which consists in rearranging the rules (in a number of addresses given by qr) in ascending order, a f t e r having considered d=l in each of them. Access to the rearranged rules is done through a vector of pointers to them, stored and accessed in a hash-like fashion. In case of c o n f l i c t the vector component involved should contain the smallest of the pointers competing to be stored there. We shall r e f e r to such method as the i n d i r e c t ~ash-like (IHL) one. Other variants can be devised to take care of c o l l i s i o n s and other f a c t o r s , but they tend to make the implementation less simple or to make d i f f i c u l t the modification of the table during i t s use. At t h i s point we can r e c a l l (2.1) and since i t is on the average 2 exp s = 1.5r, we can evaluate that the storage space needed is given in words by: M = (K+q)2 exp s = q 2 exp s = 1.5qr ~ 3nr/w (2.2) As regards the time taken to access a decision table, things are more complicated. In ( I ) i t is shown that the average number of search steps in accessing a table with the HL method i s : ( f / n ) 2 exp s = O.75rf/n As pointed out before, f o r each step only one boolean AND is performed as soon as the table is sparse and therefore the p r o b a b i l i t y of a match is ~exp w independently from the results of the other steps. Therefore i f A is the time taken to perform a comparison between one word from the key and one from the table and to step f u r t h e r on to the next comparison, then the average time to access the table with a given key i s : T : A ( q + ( f / 2 n ) 2 exp s) = A((2n/w)+O.75rf/n) (2.3) This r e s u l t disregards the e f f e c t of c o l l i s i o n s and of non-uniformity in the d i s t r i b u t i o n of d's in the rules. In the same hypothesis the minimum and maximum of T are given respectively by Aq and by A(q+(f/n)2 exp s) which r e s p e c t i v e l y correspond to the case when a match is found at the f i r s t comparison and to the case when i t is found at the l a s t poss i b l e one. I t must be pointed out that (2.3) is v a l i d i f the table is sparse, i . e . : r<<2 exp n but i f the table is not sparse other methods than those considered in t h i s paper have to be used as shown in ( I ) .
P. Ercofi / Implementations o f Decision Tables
Also i f n<>r time complexi t y is O(n) as in the case of the branching methods (1). As regards the IHL method, (2.2) becomes M = (K+q)r+K2 exp s = 2nr/w (2.4) where the second term accounts f o r the indexing v e c t o r , and (2.3) becomes: T = A ' ( q + f r / 2 n ) = A'(2n/w+O.5rf/n) (2.5) where A' includes the time taken to access a r u l e by means o f the indexing v e c t o r . For the IHL method the minimum and maximum o f T become r e s p e c t i v e l y A'q and A ' ( q + r f / n ) .
of an incomplete table and i t is during the implementation of the methods that the missing rules are found. I t is reasonable however to state that the storage M required for an incomplete table with r rules and, on the average, f don't cares in each rule is given by: M ~ br(n-f) = bnr(l-f/n) (3.2) This amounts to s t a t i n g t h a t there is one branch f o r each 1 or 0 in the t a b l e ( I ) but t h i s formula w i l l be tested with experimental data.
4.
3. TIME AND MEMORYREQUIREMENTS FOR BRANCHING METHODS. The access time for a complete table implemented with a branching method is proportional to n, since i t is necessary to test each condition. Whend's are present practically the only general statement that can be made is that the average access time is approximately given by T = B(n-f) = Bn(l-f/n) (3.1) where B is the time taken to examine one condition and branch on i t , At this point, however, two remarks have to be made. The f i r s t one is that there are several branching methods and several proposals for optimal algorithms for implementing them. The second point is that the introduction of the "else" action has more effect on branching methods than on HL ones, as shown in (1) for instance. Theoretically one could do without the "else" rule, but in most circumstances the "else" rule must be introduced as a safeguard against incorrect inputs and also against software errors or hardware faults. Howeverthe introduction of the "else" rule has only a marginal effect on T, while i t has a very considerable influence both on memory requirement and complicates the production of the program implementing a branching method. However only a rather approximate analysis of the memory requirement can be made and i t is not affected by the particular method chosen. I t must be also pointed out that some of the methods proposed to optimize storage requirement are extremely complicated and often give only marginal gains. The storage required for a complete table is br, i f b is the amount needed to store the instructions necessary to test one condition. In most cases, however, one has only r rules
619
THE TESTS AND THE RESULTS.
From the previous sections i t is quite apparent that one needs to test the expressions given by (2.2), (2.3), (3.1), (3.2) for the time and memory requirements of the methods considered, given the approximations made. In the present section we shall describe the aims of the tests, the procedures used and the reasons for choosing one particular branching method for comparison with the HL one. When using or on the point of implementing a decision table one is generally interested in: ease of implementation, which affects cost, time and r e l i a b i l i t y ; - a b i l i t y to predict with reasonableaccuracy memory occupation and average access time; minimum and maximumvalues of access time i f there are temporal constraints in the programs making use of the table. Obvious parameters affecting memory requirement and average access time are r, n and f/n, but also the distribution of d's in the keys and in the rules. Therefore experiments have been devised in order to test: the approximation of the above expressions of M and T with randomdistribution of d's and for a reasonable range of values of the parameters of the expressions; the influence of some non random d i s t r i bution of d's on the performance of the HL method; the influence of the type of processor (word length, addressing, instruction repertoire). Furthermore a number of tests have been performed in order to select one branching method among those available. In fact after an examination of the various branch methods published in the l i t e r a t u r e , -
620
P. Ercoli / Implementations o f Decision Tables
Pollack's quick rule method (3) appeared a very l i k e l y candidate due to i t s widespread use and to the s i m p l i c i t y of i t s implementat i o n . But the older and s t i l l easier to implement (although less c i t e d ) method of Egler (4) behaved better in tests performed both with small complete tables (n=8, 16) and with i n complete tables with r=lO0 and with f / n and n varying r e s p e c t i v e l y from 90% to 50% and from lO0 to 420. Therefore a set of programs has been devised to generate decision tables with n and r var i a b l e from I00 to I000 and with variable f / n , to generate keys (contained or not in the tables under t e s t ) and to implement them with HL and Egler's methods. The processors used were an 8 - b i t microprocessor the I n t e l 8080 and a Sperry II00 (with 36 b i t s words and i n s t r u c t i o n s , including a very wide spectrum of addressing modes). These two processors seemed to represent well extremes in a r c h i t e c t u r e (having excluded supercomputers and vector processors) and the llO0 had an 8080 simulator which was necessary to manage an extensive set of tests on large tables. A f t e r having established the v a l i d i t y of the formulas in (2.2) ( 2 . 3 ) , ( 3 . 1 ) , (3.2) more tests were carried out on the II00 with larger tables and with non-uniform d i s t r i b u t i o n s of d's. As regards the memory requirements the experimental r e s u l t s agreed very well w i t h : M ~ 3nr/w (4.1) as could be expected and equally well with (2.4). Agreement with (3.2)was acceptable only with f/n=O.5 or more and in these cases the maximum r e l a t i v e discrepancy was always smaller than 0.3. Anyway (3.2) always gave values l a r g e r than the experimental r e s u l t s . At f/n=O.2 the r e l a t i v e e r r o r was about 0.6. I t is i n t e r e s t i n g to point out, however, that the experimental values showed M(n) as being l i n e a r even f o r f/n=O.2. As regards access time the experimental r e s u l t s confirmed, with rather i n s i g n i f i c a n t d e v i a t i o n s , the formula (3.1) although derived rather e m p i r i c a l l y . The r e s u l t s of tests carried out with tables both with randomly generated d i s t r i b u t i o n s of d's and with columns with the higher number of d's on the r i g h t , were in accord w i t h : T = A((2n/w)+(f/2n)2 exp s) (4.2) with maximum deviations of the order of ±10%. The same kind of agreement was found between experimental data and both (2.4) and ( 2 . 5 ) . I t must be pointed out that the agreement
between the for~ulas of section 2 and 3 were not s i g n i f i c a n t l y affected by the processor being an II00 or an 8080, but i t must be noted that the access programs were w r i t t e n in assembler language both for the branching and the HL methods. This was done in order to avoid differences due only to compilers. A f t e r the above tests which confirmed the v a l i d i t y of the formulas of sections 2 and 3 (apart from what has been said about memory occupation of Egler's method f o r low values of f / n ) other tests were made. Decision tables from real a p p l i c a t i o n s have been also used without s i g n i f i c a n t departures from the previous r e s u l t s . However only four such cases were t r i e d . Then deliberate biases were introduced w i t h i n the rules and among the rules of tables again without s i g n i f i c a n t discrepancies from what had already been found, both f o r the branch and for the HL methods. Large differences were found only f o r the access time of the HL method when the columns of the table were ordered with decreasing number of d's from l e f t to r i g h t ( i . e . with an order opposite to the recommended one). An analysis of such a case can be made f o r the HL method r e c a l l i n g from ( I ) that on the average the search f o r a rule having one d in the most s i g n i f i c a n t d i g i t s consists of (2 exp s - l ) / 2 s steps. I f there are f of d's in the rule and they are uniformly d i s t r i b u t e d then the number of d's in the s most s i g n i f i c a n t d i g i t s of the rule is fs/n and therefore the average number of search steps is for one rule (f/2n)2 exp s from which (2.3) has been derived. When the d's are concentrated on the l e f t side of each r u l e , i . e . in i t s s most s i g n i f i c a n t p o s i t i o n s , each rule can reach a maximum of ( s - l ) d's and therefore (2.3) tends towards: T = A(q+(2 exp s)/2) = A((2n/w)+O.75r) (4.3) i . e . towards the values corresponding to f=n. However i t must be pointed out that t h i s case is h i g h l y improbable and that i t can be very e a s i l y circumvented by interchanging the table columns. As regards the influence of c o l l i s i o n s on (2.3) i t was found d i f f i c u l t to analyze i t , but the execution of many tests with d i f f e r e n t keys with given n, r and f / n did not show a spread of results beyond the already mentioned l i m i t s . I t was noticed however that the influence seems to appear e s s e n t i a l l y at values of f / n of about 0.85. N a t u r a l l y better ways to deal with c o l l i s i o n s can be e a s i l y implemented but experimental
P. Ercoli /Implementations of Decision Tables results seem to show they are not necessary in general.
621
a very elementary program, which can be e a s i l y proved correct; when in (5.2) is t>l and p a r t i c u l a r l y when w is large and r(f/n)<
5.
CONCLUSIONS.
A set of expressions has been obtained both f o r HL and f o r branch methods and t h e i r v a l i d i t y checked f o r numbers of conditions and of rules between 80 and I00 and f o r values of f / n between 0.2 and 0.9, which cover most of the cases of i n t e r e s t , except of course that of small tables. C erta i n l y the expressions given f o r the branching method are rather general and do not take i n t o account the possible improvement given by methods more e f f i c i e n t than Egler's or Pollack's. But c e r t a i n l y i t would have not have been p ra c t i c a l to carry out an extensive set of tests in which not only the keys were changed but n, r and f , that is the whole table. Anyway in choosing between various methods one does not need great accuracy and the expressions we have given in previous sections are more accurate than necessary f o r the purpose. Furthermore we have given also the extremes of access time, p a r t i c u l a r l y in the HL case, in order to show how predictable the performance of each method i s . From the expressions of M and T given in s e c t i ons 2 and 3 we f i n d that the r a t i o m between the storage space taken by the branch method and that taken by the HL methods considered is: m = b(w/x)(l-f/n) (5.1) where x=3 f o r HL and x=2 f o r IHL methods respectively. S i m i l a r l y the r a t i o t between the access time of the HL method and that of the branch one is: t = ( a / ( B ( l - f / n ) ) ) ( ( 2 / w ) + ( l , 5 / x ) ( r / n ) ( f / n ) ) (5.2) From such expressions one can evaluate which method is preferable with given n, r and f / n . A synthetic conclusion may be the f o l l o w i n g . HL methods have to be used when: e i t h e r memory l i m i t a t i o n s do not allow the use of branch methods; or when the table is dynamic, i . e . changes during computation, because compiling another branching program each time the table changes is impractical or impossible. HL methods may be preferred when: ease and therefore time and cost of implementation have to be considered; r e l i a b i l i t y , because with the HL method the implementation of the table is equivalent to t r a n s f e r r i n g i t i n t o memory and w r i t i n g -
-
6.
ACKNOWLEDGEMENTS.
Thanks are due to M.Valeau f o r having implemented the f i r s t set of programs used f o r most of the tests including the IHL method.
7. (1)
(2) (3) (4)
REFERENCES Ercoli P., Decision tables: applications to microprocessors and some new implement a t i o n methods, in Microprocessor systems (Sami M. et a l . Eds) North-Holland 1980, pp.135-145. Pooch U.W., Translation of decision tables: ACM Comp. Surv. 6, 2(1974) pp.125-151. Pollack S.L., Conversion of l i m i t e d entry decision tables to computer programs, CACM 8, 11(1965) pp.677-82. Egler J . F. , A procedure f o r converting logic table conditions i n t o an e f f i c i e n t sequence of test i n s t r u c t i o n s , CACM6, 9 (1963) pp.510-4.