Fast Algorithm to Generate Arithmetic Transform for Incompletely Specified Boolean Functions Using Block Matrix

Fast Algorithm to Generate Arithmetic Transform for Incompletely Specified Boolean Functions Using Block Matrix

Available online at www.sciencedirect.com Available online at www.sciencedirect.com Procedia Engineering Procedia Engineering 00 (2011) 000–000 Pr...

234KB Sizes 0 Downloads 31 Views

Available online at www.sciencedirect.com

Available online at www.sciencedirect.com

Procedia Engineering

Procedia Engineering 00 (2011) 000–000

Procedia Engineering 29 (2012) 3722 – 3726 www.elsevier.com/locate/procedia

2012 International Workshop on Information and Electronics Engineering (IWIEE)

Fast Algorithm to Generate Arithmetic Transform for Incompletely Specified Boolean Functions Using Block Matrix Yu Pang* Dept. of Electronic Engineering, Chongqing University of Post and Telecommunication, Chongqing,400065, China

Abstract Arithmetic Transform (AT) is an important representation for digital circuits. Traditional method to calculate it takes advantage of matrix multiplication. However, the execution time is always long leading to infeasibility. Given an incompletely specified Boolean function which always occurs in practice, this paper gives a fast solution to obtain AT, which uses matrix property to divide the generation matrix into small blocks for multiplication. Experiments prove that the proposed solution can accelerate the calculation speed beyond 10 times for large circuits.

© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of Harbin University of Science and Technology Keywords: Spectral domain, Arithmetic Transform, Incompletely specified Boolean function , Block matrix

1. Introduction Recently emergence of various verification methods helped engineers to check design easily. The mainstream way is using decision diagrams. BDDs [1] (Binary Decision Diagrams) and the reformation decision diagrams [2][3][4] have been proved valid to verify digital circuits. They are more compact representations because the word-level property guarantees better performance. However, all these decision diagrams are defined in the Boolean domain, and they have obvious disadvantages, for example, not easy to handle imprecise circuits. Most existing methods concern with the properties of Boolean functions in the Boolean domain and rely on each entry of such a table describes precisely the behavior of the function at a single point, and bears no relation to the function behavior in the other points of the domain e. For some applications this is a satisfactory representation, however, other like circuit verification would benefit much more if partial information about the whole function could be included in a function value at each point of its domain. Arithmetic Transform (AT) is an alternate representation for digital circuits which is defined in spectral domain. It can solve the problem easily. Nowadays it has been adopted by many applications. For example, [5] takes advantage of it to process imprecise circuits and find imprecision successfully between specifications and implementations, and [6] uses the spectral technique for fault detection. To get transforms of Boolean functions, several methods have been proposed. One way is to get AT from diagrams [7] since there is relationship between decision diagrams and AT. Authors in [8] use the polarity property to get AT quickly. Given the truth table for a Boolean function, a direct method is matrix multiplication. But the execution time has exponential increase with the size increase of the truth table, so the method is generally infeasible. Furthermore,since incompletely specified Boolean functions exist in many applications and the case to get their transforms is more difficult than that of completely specified functions, a fast algorithm to calculate transforms for * Corresponding author. Tel.: +86-13372653576 E-mail address: [email protected]

1877-7058 © 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2012.01.560

Yu Pangname / Procedia Engineering 29 (2012) 3722 – 3726 Author / Procedia Engineering 00 (2011) 000–000

2

3723

incompletely specified functions is urgent. In this paper, one new algorithm is proposed to generate AT, and it mainly focuses the incompletely specified functions. The speed of the algorithm is very fast and it can be implemented easily. 2. Definition of Arithmetic Transform Part of the problem with the definition in the Boolean domain is that each of the entries in the truth table for f tells us precisely the behavior of the function at a single point but nothing of its behavior for any other points. The alternate representation is in the spectral domain, and a number of properties are much more easily deduced in the spectral domain than in the Boolean one [10]. Arithmetic Transform (AT) defined in spectral domain is a canonical polynomial representing uniquely multi-input and multi-output Boolean functions f : B n → B m [9]. To obtain an AT description in a form of a single polynomial, multi-outputs can be grouped to form a word-level (integer) number w , leading to a pseudo Boolean function f : B n → w .Therefore, the AT representation has Boolean inputs and a word-level output. Definition 1: The Arithmetic Transform (AT) [9] is a polynomial representing a pseudo Boolean function f : B n → w with an arithmetic operation “+”, word-level coefficients c i1 i2 ...in , binary inputs x1, x2,…xn and binary exponents i1, i2,…in: AT ( f ) =

1

1

1

∑∑ ∑

i1 0= i2 0 =

...

in 0 =

c i1 i 2 ... i n x 1i ` x 2i 2 ... x ni n

(1)

We can determine AT of a given function using, among others, decision diagrams, matrix multiplication and polynomial interpolation [11]. From the above three,the matrix multiplication is most frequently used. In this method, the set of AT coefficients C = {c i1 i2 ...in } are obtained by multiplying the 2 n × 2 n matrix Tn by a 2 n × 1 vector of function values (truth table of f): C = Tn × f , where the transform matrix Tn is defined recursively:

0⎤ ⎡T Tn = ⎢ n−1 ⎥ , T0 =1 ⎣−Tn−1 Tn−1⎦

(2)

3. Discussion of Block Matrices Matrix multiplication as a direct calculation method to generate AT has an important status. This method is easy to comprehend and generate Tn. Given the truth table for the Boolean function, it is a convenient way to get AT. But the huge multiplication time always stops wide usage of the method. So finding a fast algorithm to rapidly finish the computation is necessary. The representation of matrix multiplication is:

C = Tn × f

(3)

Here f is the truth table for an incompletely specified Boolean function which is represented by a set of {0, 1, *}. The symbol of “*” means the don’t care value and often is given the value as 0.5 to differentiate with true and false. To analyze the algorithm detail, f needs to be separated first. Now two matrices are introduced to express the truth table f as: f = M * S+ J

(4)

S represents the valid range of the truth table. M retrieves corresponding information and J is used to express the don’t care part of the truth table. The follow example explains the idea clearly. Example 1: An incompletely specified function with three input bits has the valid range of (000 – 010, 101 – 111) and the don’t care set is (011 – 100). So the truth table has two continuous sub-domains as (000 – 010) and (101 – 111). S is the vector of (y0, y1, y2, 0, 0, y5, y6, y7) and J is (0, 0, 0, 0.5, 0.5, 0, 0, 0). Hence we can obtain matrices of M and J: ⎡1 ⎢0 ⎢ ⎢0 ⎢ 0 M =⎢ ⎢0 ⎢ ⎢0 ⎢0 ⎢ ⎢⎣0

0 0 0 0 0 0 0⎤ 1 0 0 0 0 0 0⎥⎥ 0 1 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 0 0⎥ 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 1 0 0⎥ 0 0 0 0 0 1 0⎥ ⎥ 0 0 0 0 0 0 1⎥⎦

y0 y1 y2 y3 y4 y5 y6 y7

⎡ I 0 0⎤ M = ⎢⎢0 0 0⎥⎥ ⎢⎣0 0 I ⎥⎦

3724

Yu Pang / Procedia Engineering 29 (2012) 3722 – 3726 Author name / Procedia Engineering 00 (2011) 000–000

3

M is partitioned according to the position of S in the truth table. Since the valid range has two subdomains, M can be divided into two parts: M = M1 + M2. ⎡ I 0 0⎤ ⎢ ⎥ Here M1 = ⎢0 0 0⎥ and M2 = ⎢⎣0 0 0⎥⎦

⎡0 0 0⎤ ⎢0 0 0⎥ ⎢ ⎥ ⎢⎣0 0 I ⎥⎦

I is the natural matrix and M is a sparse matrix consisting of many zero elements. ⎡0 ⎢0 ⎢ ⎢0 ⎢ 0 J = ⎢ ⎢0 ⎢ ⎢0 ⎢0 ⎢ ⎢⎣ 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0

0

0

0

0

0

1

0

0

0

0 0

0 0

0 0

1 0

0 0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0⎤ ⎡ 0 ⎤ 0 ⎥⎥ ⎢⎢ 0 ⎥⎥ 0⎥ ⎢ 0 ⎥ ⎡0 ⎥ ⎢ ⎥ 0 ⎥ ⎢ 0 .5 ⎥ = 0 .5 * ⎢⎢ 0 * 0 ⎥ ⎢ 0 .5 ⎥ ⎢⎣ 0 ⎥ ⎢ ⎥ 0⎥ ⎢ 0 ⎥ 0⎥ ⎢ 0 ⎥ ⎥ ⎢ ⎥ 0 ⎥⎦ ⎢⎣ 0 ⎥⎦

0 I 0

0 ⎤ ⎡0 ⎤ 0 ⎥⎥ * ⎢⎢ 1 ⎥⎥ 0 ⎥⎦ ⎢⎣ 0 ⎥⎦

J is made up of two matrices multiplication. Now M and J have been defined, AT coefficients are calculated by using equation (4) to substitute f in (3). C = Tn * (M * S + J ) = Tn * M * S + Tn * J

(5)

To facilitate multiplication procedures, Tn is partitioned in terms of the matrix M. ⎡1 ⎢−1 ⎢ ⎢−1 ⎢ 1 Tn = ⎢ ⎢−1 ⎢ ⎢1 ⎢1 ⎢ ⎣⎢−1

0 1

0 0

0 0

0

1

0

0

0

−1 −1 1 0 0 0

0 1

0 0

−1 0

0 0

0 0

0 −1 1

0 −1 0 −1 0 1

1 −1 1 − 1

0 0⎤ 0 0⎥⎥ 0 0⎥ ⎥ 0 0⎥ 0 0⎥ ⎥ 0 0⎥ 1 0⎥ ⎥ −1 1⎦⎥

⎡ A 0 0⎤ Tn = ⎢⎢ B C 0 ⎥⎥ ⎢⎣D E F⎥⎦

So we get:

⎡ A 0 0⎤ ⎡0 0 0 ⎤ ⎢ ⎥ ⎢ ⎥ Tn * M * S = Tn * (M1 + M2.) * S = ⎢B 0 0⎥ * S + ⎢0 0 0 ⎥ * S = P1 + P2 ⎢⎣D 0 0⎥⎦ ⎢⎣0 0 F⎥⎦

From the result, the technique of sparse matrix multiplication can be adopted to reduce the multiplication times, which would lead to fast execution. Also it is obvious that there is no need to multiply Tn * M directly, only partitioning Tn according to M is enough to get the result that is a part of Tn, here M determines the portion of Tn and it is not generated explicitly. Now we handle the matrix J. ⎡ A 0 0⎤

Tn * J = Q = 0.5 * ⎢⎢B C 0 ⎥⎥ ⎣⎢D E F⎦⎥

⎡0 0 0 ⎤

⎡0⎤

⎢⎣0 0 0⎥⎦

⎢⎣0⎥⎦

*⎢⎢0 I 0⎥⎥ * ⎢1⎥ ⎢ ⎥

⎡0⎤

= 0.5 * ⎢⎢C ⎥⎥ ⎣⎢ E ⎦⎥

The result also denotes it is not mandatory to multiply matrices to compute the don’t care part. The result comes from the appropriate partition of T n . The two matrices M and J retrieve corresponding information from the truth table like filters. Fig. 1 outlines the conceptual procedure. The advantage for introduction of M and J is that they are helpful to generate Tn. It is unnecessary to get all elements in Tn, because only a portion of Tn joins the multiplication process in each step. The idea of M and J can generate corresponding parts of Tn detachedly and decide the position of non-zero elements in these parts to replace the recursive equation (2). Therefore the transform of the circuit is from the equation (4) which shows that a fast algorithm is possible since the idea of auxiliary matrices is introduced to simplify the multiplication procedure. 4. Fast Algorithm to Generate Transform According to the analysis above we can get the fast algorithm easily to be implemented by computers. Fig. 2

Yu Pang / Procedia Engineering 29 (2012) 3722 – 3726 Author name / Procedia Engineering 00 (2011) 000–000

4 Filter Matrix

M1

Sub domains

Filter Matrix M2

Matrix Tn

S

Filter Matrix J

*

3725

Matrix Tn

+

AT

Matrix Tn

Fig. 1. Calculation chart of Example 1

describes the algorithm in detail. The algorithm partitions the truth table into sub-domains and don’t care sets firstly, and gets their numbers as n and m respectively. Then it loops all sub-domains and invokes a subroutine to obtain parts of AT coefficients. The subroutine of Branch_Sub commences a loop to traverse elements in this sub-domain to generate non-zero elements in corresponding rows of Tn. Here the variable “offset” indicates the start position of the sub-domain in the truth table. Subsequently the obtained Tn part is multiplied with the sub-domain to get the result matrix P and its position is adjusted by the offset. The other subroutine Branch_DC which calculates the coefficients of don’t care sets is similar to Branch_Sub. The final AT coefficients are obtained by the addition of these two parts as P and Q. 1. (S, D, n, m) = Partition (f); 2. for (i=0; i
Set_position (T[k], offset) ; }

Fig. 2. Fast algorithm to generate AT for an incompletely specified Boolean function

We can observe that the auxiliary matrices M and J assist to generate non-zero elements in appropriate Tn parts. Also the algorithm takes advantage of sparse matrix multiplications so the execution time is very small. Please note only parts of the matrix Tn are generated to shrink the complexity in each loop, and these parts are released to save memory immediately after the loop, thus requirement of fewer memory is anticipated. The time complexity of traditional matrix multiplication is O(r2), while around O(t) for the fast algorithm. Here r is the rows number of Tn and t is the number of non-zero elements in Tn. It is obvious t is a very tiny number compared to r2, so the execution time can be cut down greatly. 5. Experimental Results A) Adder The first adder and the second adders perform 6-bit addition and perform 8-bit addition respectively, and the subdomains of the truth tables are (0 – 3071) and (0 – 59999), so the don’t care set are (3072 – 4095) and (60000 – 65535). The third and fourth adders carry out 12-bit and 16-bit addition, and their sub-domains of the truth tables are (0 – 10000000) and (1000000000 –4294967296). B) Multiplier Four multipliers are adopted to execute 6-bit, 8-bit, 12-bit and 16-bit multiplication. The sub-domains cover (1000 – 4095) , (10000 – 65536) , (0 – 10000000) and (1000000000 – 4294967296) respectively.

3726

Yu Pang / Procedia Engineering 29 (2012) 3722 – 3726 Author name / Procedia Engineering 00 (2011) 000–000

5

Table 1 compares the performance of traditional multiplication and the new algorithm. It denotes the new algorithm is much faster when the function has large size of input bits. And with the increase of input, the execution time for the traditional method rises exponentially while the new algorithm linearly, so the speed discrepancy would also augment. Table 1. Performance of the fast algorithm for arithmetic circuits Circuit

Input

AT Terms

Method in [10]

Method in [7]

Our method

Time

Mem

Time

Mem

Time

857

0.8s

0.19M

0.17s

0.17M

0.13 s

0.1M

6781

6.6s

0.79M

0.28s

0.68M

0.21 s

0.72M

380s

8.2M

32s

7M

21s

7.3M

>1500min

980M

98min

730M

69min

760M

Mem

Adder 1

12

Adder 2

16

Adder 3

24

1278910

Adder 4

32

207400960

Multiplier 1

12

2435

2.6 s

0.48 M

0.4s

0.39M

0.32 s

Multiplier 2

16

25037

20.2s

2.8 M

2.6s

2.2M

2s

2.5 M

Multiplier 3

24

3219378

3912s

246M

261s

213M

192s

220M

Multiplier 4

32

430160758

>8000min

>2G

980min

>2G

780min

0.4M

>2G

6. Conclusion As an important representation for digital circuits, Arithmetic Transform has inherent advantages. Given the truth table for a Boolean function, to get the circuit transform, the traditional and direct method is matrix multiplication which has to take tremendous execution time. To overcome the limitation and handle incompletely specified circuits, this paper analyzes the procedure property of matrix multiplication, and proposes a new algorithm to generate the circuit transform efficiently. The complexity is distinctly reduced and experiments clearly indicate its better performance. Acknowledgments The research reported herein was sponsored largely by the Ministry of Industry and Information Technology of the People’s Republic of China under the grant of special projects for internet of things, and by the National Natural Science Foundation of China under the grant No. 61102075, and by the Natural Science Foundation of Chongqing under the grant No. CSTC 2011BB2142. References [1] Rolf Drechsler and Bernd Becker, Binary Decision Diagrams: Theory and Implementation, Kluwer Academic Publishers, 1998. [2] E. Clarke, M. Fujita, P. McGeer, K. L. McMillan, J. Yang and X. Zhao, Multi terminal binary decision diagrams: An efficient data structure for matrix representation, In Int’l Workshop on Logic Synth., 1993, 1-15. [3] R. P. Bryant and Y. A. Chen. Verification of Arithmetic circuits with Binary Moment Diagrams, Proc. of 32nd Design Automation Conference, 1995, 535-541. [4] M. Ciesielski, P. Kalla, Z. Zeng and B. Rouzeyre. Taylor Expansion Diagrams: a Compact, Canonical Representation with Applications to Symbolic Verification. Proc. Design Automation & Test in Europe, 2002, 285-289. [5] Yu Pang, Katarzyna Radecka, Zeljko Zilic. Arithmetic Transforms of Imprecise Datapaths by Taylor Series Conversion. 13th IEEE International Conference on Electronics, Circuits and Systems, 2006. 696 – 699. [6] T. Damarla and M.G. Karpovsky, Reed-Muller spectral techniques for fault detection. IEEE Trans. Comput., 1989, 38, 788 – 797. [7] B. J. Falkowski, Chip-Hong Chang, Efficient algorithms for the calculation of Walsh spectrum from OBDD and synthesis of OBDD from Walsh spectrum for incompletely specified Boolean functions, Proceedings of the 37th Midwest Symposium Circuits and Systems, 1994, 1, 393 – 396. [8] B. J. Falkowski, Calculation of Rademacher-Walsh spectral coefficients for systems of completely and incompletely specified Boolean functions, IEEE International Symposium on Circuits and Systems, 1993, 3, 1698 – 1701. [9] K. Radecka and Z. Zilic, Arithmetic Transforms for Compositions of Sequential and Imprecise Datapath, IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 2006, 25, 1382 – 1391. [10] S.L.Hurst, D.M.Miller and J.C.Muzio. Spectral Techniques in Digital Logic, Academic Press, 1985 [11] B. J. Falkowski, Chip-Hong Chang, Efficient algorithms for the calculation of arithmetic spectrum from OBDD and synthesis of OBDD from arithmetic spectrum for incompletely specified Boolean functions, IEEE International Symposium on Circuits and Systems, 1994, 1, 197 – 200.