Available online at www.sciencedirect.com
ScienceDirect IFAC PapersOnLine 52-27 (2019) 257–264
Distributed Indication in LUT-Based Asynchronous Logic Distributed Indication in LUT-Based Asynchronous Logic Distributed Indication in LUT-Based Asynchronous Distributed Indication in LUT-Based Asynchronous Logic Logic Distributed Indication in LUT-Based Asynchronous Logic Igor Lemberski, Marina Uhanova, Artjoms Suponenkovs
Igor Lemberski, Marina Uhanova, Artjoms Suponenkovs Igor Marina Uhanova, Artjoms Suponenkovs International Astronomy Centre Igor Lemberski, Lemberski, MarinaRadio Uhanova, Artjoms Suponenkovs International Radio Astronomy Centre International Radio Astronomy Centre Ventspils University College Igor Lemberski, Marina Uhanova, Artjoms Suponenkovs International Astronomy Centre VentspilsRadio University College Ventspils University College Ventspils, Latvia International Radio Astronomy Centre Ventspils University College Ventspils, Latvia Ventspils, Latvia {igors.lemberskis, artjoms.suponenkovs}@venta.lv Ventspils University College Ventspils, Latvia {igors.lemberskis, marina.uhanova, marina.uhanova, artjoms.suponenkovs}@venta.lv {igors.lemberskis, marina.uhanova, artjoms.suponenkovs}@venta.lv Ventspils, Latvia {igors.lemberskis, marina.uhanova, artjoms.suponenkovs}@venta.lv {igors.lemberskis, marina.uhanova, artjoms.suponenkovs}@venta.lv Abstract: Abstract: The The method method of of indicating indicating asynchronous asynchronous logic logic design, design, targeting targeting Look-Up-Tables Look-Up-Tables (LUTs) (LUTs) is is Abstract: The method of indicating asynchronous logic design, targeting Look-Up-Tables (LUTs) is proposed. It produces a dual-rail multi-level network optimized for the speed and area. The optimization Abstract: The method of indicating asynchronous logic design, targeting Look-Up-Tables (LUTs) is is proposed. It produces a dual-rail multi-level network optimized for the speed and area. The optimization proposed. It produces a dual-rail multi-level network optimized for the speed and area. The optimization is done using the resubstitution. Initially, a single-rail multi-level network is created using ABC script. Then Abstract: The method of indicating asynchronous logic design, targeting Look-Up-Tables (LUTs) is proposed. It produces a dual-rail multi-level network optimized for the speed and area. The optimization is done using the resubstitution. Initially, a single-rail multi-level network is created using ABC script. Then done using the resubstitution. Initially, aa single-rail multi-level network is using ABC script. each is transformed into logic. The of indication are formulated and proposed. produces a dual-rail multi-level network optimized fordistributed the speed and area. optimization is done usingIt resubstitution. Initially, single-rail multi-level network is created created usingThe ABC script. Then Then each node node isthe transformed into dual-rail dual-rail logic. The conditions conditions of the the distributed indication are formulated and each node transformed into dual-rail logic. The of distributed indication are formulated and the is proposed. For the compact representation and an extended dual-rail done usingis Initially, a single-rail multi-level network is created using ABC script. Then each node isthe transformed logic. The conditions conditions of the the distributed indication formulated and the procedure procedure isresubstitution. proposed.into Fordual-rail the network network compact representation and optimization, optimization, an are extended dual-rail the procedure is proposed. For the network compact representation and optimization, an extended dual-rail PLA table is used. The resubstitution targeting multi-output logic is formulated and solved as a covering each node is transformed into dual-rail logic. The conditions of the distributed indication are formulated and the is proposed. For the network compact representation and optimization, extended dual-rail PLAprocedure table is used. The resubstitution targeting multi-output logic is formulated and an solved as a covering PLA table is used. The resubstitution targeting multi-output logic is formulated and solved as a covering task: the output of the node, whose inputs have been selected for the resubstitution, is split into the set the procedure is proposed. For the network compact representation and optimization, an extended dual-rail PLA tableoutput is used. Thenode, resubstitution targeting is formulated and issolved as athe covering task: the of the whose inputs have multi-output been selectedlogic for the resubstitution, split into set of of task: the output of the node, whose inputs have been selected for the resubstitution, is split into the set dichotomies. The set of inputs satisfying formulated target (optimization for the speed or area) are sought to PLA table is used. The resubstitution targeting multi-output logic is formulated and solved as a covering task: the output inputs have beentarget selected for the resubstitution, set of of dichotomies. Theofsettheofnode, inputswhose satisfying formulated (optimization for the speed isorsplit area)into are the sought to dichotomies. The set of inputs satisfying formulated target (optimization for the speed or area) are sought to cover the dichotomies. The nodes with zero fan-outs are removed. The set of benchmarks is processed and task: the output of the node, whose inputs have been selected for the resubstitution, is split into the set of dichotomies. The set of The inputs satisfying formulated (optimization forof thebenchmarks speed or area) are soughtand to cover the dichotomies. nodes with zero fan-outstarget are removed. The set is processed cover the dichotomies. The nodes with zero fan-outs are removed. The set of benchmarks is processed and results are compared. experiments show, that significantly improves (on average, dichotomies. The set ofThe inputs satisfying formulated target (optimization forof the speed orresults area) are to cover The nodes with zero fan-outs aremethod removed. The set benchmarks is processed and resultsthe aredichotomies. compared. The experiments show, that the the method significantly improves results (onsought average, results are compared. The experiments show, that the method significantly improves results (on average, 18.7% for the speed and 39.3% for the area respectively). cover the dichotomies. The nodes with zero fan-outs are removed. The set of benchmarks is processed and results are compared. The experiments show, that the method significantly improves results (on average, 18.7% for the speed and 39.3% for the area respectively). 18.7% for speed and 39.3% for respectively). results are the compared. The experiments show, that the method significantly improves results (on average, 18.7% for the speed and 39.3% for the the area area respectively). © 2019, IFAC (International Federation of Automatic by Elsevier Ltd. All rights reserved. Keywords: logic synthesis, asynchronous logic, dual-rail function, look-up-table-LUT, resubstitution Keywords: logic synthesis, asynchronous logic, Control) dual-railHosting function, look-up-table-LUT, resubstitution 18.7% for the speed and 39.3% for the area respectively). Keywords: logic synthesis, asynchronous logic, dual-rail function, look-up-table-LUT, resubstitution Keywords: logic synthesis, asynchronous logic, dual-rail function, look-up-table-LUT, resubstitution Keywords: logic synthesis, asynchronous logic, dual-rail function, look-up-table-LUT, resubstitution able to to inform inform the the environment, environment, that that the the circuit circuit is is in in aa stable stable I. able I. IINTRODUCTION NTRODUCTION able inform the that circuit aa stable I. state and is ready ready to accept accept aa new new input state.is Inin strongly able to to inform the environment, environment, that the the circuit isIn in strongly stable I. IINTRODUCTION NTRODUCTION state and is to input state. The LUT-oriented synthesis synthesis state is ready to accept aa new input state. strongly The existing existing methods methods of of multi-level LUT-oriented indicative logic [10], the outputs change their state, once all able toand inform the environment, that the circuit isIn in once a stable I. Imulti-level NTRODUCTION state and is ready to accept new input state. In strongly indicative logic [10], the outputs change their state, all The methods of multi-level have been developed developed for synchronous logic and and are are synthesis based on on indicative logic [10], the outputs change their state, once all The existing existing methodsfor of synchronous multi-level LUT-oriented LUT-oriented synthesis inputs have changed their states. In weakly indicative logic, have been logic based state and is ready to accept a new input state. In strongly indicative [10], their the outputs their state, once all inputs havelogic changed states. change In weakly indicative logic, have been developed for logic are based algebraic and functional decompositions. In the the The methods of synchronous multi-level LUT-oriented synthesis inputs have changed their states. In weakly indicative logic, have existing been and developed for synchronous logic and and are algebraic based on on selected outputs may change their state, once selected inputs algebraic functional decompositions. In algebraic indicative logic [10], the outputs change their state, once all inputs have changed their states. In weakly indicative logic, selected outputs may change their state, once selected inputs algebraic functional decompositions. In decomposition, aa Boolean network is as have been and developed for synchronous and are algebraic based selected outputs may change their state, once selected inputs algebraic and functional decompositions. In the the algebraic decomposition, Boolean network logic is represented represented as onaa have changed the state, however, all outputs change their inputs have changed their states. In weakly indicative logic, selected outputsthe may change their state, once selected have changed state, however, all outputs changeinputs their decomposition, aa graph Boolean network is as a directed (DAG) and heuristics algebraic and functional decompositions. In the algebraic have changed the state, however, all outputs change their decomposition, Boolean network is represented represented as of directed acyclic acyclic graph (DAG) and various various heuristics ofa state, once all inputs inputs have changed their states. selected outputs may change their their state, once selected have once changed the state, however, all states. outputs changeinputs their state, all have changed directed acyclic graph (DAG) and various heuristics of extracting subgraphs to be mapped into LUTs, are applied decomposition, a Boolean network is represented as a state, once all inputs have changed their states. directed acyclic graph (DAG) and various heuristics of extracting subgraphs to be mapped into LUTs, are applied have once changed the state, however,their all states. outputs change their state, all inputs have changed extracting subgraphs to mapped LUTs, are applied The multi-level implementations of the the dual–rail dual–rail [1,2,3]. DAG, where each node corresponds to directed acyclic (DAG) andinto various of extracting subgraphs to be be mapped into LUTs,heuristics are applied The multi-level implementations of [1,2,3]. For For DAG,graph where each node corresponds to inverted inverted state, once all inputs have changed their states. The implementations of [1,2,3]. For DAG, where each node corresponds to inverted asynchronous logic were were proposed in in [11,12]. [11,12]. These dual–rail methods two-input ANDwhere gate (And-Inverter-Graph-AIG) extracting to be mapped into LUTs, are applied The multi-level multi-level implementations of the the dual–rail asynchronous logic proposed These methods [1,2,3]. Forsubgraphs DAG, each node corresponds to inverted two-input AND gate (And-Inverter-Graph-AIG) asynchronous logic were proposed in [11,12]. These methods two-input AND gate (And-Inverter-Graph-AIG) (called NCL-D and NCL-X) are based on the circuit The multi-level implementations of the dual–rail rewriting/refactoring procedures based on K-cuts enumeration [1,2,3]. For DAG, each node corresponds to inverted asynchronous logic were proposedare in [11,12]. These (called NCL-D and NCL-X) based on themethods circuit two-input ANDwhere gate (And-Inverter-Graph-AIG) rewriting/refactoring procedures based on K-cuts enumeration (called NCL-D and NCL-X) are based on the rewriting/refactoring procedures based on K-cuts enumeration decomposition into simple (OR, AND, NOR, NAND, etc.) asynchronous logic were proposed in [11,12]. These methods are proposed and incorporated into ABC synthesis system [4]. two-input AND gate (And-Inverter-Graph-AIG) (called NCL-Dinto andsimple NCL-X) based the circuit circuit decomposition (OR, are AND, NOR,onNAND, etc.) rewriting/refactoring procedures based onsynthesis K-cuts enumeration are proposed and incorporated into ABC system [4]. decomposition into simple (OR, AND, NOR, NAND, two-input gates. In NCL-D, each gate function as wellcircuit asetc.) its are proposed and incorporated into ABC synthesis system [4]. (called NCL-D and NCL-X) are based on the The procedures are oriented on single output logic and local rewriting/refactoring procedures based on K-cuts enumeration decomposition into simple (OR, AND, NOR, NAND, etc.) two-input gates. In NCL-D, each gate function as well as its are proposed andare incorporated ABCoutput synthesis [4]. The procedures oriented oninto single logicsystem and local two-input gates. In NCL-D, each gate function as well as its complement one is represented as DIMS. It ensures strongly The procedures are oriented on single output logic and local decomposition simple (OR, NOR, NAND, two-input gates. In each gate function as well asetc.) its transformation. Inincorporated [5], multi-output optimization withlocal the complement oneinto is NCL-D, represented asAND, DIMS. It ensures strongly are andare ABC synthesis [4]. Theproposed procedures oriented oninto single output logicsystem and transformation. In [5], multi-output optimization with the complement one is represented as DIMS. It ensures strongly indicative logic within each module and therefore, within the two-input gates. In NCL-D, each gate function as well as its transformation. In [5], multi-output optimization with the complement one is represented as DIMS. It ensures strongly indicative logic within each module and therefore, within the divisor of limited (two literal) size is proposed. The procedures are oriented on single output logic and local transformation. [5],literal) multi-output optimization with the divisor of limitedIn(two size is proposed. indicative logic within each module and therefore, within the whole circuit. The other solution is based on distributing complement one is represented as DIMS. It ensures strongly divisor of limited (two literal) size is proposed. indicative logic within each module and therefore, within the whole circuit. The other solution is based on distributing transformation. [5],literal) multi-output optimization with the divisor of limitedIn(two size is proposed. whole The other on indication between several outputs[13]. In NCL-X, NCL-X, sum-ofindicative logic within eachsolution module is andbased therefore, within the The functional decomposition is isbased based on methods methods [6,7]. [6,7]. whole circuit. circuit. The other solution is based on distributing distributing indication between several outputs[13]. In sum-ofThe functional is on divisor of limiteddecomposition (two literal) size proposed. indication between several outputs[13]. In NCL-X, sum-ofThe functional decomposition is based on methods [6,7]. products (SOP) function representation obtained after whole circuit. The other solution is based on distributing Namely, in [6], the conditions for the simple disjoint indication between several outputs[13]. In NCL-X, sum-ofproducts (SOP) function representation obtained after The functional decomposition is based on methods [6,7]. Namely, in [6], the conditions for the simple disjoint products representation obtained after minimization, is function allowed. It results results In in NCL-X, less expensive Namely, in the conditions for the indication between several outputs[13]. sum-ofdecomposition are proposed. Since decomposition [7] is The functional decomposition is based on simple methodsdisjoint [6,7]. products (SOP) (SOP) function representation obtained after minimization, is allowed. It in less expensive Namely, in [6], [6], the conditions for decomposition the simple disjoint decomposition are proposed. Since [7] is minimization, is allowed. It results in less expensive implementation. The drawbackrepresentation is results that the the in circuit may be nonnondecomposition are proposed. Since [7] is products (SOP)isThe function obtained after minimization, allowed. It is less expensive capable ofinproducing producing node functions of given number of implementation. drawback that circuit may be Namely, [6], conditions for decomposition thegiven simple disjoint decomposition are the proposed. Since decomposition [7] of is capable of node functions of number implementation. drawback that the circuit be nonindicative. TheisThe problem can be solved bymay using the minimization, allowed. It is results in less expensive capable of producing node functions of given number of implementation. The drawback is that the circuit may be nonindicative. The problem can be solved by using the variables, its using is rather attractive for LUT-based decomposition are proposed. Since decomposition [7] is capable of its producing functions of given number of variables, using isnode rather attractive for LUT-based indicative. The problem can be solved by using the acknowledgement block [14]. It indicates circuit stability. implementation. The drawback is that the circuit may be nonvariables, using rather attractive for indicative. The problem can It beindicates solved circuit by using the acknowledgement block [14]. stability. implementation. capable producing functions of given number of variables,of its its using is isnode rather attractive for LUT-based LUT-based implementation. acknowledgement block [14]. It indicates circuit stability. The state-of-the-art design methods under the strong indicative. The problem can be solved by using the implementation. acknowledgement block [14]. It indicates circuit stability. The state-of-the-art design methods under the strong variables, its using is rather attractive for LUT-based implementation. The state-of-the-art design methods under strong indication are oriented oriented on the implementation usingthe DIMS[8], acknowledgement block [14]. It indicates circuit stability. Quasi-Delay-Insensitive (QDI) asynchronous logic is of The state-of-the-art design methods under the strong indication are on the implementation using DIMS[8], Quasi-Delay-Insensitive (QDI) asynchronous logic is of implementation. indication oriented the implementation using DIMS[8], Quasi-Delay-Insensitive (QDI) asynchronous logic NCL [15],are reconfigurable logic [16], complex complex nodes[17], The state-of-the-art design methods under strong special interest, since since its its correct correct behaviour is independent independent indication arereconfigurable oriented on on thelogic implementation usingthe DIMS[8], NCL [15], [16], nodes[17], Quasi-Delay-Insensitive (QDI) behaviour asynchronous logic is is of of special interest, is NCL [15], reconfigurable logic [16], complex nodes[17], special interest, since its correct behaviour is independent of direct logic[18]. indication are oriented on the implementation using DIMS[8], gate and wire delays. The circuit should satisfy the Quasi-Delay-Insensitive (QDI) asynchronous logic NCL logic[18]. [15], reconfigurable logic [16], complex nodes[17], direct specialand interest, its correct behaviour is independent of gate wire since delays. The circuit should satisfyis the direct logic[18]. gate wire delays. The circuit should satisfy the NCL [15], reconfigurable logic [16], complex nodes[17], assumption aboutsince isochronic forks. The most most popular method special interest, its correct behaviour is popular independent of direct gate and and wire delays. The circuit should satisfy the assumption about isochronic forks. The method In [14],logic[18]. two-level (NOR-NOR, NAND-NAND) dual-rail In [14], two-level (NOR-NOR, NAND-NAND) dual-rail assumption about isochronic forks. The most popular method direct logic[18]. of QDI logic implementation is based on sum-of-minterms gate and wire delays. The circuit should satisfy the In [14], two-level (NOR-NOR, NAND-NAND) dual-rail assumption about isochronic forks. The most popular method of QDI logic implementation is based on sum-of-minterms asynchronous logic is offered. Using this approach, multiIn [14], two-level NAND-NAND) asynchronous logic (NOR-NOR, is offered. Using this approach,dual-rail multiof QDI implementation is on sum-of-minterms (SOM) dual-rail representation, where each minterm is assumption about isochronic forks. The most popular method asynchronous logic is offered. Using this approach, multiof QDI logic logic implementation is based based oneach sum-of-minterms (SOM) dual-rail representation, where minterm is level design method is proposed [19]: firstly, initial logic In [14], two-level (NOR-NOR, NAND-NAND) dual-rail asynchronous logic is offered. Using this approach, multilevel design method is proposed [19]: firstly, initial logic (SOM) dual-rail representation, where each minterm is implemented as a state-holding (C-) element (Delayof QDI logic implementation is based sum-of-minterms level design method is proposed [19]: firstly, initial logic (SOM) dual-rail where minterm is implemented as representation, a state-holding (C-)oneach element (Delayfunction is decomposed into a single-rail boolean network, asynchronous logic is offered. Using this approach, multilevel design method is proposed [19]: firstly, initial logic function is decomposed into a single-rail boolean network, implemented as aa state-holding (C-) element (DelayInsensitive-Minterm-System-DIMS) [8]. DIMS cost is very very (SOM) dual-rail each minterm is function is decomposed into a single-rail boolean network, implemented as representation, state-holdingwhere (C-)DIMS element (DelayInsensitive-Minterm-System-DIMS) [8]. cost is secondly, each node is represented as two-level dual-rail level design method is proposed [19]: firstly, initial logic function is decomposed into a single-rail boolean network, secondly, each node is represented as two-level dual-rail Insensitive-Minterm-System-DIMS) [8]. DIMS cost is very high. In [9], [9], it it as is shown, shown, that in in the the case case ofelement LUT, function implemented a state-holding (C-) DIMS (Delaysecondly, each node is represented as two-level dual-rail Insensitive-Minterm-System-DIMS) [8]. cost is very high. In is that of LUT, aa function logic. Theismethod method is based based on athe thesingle-rail modified weakly indicative indicative function decomposed into boolean network, secondly, each node is represented as two-level dual-rail logic. The is on modified weakly high. In [9], it is shown, that in the case of LUT, a function and single C-element canthat be implemented implemented asLUT, whole. Insensitive-Minterm-System-DIMS) [8]. DIMS cost is very logic. The method is on the modified weakly high.aa single In [9], C-element it is shown, in the case ofas a function and can be aa whole. logic[14], known also as an early output: all outputsindicative change secondly, each node is an represented as all two-level dual-railaa logic. The known method is based based on the output: modified weakly indicative logic[14], also as early outputs change and single can be aa whole. high. [9], C-element it is shown, in the case ofas a function logic[14], known also as an early output: all outputs change and aa In single C-element canthat be implemented implemented asLUT, whole. state, if not all inputs change the state. A model consists ofaa logic. The method is based on modified weakly indicative logic[14], known also as an early output: all outputs change state, if not all inputs change the state. A model consists of To ensure stability due to variable delays, the circuit should To duecan to variable delays, the andensure a singlestability C-element be implemented as a circuit whole.should state, if not all inputs change the state. A model consists of functional and acknowledgement blocks. To ensure stability, logic[14], known also as an early output: all outputs change To ensure to delays, the should state, if notand all acknowledgement inputs change the blocks. state. ATomodel consists ofa functional ensure stability, be indicative: after due changing inputs, the outputs outputs should be To indicative: ensure stability stability due to variable variable delays, the circuit circuit should be after changing inputs, the should be functional and acknowledgement blocks. To ensure stability, state, if not all inputs change the state. A model consists of be after changing inputs, the should be functional and acknowledgement blocks. To ensure stability, To ensure stability to variable delays, the circuit should be indicative: indicative: after due changing inputs, the outputs outputs should be functional and acknowledgement blocks. To ensure stability, 2405-8963 © 2019,after IFACchanging (International Federation of Automatic Hosting by Elsevier Ltd. All rights reserved. be indicative: inputs, the outputs shouldControl) be Peer review under responsibility of International Federation of Automatic Control. 10.1016/j.ifacol.2019.12.648
258
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
disjoint SOP representation is required. Compared to DIMS, the implementation cost is lower since to some extent, the term minimization is allowed. 2. RESUBSTITUTION: PRELIMINARIES The resubstitution is a procedure of replacing a set of variables, a node depends on, by the other set of variables while preserving node functionality. Such a replacement is accepted if a better network (for example, simpler in some sense as given one) is produced. The resubstitution targeting the node function support minimization is proposed in [20].
4. SOP- BASED DUAL-RAIL REPRESENTATION TARGETING LUT 4.1. Single-rail logic. Let f be a single-rail n-variable function: f=f(x0,x0’,x1,x1’,…,xi,xi’,…,xn-1,xn-1’), depending on both positive and negative variables, xi’-negation of variable xi. Definition. The set on input vectors, for which function f accepts value 1 are called as onset vectors and the set of ones, for which function f accepts value 0 – offset vectors. Each input vector may be either binary (1,0) or three component ones (1,0, don’t care).
In [21], the resubstitution based on the concept of permissible functions is presented. The functions are permissible, if they do not change node function value. They are designed based on don’t cares values. A new input can be added if the node output belongs to the set of permissible functions. The resubstitution targeting area minimization at the post technological mapping is given in [22].
Analytically, each three component vector is represented as a product term and a function is described as a sum-of-products (SOP). If function description is based on binary vectors, the representation is called as a sum-of-minterms (SOM). For example, function f0 (fig.1b) SOP representation is as follows: f0=x3n2vx3‘n2‘n3.
The current methods are oriented on the network representation using AIG. New k nodes are added to AIG once it implies removing at least (k+1) nodes [23]. The method is based on the algebraic division. Since a procedure is oriented on the implementation using simple gates, the number of candidates for the resubstitution is rather large. For speeding-up and scalability, the local optimization based on windowing is applied.
In dual-rail logic, it is supposed that each single-rail variable xi as well function f may be in one of these three states: states 1, 0 (so called working states) or undefined (spacer state). To represent a three-state variable xi,i=1,2,…,n, two signals xi1 and xi0 are introduced, where xi1 =1,xi0 =0, if xi is in state 1; xi1 = 0 and xi0 = 1 if xi is in state 0; xi1 =xi0 = 0 if xi is in the spacer state, combination xi1 = xi0 =1 is not allowed. Similarly, for a three-state output, two functions f1 and f0 should be introduced. As a result, functions f1, f0 depend on positive variables only: f1=f1(x00,x01,x10,x11,…,xi0,xi1,…, xn-1,0, xn-1,1),f0=f0(x00,x01,x10,x11,…,xi0,xi1,…,xn-1,0,xn-1,1). A four phase behaviour discipline is supposed: to change an input state, it resets first (changes to the spacer state). As a result, the output state resets too. After that, a new input state sets up. It implies setting a new output state.
Our method is supposed for QDI logic. Logic is represented by a PLA (truth) table, decomposed into BLIF format (Sect.3,4) and extended PLA (Sec.5). The initial network is close to NCL-X and the features are given in Sect.6. The resubstitution targeting area (number of nodes) and speed (number of nodes in a critical path) (Sect.7) as well as indication (Sect.8) are proposed. 3. PLA AND BLIF REPRESENTATIONS Single-rail two-level multi-output logic can be represented by a multi-output PLA table of dimension p*(n+m), where p is number of minterms for which m functions are described, n-number of primary variables (fig.1a). Don’t care minterms are not listed, what reduces the size of the PLA table. Implicitly, don’t cares are taken into account once PLA table is processed. Multi-level logic is given in a BLIF format [24](fig.1b), where a set of cooperating functions are described. Each function depends of given number of inputs at most and is represented as a single-output PLA table. The set of primary inputs and outputs are given in the BLIF header, using keywords .input, .outputs. For each PLA table, inputs and single-output names are described using keyword .names. The decomposition from the initial two-level PLA to multilevel BLIF is done using an existing tool (ABC). The details will follow.
4.2. Dual-rail logic.
4.3. SOP-based Hazard-free Representation A typical k-input LUT (further, k-LUT) consists of 2kx1 SRAM indexed by 2k input (all possible) combinations. Any SOM function of k inputs at most can be implemented using k-LUT. The LUT structure is carefully designed w.r.t. delays, therefore, 1) the propagation delay is the same independently of the function; 2) switching an input signal and its negation (produced inside LUT) occur at the same moment. In [25], hazard-free SOP-based representation targeting LUT, has been proposed. Consider function f0 (fig.1b) and its negation: f0’= =x3’n3’vx3’n2vx3n2’. In dual-rail, the above functions can be represented as follows: f01=x31n21vx30n20n31, f00= x30n30v x30n21v x31n20 and re-written as SOMs [24]: f01=x31n21n31 v x31n21n31’v x30n20n31, f00=x30n21n30v vx30n20n30vx30n21n30’vx31n20n30‘vx31n20n30, where n30’n31’negations, produced inside LUT. The table representation is shown in fig.2. Additional lines (4-8 in the left part and 6-10 in the right part), where each input sigpagnal ANDed with the feedback one, are introduced to implement state-holding (C-) elements, f-1, f-0 - feedbacks of signals f01, f00. One can see, that two 6-LUTs (including inputs to arrange feedbacks) are required to implement functions f01, f00. SOP-based dual-rail BLIF is given in fig.3 (for simplicity, additional lines for Celements implementation are omitted).
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
x0x1x2x3f0 0000 0 1000 1 1010 1 1111 1 0111 0 1011 0 1001 0
.inputs x0 x1 x2 x3 .outputs f0 .names x3 n2 n3 f0 11- 1 001 1 .names x0 x1 x2 n2 111 1 .names x0 x1 n3 10 a) b) Fig.1. PLA table (a), single-rail BLIF(b)
259
function is described by its onset and represented as a PLA table with one output and set X of inputs, XXGF. Taking the advantage of PLA format representation into account, it is expediently to describe the boolean network by so called extended PLA table. The procedure of the extended PLA table design is as follows. Table 1. Dual-rail extended PLA
Note, that the number of inputs suppling LUTs is less than in DIMS [9], where 7-LUTs (including feedback inputs) are required to implement function f00, f01. It will be shown experimentally, that compared to DIMS, our method produces logic of the less complexity.
Line No x01 x00 x11 x10 x21 x20 x31 x30 n21 n20 n31 n30 f01 f00 1 1 0 0 1 0 1 0 1 0 1 1 0 1 0 2 1 0 0 1 1 0 0 1 0 1 1 0 1 0 3 1 0 1 0 1 0 1 0 1 0 0 1 1 0 4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 5 0 1 1 0 1 0 1 0 0 1 0 1 0 1 6 1 0 0 1 1 0 1 0 0 1 1 0 0 1 7 1 0 0 1 0 1 1 0 0 1 1 0 0 1
4.4. Indication in the set and reset phases. Once inputs set up, proposed SOP-based logic operates under early output discipline. Therefore, it may be non-indicative. For example, if function f01 “not all” inputs x31,n21 (excluding n31) set up, then independently on input n31 value, function f01 sets up: f01=1(“all” outputs change the state). As a result, input n31 isn’t acknowledged by the function. Note, that resetting all inputs is acknowledged by resetting LUT output (since each input is supported by a state-holding element). Therefore, during the reset, logic operates as strongly indicative one and indication should be ensured in the set phases only. Line x31 x30 n21 n20 n31 1 1 0 1 0 1 2 1 0 1 0 0 3 0 1 0 1 1 4 1 - - 5 - 1 - - 6 - - 1 - 7 - - - 1 8 - - - - 1
f-1 1 1 1 1 1
f01 1 1 1 1 1 1 1 1
Line x31 x30 n21 n20 n30 1 0 1 0 1 1 2 0 1 1 0 1 3 0 1 1 0 0 4 1 0 0 1 1 5 1 0 1 0 0 6 1 - - - 7 - 1 - - 8 - 1 - 9 - 1 10 - - 1
f-0 1 1 1 1 1
f00 1 1 1 1 1 1 1 1 1 1
Fig.2. Functions f01, f00 SOP-based implementation targeting LUT
1) Nodes are ranged in the topological order: primary inputs are assigned to level 0; nodes depending on signals of level (i-1) or less are assigned to level i. 2) Node functions of level i are designed based on nodes outputs of levels 0,1,…, i-1 and added to the PLA table; 3) For each extended PLA table row, an embedded vector of X’ variables is checked if it equals the node function gi onset vector of X’ variables. If yes, value 1 is put on the cross point of this row and column gi. Otherwise, value 0 is put. 5.2. Example Given single-rail PLA (fig.1a). To create extended PLA, firstly, dual-rail PLA table is designed (it results in additional columns x10-x30 in table 1). Let us create column n30. Extended PLA table embedded vectors, which are equal to node function minterms are shown in bold. Value 1 is put in proper rows of the column n30. Similarly, the rest of columns are created. 6. LUT-ORIENTED MULTI-LEVEL IMPLEMENTATION AND OPTIMIZATION
5.1. Extended PLA table
The initial multi-level network is close to NCL-X. The features are as follows: 1) in single-rail, each node is a complex (AND-OR) function [17] of given number of variables; it is in contrast to [11,12], where each node is a simple gate (AND, OR, NAND etc.), 2) the initial (singlerail) network is produced using ABC script; 3) a node function is transformed into dual-rail SOP-based structure [25], which may not be indicative. Instead of acknowledgement block, the distributed indication is done. For k-LUTs library, the number of each node inputs in dualrail should not exceed (k-1) (one input is required to arrange the feedback). Therefore, the single-rail network should contain nodes of at most (k-1)/2 inputs. As a result of the decomposition, LUT utilization ratio (number of LUTs inputs supplied by signals / the total number of LUTs ones) may be low since 1) some nodes depend on less than (k-1)/2 variables; 2) in dual-rail, at least one LUT input will always be unused if (k-1) is an odd number. To utilization ratio is improved during the resubstitution.
In dual-rail, multi-output function F, |F|=2m, depends on set X of inputs: |X|=2n. Given dual-rail BLIF, where each node
Since the complex nodes are supposed, the number of nodes in the boolean network and therefore, inputs-candidates for
.inputs x01 x11 x21 x31 x00 x10 x20 x30 n21 n20 n31 n30 .outputs f01 f00 .names x31 x30 n21 n20 n31 f01 .names x00 x10 x20 n20 10101 1 100 1 10100 1 110 1 01011 1 101 1 .names x01 x11 x21 n21 010 1 111 1 011 1 .names x01 x10 n31 001 1 11 1 111 1 .names x31 x30 n21 n20 n30 f00 .names x00 x11 n30 01011 1 10 1 01101 1 01 1 01100 1 11 1 10011 1 10100 1 Fig.3. SOP-based BLIF
5. EXTENDED DUAL-RAIL PLA TABLE
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
260
the resubstitution are smaller than in [23]. Therefore, more variants of resubstitution can be checked, what significantly improves the result. The initial single-rail multi-level network is produced using ABC script and further optimized by applying resubstitution followed by indication and transformation targeting LUT. In the next Sections, the details of the resubstitution procedure are considered and pseudocodes are given. 7. RESUBSTITUTION : OUR APPROACH 7.1. Dichotomies and a covering table Suppose, one needs to resubstitute input gi of node gj while preserving functionality of node gj. The latter means that for each pair of on-, offset minterms there exists at least one input distinguishing this pair. Definition. A dichotomy is a binary vector, accepting 1) value 1 in any position; 2) value 0 in any other position and 3) don’t cares in the remaining positions. Assume, function g onset has anum minterms, offset contains bnum minterms. A set of anum x bnum dichotomies is created. Each dichotomy is associated with a pair (i,j) of i-th onset and j-th offset minterms, 1≤ i ≤anum, anum+1≤ j ≤ anum+bnum and contain value 1 in the i-th position, value 0in anum+j position, don’t care value-in the rest positions. A set of dichotomies can be presented as a table of size (anumxbnum)x x(anum+bnum) where each column is associated with a dichotomy and each row - with an on/offset vector. Function f0 (fig.2a) dichotomies are given in table 2. A covering table is a two-dimension matrix, where each column associates with a dichotomy and each row-with an input. In dual-rail, the covering table is designed based on the dual-rail extended PLA (table 3). Note, that in dual-rail logic, the function is represented as the set of onset minterms, where each minterm depends on positive variables. It implies a covering rule as follows: if dual-rail variable x, x{x1,x0}, distinguishes dichotomy (i,j), where i is the onset minterm, j-offset one, it should accept value 1 in the i-th onset and 0 in the j-th offset. The negative function is represented as a set of offset minterms. Therefore, to distinguish the above mentioned dichotomy (i,j), variable x should accept value 1 in the j-th offset and 0 in the i-th onset. The covering table is given in table 3. 7.2. Optimization for speed Let us express a delay between inputs and any output as number of nodes in a path. Definition. A path between inputs and the output, where a signal propagates the maximal number of nodes is called as critical one. Optimization for speed is based on the minimization of the number of nodes in the critical path. Suppose, node gj is on a critical path. It is supplied with input(-s) of maximal level L. The resubstitution replaces input(-s) of level L for ones of maximal level L’, L’< L, while preserving node functionality.
7.3. Optimization for area Area optimization is based on the number of nodes minimization. Given node gj supplied by node gi output of fan-out n0. Once node gi is replaced, its fan-out n0 decreases: n0=n0-1(fig.3). The resubstitution is applied iteratively either to achieve n0=0 or make sure, that fan-out n0 can’t be reduced further. If n0=0, node gi is removed, what results in decreasing the fan-out of the node, supplying node gi: n1=n11. This node is also removed, if n1=0 and so on. Two strategies of selecting sets of inputs as the candidates for the cover, are applied. If n0≠1, then firstly, such inputs are sought among the set of primary inputs and ones produced by the nodes, which have already been considered to be resubstituted, however their fan-outs didn’t reach value 0. If the cover isn’t successful, outputs of the remaining nodes are also considered as candidates for the cover. If priority is given to the number of levels, only those nodes which don’t increase the critical path, should be involved in the covering procedure. Table 2. Function f01 dichotomies Line No 1 2 3 4 5 6 7
Dichotomy No 1 2 3 4 5 6 7 8 9 10 11 12 1111- - -- - -- -1111- - - - - -- -1 0- - - 0- - -0 - 0- - -0 - -- -0 - - -0 -- -- 0 - -- 0-
1 0 -
1 0 -
1 0
Suppose, given function gi and its input x resubstitution is supposed. After removing input x procedure FindMinCover(gj,x,k) (fig.5) checks for uncovered dichotomies, extracts the set of inputs, which satisfy an optimization criterion (speed-see subsection 7.2 or areasubsection 7.3) and therefore, potentially, can resubstitute input x. The input of the above set, which covers the maximal number of dichotomies, is selected for covering. If no one input is selected or node gj fan-in equals (k-1), the procedure returns FALSE. Otherwise, covered dichotomies are marked, the procedure iterates until all dichotomies are covered and function gi after the resubstitution is returned. 7.4. Example Based on the table 3, try to cover dichotomies using the set of primary inputs. One can see, that dichotomy 4 is covered by single input x30. Dichotomies 1,10 are covered by inputs x01,x21,x31. Input x01 is extracted since it covers the maximal number of uncovered dichotomies. Finally, remaining dichotomies are covered by input x11 (the cover is shown in bold). 8. DISTRIBUTED INDICATION Definition. A multi-level circuit is called as an indicative one, if any node input (both primary and internal) change is acknowledged by the node output (either primary or internal) change. As a result, all primary inputs are acknowledged by the primary outputs.
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
Given: 1) a node function gi(Xi), where Xi is a set of signals, generating input states; 2) {A,B,C,…} – a set of input states; 3) transitions between them. Suppose, logic operates under four phase behaviour discipline, where state A is a spacer one (all input signals are in zero: 00..0); (B,C,…) are working states (at least one input signal is in 1). As shown (Sect..4), indication in the set phases only should be checked and, if necessary, ensured. Consider a transition between states A, B. Definition. The states, which occur during transition between states A and B let us call as of transition one. Denote the set of transition states as (A,B). Definition. The transitions are called as ones with mutually exclusive transition states if C(A,B). for each state C, C≠B, which can appear after state A. Logic with mutually exclusive transition states is indicative. Theorem 1. Logic has mutually exclusive transition states, iff there is no pair of working states (say, B, C), where states are in relation: BC.
261
1 Bool FindMinCover(gj,x.k) 2{while uncovered dichotomies ≠ 0 3 {if (fan_in=k-1) then 4 return false 5 else 6 {extract set of inputs satisfying an optimization 7 criterion 8 from the above set, select input, which covers the 9 maximal number of dichotomies 10 mark_covered_dichotomies 11 if noone input is selected then return false 12 else 13 fan_in=fan_in+1 14 }} 15 return gi 16} Fig5. Covering procedure
x1* x1 x2 x3 x4 gi 0 1 0 1 0 1 1 1 1 0 1 1 0 0 0 1 0 1 a) b) Fig.6. Indication: example (a); distributed indication (b) Fig.4. Resubstitution
Proof. Necessity. Suppose that states B, C are in relation: BC. It means, that inputs which accept value 1 in state C, also accept value 1 in state B and there exists at least one input (say, x), which accepts value 1 in state B and value 0 in state C. Therefore, between states A and B, there exists a transition state, which coincides with state C. As a result, input x is not indicated. Table 3. Covering table Inputs
Dichotomy No 1 2 3 4 5 6 7 8 9 10 11 12
x01 x00 x11 x10 x21 x20 x31 x30 n21 n20 n31 n30
xx
xx
x
x x
x x x
x x
xx
x
xx x xxx xxx x x x x x x
xx x x
Sufficiency. Suppose, states B, C are not in the above relation. It means, that there are two variables (say, x, y), where x accepts value 1 in state B and value 0 in state C and y accepts value 1 in state C and value 0 in state B. Therefore, state C can’t be a transition one between states A and B, since input y never accepts value 1. ∎
Example. Consider function gi (x1,x2,x3,x4) (fig.6a). The set (A,B), where A=0000 (spacer state) and B=1010 (1st row) is: (A,B)={1000, 0010}. One can find state C, C=0010 (3rd row), where BC , and therefore, state C can appear after state A, C(A,B). As a result, input x1 isn’t indicated. Note, that due to relation: BC, function SOP: gi = x1x3 v x1x2x4 v vx3 - contains redundant (1st) term, which will be removed during transformation into SOM targeting LUT (Sect.4.3). Corollary 1. If a node is supplied by input x and its negation y or supplied by input x only and it accepts value 1 in all states, then input x is indicated. Proof. Indeed, input x is not indicated, if there are states B,C, where: 1) input x accepts value 1 in state B and value 0 in state C and 2) BC. Suppose, condition 1) holds, then input y accepts value 0 in state B and value 1 in state C. Therefore, condition 2) doesn’t hold. Now suppose, that condition 2) holds. To ensure it, input x and its negation y should accept the same values in both states B, C. It violates condition 1). Furthermore, if input x accepts value 1 in all states, then condition 1) never holds and therefore, the presence of its negation (input y) isn’t required. ∎ Once states B, C are in the above relation, vector BC determines (by its ones) inputs, which are not indicated. In our example, BC= 1000, therefore input x1 in state B isn’t indicated. To keep information, whether or not any input x is indicated by output g, xX’, vector x*(g) is created, where input x states, which are not indicated, are switched to 0 (for input x1, see fig.6a). Usually, more than one node (set G’ of nodes) are supplied by input x (fig. 6b). Therefore, distributed indication is
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
262
supposed. In this case, vector x*(g) is created for each node gG’. The distributed indication is checked based on the following equality:
˅(
)
g&x*(g) = x, (1) g G’ where G’ - set of nodes supplied by x, G’⸦ G. Note, that condition x*(g)=x holds for node g, satisfying corollary 1, and following equality holds:
(˅ )
x& g = x, g G’ where G’-set of nodes, satisfying corollary 1.
(2)
9. PROCEDURES DESCRIPTION The indication procedure (fig.7) inputs circuit description in BLIF format as well as LUT input limitation (k) (line 1). Firstly, indication is checked based on corollary 1 (lines 1115). by extracting a pair of nodes, generating a function g1 and its negation g0. Otherwise, the set of nodes (lines 4,5) satisfying corollary 1 condition is added and indication is checked again using equality (2) (line 17). If input x indication isn’t successful, the set of nodes, described in lines 6,7 and lines 8,9 are added (lines 20, 24) and indication is checked. Once input x negation doesn’t exist in BLIF (removed during optimization) (line 29), then the rough check (to speed up the procedure) is done based on equality (2) (line 30). If successful, then equality (1) is checked (line 31). Otherwise, the set of nodes of (k-2) inputs at most and not supplied by input x (lines 33,34), is added, supplied with x and indication is checked. If not successful, the procedure returns FALSE. Procedure Resub_speed (fig.8) minimizes a critical path. Within a critical path, node gi is extracted and its inputs of the maximal level are resubstituted for inputs of less levels (lines 7,8). The obtained circuit is checked for the indication. If successful, it replaces current BLIF (lines 9,10) and the procedure searches for a new critical path. Otherwise, the next node in the critical path is checked. If all nodes have been checked, but the critical path level hasn’t been reduced, hazard-free representation targeting LUT (Sec.4.3) is done and BLIF is returned. Procedure Resub_area (fig.9) is similar to Resub_speed. Firstly, node gi of the minimal fan-out is extracted and inputs implied by gi output, are resubstituted. If it results in a zero fan-out, the node is removed. Secondly, the obtained circuit is checked for the indication. If successful, it replaces current BLIF and the procedure searches for a non-checked node of the minimal fan-out. Otherwise, the procedure continues searching for non-checked nodes. Once all nodes are checked, hazard-free representation targeting LUT (Sec.4.3) is done is done and BLIF is returned. Note, that initial BLIF obtained as a result of ABC decomposition, can always be transformed into indicating logic once each single-rail node is converted into dual-rail one and represented as DIMS. Therefore, indicating BLIF can always be obtained.
10. EXPERIMENT We processed a set of MCNC91 benchmarks. The library consists of single-output 6-LUTs. The summary of the experiment is given in table 4. In the 1st column, names of
1 Bool Indication(blif, k) 2 /* input y –negation of input x, */ */ 3 /*node function g0 -negation of node function g1, */ 4 /*{gi,gi2,....} – nodes supplied by inputs x.y or 5 /* input x only, if it accepts value 1 in all input states, */ */ 6 /*{gi3, gi4…} – nodes of (k-2) inputs at most and 7 /*supplied by input x and not supplied by its negation y,*/ */ 8 /*{gi5, gi6…}- nodes of (k-3) inputs at most and not 9 /* and not supplied by x,y */ 10{while there exists non-checked input x 11 {if input y exists then 12 if (both inputs x, y supply nodes g1, g0 ) OR 13 (input x only supplies node g1/g0 AND 14 accepts value 1 in all input states of function g1/ g0 respectively) then continue 15 16 else 17 if equality x&(gi1vgi2v…)= x holds, then continue 18 else 19 {find non-indicated states: t= x¬(x&(gi1 v gi2 v....)) 20 search for indication by node functions gi3,gi4… 21 if successful, then supply nodes gi3,gi4… with input y 22 continue 23 else 24 {search for indication together with nodes gi5,gi6… 25 if successful, then supply nodes with inputs x,y 26 continue 27 else return false 28 }} 29 else 30 {search for nodes supplied by x and check equality (2) 31 if equality(2) holds, then check equality (1) 32 else 33 {search for nodes of (k-2) inputs at most and not 34 supplied by x, 35 supply nodes with input x and check equality (1) 36 } 37 if equality(1) holds , then continue else return false 38 }}return true 39} Fig.7. Indication procedure
benchmarks are listed. The decomposition is done using option n=(6-1)/2=2. Following ABC script is applied: fx, st. For the further comparison, the NCL-D network is produced and the number of levels in the critical path as well as number of nodes are listed in the next two columns. Then NCL-X network is created. The resubstitution for speed followed by resubstituion for area while preserving achieved critical path length, is done using option k=5. Next, indication is done. Unfortunately, it may increase achieved critical path length. The number of levels and nodes in the optimized indicating networks are reported in the next two columns. The improvement w.r.t. NCL-D is given in the last
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
two columns. One can see, that the significant improvement (on average, 18.7% for the speed and 39.3% for the area) has been achieved. For big (more than 1000 nodes) benchmarks, the timeout (I hour) for each optimization procedure (resubstitution for speed and area) is set up and the obtained result is reported. Note, that after applying the indication procedure, the critical path length of benchmarks apex3, pope increases and even exceeds one in initial BLIF. 1 Resub_speed(blif, extendedPLA) 2 {current_blif=blif 3 for a critical path 4 { 5 for each non-checked node gi within the critical path 6 supplied with inputs of maximal level L 7 {resubstitute inputs for ones of maximal level L’
Table 4.Experimental results Benchmarks
5xp1 9sym apex3 b11 bw clip dk17 ex1010 luc m2 m4 max46 max128 max512 pope rd84 tms z5xp1 z9sym
NCL-X: resubstitution (k=5) and indication
NCL-D: ABC( n=2)
Namely, the output of the node, whose input has been selected for the resubstitution, is split into the set of dichotomies. The set of inputs satisfying formulated target (optimization for the speed or/and area) are sought to cover the dichotomies. The nodes of zero fan-outs obtained as a result of the resubstitution, are removed. The conditions of the distributed indication are formulated and the procedure is proposed. The set of MCNC91 benchmarks is processed and results are compared with the state-of-the-art (NCL-D) approach. The significant improvement w.r.t. the speed and area has been achieved. ACKNOWLEDGMENT This work is supported by ERAF Project „ Asynchronous Logic Circuits: Methods and Tools for the Design in Reconfigurable Environment”, Nr. 1.1.1.1/16/A/234. 1 Resub_area(blif, extendedPLA) 2 {current_blif=blif; 3 for each non-checked node gi with minimal fan-out 4 {resubstitute inputs implied by gi output, if possible; 5 remove nodes with zero fan-out, if any; 6 if (indication(blif_after_resub) then current_blif= 7 =blif_after_resub and continue else continue; 9 } 10 hazard-free representation targeting LUT; 11 return blif; 12 } Fig.9. Resubstitution for area
Improvement (%)
REFERENCES [1]
level
node
level
node
levels
nodes
10 32 16 8 9 14 8 15 12 14 16 18 15 25 13 16 13 11 30
214 422 2180 148 298 358 120 2338 308 438 936 310 768 818 938 252 376 284 370
7 24 17 3 8 11 5 13 8 13 13 15 9 18 25 11 12 7 24
104 277 1818 81 144 216 66 2218 180 266 538 198 422 534 534 141 198 130 244
30.0 25.0 -6.3 76.0 11.1 21.4 37.5 13.3 33.3 7.1 18.7 16.7 40.0 28.0 -92.0 31.2 7.7 36.4 20.0
51.4 34.4 16.6 45.3 51.7 39.7 45.0 5.1 41.6 33.3 42.5 36.1 45.1 34.7 43.1 44.0 47.3 54.2 34.1
18.7
39.3
AVERAGE:
263
11. CONCLUSION The multi-level network is created by applying ABC script. Further improvement is done using the resubstitution. To process the network, the extended PLA table is used. The resubstitution is formulated and solved as the covering task.
Francis, R., Rose, J. and Chung, K. , Chortle: A technology mapping program for lookup table-based field programmable gate arrays, 27th ACM/IEEE Design Automation Conference, Orlando, FL, USA, 1990, pp. 613–619 [2] Karplus, K., Xtmap: Generate-and-test mapper for table-lookup gate arrays, Compcon Spring’93, San Francisco, CA, USA, 1993, pp. 391– 399 [3] Murgai, R., Shenoy, N., Brayton, R. ,Sangiovanni-Vincentelli, A. , Improved logic synthesis algorithms for table look up architectures, IEEE International Conference on Computer-Aided Design,ICCAD91, Santa Clara, CA, USA, pp. 564–567 [4] A. Mishchenko, S. Chatterjee, R. Brayton, "DAG-aware AIG rewriting: A fresh look at combinational logic synthesis", Proc. DAC '06, pp. 532-536 [5] Lucas Machado, Jordi Cortadella, Boolean Decomposition for AIG Optimization, GLSVLSI'17, In Proc. of the Great Lakes Symposium on VLSI 2017, Banff, Alberta, Canada, May 10 - 12, 2017, pp. 143148 [6] Ashenhurst, R., The decomposition of switching functions, Proceedings of an International Symposium on the Theory of Switching, Cambridge, MA, USA, 1957, pp.74–116 [7] J.P. Roth and R.M. Karp. Minimization over Boolean Graphs. In IBM Journal of Research and Development, April 1982, pp. 227-238 [8] E.J. Sparsø, J. Staunstrup, M. Dantzer-Sørensen, Design of delay insensitive circuits using multi-ring structures, In Proc. of the European Design Automation Conference (EURO-DAC’92), 1992, pp. 15-20 [9] I. Lemberski, LUT-oriented dual-rail quasi-delay insensitive logic synthesis, Electronics Letters , vol. 50 , Issue 7 , March 2014, pp. 503 – 505 [10] C.L. Seitz, System Timing, In: Introduction to VLSI Systems, C. Mead, L. Conway, Addison—Welsey Publishing Company, 1980, pp. 218-262
264
Igor Lemberski et al. / IFAC PapersOnLine 52-27 (2019) 257–264
[11] J.Cortadella, A.Kondratyev, L.Lavagno, C.Sotiriou, Coping with the Variability of Combinational Logic Delays, IEEE Int. Conf. On Computer Design (ICCD), October 2004, pp.505-508 [12] M. Ligthart, K .Fant, R. Smith, A. Taubin, A. Kondratyev, Asynchronous Design Using Commercial HDL Synthesis Tools, 6 th Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, pp. 114-125 [13] W. B. Toms, D. A. Edwards, M-of-N Code Decomposition for Indicating Combinational Logic, 2010 IEEE Symposium on Asynchronous Circuits and Systems (2010), Grenoble, France, May 36, 2010, pp. 15-25 [14] I. Lemberski, P. Fišer, R. Suleimanov, Asynchronous Sum-of-Product Logic Minimization and Orthogonalization, International Journal of Circuit Theory and Applications, John Wiley & Sons Ltd, vol. 42, issue 6, June 2014, pp. 562-571 [15] S.C. Smith, J. Di, Designing Asynchronous Circuits using NULL Convention Logic (NCL), Morgan & Claypool, 2009, 96 p. [16] Q.T. Ho, J.-B. Rigaud, L. Fesquet, M. Renaudin, R. Rolland, Implementing Asynchronous Circuits on LUT Based FPGAs, Proceedings of the 12th International Conference on FieldProgrammable Logic and Applications (FPL2002), Montpellier, France, 2002, pp. 36–46 [17] I. Lemberski, P. Fišer, Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints, in Proc. of 13th Euromicro Conference on Digital Systems Design (DSD), Lille (France), September 1-3, 2010, pp. 155-162. [18] IC.D. Nielsen, Evaluation of Function Block Designs, Technical Report 1994-135, Department of Computer Science, Technical University of Denmark, Denmark, 1994, 43 pp.
[19] I. Lemberski, P. Fišer, Multi-Level Implementation of Asynchronous Logic Using Two-Level Nodes, 4TH IFAC Workshop on DiscreteEvent System Design (DesDes’09), 6-8 October, 2009, Spain, pp. 9096 [20] H. Sawada, T. Suyama, A. Nagoya, Logic Synthesis for Look-Up Table based FPGAs using Functional Decomposition and Support Minimization, IEEE International Conference on Computer Aided Design (ICCAD), 5-9 Nov, 1995, San Jose, CA, USA, pp. 353 - 358 [21] H. Sato, Y. Yasue, Y. Matsunaga, M. Fujita, Boolean Resubstitution With Permissible Functions and Binary Decision Diagrams, 27th ACM/IEEE Design Automation Conference (DAC90), 24-29 June, Orlando, Florida, USA,1990, pp.284-289 [22] T. Takata, Y. Matsunaga, Area Recovery under Depth Constraint for Technology Mapping for LUT-based FPGAs, IPSJ Transactions on System LSI Design Methodology, vol.2. 2009, pp. 200-211 [23] A. Mishchenko R. Brayton, Scalable Logic Synthesis using a Simple Circuit Structure, Proc. IWLS '06, pp. 15-22 [24] E. Sentovich at al,SIS: a System for Sequential Circuit Synthesis, Electronic research Laboratory Memorandum, No UCB/ERL M92/41, 1992, 45 pp [25] I. Lemberski, A. Suponenkovs, Asynchronous Logic Design Targeting LUTs, The 7th Mediterranean Conference on Embedded Computing,June, 2018, Budva, Montenegro (MECO2018), pp. 420425