INTEGRATION, the VLSI journal 26 (1998) 79—99
High-level test synthesis: a survey Indradeep Ghosh *, Niraj K. Jha Fujitsu Labs. of America, 595 Lawrence Expressions, Sunnyvale, CA 94086-3922, USA Department of Electrical Engineering, Princeton University, Princeton, NJ 08544, USA
Abstract This paper surveys the various high-level design for testability and synthesis for testability methods that have been proposed in the last decade. We begin with a description of high-level synthesis methods which target the ease of subsequent gate-level sequential test generation. Then we describe high-level synthesis methods which target built-in self-test (BIST) and hierarchical testability. Thereafter, we describe register-transfer level testability techniques that target gate-level test generation, BIST and hierarchical testability. We then describe some high-level test generation methods in brief. 1998 Elsevier Science B.V. All rights reserved. Keywords: Design for testability; Digital system testing; High-level synthesis; Register-transfer level synthesis; Synthesis for testability
1. Introduction High-level synthesis for testability (SFT) and design for testability (DFT) techniques have been the subject of intense research since the late 1980s [1], concurrent with research into synthesis to satisfy area, timing and, more recently, power constraints. Initially, the synthesis techniques were limited to logic-level and circuit-level designs. Automatic test pattern generation (ATPG) techniques at the logic level require large amounts of computing time and resources for testing even moderately sized sequential circuits. With the pressure of time-to-market of integrated circuits (ICs) mounting on manufacturers, test solutions using just ATPG have become almost unacceptable. Hence, designers use certain DFT or test insertion techniques and modify the circuit to ease the task of test generation at the expense of test overheads. Even DFT methods, such as scan design or BIST can incur large area, performance, and power overheads, if introduced at the lower levels of the design hierarchy. Due to the above disadvantages, and with the recent trend towards
* Corresponding author. Tel.: #1 408 530 4559; fax: # 1 408 530 4515; e-mail:
[email protected]. 0167-9260/98/$ — see front matter 1998 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 9 2 6 0 ( 9 8 ) 0 0 0 2 2 - 4
80
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
Fig. 1. A high-level circuit description and its CDFG.
high-level design, the SFT and DFT efforts have also shifted towards the behavior and registertransfer level (RTL). This paper gives an overview of several behavioral (high-level) and RTL design and synthesis approaches that have been proposed to generate easily testable implementations. These approaches target sequential ATPG in a non-scan or partial scan environment, BIST, and hierarchical testing. It also provides an overview of high-level test generation techniques. 1.1. Definitions Before going into the details, a few terms need to be defined which have been used extensively in this paper. High-level or behavioral synthesis converts the behavioral description of a circuit into an equivalent RTL implementation. The behavioral description can be given in a high-level hardware description language, as shown in Fig. 1a. The description is compiled into a control/data flow graph (CDFG), as shown in Fig. 1b. This is followed by scheduling where the operations in the CDFG are assigned to clock cycles and allocation where the operations are mapped to modules (module allocation) and variables are mapped to registers (register allocation). The lifetime of a variable in a scheduled CDFG consists of the clock cycles when the variable is alive. Testability enhancing methods analyze the controllability of a node in the RTL circuit or CDFG which signifies the ability to control the value at that node from primary inputs (PIs). Similarly, the observability of a node signifies the ability to observe the value at that node at a primary output (PO). The final quality of the test generated for the circuit is measured by fault coverage which is the percentage of faults detected from the total number of faults in the circuit, assuming a particular fault model. The test efficiency metric can also be used where the total number of undetectable faults is added to the number of detected faults to calculate the percentage.
2. High-level synthesis for sequential ATPG SFT at the behavior level is complicated by the absence of a behavioral fault model that can be strongly correlated to silicon defects. Therefore, researchers have focused on innovative methods to include sequential ATPG objectives into behavioral synthesis.
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
81
Fig. 2. An RTL circuit and its register adjacency graph.
An S-graph can be used to represent a circuit structure at the logic level in which each node represents a flip-flop, and there is a directed edge between nodes A and B in the graph if the output of flip-flop A is connected to the input of flip-flop B either directly or through combinational logic [2]. At the RTL, the corresponding structure is called a register adjacency graph with flip-flops replaced by registers or latches. An example RTL data path and its register adjacency graph are shown in Fig. 2. The sequential depth is the maximum number of arcs between any two flip-flops in the S-graph while the length of a cycle is the number of arcs in a cycle in the graph. It has been empirically observed [2,3] that the complexity of sequential ATPG grows exponentially with the length of cycles, and linearly with the sequential depth of flip-flops in the S-graph. Gate-level DFT techniques have been developed based on this topological analysis. These techniques attempt to break all loops, except self-loops, and minimize sequential depth. Behavioral synthesis for testability approaches can target similar measures, such as loop size and sequential depth in the register adjacency graph, to synthesize testable implementations, while meeting the performance and area constraints of the design. 2.1. Improving register controllability and observability One way to improve the controllability and observability of data path registers is to assign the variables of the CDFG to maximize the number of input/output registers, i.e., registers connected to PIs/POs of the circuit. Also, the sequential depth from an input register to an output register can be minimized during register allocation, thereby improving controllability and observability. One approach assigns each PO to a different output register, and then assigns as many intermediate variables as possible to the output registers [4]. Next, it assigns each PI to an input register, and as many of the remaining intermediate variables as possible to the input registers. When two variables cannot share a register because their lifetimes overlap, the operations of the CDFG can be re-scheduled such that the lifetime of an intermediate variable does not overlap with
82
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
Fig. 3. Reducing sequential depth during high-level synthesis.
the lifetime of an input/output variable, and the intermediate variable can then be assigned to an input/output register. A mobility path scheduling technique has been proposed to minimize the sequential depth between registers and maximize the number of intermediate variables that are allocated to input/output registers [5]. Consider the example in Fig. 3. It shows two possible register allocations for the scheduled CDFG of Fig. 3a. The sequential depth in Fig. 3b is two, while in Fig. 3c the sequential depth is reduced to one by performing register allocation based on testability considerations. When a logic-level ATPG tool is run on the corresponding logic-level implementations, the fault coverage and test generation time for the circuit in Fig. 3c are better than that of Fig. 3b. 2.2. Avoiding loops in the data path We next discuss how loops are formed in a circuit generated by high-level synthesis, and ways to avoid their formation. 2.2.1. Loops in the behavioral description Corresponding to each loop consisting of data-dependency edges present in the CDFG, a loop is formed in the data path. The CDFG loops can be broken by selecting a set of scan variables from
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
83
the variables in the CDFG such that each CDFG loop has a scan variable, and assigning each scan variable to a scan register. Two measures, loop cutting effectiveness and hardware sharing effectiveness, have been developed [6]. These measures are used to select a set of scan variables that can be maximally shared (requiring a minimal number of scan registers) and chances of sharing other variables to break loops formed during subsequent high-level synthesis steps are maximized. Variables which constitute the boundary of a loop are known as boundary variables. One way of breaking these loops is to select a set of boundary variables to be assigned to the available scan registers [7]. Though the boundary variables cannot share the same register, because they are alive simultaneously, other intermediate variables of the CDFG can share the same register with boundary variables. To facilitate maximal sharing, boundary variables with shorter lifetimes are preferred for selection as scan variables. Next, the intermediate variables are assigned to both the available scan registers as well as the existing input/output registers to further minimize the number of loops. The technique can be extended to handle CDFGs with conditionals [8]. 2.2.2. Loops formed by hardware sharing Even when the CDFG has no loops, or all the CDFG loops have been effectively broken by scan variables, hardware sharing of registers and functional units can introduce further loops in the data path. An allocation algorithm has been developed that generates testable data path logic given a scheduled data path description [9,10]. It simultaneously performs module and register allocation, and allows the user to specify preferred allocations and an increased number of allocated resources in order to guide the algorithm to a more testable implementation. The primary cost that the algorithm attempts to minimize is a function of the number of self-adjacent registers (registers with self-loops in the register adjacency graph), cycles, and input and output registers in the generated data path. An adjacency-breaking algorithm can be applied after synthesis to the data path logic to reduce the number of self-adjacent registers created during allocation. A simultaneous scheduling and allocation technique has also been proposed which avoids formation of loops in the implementation [6]. Consider the CDFG of Fig. 4a. Let the given performance constraint be four clock cycles and the resource constraint be two adders. A feasible schedule and allocation is shown in Fig. 4b. There is a loop in the RTL circuit as shown by the bold lines. To create a loop-free circuit, register R2 needs to be converted into a scan register. However, if the scheduling and allocation are changed as shown in Fig. 4c, no register needs to be scanned, assuming self-loops can be tolerated. Results from high-level scan selection and loop-breaking indicate that loop-free highly testable designs can be synthesized that require significantly fewer scan flip-flops than conventional gate-level schemes. 2.3. Using controllability/observability measures An SFT system has been developed that addresses the testability of both data path and control logic, and generates partial scan designs targeting gate-level sequential ATPG [11—13]. The system takes as input a behavioral VHDL description of the design and converts it to an intermediate description called an extended, timed Petri net (ETPN). The ETPN contains both the control and data path specifications and represents a structural implementation of the data path logic. The
84
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
Fig. 4. Reducing loops in the RTL data path through high-level synthesis.
synthesis process consists of a series of transformations applied to the ETPN to improve the testability. The testability analysis algorithm generates testability measures (controllability and observability) for each node in the ETPN based on the structure of design, the length of sequential paths from PIs and POs to intermediate nodes, and the testability characteristics of the modules in the sequential paths. The testability measures represent costs associated with applying ATPG to the design, e.g., test generation time, test application time, and difficulty in achieving high fault coverage. The SFT algorithm begins with an initial worst-case implementation and successively merges two data path elements whenever functionally possible and whenever the merging leads to improved testability measures for the design. The testability measures are recalculated and the iterative improvement continued. The controllability and observability measures can be generated by observing the behavior of the variables in the behavioral description when random vectors are applied to PIs [14]. These measures are used to guide the scheduling algorithm to select a schedule that can be implemented
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
85
with testable control logic, and to guide the allocation algorithm to generate more testable modules and registers. A method for analyzing the testability has been presented that represents the relative difficulty of computing test data at the behavior level or RTL [15]. It determines the probabilities of justifying and propagating any test data to and from internal nodes. These probabilities are computed given the transparency coefficients of operations and the set of elementary data transfers performed between data path components. The transparency coefficient of an operation or module quantifies the difficulty of propagating a random test vector through it. For example, adders will have high-transparency coefficients and multipliers will have low-transparency coefficients. Since this testability analysis deals with different levels of description, it can be used at any step within the high-level synthesis process and particularly during the hardware allocation steps [16].
2.4. Modifying the behavioral description A behavioral description of a circuit can be modified to make the resulting implementation more testable than the implementation generated from the original description. The behavioral description can be analyzed to detect hard-to-test areas, classifying variables as controllable, observable, partially controllable and partially observable [17]. Based on the testability analysis, test statements, which are executed only in the test mode, are added to improve the controllability and observability of all the variables in the description. The modified behaviors produce circuits with higher fault coverage and test efficiency than the original description at modest area overheads. In hierarchical designs consisting of several modules, the top-level design constrains the controllability and observability of its modules’ inputs/outputs. A technique has been developed to generate top-level test modes and constraints required to realize a module’s local test modes [18]. The process of generating global test modes may reveal that some constraints cannot be satisfied, in which case either the top-level description or the description of an individual module must be modified to satisfy the constraints [19]. The testability modifications made to the behavioral description are selected from a library of VHDL behavioral descriptions of testability techniques, such as test pins for added controllability and observability. The technique has also been applied to microprocessors [20]. CDFG transformations can be used to make high-level synthesis more amenable to testability. This technique adds operations which do not change the original computation, but enable more sharing of scan registers, so as to minimize the number of scan registers needed [21]. In the method, deflection operations, with the identity element as one of the operands (like add with 0), are inserted between CDFG operations. Deflection operations are also added to avoid the formation of loops during the allocation phase by maximally reusing existing scan registers. Another technique has been developed to evaluate the controllability of functional loops in a high-level circuit description [22]. The proposed testability measure is first applied to identify hard-to-control nodes in the CDFG. Then the DFT technique adds controllability to these loops by adding some test behavior with the help of some extra test pins. After this, RTL and logic synthesis can be done on this testable behavior to produce circuits that are more testable through logic-level sequential ATPG.
86
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
3. High-level synthesis for BIST To make a design self-testable using pseudorandom BIST methodology, it needs to be reconfigured during the test mode into a set of logic blocks. Each logic block has the equivalent of a pseudorandom pattern generation (PRPG) register at each of its inputs, and a signature register (SR) at each of its outputs. In situ BIST schemes require reconfiguration of a functional register as a PRPG or an SR. In each test cycle, the PRPGs at the input of a block generate pseudorandom test patterns, and the test response of the block is captured and analyzed by the SRs at its outputs. 3.1. Minimizing extra test logic A register cannot be configured both as a PRPG and an SR simultaneously, unless it is implemented as a concurrent built-in logic block observer (CBILBO) [23], which is very expensive in terms of area and delay penalties. Hence, a self-adjacent register, which serves as both an input and an output of a logic block poses a problem, since it may have to be implemented as a CBILBO. An objective in generating self-testable data paths with low area overheads is to minimize the formation of self-adjacent registers. Given the allocation of operations to modules during behavioral synthesis, register allocation can be performed to minimize the number of self-adjacent registers, and hence the number of CBILBOs [24]. A conventional method of assigning a set of variables to a minimum number of registers is to color a conflict graph with a minimum number of colors. The nodes of a conflict graph represent the variables in a CDFG. An edge exists between two variables that cannot share a register due to overlapping lifetimes. To minimize the formation of self-adjacent registers, conflict edges are also added between two nodes if the corresponding variables are the input and output of the same module. Experimental results show that the technique generates data paths with fewer self-adjacent registers, without increasing the register count, compared to conventional register allocation techniques. Formation of self-adjacent registers can be completely avoided by restricting the data path architecture [25,26]. The basic building block used to allocate a variable and the operation which generates the variable is a test functional block (TFB), which consists of an ALU, a multiplexer at each input of the ALU, and a test register (PRPG or SR) at its input/output. Instead of considering mapping of variables and operations of the CDFG to individual registers and ALUs as done conventionally, each (v, o(v)) pair, termed action, where v is a variable and o(v) is the operation producing v, is considered for mapping to TFBs. Two actions, ((v1, o(v1)), (v2, o(v2))), are compatible and can be merged (assigned to the same TFB) if (i) the lifetimes of v1 and v2 do not overlap, and (ii) v1, v2 are not the inputs of o(v1), o(v2), respectively. The second condition is needed to ensure that the output register of a TFB does not become an input of the TFB, thus ensuring that no self-adjacent register is formed. The allocation technique first identifies sequences of compatible actions, each of which can be merged and mapped to a single TFB. A prime sequence is one that does not contain any other sequence. Allocation to a minimal number of TFBs is then achieved by finding a minimal set of prime sequences which cover all the actions of the CDFG. The restriction of one output register per TFB prevents the sharing of operations whose output variables have overlapping lifetimes. Self-testable data paths with even fewer TFBs can be formed by using an extended TFB (XTFB), which contains an ALU with multiple input as well as output
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
87
Fig. 5. Reducing the number of SRs by using transparency.
registers [27]. During the test mode, while the two input registers are configured as PRPGs, only one of the multiple output registers needs to be configured as an SR, thus allowing the presence of self-adjacent registers which have to be configured as PRPGs but not SRs. By avoiding the use of CBILBOs, while still allowing some self-adjacent registers, the use of XTBFs enables the generation of self-testable data paths with less area overhead than previous approaches. The test area overhead can be further reduced by relaxing the requirement that the output register of every ALU has to be an SR. Instead, the test response is allowed to propagate through other ALUs and logic blocks before being captured in an SR, thus forming logic blocks with sequential depth between PRPGs and SRs being greater than one. Consider the example in Fig. 5. Suppose x random patterns are required to test block A in the figure. If the fault effects at the output of A need to be captured at register R3, doubling the required number of patterns to 2x is required as half of the fault effects are expected to be killed while propagating through the AND block. This is because the side input of the AND block which comes from register R2 is assumed to have value 1 with probability 0.5, and this is essential for fault effect propagation through the AND block. This ability to propagate fault effects is known as the transparency of the logic block, and for an AND block it is 0.5. The above scheme results in fewer SRs but can reduce fault coverage, allowing a trade-off between test area overhead and fault coverage. The BIST overheads can also be reduced by reducing the number of PRPGs and SRs needed to test all the data path modules [28]. After scheduling and module allocation phases have been completed, register allocation can be done to maximize the number of modules for which a register is an input register and hence can act as a PRPG, and the number of modules for which a register is an output register and hence can act as an SR. Also, exact conditions under which a self-adjacent register needs to be a CBILBO have been obtained [28]. A synthesis technique has been proposed that maximizes scan dependence in data path logic assuming an orthogonal scan path configuration in the BIST architecture [29]. Results show that using orthogonal scan paths in the data path logic allows greater sharing of functional and test logic when testability techniques such as scan and circular BIST are implemented. In the circular BIST technique, the PRPGs and SRs are unified to form a circular structure so that the output patterns of a test are fed back as the input patterns for the next test. This is shown in Fig. 6. The shift direction in the orthogonal scan path is orthogonal to the shift direction in traditional scan paths (shifting bits within registers). For example, z"x#y is a type of data flow equation that is commonly found in behavioral descriptions. When x is bound to register R2, y is bound to R3, and
88
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
Fig. 6. Circular BIST architecture: (a) Normal operation, (b) BIST configuration.
Fig. 7. Orthogonal scan path: (a) RTL data path, (b) bit-slice, and (c) modes of operation.
z is bound to R1, the data path logic function for each flip-flop, i, of register R1 has the form R1 "R2 R3 C , where C is the carry function for flip-flop i. This is shown in Fig. 7. G G G G G Traditional BIST schemes might arrange the elements in the scan path such that each flip-flop of register R2 feeds the next flip-flop of register R2 during scan operation. With an orthogonal scan path, however, each flip-flop of R2 feeds the corresponding flip-flop of register R1 during scan operation. This allows the use of exclusive-OR operation during both test and functional modes. The synthesis procedure assumes that the data path logic has an orthogonal scan path structure, then biases the register allocation algorithm such that the occurrence of the types of functions that allow for logic sharing is maximized in the data path logic. Results show that data path logic generated by this synthesis technique has fewer logic gates than traditional scan path designs.
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
89
3.2. Minimizing test sessions In the most general BIST scheme, a test path through which test data can be transmitted from the PRPGs to the SR at the output of a logic block may pass through several ALUs. This leads to two or more test paths sharing the same hardware (registers, ALUs, multiplexers, buses), thus creating conflicts and forcing the need for multiple test sessions. Scheduling and allocation techniques have been presented which use test conflict estimates to generate data paths which require minimal number of test sessions, thus achieving maximal test concurrency [30]. Experimental results show the ability to synthesize data paths that require only one test session.
3.3. Adding test behavior A general BIST scheme has been proposed where only the input and output registers are configured as PRPGs and SRs, respectively [25]. Testability metrics are developed to measure the controllability/observability of signals in the original design behavior, under the application of pseudorandom vectors at the PIs. A test behavior, executed only in the test mode, is obtained by inserting test points in the original behavior to enhance the testability of the required internal signals. The test points need extra PIs/POs, implemented by extra PRPGs/SRs. The combined design and test behavior are synthesized together using any high-level synthesis tool. The above techniques can be modified where a test behavior is generated from a given behavioral description for data-flow intensive circuits [31]. The test behavior describes the BIST insertion for the design. The normal and test behaviors are combined, and then synthesized by any generalpurpose synthesis system to produce a testable design with inserted BIST structures. The technique has also been modified to handle conditionals and loops in control-flow intensive designs [32].
3.4. Using arithmetic units as test generators and compactors Instead of using special BIST hardware like PRPGs and SRs, functional units can be used to perform test pattern generation and test response compaction [33]. A high-level synthesis methodology has been proposed to synthesize data paths where high fault coverage can be obtained using arithmetic test generators and test compactors. A testability metric termed subspace state coverage is used to guide the synthesis process, both in characterizing the quality of test vectors required to provide complete fault coverage of each functional unit, as well as the quality of test vectors seen at the inputs of each operation in the CDFG after the degradation suffered by the patterns due to propagation through various other operations. For each arithmetic unit in the module library, the input subspace state coverage needed to obtain complete structural coverage is characterized. Next, an additional generator is added at the inputs of the CDFG and the state coverage measured at the inputs of the operations. If two operations, with S1 and S2 denoting the set of states covered at their inputs, are mapped to the same arithmetic unit, the set of states covered at the inputs of the unit is the union of S1 and S2. During high-level synthesis, allocation of operations to functional units is done to maximize the state coverage obtained at the inputs of each functional unit.
90
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
Fig. 8. An example CDFG.
4. High-level synthesis for hierarchical testability A symbolic hierarchical testing approach has been proposed that uses the functional information of the modules in the circuit to speed up its testability analysis and test generation [34,35]. Since the testability analysis and test generation are symbolic, they are independent of the bit-width of the data path. The hierarchical testability analysis method is embedded in the allocation stage of the behavioral synthesis system. It tries to find a symbolic justification path from the inputs of each module or register to the system PIs and a propagation path from the output of each module or register to the system POs. Such a test path for a register or a module is called its test environment. A test environment for an operation or variable in a CDFG can be similarly defined in terms of the CDFG PIs and POs. A test environment for a variable (operation) is also a test environment for the register (module) that the variable (operation) gets mapped to after allocation. Consider the CDFG shown in Fig. 8. To test operation *1, its inputs b and f need to be controlled, and output g observed. Input b can be controlled to any desired value, since it is a PI. Input f can be independently controlled to any desired value by controlling PIs c and d appropriately since f"(c#d) mod 2L (n is the bit-width of the data path). Output g can be observed indirectly by observing the PO o (since o"(e!g)), provided the value of e is known. In this example, e"a#b, where a and b are PIs. Thus, for example, if a is controlled to negative of the value at b, e becomes 0. The test environment for operation *1 in that case would constitute of arcs +a, b, c, d, e, f, g, o,. The proposed synthesis system can easily handle large and complex circuit descriptions with loops and conditionals, and various different scheduling styles, such as, multicycling, pipelining and chaining. The complete system-level test set is generated along with the synthesis of the circuit in a very small time. Logic-level fault simulation of the system test set on the synthesized circuits confirm their high testability. The overheads are close to zero.
5. RTL synthesis for ATPG There are several approaches for enhancing RTL descriptions for testability. The description can be augmented to improve testability by rewiring internal signals to more controllable or observable
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
91
nodes when a test signal is active. With information regarding the connectivity of modules and the functionality of each module, transformations that restructure the data path and minimize control logic by using don’t care conditions extracted from the data path can yield 100% single stuck-at fault testable designs [36]. Structural and functional knowledge embedded in an RTL description have been used for non-scan DFT schemes like test point insertion [37]. Instead of conventional techniques of breaking loops by making flip-flops scannable, functional units are broken by inserting test points, implemented using register files and constants. It is shown that it suffices to make at least one node in each loop k-level (k'0) controllable and observable to achieve high test efficiency. A node in the register adjacency graph is k-level controllable (observable) if any value at the output of the node can be justified (propagated) from (to) a system input (output) in at most k#1 clock cycles. Another way of increasing the testability of RTL circuits is by modifying the controller. In the controller, various control signals can imply the conditions for various other control signals. For example, if control signal load1 is high, it might imply that control signal muxselect1 is low in all control states. One technique tries to eliminate control signal implications that create conflicts during sequential ATPG [38]. It adds a few extra control vectors to the outputs of the controller for this purpose. Recently, another controller resynthesis technique has been proposed that exploits the fact that many control signals in an RTL implementation are don’t cares under certain states and conditions [39]. It respecifies the don’t care signals in the controller so as to affect improvements in the testability (better fault coverage and shorter test generation time) of RTL controller/data path circuits. If the don’t care information in the controller specification leaves little scope for respecification, then control vectors are added to the controller for enhancing testability. Consider the RTL circuit shown in Fig. 9a. The CDFG executed on the circuit is shown in Fig. 9b. Observe that in the CDFG, REGz (which is observable at PO-port) loads only in control steps 5 and 6. Since there are no live variables in REGz in earlier control steps, the load signal ¸z is essentially a don’t care in the initial four control steps. From the original specification of the controller, we see that ¸z has value 0 in the initial four control steps. This means that the value of the register at the output of the multiplier is observable only in the last two control steps. Since the functionality of the circuit warrants the use of the multiplier in the first three cycles, the value at the output of the multiplier can be propagated to the PO only after all the three multiplications have taken place. As the circuit specification is such that the third multiplication operation has a constant as one of its operands, the multiplier becomes hard to test as fault effects can be swept out by constant k . If ¸z is made 1 during the second and third control steps, the multiplier becomes easily testable during the first two clock cycles. This is because the value at the output of the multiplier after the second multiplication operation is observable at the PO as the other input of the adder is connected to a PI register. One of the inputs comes from PI-port, while the other input is the output of a previous multiplication operation and this value is also verified at PO-port at the second control step. The required respecification is shown in bold in Fig. 9c. A methodology has also been developed for selecting partial scan flip-flops using RTL information [40]. The method converts the RTL data and control descriptions into an execution flow graph (EFG). The EFG formalizes the data flow and the interaction between data and control. A high-level testability measure based on estimates of the length of controllability and observability sequences is used to select partial scan flip-flops. The testability measure utilizes the data path
92
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
Fig. 9. Controller respecification for testability.
structural information and any available control flow information. Since testability is inserted at a high level, some of the limitations of gate-level methods are overcome. An earlier work on RTL synthesis for sequential ATPG proposes a knowledge-based expert system to design testable chips [41]. An RTL design is given to the system together with the design goals and constraints. The circuit is then partitioned into subcircuits for individual processing. The concept of identity path (I-path) is used to transfer data from one place in the circuit to another without modification. While the I-path partitioning is conceptually clean, it is inapplicable in some practical circuits where the data path is complex and the control mechanism is complicated. This limitation can be alleviated by using the concept of functional path (F-path) [42]. A low-cost high-level scan technique named HSCAN has been introduced that utilizes existing paths between registers, through multiplexers, to connect registers in parallel scan chains [43—45]. Consider the example RTL circuit shown in Fig. 10, which shows 16-bit multiplexers and registers. Since a multiplexer path already exists between REG1 and REG2, in the HSCAN mode these
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
93
Fig. 10. Structures employed by HSCAN.
registers can be connected in 16 parallel scan chains by using just two extra logic gates, as shown in Fig. 10a. If the select-0 path of an existing multiplexer needs to be chosen during testing then a configuration like Fig. 10b can be used. If a direct connection exists between two registers, only an OR gate is required at the load signal of the destination register. If no path exists between two registers, or if there is a conflict with already created HSCAN paths, then a scan path is created by adding a test multiplexer, as shown in Fig. 10c. This technique typically has lower area overhead than full scan while retaining the usual advantages of full scan.
6. RTL synthesis for BIST The RTL description of a circuit can be analyzed and modified to become testable using BIST [46]. A testability analysis technique is proposed that identifies test points for BIST hardware insertion. The method makes use of testability metrics for the controllability and observability of registers of a circuit, and provides a means of measuring the effect of test insertion decision on test quality. The method has several features like an iterative technique for modeling indirect feedback and an extension to the circular BIST methodology. Additionally, a new preprocessing transformation enables the correct modeling of word-level correlation. While the above work is limited to the RTL data path, a BIST scheme for integrated controller/data path testing has also been presented [47]. It advocates the testing of the controller/data path pair in an integrated way, rather than testing the data path and controller completely independently by separating each from the system environment during test. The scheme adds a small finite-state machine, known as a ‘‘piggyback FSM’’, to the system between the controller and data path. During the test mode, this is used to enhance the observability of the controller outputs, so that controller faults can be observed at the outputs of the registers of the data path.
7. RTL synthesis for hierarchical testability A technique has been developed for extracting functional (control/data flow) information from RTL controller/data path circuits, and using it in design for hierarchical testability of these circuits [48]. It identifies a suitable test control/data flow (TCDF) from the RTL circuit and uses it to test
94
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
each embedded element in the circuit by symbolically justifying its precomputed test set from the system PIs to the element inputs and symbolically propagating the output response to the system POs. When symbolic justification and propagation become difficult, it inserts test multiplexers at suitable points to increase the symbolic controllability and observability of the circuit [49]. The data path test set is obtained as a by-product of this analysis without any further search, and is hence very fast. Extensions of the techniques to handle circuits with mixed control/data flow and data paths of non-uniform bit-width have also been done. Experimental results demonstrate that the test generation process is 1000—10000 times faster than gate-level ATPG while keeping all test overheads within 4.5%. The fault coverage obtained is above 99%. The above technique has been modified to facilitate the testing of programmable data paths like application-specific programmable processors (ASPPs) and application-specific instruction processors (ASIPs) [50]. The method utilizes the RTL circuit description of an ASPP or ASIP and tries to come up with a set of test microcode patterns which can be written into the instruction ROM of the processor. These lines of microcode dictate a new control/data flow in the circuit and can be used to test modules which are not easily testable. As before, the new control/data flow is used to justify precomputed test sets of a module from the system PIs to the module inputs and propagate output responses from the module output to the system POs. The test microcode patterns are a by-product of this analysis. Experimental results show that very high fault coverages can be obtained by this technique with negligible test overheads. A BIST technique targeting hierarchical testability has also been proposed [51]. In this scheme, a TCDF is extracted as in [48] and used to derive a set of symbolic justification and propagation paths (i.e., test environment) to test some of the operations and variables present in it. This test environment can then be used to exercise a module or register in the circuit with PRPGs which are placed only at the PIs of the circuit. The test responses can be analyzed with SRs which are only placed at the POs of the circuit. Experimental results show that high fault coverages ('99%) can be obtained by this method with much lower test overheads than traditional BIST schemes in a small number of test cycles.
8. High-level test generation Hierarchical testing can be used to speed up test generation when circuit descriptions at different levels of the design hierarchy are available. Some of the high-level approaches to test generation are described in this section. One method assumes that the stimulus/response package for each high-level primitive is predefined [52]. The test set for the module under test is then justified and propagated at the RTL. The existence of I-paths from one module port to another is assumed. The test generation algorithm used is similar to a branch-and-bound algorithm. The approach is restricted to acyclic RTL circuits. Several techniques for high-level test generation, which exploit only structural information, have been presented [53—55]. These techniques use higher-level modules such as multiplexers, decoders, adders, and multipliers to speed up the justification and propagation phases during test generation. However, these techniques are also limited in that they can be only applied to combinational circuits. Behavioral information has also been used to test high-level circuits [56]. This method presents the concept of assembling instruction sequences to create test sets. A functional fault model is
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
95
assumed. The approach primarily targets microprocessors. Data flow descriptions can be used to determine valid control signals in order to set up the propagation and justification paths [57]. The availability of precomputed test sets for modules is assumed. RTL information has been used in addition to gate-level information to generate tests for large circuits [58]. The problem of sequential test generation is decomposed into the subproblems of combinational test generation, fault-free state justification, and fault-free state propagation. A combinational test generator generates a test for the fault by assuming full controllability at the pseudo-inputs (register outputs) and full observability at the pseudo-outputs (register inputs). The RTL description is used to justify the values at the pseudo-inputs during state justification and to propagate the faulty effects at the pseudo-outputs during state differentiation. An equation-solving approach is coupled with a branch-and-bound algorithm. A hierarchical test generation approach is presented based on a functional approach to guide backward and forward signal propagation [59]. The basic idea of this approach is to use the storage capabilities of the sequential blocks to modify the data propagation through the circuit and then solve conflicts by delaying some data. Another method also uses a high-level synthesized RTL description to compute test sets. A behavioral testability measure is utilized in path justification and propagation during test generation [60]. A high-level ATPG tool has also been developed that uses a functional equivalent model of sequential primitives making it usable beyond the RTL [61,62]. A gate-level test generator is used to generate test vectors for the module-under-test. High-level knowledge is applied locally to speed up data implication at each module. For each pattern to be justified at a module input, an instruction sequence is automatically assembled using a structural data-flow graph. The instruction sequence is obtained by first performing symbolic simulation to create a set of symbolic equations. The solution of these equations leads to the desired instruction sequence. Global constraints that the design imposes on each module can be passed on to an ATPG tool to generate gate-level tests for each individual module [63]. Subsequently, the module test sets can be combined with the extracted global test modes to generate test vectors that can be applied at the PIs of the hierarchical design. The test generation time has also been improved using symbolic controllability and observability descriptions [64]. An RTL automatic test scheduling technique has been proposed that can be used in lieu of or to complement sequential gate-level ATPG [65]. It provides a system with 11 signal types to perform test scheduling at the RTL which allows module-level pre-computed test sets to be directly used for testing. The techniques have been further extended to do RTL fault simulation [66]. A high-level fault modeling and test generation philosophy has been proposed which is aimed at ensuring full detection of low-level, physical faults as well as stuck-at faults [67]. A new functional fault model has been developed that is derived from the circuit under test. A unique feature of this fault model is that detecting faults using this model guarantees the detection of low-level stuck-at faults.
9. Conclusions This paper presented an overview of various high-level SFT, DFT and test generation techniques. Though these methods show great promise, there are three major issues which still need to
96
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
be addressed at the higher levels: (i) the existing techniques are mostly applicable to data-flow intensive and arithmetic-intensive designs like those used in digital signal processing filters; some amount of control-flow constructs like conditionals and loops have also been tackled, but heavily control-dominated designs like communicating FSMs need to be tackled, (ii) the techniques need to be extended to handle today’s complex general-purpose microprocessors so that automatic test set and test program generation can be done at higher levels, (iii) the most common fault model used in the existing high-level approaches is the stuck-at fault model; delay faults and IDDQtestable faults also need to be addressed.
References [1] K.D. Wagner, S. Dey, High-level synthesis for testability: a survey and perspective, Proc. Design Automation Conf., June 1996, pp. 131—136. [2] K.-T. Cheng, V.D. Agrawal, A partial scan method for sequential circuits with feedback, IEEE Trans. Comput. 39 (1990) 544—548. [3] D.-H. Lee, S.M. Reddy, On determining scan flip-flops in partial scan designs, Proc. Int. Conf. Computer-Aided Design, November 1990, pp. 322—325. [4] T.C. Lee, W.H. Wolf, N.K. Jha, J.M. Acken, Behavioral synthesis for easy testability in data path allocation, Proc. Int. Conf. Computer Design, October 1992, pp. 29—32. [5] T.C. Lee, W.H. Wolf, N.K. Jha, Behavioral synthesis for easy testability in data path scheduling, Proc. Int. Conf. Computer-Aided Design, November 1992, pp. 616—619. [6] M. Potkonjak, S. Dey, R. Roy, Behavioral synthesis of area-efficient testable designs using interaction between hardware sharing and partial scan, IEEE Trans. Comput. Aided Des. 14 (1995) 1141—1154. [7] T.C. Lee, N.K. Jha, W.H. Wolf, Behavioral synthesis of highly-testable data paths under non-scan and partial scan environments, Proc. Design Automation Conf., June 1993, pp. 292—297. [8] T.C. Lee, N.K. Jha, W.H. Wolf, A conditional resource sharing method for behavioral synthesis of highly-testable data paths, Proc. Int. Test Conf., October 1993, pp. 749—753. [9] A. Mujumdar, K. Saluja, R. Jain, Incorporating testability considerations in high-level synthesis, Proc. Int. Symp. Fault-Tolerant Comput., July 1992, pp. 272—279. [10] A. Mujumdar, R. Jain, K. Saluja, Behavioral synthesis of testable designs, Proc. Int. Symp. Fault-Tolerant Comput., June 1994, pp. 436—445. [11] Z. Peng, K. Kuchcinski, Automated transformation of algorithms into register-transfer level implementations, IEEE Trans. Comput. Aided Des. 13 (1994) 436—445. [12] X. Gu, K. Kuchcinski, Z. Peng, Testability analysis and improvement from VHDL behavioral specifications, Proc. European Conf. Design Automation, September 1994, pp. 644—649. [13] Z. Peng, Testability-driven high-level synthesis, Proc. Int. Conf. ASIC, March 1994, pp. 123—126. [14] V. Fernandez, P. Sanchez, E. Villar, High-level synthesis guided by testability measures, Int. Test Synthesis Workshop, May 1994. [15] M.L. Flottes, R. Pires, B. Rouzeyre, Analyzing testability from behavioral to RT level, Proc European Design and Test Conf., March 1997, pp. 158—165. [16] M.L. Flottes, D. Hammad, B. Rouzeyre, High-level synthesis for easy testibility, Proc. European Design and Test Conf., March 1995, pp. 198—206. [17] C.H. Chen, T. Karnik, D.G. Saab, Structural and behavioral synthesis for testability techniques, IEEE Trans. Comput. Aided Des. 13 (1994) 777—785. [18] P. Vishakantaiah, J. Abraham, M. Abadir, Automatic test knowledge extraction from VHDL (ATKET), Proc. Design Automation Conf., June 1992, pp. 273—278. [19] P. Vishakantaiah, T. Thomas, J.A. Abraham, M. Abadir, AMBIANT: automatic generation of behavioral modifications for testability, Proc. Int. Conf. Computer Design, October 1993, pp. 63—66.
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
97
[20] T. Thomas, P. Vishakantaiah, J.A. Abraham, Impact of behavioral modifications for testability, Proc. VLSI Test Symp., April 1994, pp. 427—432. [21] S. Dey, M. Potkonjak, Transforming behavioral specifications to facilitate synthesis for testable designs, Proc. Int. Test Conf., October 1994, pp. 184—193. [22] F.F. Hsu, E.M. Rudnick, J.H. Patel, Enhancing high-level control-flow for improved testability, Proc. Int. Conf. Computer-Aided Design, November 1996, pp. 322—328. [23] L.T. Wang, E.J. McCluskey, Concurrent built-in logic block observer (CBILBO), Proc. Int. Symp. Fault-Tolerant Comput., May 1986, pp. 1054—1057. [24] L. Avra, Allocation and assignment in high-level synthesis for self-testable datapaths, Proc. Int. Test Conf., June 1991, pp. 463—471. [25] C.A. Papachristou, S. Chiu, H. Harmanani, A data path synthesis method for self testable designs, Proc. Design Automation Conf., June 1991, pp. 378—384. [26] H. Harmanani, C.A. Papachristou, S. Chiu, M. Nourani, SYNTEST: an environment for system-level design and test, Proc. European Conf. Design Automation, March 1992, pp. 402—407. [27] H. Harmanani, C.A. Papachristou, An improved method for RTL synthesis with testability tradeoffs, Proc. Int. Conf. Computer-Aided Design, June 1993, pp. 30—35. [28] I. Parulkar, S. Gupta, M. Breuer, Data path allocation for synthesizing RTL designs with low BIST area overhead, Proc. Design Automation Conf., June 1996, pp. 395—401. [29] L. Avra, E.J. McCluskey, Synthesizing for scan dependence in built-in self-testable designs, Proc. Int. Test Conf., October 1993, pp. 734—734. [30] A. Orailoglu, I.G. Harris, Microarchitectural synthesis for rapid BIST testing, IEEE Trans. Comput. Aided Des. 16 (1997) 573—586. [31] C.A. Papachristou, J.E. Carletta, Test synthesis in the behavioral domain, Proc. Int. Test Conf., October 1995, pp. 693—702. [32] K.A. Ockunzzi, C.A. Papachristou, Test enhancement for behavioral descriptions containing conditional statements, Proc. Int. Test Conf., November 1997, pp. 236—245. [33] N. Mukherjee, M. Kassab, J. Rajski, J. Tsyzer, Arithmetic built-in self-test for high level synthesis, Proc. VLSI Test Symp., May 1995, pp. 132—139. [34] S. Bhatia, N.K. Jha, Genesis: a behavioral synthesis system for hierarchical testability, Proc. European Design and Test Conf., February 1994, pp. 272—276. [35] S. Bhatia, N.K. Jha, Behavioral synthesis for hierarchical testability of controller/data path circuits with conditional branches, Proc. Int. Conf. Computer Design, October 1994, pp. 91—96. [36] S. Bhattacharya, F. Brglez, S. Dey, Transformations and resynthesis for testability of RTL control-datapath specifications, IEEE Trans. VLSI Systems 1 (1993) 304—318. [37] S. Dey, M. Potkonjak, Non-scan design-for-testability technique of RT level circuits, Proc. Int. Conf. ComputerAided Design, November 1994, pp. 640—645. [38] S. Dey, V. Gangaram, M. Potkonjak, A controller-based design-for-testability technique for controller-datapath circuits, Proc. Int. Conf. Computer-Aided Design, November 1995, pp. 534—540. [39] S. Ravi, I. Ghosh, R.K. Roy, S. Dey, Controller resynthesis for testability enhancement of RTL controller/data path circuits, Proc. Int. Conf. VLSI Design, January 1998, pp. 193—198. [40] V. Chickermane, J. Lee, J.H. Patel, Addressing design for testability at the architectural level, IEEE Trans. Comput. Aided Des. 13 (1994) 920—934. [41] M.S. Abadir, M. Breuer, A knowledge-based system for designing testable VLSI chips, IEEE Des. Test Comput. (1985) 56—58. [42] S. Freeman, Test generation for data-path logic: the F-path method, IEEE J. Solid-State Circuits 23 (1988) 421—427. [43] S. Bhattacharya, S. Dey, H-scan: a high level alternative to full-scan testing with reduced area and test application overheads, Proc. VLSI Test Symp., April 1996, pp. 74—80. [44] S. Bhattacharya, S. Dey, B. Sengupta, An RTL methodology to enable low overhead combinational testing, Proc. European Design and Test Conf., March 1997, pp. 146—152. [45] T. Asaka, S. Bhattacharya, S. Dey, M. Yoshida, An efficient low-overhead approach using RTL design for testability technique with scan flip—flops, Proc. Int. Test Conf., November 1997, pp. 265—274.
98
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
[46] J.E. Carletta, C.A. Papachristou, Testability analysis and insertion for RTL circuits based on pseudorandom BIST, Proc. Int. Conf. Computer Design, November 1995, 162—167. [47] M. Nourani, J.E. Carletta, C.A. Papachristou, A scheme for integrated controller-datapath fault testing, Proc. Design Automation Conf., June 1997, pp. 546—551. [48] I. Ghosh, A. Raghunathan, N.K. Jha, A design for testability technique for RTL circuits using control/data flow extraction, Proc. Int. Conf. Computer-Aided Design, November 1996, pp. 329—336. [49] I. Ghosh, A. Raghunathan, N.K. Jha, Design for hierarchical testability of RTL circuits obtained by behavioral synthesis, IEEE Trans. Comput. Aided Des. 16 (1997) 1001—1014. [50] I. Ghosh, A. Raghunathan, N.K. Jha, Hierarchical test generation and design for testability of ASPPs and ASIPs, Proc. Design Automation Conf., June 1997, pp. 534—539. [51] I. Ghosh, N.K. Jha, S. Bhawmik, A BIST scheme for RTL controller/data paths based on symbolic testability analysis, Proc. Design Automation Conf., June 1998. [52] B.T. Murray, J.P. Hayes, Hierarchical test generation using precomputed test sets for modules, IEEE Trans. Comput. Aided Des. 9 (1990) 594—603. [53] S.J. Chandra, J.H. Patel, A hierarchical approach to test vector generation, Proc. Design Automation Conf., June 1987, pp. 495—501. [54] R.P. Kunda, P. Narain, J.A. Abraham, B.D. Rathi, Speed up of test generation using high level primitives, Proc. Design Automation Conf., June 1990, pp. 594—599. [55] D. Bhattacharya, J.P. Hayes, A hierarchical test generation methodology for digital circuits, J. Electron. Testing Theory Appl. 1 (1990) 103—123. [56] D. Brahme, J. Abraham, Functional testing of microprocessors, IEEE Trans. Comput. Aided Des. 33 (1984) 475—485. [57] K. Roy, J.A. Abraham, High-level test generation using data flow descriptions, Proc. European Conf. Design Automation, March 1990, pp. 480—484. [58] A. Ghosh, S. Devadas, A.R. Newton, Sequential test generation and synthesis for testability at the register transfer and logic levels, IEEE Trans. Comput. Aided Des. 12 (1993) 579—588. [59] M. Karam, R. Leveugle, G. Saucier, hierarchical test generation methodology based on delayed propagation, Proc. Int. Test Conf., October 1991, pp. 739—747. [60] C. Chen, C. Wu, D.G. Saab, Beta: behavioral testability analysis, Proc. Int. Conf. Computer-Aided Design, November 1991, pp. 202—205. [61] J. Lee, J.H. Patel, An architectural level test generator for a hierarchical design environment, Proc. Int. Symp. Fault-Tolerant Comput., June 1991, pp. 44—51. [62] J. Lee, J.H. Patel, A signal-driven discrete relaxation technique for architectural level test generation, Proc. Int. Conf. Computer-Aided Design, November 1991, pp. 458—461. [63] P. Vishakantaiah, J.A. Abraham, D. Saab, CHEETA: composition of hierarchical sequential tests using ATKET, Proc. Int. Test Conf., October 1993, pp. 606—615. [64] J. Steensma, F. Catthoor, H. De Man, Test of high throughput data paths with symbolic controllability and observability descriptions, Proc. 6th Workshop New Directions for Testing, May 1992, pp. 67—76. [65] S. Yadavalli, I. Pomeranz, S.M. Reddy, MUSTC-testing: multi-stage-combinational test scheduling at the registertransfer level, Proc. Int. Conf. VLSI Design, January 1995, pp. 110—115. [66] S. Yadavalli, Register-transfer level test generation and test synthesis strategies for data-flow data-paths, Ph.D. Thesis, Dept. of Electrical Engr., Univ. of Iowa, June 1996. [67] M.C. Hansen, Symbolic functional test generation with guaranteed low-level fault detection, Ph.D. Thesis, Dept. of Compter Science and Engr., Univ. of Michigan, June 1996.
I. Ghosh, N.K. Jha / INTEGRATION, the VLSI journal 26 (1998) 79—99
99
Indradeep Ghosh (S95-M98) received the B.Tech. degree in Computer Science and Engineering from the Indian Institute of Technology, Kharagpur, India, in 1993, and the M.A. and Ph.D. degrees in Electrical Engineering from Princeton University, Princeton, NJ, in 1995 and 1998, respectively. Currently he is a member of research staff in the Advanced CAD Research group at Fujitsu Laboratories of America, Sunnyvale, CA. He has co-authored a paper which has won the Honorable Mention Award at the International Conference on VLSI Design (1998). His research interests include design for testability, high level test generation, BIST, high level synthesis and high-level design verification.
Niraj K. Jha (S85-M85-SM93-F98) received his B.Tech. degree in Electronics and Electrical Communication Engineering from Indian Institute of Technology, Kharagpur, India in 1981, M.S. degree in Electrical Engineering from S.U.N.Y. at Stony Brook, NY in 1982, and Ph.D. degree in Electrical Engineering from University of Illinois, Urbana, IL in 1985. He is an Associate Professor of Electrical Engineering at Princeton University. He has served as an Associate Editor of IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, and is currently serving as an Associate Editor of IEEE Transactions on VLSI Systems and of Journal of Electronic Testing: Theory and Applications. He has also served as the Program Chairman of the 1992 Workshop on Fault-Tolerant Parallel and Distributed Systems. He is the recipient of the AT&T Foundation award and the NEC Preceptorship award for research excellence. He has co-authored two books titled Testing and Reliable Design of CMOS Circuits (Kluwer) and High-Level Power Analysis and Optimization (Kluwer). He has authored or co-authored more than 140 technical papers. He has co-authored three papers which have won the Best Paper Award at IEEE International Conference on Computer Design (1993), IEEE International Symposium on Fault-Tolerant Computing (1997), and International Conference on VLSI Design (1998). He has also received nominations for Best Paper Awards at three other conferences. His research interests include digital system testing, fault-tolerant computing, computer-aided design of integrated circuits, distributed computing and real-time computing.