Microprocessingand Microprogramming37 (1993) 211-214 North-Holland
211
An Optimized Architecture for a Rapid-Prototype-Emulator G. vom B6gel, K. Scherer, M. Bollerott Fraunhofer Institute of Microelectronic Circuits and Systems Finkenstr. 61,4100 Duisburg 1, Germany
Abstract An ASIC prototype based on field programmable gate-arrays (FPGA) is a fast and cheap alternative to a silicon based prototype for design verification. To meet the high performance and complex functionality of the digital part of ASICs, it is necessary to connect FPGAs to an array. The connection-structure determines the realtime ability and the variety of designs that can be implemented into the FPGA-array. A new architecture for the connection structure of FPGAs as the base for a rapid-prototype emulation-system is presented. This architecture leads in relation to conventional systems to a better usage of the FPGAs and to an emulation-timing close to realtime in most designs. Key-words:
in-circuit-emulation, embedded circuit, rapid-prototyping, realtime-verification, FPGA-array, connection-structure
1. Introduction Rapid access to a prototype is becoming more and more important from the technical as well as the economical point of view, as it allows interactive design specification and extensive verification in the embedding environment of a systemASIC. Emulation therefore is an attempt to avoid the cost and time for redesigns of fabricated silicon (silicon based prototypes). The introduction of complex field programable gate-arrays (FPGA) allows a new kind of Rapid-Prototyping for embedded circuits [1]. The Rapid-Prototype Emulator (RPE) presented in this paper improves the performance by means of a special structure of inter-FPGA-connections described below.
WORKSTATION
2. The RPE-concept A workstation is used for the design-capture (e.g. schematic, behaviour), presimulation, transformation and compilation to downloadable netlists. Furthermore, software has been implemented in the workstation for emulation-control and emulation-monitoring. The embedding environment is connected by a plug and an emulator-cable to the RPE-system (figure 1). To reach a high adaptability to the environment of a variety of designs, the RPE has a modular structure. The hardware-functions to be implemented in the emulator, will be divided into blocks and allocated to specific modules like logic-modules, processor-modules and analog-modules (figure 2).
RAPID-PROTOTYPEEMULATOR ........
Fig. 1 General view of the RPE-System
.........
EMBEDDINGENVIRONMENT
212
G. vom B6gel et al.
CONTROLPROCESSORBOARD
"/
I. ~
3. Structure of the logic-module One RPE-logic-module contains only four LCAs. This is a very small number compared with the well known module of the Rapid Prototype Machine RPM (QUICKTURN SYSTEMS USA, Ca) [2]. A 6x6 LCA-array with an edge-to-edge connection structure between adjacent FPGAs was realised in this device. Figure 3 shows the hardwired structure (FPGA-interconnects) of both modules. In both systems the user port (I/O) represents the interface to the surrounding (embedding environment, logic analyzer etc.), while the module port (M) provides the connection to further modules. The architecture of the RPE-logic-module described in this paper is a result of extensive investigation of a variety of multi-LCA-arrays with different numbers of LCAs (up to 36 LCAs per module) and connection-structures. Most of the routing-channels of the chosen structure realize connections between two LCAs and the user- or module-port (figure 3). This particular structure allows to route a direct connection from any LCA to every other LCA on any module, thereby minimizing the number of intra-LCA-bypasses, so that a drastic increase in utilization of LCAs is possible. The timing-behaviour becomes determinable and the maximum emulation clock rate is calculable.
LOGIC-MODULES
I
WORKSTATION
/ CONFIGURATION. ANDCONTROLBUS (VMEBUS)
~R:~RT
MODULE-BUS
Fig. 2 Modular hardware-structure of the RPE The logic-modules consist of interconnected FPGAs (LCAs from XILINX). Depending on the complexity of a circuit, up to four logic-modules can be used. To increase the emulation-power in Hardware/Software Co-Design, additional processor-modules can be used for running softwareimplemented functions of the design. The RPE can easily be integrated in the designtool environment. The interface between designtools and RPE is a standard netlist format (e.g. EDIF). In this way design-capture on multiple levels is possible.
(n
[~--,
Y
~
p
f~
MODULE-
"~1~ B ~USER.~~/PORT '
LCA
FhG-IMS: RPE-Iogic-module
LJ
L~
L~
t~
LOWEREDGE CONNECTEDWITH UPPER EDGE
QUICKTURN: RPM-module
Fig. 3 Connection-structure of the QUICKTURN- and the RPE-Iogic-module [2,3]
An optimized architecture for a rapid-prototype-emulator The concept of the RPE-logic-module aims at the following improvements, compared with the RPMmodule [21: - Utilization of L C A s up to 50 % (RPM: less than 10 %) - leading to aproximately 15 000 usable gates per module (RPM: 25 000 usable gates). - A high routing flexibility based on the special kind of routing-channels. Due to the edge-to-
edge connection of the RPM-module a large number of bypasses through other LCAs is required to connect the clusters of the implemented circuit, which are not placed in neighboured LCAs. - A calculable emulation c l o c k rate in ranges up to 10 M H z depending on the design. Long
routing-paths in the RPM-module cause additional delays. This results in a relatively low permissible clock rate (1 - 2 MHz). - A short i m p l e m e n t a t i o n t i m e of the prototype. The large number of LCAs in the RPM-module requires a long time to partition, place and route the design.
213
determines the minimum period Tmin of the clock. The additional delay toffset is caused by buffers (LCA-I/O-buffer, user- and module-port-buffer) inserted into the signalpaths, which connect LCAs together or LCAs with the embedding environment. This delay will be added to the normal delay of the signalpaths. When implementing a design, three different occupationlevels of the RPE are possible: Level 1: All critical signalpaths are routed inside the LCAs. The influence of the buffers is low. Level 2: Critical signalpaths can occur between LCAs on one module or between a LCA and the embedding-environment. Level 3: Critical signalpaths can occur between the modules. The final equation to estimate the maximum emulation clock rate is:
rE=
fD k + fD" tOffset
4. Estimation of the emulation clock rate
One of the main problems of In-Circuit-Emulation is the determination of the clock rate the implemented prototype is able to operate with. It must be higher as the desired clock rate of the target-system. The following three items describe the preconditions for determining the maximum emulation clock rate: - There is a technology-constant k that describes in the first approximation the relation of the LCA-internal timing-behaviour (tLCA) tO the timing-behaviour of the target-technologic (tTT) (e.g. standard-cell ASIC): tLCA k = tTT The technology constant k depends on the technology of the ASIC-process and the speed-class of the LCA-devices. For the inhouse 2.5pm CMOS-process [5] and the LCA-type 3090-70 for example is k ~- 1. - T h e theoretical maximum clock rate the designed ASIC can operate with, is the maximum design-clock rate fD" It depends on the maximum appearing delay of a signal path between the inputs of two consecutive flip-flops. This delay
Figure 4 shows the coherence graphically between fu and fE" The technology-constant is k = 1 in this case. Table 1 shows some samples of this coherence.
[MHz] 20-
EMULATIOn// ~ L EL1V~
MAXIMUM EMULATION CLOCK RATE
IDEAL
105
1
0.5
MAXIMUM DESIGN CLOCK RATE
0.1 0.1
0.5
1
Fig. 4 The coherence between f D a n d fz ( k = l )
5
10 20
fD [MHz]
214
G. vom B6gel et al.
fD [MHz] Occupation Level 1 Level 2 Level 3
1
2
5
10
20
0.97 0.93 0.92
1.87 1.75 1.70
4.23 3.68 3.45
7.35 5.80 5.32
11.6 8.20 7.25
Table 1 The coherence between fD and fE (k = 1) References
When clock rates up to 1 MHz can be obtained, the coincidence of the timing of the emulated and the realized design may be considered ideal. We will have coincidence within the range of 1 to 5 MHz when the design has been optimized in relation to the demand (e.g. critical paths). Above 5 MHz it is hard to get coincidence between the timing of the emulation and the realization. The difficulties in this case depend on the kind of implementation of the design (occupation-level) and the value of the technology-constant k. But with higher efforts in design implementation it is possible to reach clock ranges above 10 MHz. 5. Conclusion
With the RPE we have shown that it is possible to realize realtime-qualified prototypes, which can operate in usage-related clock-ranges. Furthermore the results of our investigation have shown, that the use of a problem-orientated connectionstructure increases the utilization of the used LCAs and drastically decreases the expense of the emulator. The following activities in reseach and development aim at optimized automation of the design-implementation, easier handling of the emulation-system, and more efficient control and monitoring of the emulation-process. Furthermore we will extend RPE-capabilities for hardware/software co-emulation.
1. K. Scherer, O. Rettig Rapid Prototyping microelektronischer Hardware-Software-Systeme durch Emulation (Rapid Prototyping of microelectronic hardware-software-systems by emulation) Hrsg. B.Reusch Informatik Fachberichte 255 Seite 285ff Springer Verlag 1990 (in german) 2. M. D'Amour ASIC-emulation cuts design-risk High Performance Systems 10/89 3. G. vom B6gel Theoretische Untersuchungen zum Autbau einer softwarem~ig kortfigurierbaren Logikanordnung fiir einen ASIC-Emulator (Theorecital investigation of the construction of a configurable connecting-structure of FPGAs for an ASIC-emulator) Examination at University of Duisburg 11/90 (in german) 4. G. vom BOgel Entwurf und Aufbau eines VMEbus-kompatiblen, softwarekonfiguriebaren Logikmoduls fiir einen ASIC-Emulator (Development of a VMEbus-compatible, cordigurable logic-module for an ASIC-emulator) Examination at University of Duisburg 7/91 (in german) 5. FhG-IMS IMS-standard-cell catalog Duisburg 6/88