Microelectronics Journal Microelectronics Journal 33 (2002) 403±407
www.elsevier.com/locate/mejo
Improved structure for adiabatic CMOS circuits design G. Hang a,*, X. Wu b,1 a
Department of Information and Electronic Engineering, Zhejiang University, Hangzhou 310027, People's Republic of China b Institute of Circuits and Systems, Ningbo University, Ningbo 315211, People's Republic of China Received 7 September 2001; accepted 1 February 2002
Abstract Two adiabatic circuits with complementary structure and operation are proposed in this paper. They employ two-phase sinusoidal power clock. The power consumption of the proposed circuits is comparable to that of some previously reported circuits. The problem of ¯oating output nodes is solved by connecting two MOS transistors to the power clock. In particular, using the proposed architecture more than one stage of gates can be computed simultaneously within a single clock-phase, compared to only one stage is computed in every phase by most other adiabatic logic families. With this feature, the latency of the complex logic circuit is greatly improved and the number of buffers required for a pipelining circuit is also reduced. In this paper, a 2:1 multiplexer and full adder are illustrated and simulated. From the PSPICE simulation results, the effectiveness of the proposed approach and the low power characteristic of the designed circuits are validated. q 2002 Elsevier Science Ltd. All rights reserved. Keywords: Low power design technique; Adiabatic circuits; Energy recovery
1. Introduction The power dissipation in CMOS circuits is related to the type of energy conversion. In static CMOS circuits, a DC power supply is used and the signal value is realized by charging (or discharging) the node capacitance. During this process, the charge is drawn from the power supply Vdd, then transported to node capacitance, and returned to the ground terminal, resulting in an irreversible energy conversion from electric energy to heat. As a result, when a node capacitance is charged (or discharged), it leads to 2 energy dissipation of 1=2
CVdd ; where C is the node capacitance. So reducing the energy dissipation has been equated with reducing the switching activity. Low power design targeting minimum switching activity has been concerned. Specially, it is suitable to design low power sequential circuits, where some evident energy saving is observed [1±3]. However, the energy saving obtained is still limited. Adiabatic computation has been widely studied as a low power design technique. In the recent years, several adiabatic or energy recovery logic architectures have been proposed [4±7]. In Refs. [8,9], the adiabatic principle is extended to the design of adiabatic ¯ip-¯ops and sequential * Corresponding author. Fax: 186-574-87604591. E-mail addresses:
[email protected] (G. Hang),
[email protected] (X. Wu). 1 Also corresponding author.
circuits. These proposed circuits have achieved signi®cant power savings compared to conventional CMOS circuits. The outputs of these circuits are only valid during a particular phase of the power clock cycle. Hence, multiple-phase clocking is required to drive a chain of cascaded adiabatic logic circuits. The need for multiple-phase power clock not only increases power dissipation of the clocking network, but also results in extra complexity of both the logic and the required power clock generator. The 2N±2P and the 2N± 2N2P circuits [4], and the ef®cient charge recovery logic circuit (ECRL) [5] require four-phase clocking. The passtransistor adiabatic logic (PAL) [6] requires two-phase clocking, which eliminates the path to ground and thereby achieves higher power saving. However, the ¯oating low output will affect the circuit performance. The PAL-2N circuit [7] uses two minimized pull down nMOS to solve this problem and hence the output waveform is signi®cantly improved at the expense of four-phase clocking for cascading purposes. Furthermore, the common feature of the above proposed circuits is that the input signals and the output signals in a circuit do not appear in the same clock-phase [10], which results in the combinational circuits having sequential characteristic. Therefore, the latency of a complex logic circuit with multiple cascading stages will increase with its logic depth. In the circuit proposed in Ref. [11], the output signal responses in the same clockphase of the input signal (that is, the circuit is able to receive input signals in phase with its power clock and evaluate the
0026-2692/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved. PII: S 0026-269 2(02)00011-3
404
G. Hang, X. Wu / Microelectronics Journal 33 (2002) 403±407
logic value concurrently rather than in the next phase). With this feature, more than one stage of gates can be computed simultaneously within the same clock-phase. This helps to achieve the reduced latency of complex logic circuits and to save the number of buffers required for a pipelining circuit [11]. In this article, we propose two adiabatic circuits with complementary structure and operation. They are suitable for the adiabatic circuits design when base-0 signals and base-1 signals are required, respectively. Fully four types of clocked signals [12] can be obtained by incorporating both proposed complementary circuits in a circuit design. The power consumption of the proposed circuits in this paper is comparable to that of the above-mentioned circuits. The proposed circuits employ two-phase sinusoidal power clock, and the output signals can response the input signals in the same clock-phase (that is, several stages can be evaluated in every phase). Therefore, the latency of the overall circuits is greatly improved and the number of buffers required for pipelining structure is also reduced. The proposed circuits also eliminate the path to ground just as in PAL circuits. However, the ¯oating nodes are avoided by connecting two MOS transistors to the power clock in each circuit, which make clocked output signals clamp to either the high level or low level of power clock during the output valid time. In this paper, a 2:1 multiplexer and full adder (FA) are constructed and simulated using the proposed technique to illustrate the effectiveness of this approach. 2. Basic structure and operation The basic adiabatic computing circuits proposed in this paper are shown in Fig. 1. All the input signals and the output signals in the schematics are clocked signals,
Fig. 1. (a) n-Logic base-converting circuit; (b) p-logic base-converting circuit; (c) inverter/buffer with base-0 input/output signals; (d) inverter/ buffer with base-1 input/output signals.
where both x0 and x0 represent base-0 signal (that is, the clocked signal displays its true logic value at the peak of power clock and is set to its `base' value `0' at the valley of power clock); both x1 and x1 represent base-1 signal (that is, the clocked signal displays its true logic value at the valley of power clock and is set to its `base' value `1' at the peak of power clock). Fig. 1(a) performs the function of converting base-1 signals into base-0 signals (henceforth referred to as n-logic base-converting circuit), whereas Fig. 1(b) performs the function of converting base-0 signals into base-1 signals (henceforth referred to as p-logic base-converting circuit). In the following, we take the circuit shown in Fig. 1(a) as an example for explaining its working principle. In comparison with PAL circuit, the circuit shown in Fig. 1(a) has additional two low-level clamping transistors MN3, MN4 and power clock clk: In fact clk is just the power clock of the subsequent stage, therefore, compared to the PAL circuit, the number of power clocks is actually not increased in the circuit designs (PAL circuit requires twophase clocking too). If the input signals are base-0 signals, the function of the circuit is identical to that of PAL circuit, while the ¯oating output node is avoided due to the additional transistors MN3 and MN4. Therefore, the output waveform is improved in comparison to PAL circuit [13]. Now, we discuss the condition of using base-1 input signals. Initially, when clk 0; both output nodes are discharged, therefore, MN3 and MN4 are cut off. When clk starts rising from 0, the circuit will enter evaluation period. Assuming that x1 is high and x1 is low, then MN1 is switched on and MN2 is cut off. Therefore, the output x0 will begin to ramp up following the power clock clk: Once the voltage difference between x0 and x0 increases above uVtp u (where Vtp is the threshold voltage of pMOS), MP1 turns on and x0 is being charged up through MP1 to the peak of clk: Therefore, the high-level output is clamped to the peak of power clock. If there is no MN4, as MN2 and MP2 stay off, the low-level output node x0 will ¯oat. However, if the transistor MN4 is added, the ¯oating problem can be solved. When clk ramps up and the voltage difference between clk and x0 increases above uVtp u; MP1 turns on, and MN4 is also switched on then if uVtp u Vtn (where Vtn is the threshold voltage of MN4). Thus, the output node x0 is connected to the clk through the conducting transistor MN4. Since clk and clk are inverted each other, when clk is ramping up, clk is going down. Therefore, when x0 ramps up to the peak of clk; clk drops to the valley, and then the low-level output node x0 is clamped to the valley of the power clock. Thus, the ¯oating problem is solved. As x0 is low, MN3 always stay off. When clk ramps down, x0 follows clk down through MP1 to realize energy recovery. Therefore, the circuit performs the function of converting base-1 signals into base-0 signals, while the logic values are unchanged. The circuit shown in Fig. 1(b) can be analyzed similarly, where the problem of ¯oating output node is solved by two additional pMOS transistors (MP3, MP4) connecting to power clock clk: Note that the structure and the operation of the circuit
G. Hang, X. Wu / Microelectronics Journal 33 (2002) 403±407
405
Fig. 2. (a) 2:1 Multiplexer; (b) waveforms of multiplexer at 200 MHz clock frequency.
shown in Fig. 1(a) are complementary to the one in Fig. 1(b), the combination of both circuits can be used to design the adiabatic circuits with the property of achieving the input and output at the same clock-phase, as shown in Fig. 1(c) and (d). For simplifying the circuit structure, two clamping transistors in the ®rst stage may be omitted as shown in Fig. 1(c) and (d). The input signals and the output signals shown in Fig. 1(c) are all base-0 signals; whereas the input signals and the output signals shown in Fig. 1(d) are all base-1 signals. Therefore, both circuits perform the logic function of buffer and inverter, while keeping `base' unchanged. By expanding the input MOS evaluation tree, basic base-0 and base-1 adiabatic gates and combinational circuits can be designed. In Section 3, a 2:1 multiplexer and FA are constructed to illustrate the effectiveness of the proposed approach. 3. Circuit designs Habitually, base-0 signals are adopted in the discussions of many references. Therefore, the circuit shown in Fig. 1(c) is taken as the main body of logic circuits. Base-0 signals can be converted into base-1 signals by connecting the baseconverting circuit shown in Fig. 1(b) in the output terminal. For simplicity, we omit the superscript 0 in the expression of signals, which represents base-0 signals. The schematic of a 2:1 multiplexer is shown in Fig. 2(a). The evaluation trees are constructed by using pMOS networks, where the ORoperation is realized by connecting pMOS in series and AND-operation is realized by connecting pMOS in parallel. Therefore, it performs the function: f as 1 bs: We have simulated the circuit shown in Fig. 2(a) with PSPICE. Using sinusoidal power clock, the output waveform at 200 MHz frequency is shown in Fig. 2(b), where the input signals are CMOS-compatible rectangular pulses, while the output signals are clocked signals (sinusoidal pulses). The simulation result proves that the designed circuit has proper logic function. From the waveform, we ®nd that the low-level of the output signal is improved due to additional two nMOS
transistors connecting to clk in the output terminal, in comparison to PAL circuit. Several 2:1 multiplexers (PAL-2N, PAL, 2N±2N2P) were designed and PSPICE simulations were performed based on a standard 1.2 mm CMOS technology. The power dissipation of these circuits are compared to the proposed multiplexer at different frequencies. The comparison results are given in Table 1. In simulation, we employ sinusoidal power clock with 5 V peak-to-peak voltage. The transistor sizes are kept identical for the various designs, where the W=L of nMOS is 1.8 mm/ 1.2 mm, the W=L of pMOS is 5.4 mm/1.2 mm, and for 2N± 2N2P and PAL-2N, the W=L of additional two pull-down nMOS transistors is 1.2 mm/1.2 mm. A capacitive load of 20 fF is placed at each output node. The power dissipation of the proposed multiplexer ranges from 5.3 to 99.4 mW with respect to clock frequencies varying from 40 to 300 MHz. By replacing the input pMOS transistor in Fig. 1(c) with the appropriate pMOS logic evaluation blocks, the schematic of the adiabatic FA realizing base-0 signals addition can be obtained as shown in Fig. 3(a), where s is the base-0 sum output signal and c1 is the base-0 carry-out signal. By using sinusoidal power clock (5 V peak-to-peak voltage) and with a Gray-coded input stream, the PSPICE simulation result is shown in Fig. 3(b) at 100 MHz clk frequency. In simulation, the frequency of the input signal a is a quarter of the clk frequency, the frequency of the input signal b and c is half the frequency of the signal a. From the waveform, we verify that the designed circuit has correct logic function with improved low-level output. The power dissipation of Table 1 Power dissipation (mW) for various MUXs at different clock frequencies
Fig. 2(a) PAL-2N PAL 2N±2N2P
40 MHz
80 MHz
100 MHz
200 MHz
300 MHz
5.3 4.9 4.2 10.7
15.2 13.1 8.9 16.0
19.9 17.0 11.9 19.6
54.6 50.2 39.5 54.8
99.4 100.7 77.3 88.2
406
G. Hang, X. Wu / Microelectronics Journal 33 (2002) 403±407
83%, respectively, at 20 MHz, 51 and 48%, respectively, at 200 MHz. In the above discussion, we take base-0 signals as inputs and outputs to illustrate the circuit designs. Similarly, based on the circuit structure shown in Fig. 1(d), the part design for base-1 signals can also be derived. The main property of the FA shown in Fig. 3(a) is that the circuit is able to receive input signals in phase with its power clock and then evaluate the logic value concurrently. With this feature, more than one stage can be computed simultaneously within the same clock-phase. It helps to reduce the latency of a complex logic circuit and to save the number of buffers required for a pipelining structure. We take the ripple-carry adder (RCA) shown in Fig. 5(a) as an example to demonstrate this property, where FA is given in Fig. 3(a). Using 100 MHz clock frequency, we simulated a 2-bit RCA with PSPICE. The simulated output waveforms are shown in Fig. 5(b), where s1 and s2 are sum signals, c11 and c21 are carry-out signals. The input signals of the ®rst FA, a1 ; b1 and c1 are identical to the input signals a, b and c shown in Fig. 3(b), respectively. Note that the carry-input signal c11 of the second FA is a clocked signal. It is veri®ed that the results from every addition are valid within the same clock-phase and it may be sampled at the peak of clk; as can be seen from the simulated output waveforms. Therefore, it takes only a single phase for executing a 2-bit addition in comparison to two phases required by most other adiabatic logic families using the 2-bit RCA design. The advantage will be remarkable with the increase of cascading stages. Fig. 3. (a) Adiabatic FA schematic; (b) waveforms of adiabatic FA at 100 MHz clock frequency.
the static differential cascode voltage switch (DCVS) [14] FA, the differential current switch logic (DCSL) [15] FA (based on DCSL3) and the proposed adiabatic FA at different clock frequencies are compared in Fig. 4. Compared to the static DCVS FA and the clocked DCSL FA, the proposed adiabatic FA achieves a power saving of 87 and
Fig. 4. Power consumption at different operating frequencies.
4. Conclusion In this article, we propose two adiabatic circuits with complementary structure and operation. They can be applied to the adiabatic circuits design when base-0 signals and base-1 signals are required, respectively. The power consumption of the proposed circuits is comparable to that of some reported circuits. The proposed circuits use twophase sinusoidal power clock, which can relatively simplify power clock generator. In order to improve the output waveforms, we use two MOS transistors connecting to power clock in each circuit to solve the problem of ¯oating nodes, which makes the clocked output signals clamp to either the high level or low level of power clock during the output valid time. The adiabatic circuits achieve signi®cant power savings over conventional CMOS circuits. However, as multiple-phase clocking is required for proper cascading purposes, and as only one stage is being computed in every phase, it is impractical to use adiabatic switching technique to implement complex logic with many cascading stages. In comparison with the most reported adiabatic logic, the adiabatic circuit implemented by the proposed architecture is able to receive input signals in phase with its power clock and compute the logic value concurrently. Therefore, the input signals and the output
G. Hang, X. Wu / Microelectronics Journal 33 (2002) 403±407
407
Fig. 5. (a) Ripple-carry adder structure; (b) simulated output waveforms of 2-bit RCA at 100 MHz clock frequency.
signals of the proposed circuits can appear in the same clock-phase. With this feature, more than one stage of gates can be evaluated within a single clock-phase. Besides, the input/output latency of complex logic circuit is reduced and the number of buffers required for a pipelining circuit is also saved. The PSPICE simulations have demonstrated the effectiveness of the proposed approach and the low power characteristic of the designed circuits. We think that the proposed circuits can act as an alternative in the design of low power systems.
[5] [6]
[7] [8] [9]
Acknowledgements
[10]
This work is supported partially by NNSF of China (No. 69973039).
[11]
References
[12]
[1] Q. Wu, M. Pedram, X. Wu, Clock-gating and its application to low power design of sequential circuits, IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 47 (3) (2000) 415±420. [2] X. Wu, M. Pedram, Low-power design on sequential circuits using T ¯ip-¯ops, International Journal Electronics 88 (6) (2001) 635±643. [3] X. Wu, M. Pedram, L. Wang, Multi-code state assignment for low power circuit design, IEE ProceedingsÐCircuits Devices Systems 147 (5) (2000) 271±275. [4] A. Kamer, J.S. Denker, B. Flower, J. Moroney, Second order adiabatic computation with 2N±2P and 2N±2N2P logic circuits, Proceed-
[13]
[14]
[15]
ings of the International Symposium on Low Power design, Dana Point, April 1995, pp. 191±196. Y. Moon, D.K. Jeong, An ef®cient charge recovery logic circuit, IEEE Journal of Solid-State Circuits SC-31 (4) (1996) 514±522. V.C. Oklobdzija, D. Maksimovic, F. Lin, Pass-transistor adiabatic logic using single-clock supply, IEEE Transactions on Circuits and Systems-II: Analog and Digital Signal Processing 44 (10) (1997) 842±846. F. Liu, K.T. Lau, Pass-transistor adiabatic logic with NMOS pulldown con®guration, Electronics Letter 34 (8) (1998) 739±741. K.W. Ng, K.T. Lau, ECRL-based low power ¯ip-¯op design, Microelectronics Journal 31 (5) (2000) 365±370. K.W. Ng, K.T. Lau, Low power ¯ip-¯op design based on PAL-2N structure, Microelectronics Journal 31 (2) (2000) 113±116. M. Pedram, X. Wu, Analysis of clocked power CMOS gates with application to the design of energy-recovery circuits, Proceedings of ASP-DAC, Paci®co Yokohama, January 2000, pp. 339±344. K.W. Ng, K.T. Lau, Improved PAL-2N logic with complementary pass-transistor logic evaluation tree, Microelectronics Journal 31 (1) (2000) 55±59. X. Wu, M. Pedram, Low power CMOS circuits with clocked power, Proceedings of IEEE Asia Paci®c Conference on Circuits and Systems, Tianjin, December 2000, pp. 513±516. G. Hang, X. Wu, Adiabatic CMOS switching circuits adopting twophase power-clock supply and avoiding ¯oating output, Chinese Semiconductor Journal 22 (3) (2001) 366±372. L.G. Heller, W.R. Grif®n, J.W. Davis, N.G. Thoma, Cascode voltage switch logic: a differential CMOS logic family, Proceedings of the International Conference on Solid-State Circuits, San Francisco, 1984, pp. 16±17. D. Somasekhar, K. Roy, Differential current switch logic: a low power DCVS logic family, IEEE Journal of Solid-State Circuits 31 (7) (1996) 981±991.