GDI based full adders for energy efficient arithmetic applications

GDI based full adders for energy efficient arithmetic applications

ARTICLE IN PRESS Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■ Contents lists available at ScienceDirect Engineering ...

3MB Sizes 12 Downloads 78 Views

ARTICLE IN PRESS Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

Contents lists available at ScienceDirect

Engineering Science and Technology, an International Journal j o u r n a l h o m e p a g e : h t t p : / / w w w. e l s e v i e r. c o m / l o c a t e / j e s t c h

Press: Karabuk University, Press Unit ISSN (Printed) : 1302-0056 ISSN (Online) : 2215-0986 ISSN (E-Mail) : 1308-2043

H O S T E D BY

Available online at www.sciencedirect.com

ScienceDirect

Full Length Article

GDI based full adders for energy efficient arithmetic applications Mohan Shoba *, Rangaswamy Nakkeeran Department of Electronics Engineering, School of Engineering and Technology, Pondicherry University, Puducherry 605014, India

A R T I C L E

I N F O

Article history: Received 9 June 2015 Received in revised form 25 August 2015 Accepted 7 September 2015 Available online Keywords: Adder GDI logic Digital design Full swing

A B S T R A C T

Addition is a vital arithmetic operation and acts as a building block for synthesizing all other operations. A high-performance adder is one of the key components in the design of application specific integrated circuits. In this paper, three low power full adders are designed with full swing AND, OR and XOR gates to alleviate threshold voltage problem which is commonly encountered in Gate Diffusion Input (GDI) logic. This problem usually does not allow the full adder circuits to operate without additional inverters. However, the three full adders are successfully realized using full swing gates with the significant improvement in their performance. The performance of the proposed designs is compared with the other full adder designs, namely CMOS, CPL, hybrid and GDI through SPICE simulations using 45 nm technology models. Simulation results reveal that proposed designs have lower energy consumption among all the conventional designs taken for comparison. Copyright © 2015 The Authors. Production and hosting by Elsevier B.V. on behalf of Karabuk University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction Adders are extensively used circuit elements in Very Large Scale Integration (VLSI) systems such as Digital Signal Processing (DSP) processors, microprocessors etc. It is the nucleus of many other operations like subtraction, multiplication, division and address calculation. In most of the digital systems, adders lie in a critical path which influences the overall system performance. Hence, enhancing adder’s performance is becoming an important goal [1–3]. The explosive growth in portable systems like laptops has intensified the research efforts in low power microelectronics. The reason behind is that the battery technology does not advance at the same rate as the microelectronics technology. There is only a limited amount of power available for the mobile systems. Therefore, low power design has become a major design consideration [4]. The advances in VLSI technology allow hardware realization of most computing intensive applications such as multimedia processing, DSP, to enhance the speed of operation. Moreover, with increasing demand and the popularity of portable electronic products, the researchers are driven to strive for smaller silicon area, higher speed, longer battery life and enhanced reliability. The importance of digital computing lies in full adder design. The design criteria for full adder are usually multifold [5]. Transistor count, which is one of the attributes, determines the system

* Corresponding author. Tel.: +91 7598308967. E-mail address: [email protected] (M. Shoba). Peer review under responsibility of Karabuk University.

complexity of arithmetic circuits like multiplier, Arithmetic Logic Unit (ALU), etc. Power consumption and speed would be the other two important criteria when it comes to the design of full adders. However, they have a contradictory relationship with each other. Therefore, power delay product or energy consumption per operation has been introduced to accomplish optimal design tradeoffs. The performance of digital circuits can be optimized by proper selection of logic styles. Different logic styles tend to favor the accomplishment of one performance aspect at the expense of others. The logic styles are varied in the method of computing intermediate nodes, the number of transistor count, though they are implementing the same function [6]. Numerous full adder designs in the classes of static CMOS, dynamic circuit, transmission gate, GDI logic and Pass Transistor Logic (PTL) are discussed in the literature [7–12]. The well known static CMOS adders with complementary pullup PMOS and pull down NMOS network require 28 transistors for generating sum and carry outputs. PTL is an alternative to CMOS and offers most functions implementations with fewer transistors. This may reduce overall capacitances which in turn will increase the speed and decrease the power dissipation. However, in the PTL based design, the output voltage is varied due to threshold voltage drop across the input and the output. This problem can be resolved by the adaptation of Complementary Pass Logic (CPL) and Swing Restored PTL (SRPL). But these logics produce larger short circuit current, higher transistor count and increased wiring complexity due to demand of complementary input signals. Building logic using transmission gate is another choice to minimize complexity. The full adder design implemented using transmission gate is discussed in Reference 13. It requires 20 tran-

http://dx.doi.org/10.1016/j.jestch.2015.09.006 2215-0986/Copyright © 2015 The Authors. Production and hosting by Elsevier B.V. on behalf of Karabuk University. This is an open access article under the CC BY-NCND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

2

sistors; further reduction in transistor count is also possible using transmission function adder which needs 16 transistors, and it is discussed in Reference 14. GDI logic [15] is introduced as an alternative to CMOS logic. It is a low power design technique which offers the implementation of the logic function with fewer numbers of transistors. GDI gates provide reduced voltage swing at their outputs, i.e. the output high (or low) voltage is deviated from the VDD (or ground) by threshold voltage Vt. The reduction in voltage swing is beneficial to power consumption. On the other hand, this may lead to slow switching in the case of cascaded operation. At low VDD operation, the degraded output may even cause circuit malfunction. Therefore, special attention must be needed to achieve full swing operation. In this paper, an efficient methodology for digital circuits such as AND, OR and XOR gates with full swing is implemented. After that, three full adders are proposed based on the full swing gates in a standard 45 nm technology. The performance of three proposed full adder designs are compared with other adders based on CMOS, CPL, hybrid and GDI logic cited in the literature. The paper is organized as follows: Section 2 overviews the GDI methodology and presents its advantages and limitations. The three proposed full adders implementations based on full swing AND, OR and XOR gates are discussed in Section 3. Section 4 discusses the simulation results and compares them with CMOS, CPL, hybrid and GDI based designs. The conclusion is drawn in Section 5.

performance of the gate. The output voltage reduction can be compensated by the use of swing restoration buffers at the output [16]. However, the presence of inverters in the buffers increases the transistor count and also increases the static power consumption when they are connected in cascade. A multiple Vt technique is presented in the lieu of swing restoration buffer in Reference 15]. This approach utilizes low threshold transistors in the places where a voltage drop is to occur and also high threshold transistors for the inverters. Though this hybrid threshold voltage method minimizes power consumption, it becomes a bottleneck at the transistor fabrication process. Another method of swing restoration of GDI based, full adder output, using an Ultra Low Power Diode (ULPD) technique is detailed in Reference 17. This technique configures the MOS transistor to work as a diode and uses 8 additional transistors for providing full swing. It mitigates the problem of static power dissipation as a conventional swing restoration buffer but still the complexity issue in the fabrication of ULPD is to be taken into account. The techniques presented so far to achieve full swing at the full adder output either increase the number of transistors (more than half from non-full swing design) or increase the power consumption (use of buffers). So, a general method is required to design full swing at the gate level like AND, OR, XOR, etc. Hence, an attempt is made to design full swing gates subsequently three adders using the proposed gates; a detailed explanation of the same is discussed in the following section.

2. GDI logic The basic GDI cell is shown in Fig. 1. Though it resembles a conventional CMOS inverter the source/drain diffusion input of both PMOS and NMOS transistor is different. In conventional inverter circuit, source and drain diffusion input of PMOS and NMOS transistors are always tied at VDD and GND potential, respectively. On the other hand, the diffusion terminal acts as an external input in the GDI cell. It helps in the realization of various Boolean functions such as AND, OR, MUX, INVERTER, F1 and F2, as listed in Table 1. The main drawback of GDI gate is that it suffers due to threshold voltage drop. This reduces current drive and affects the

3. Proposed full adders in GDI In this section, three proposed full adder designs featuring GDI with full swing logic are discussed with the goal to minimize the circuit complexity and to achieve speed at cascaded operation. The strategy is to avoid threshold voltage losses with the help of full swing gates. 3.1. Basic gates for full adder design The logic function of full adder [6] can be represented as

P

G

Sum = A XOR B XOR C in

(1)

C out = A AND B + B AND C in + A AND C in

(2)

From Eqs. (1) and (2) three basic gates are needed for implementing the function i.e., AND, OR and XOR. As illustrated in Table 1, the gate functions can be achieved with two transistors (excluding the inverters for complementary input signals) and their transistor level diagrams are shown in Fig. 2. The operational characteristics of these gates are given in Table 2. Assume that both the inputs have voltage swing, the output voltages are subjected to different input combinations given in Table 2. From Table 2, it concludes that the output voltages are degraded by threshold voltage drop for certain input combinations. The reduction in output voltage increases significantly with increase in the number of stages. Therefore, the design of full

OUTPUT

N Fig. 1. Basic GDI cell.

Table 1 Different logic function realization using GDI cell. N

P

G

OUT

Function

‘0’ B ‘1’ B C ‘0’

B ‘1’ B ‘0’ B ‘1’

A A A A A A

AB A+B A+B AB AB + AC A

F1 F2 OR AND MUX NOT

Table 2 Operational characteristics of AND, OR and XOR gate using GDI logic. A

B

AND

OR

XOR

‘0’ ‘0’ ‘1’ ‘1’

‘0’ ‘1’ ‘0’ ‘1’

|Vtp| |Vtp| Gnd VDD-Vtn

|Vtp| VDD VDD-Vtn VDD-Vtn

|Vtp| VDD VDD-Vtn Gnd

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

3

B B

A⊕B

A

A+B

A

A

B

(a)

A.B

B

(b)

(c)

Fig. 2. (a) XOR gate (b) OR gate and (c) AND gate.

F1 and F2 based AND and OR functions implementation require 3 transistors whereas CMOS based implementation demands 6 transistors. So, the choice of F1 and F2 for AND and OR gates will be good since a less number of transistors (like PTL logic) also provides full swing like CMOS. However, F1 and F2 based XOR gate implementation lacks CMOS based design. The reason might be one of the following:

swing gates is necessary and it is discussed in the forthcoming subsections. 3.2. Full swing AND, OR and XOR gates using F1 and F2 functions Conventionally universal gates, namely NAND and NOR can be used to realize any logical expression. Similarly, in GDI, two functions are available, namely, F1 ( AB ) and F2 ( A + B ) to realize logical expression. These two functions are also suffering from a threshold voltage drop. The solution for this issue is discussed in Reference 18. In Reference 18, swing restoration transistor provided at the output to take care of threshold voltage loss and the schematic of AND, OR and XOR gates using F1 and F2 functions are shown in Fig. 3. It increases the transistor count from 2 to 3 for the design of AND and OR yet the full swing operation can be achieved. The operational characteristics of AND, OR and XOR gates with the full swing are given in Table 3.

(i) XOR gate based on F1 and F2 needs a total of 9 transistors, which is twice that of the transistors required for GDI logic (without full swing, requires 4T) as seen from Fig. 3(c). Therefore, it cuts off the goal of GDI logic, i.e. function realization using minimal transistor. (ii) Due to increased transistor count, the overall input gate capacitance (C g ) of the XOR function increased since C g is a direct function of the number of transistor seen by the inputs.

B A A A.B

A+B

A

A B

(a)

(b) A

B A A

B B AB

A AB

A XORB

AB AB AB

(c) Fig. 3. Full swing gates based on F1 and F2 (a) AND (b) OR and (c) XOR.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

4

equal to |Vtp|, is obtained at the output, where Vtp is the threshold voltage of PMOS transistor. However, when AB = 11, the NMOS transistor becomes ON and PMOS transistor becomes OFF and passes ground potential at the output.

Table 3 Operational characteristics of AND, OR and XOR gate with full swing. A

B

AND

OR

XOR

‘0’ ‘0’ ‘1’ ‘1’

‘0’ ‘1’ ‘0’ ‘1’

Gnd Gnd Gnd VDD

Gnd VDD VDD VDD

Gnd VDD VDD Gnd



When AB = 01, PMOS transistor is switched ON and NMOS transistor is switched OFF. Therefore, VDD passes through PMOS transistor. On the contrary, the case occurs when AB = 10. In this case NMOS turns ON and PMOS turns OFF resulting in NMOS passes a poor ‘1’ signal which is about VDD-Vtn at the output, Vtn denotes the threshold voltage of NMOS transistor. The disadvantage of XOR circuits in Fig. 4(a) comes from the fact that the internal nodes do not have a full voltage swing due to threshold voltage drop. The operation of proposed XOR gate is explained as follows: The existing design lacks in full swing operation for two cases when AB = 00 and 10. The techniques presented in the literature directly use supply rail VDD for strong ‘1’ and VSS for strong ‘0’. But the proposed design does not use supply rails either GND or VDD for obtaining the perfect output. It uses input, but only with proper biasing of a necessary transistor, which may be either PMOS or NMOS. This in turn would depend on the input level, to mitigate the threshold voltage loss, which occurs in conventional XOR design. For AB = 00, transistor P1 (PMOS), P3 (NMOS) and P4 (PMOS) conduct. The P3 transistor is responsible for delivering strong ‘0’. Likewise, another case when AB = 10, transistor P2 (NMOS), P3 (NMOS) and P4 (PMOS) work for the given input, in which P4 passes strong ‘1’ to the output. Whereas in other cases, namely AB = 01 and 11, the transistors P3 and P4 do not change the output potential. Hence, the correct output for XOR gate is attained with the proposed design.

(iii) The intermediate nodes can be increased slightly, this might lead to a number of glitches which are the sources of power consumption. The realization of AND and OR gates with full swing can be possible using F1 and F2 functions, respectively, and operate relatively better than conventional design though not suitable for XOR realization. But XOR-XNOR circuits are basic building blocks in various arithmetic circuits such as adders, multipliers, compressors, comparators, parity checks, etc. as some of the examples. They provide an intermediate output to generate the final sum and carry the full adder. Also the importance of XOR-XNOR functions, implementations like an adder and multiplier, are well explored in Reference 19. Therefore, full swing XOR is necessary to drive successive stages reliably. 3.3. Proposed full swing XOR gate This subsection details about the proposed XOR gate to achieve full swing operation. It acts as one of the basic modules for the realization of three full adder designs and the performance of the designed adders are investigated under full swing XOR gate as one of the modules. The proposed XOR gate uses four transistors (excluding the inverter for complementary input signal) to provide full swing in the output. The design of XOR gate using GDI logic without and with full swing is shown in Fig. 4. The goal is to reduce the circuit complexity and to achieve faster cascaded operation. Before explaining the operation of proposed 4T XOR, GDI based XOR operation is discussed for understanding of its working.



Logic ‘1’:

3.4. Three full adder designs The design of GDI full adder with full swing can be made possible with the help of full swing gates such as AND, OR and XOR discussed in the previous section. This design completely eliminates the swing restoration buffers that results in improvement in the performance. Three possible full swing GDI full adders are

Logic ‘0’:

When AB = 00, NMOS transistor is switched OFF and PMOS transistor is switched ON. Therefore, the output, which is approximately

B B

A

P1

A

OUTPUT

P3

A

OUTPUT

B

P2

B

B

(a)

P4

A

(b) Fig. 4. XOR gate (a) using GDI logic and (b) proposed design.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

Fig. 5. Proposed full adder based on (a) Design 1 (b) Design 2 and (c) Design 3.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

5

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

6

designed by rewriting the full adder design expression Eqs. (1) and (2), to accommodate the full swing gates. These design’s expressions [Eqs. (3)–(8)] are given below and their schematic diagrams are given in Fig. 5. Design 1 The full adder’s Sum and Cout expressions are given in Eqs. 3 and 4, respectively.

Sum = C in ( A XOR B ) + C in ( A XNOR B )

(3)

C out = ( A XOR B ) C in + ( A XOR B )A

(4)

Design 1 uses XOR output as an intermediate result for computing Sum and Cout. Sum output can be attained by multiplexing the XOR and its inverted version XNOR through Cin input. The Cout is obtained by multiplexing the inputs A and C in whose output is controlled by the selection input, i.e. XOR output of A and B inputs. The presence of inverter on the critical path increases the delay of the whole circuit. This design is simple and requires a total of 18 transistors for realizing the full adder function. Design 2 The Sum and Cout expressions are represented in Eqs. 5 and 6, respectively. This design can be attained by means of XOR, AND and OR along with Multiplexer modules

Sum = A XOR B XOR C in

(5)

C out = C in ( A AND B ) + C in ( A OR B )

(6)

Cout function can be realized with the help of AND and OR gates in the case of Design 2 based full adder. AND and OR gates are designed based on F1 and F2, respectively. Multiplexing the AND and OR operation through Carry input Cin helps in Cout realization. The XOR operation on the inputs A, B and Cin achieves Sum function. It uses total 22 transistors for implementing Design 2 logic expression. Design 3 It is designed by considering the XOR, AND and OR gates and the Sum and Cout design expressions are given in Eqs. (7) and (8).

Sum = A XOR B XOR C in

(7)

C out = A AND B + ( A XOR B ) C in

(8)

Design 3 uses XOR module that plays an important role since Sum output can be achieved by XORing the inputs A, B and Cin. The output Cout is obtained with the help of AND and OR followed by XOR gate. The realization of AND and OR gate can be done with the help of full swing F1 and F2 gates. The GDI based F1 and F2 enables the implementation of AND and OR with only 3 transistors where as CMOS needs 6 transistors for achieving the same. The intermediate XOR gate output is used for computing Cout output. So totally, 23 transistors are needed for designing a full adder.

Table 4 Simulation results of XOR gate. Design

Power (nW)

Delay (ps)

Transistor count

Energy (e-18 J)

CMOS Morgenshtein et al. [15] Uma and Dhavachelvan [16] Morgenshtein et al. [18] Proposed XOR

547.3 403.1 396.2 381.8 283.6

23.2 22.0 21.1 20.2 7.5

12 8 8 9 6

12.6 8.8 8.3 7.7 2.1

are included in the power and delay calculations of the whole circuit. The size of PMOS transistor is twice that of NMOS transistor size. SPICE Simulations are performed in 45 nm technology with VDD = 1.1 V and a clock frequency of 100 MHz. Typical transistor sizes, i.e., (W/L)p = 120 nm/45 nm and (W/L)n = 120 nm/45 nm are used. The simulation results of proposed XOR gate along with the existing design reported in References 15,16,18 are shown in Table 4. Among the simulated designs, the proposed XOR gate outperforms in terms of delay, power consumption, transistor count and energy. On the other hand GDI based designs discussed in References 15,16,18 and proposed XOR gate performs better in all aspects than CMOS based design. The delay improvement in the proposed XOR gate is obtained by reduced transistor count on its critical path. The XOR design discussed in References 15,16 has more delay due to the presence of buffer in the output whereas the XOR gate in Reference 18 has longer critical path which results into slow down the operation. In respect of power consumption, the proposed XOR gate operates at least values since it has no direct path between the power supply and ground rails, which eliminates direct short circuit current. The transistor count is also reduced compared with the other full swing XOR gates reported in the literature. Finally regarding energy consumption as expected, proposed XOR is better among the other designs. Therefore, the choice of proposed XOR gate as a basic module in the arithmetic circuit, namely full adder, would gain the advantage in improving the performance metrics and can provide good driving capabilities for the subsequent stages. Hence, the performance analysis of proposed full adder designs along with existing full adder is investigated in the forthcoming sub-sections. 4.2. Simulation results of single full adder Full adders based on CMOS, CPL, hybrid logic are taken for comparison with the proposed designs. In addition to these full adders, adders based on XOR gate discussed in References 15,16,18 are also taken into account. CMOS logic consists of 28 transistors, which is considered as reference for comparison. It has a full voltage swing and buffered Sum and Cout signals. CPL, which is a variant of PTL, uses 32 transistors and provides both complementary and true output

4. Simulation results and comparison In this paper, full swing XOR gate is proposed and its performance is compared with the existing works cited in References 15,16,18. Three GDI full adders are designed based on the full swing AND, OR and XOR gates discussed in this paper and their performances are also compared with other adders found in the literature in terms of speed of operation, power consumption and circuit complexity. 4.1. Simulation results of XOR gate For simulation environment, two inverters with the same W/L are used to make output buffers. Power and delay of the inverters

Table 5 Simulation results of single full adder. Logic

Power (nW)

Delay (ps)

Transistor Count

Energy (e-18 J)

EDP (e-28 J sec)

CMOS CPL Hybrid Morgenshtein et al. [15] Uma and Dhavachelvan [16] Morgenshtein et al. [18] Design 1 Design 2 Design 3

975.6 2680 1613 1310 1685

46.2 38.8 35.21 41.3 49.13

28 32 24 20 20

45.1 103.9 56.8 54.1 82.7

20.8 40.3 19.9 22.3 40.6

1462 927.9 1140 1216

32.2 37.86 26.87 36.57

30 18 22 23

47.1 35.1 30.6 44.4

15.2 13.3 8.2 16.2

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

7

Table 6 Monte Carlo simulation results of full adder’s power and delay distribution. Design

CMOS CPL Hybrid Morgenshtein et al. [15] Uma and Dhavachelvan [16] Morgenshtein et al. [18] Design 1 Design 2 Design 3

Power

Delay

Min. (nW)

Max. (nW)

Mean(μ) (nW)

Std. Dev.( σ) (nW)

μ/σ

Min. (ps)

Max. (ps)

Mean (μ) (ps)

Std. Dev. ( σ) (ps)

μ/σ

893 2599 1550 1080 1508 2228 870 1084 1093

1041 3528 2532 2445 2065 2557 990 1217 1212

978 2721 1677 1678 1746 2412 930.2 1145 1146

21.6 106.4 108.5 293.0 80.5 51.0 18.9 20.8 21.4

45.2 25.5 15.4 5.7 21.7 47.2 49.2 55.0 53.5

47.2 31.9 66.8 38.5 46.1 28.3 38.5 22.6 35.4

78.3 129.7 970.1 55.0 58.2 49.1 54.6 31.8 49.6

56.5 45.9 217.2 44.2 50.3 77.7 44.4 27.2 41.1

2.5 9.9 115.8 2.9 2.2 3.8 2.2 1.1 2.1

22.6 4.6 1.9 15.2 22.8 20.4 20.2 24.7 19.5

of Sum and Cout signals. It uses the feedback transistors for providing full swing. The design which uses the combination of CMOS and PTL to generate Sum and Cout, respectively is called hybrid design. It uses 24 transistors in this regard, which lies between CMOS and transmission gate. For all possible input combinations applicable to the full adder, the average power consumption and worst case delay are measured. Table 5 summarizes the simulation results of single full adder. From the results in Table 5, it is very clear that CPL logic consumes relatively more power due to more number of transistors required for its design. In the case of hybrid design, this equally performs well with CMOS in terms of delay and power consumption. However, it takes reduced number of transistor count compared to CMOS for its design. Whereas the Three proposed GDI based full adders, especially Design 2 outperforms all the other adders in both delay and Energy Delay Product (EDP). This would have resulted due to reduced transistor count on the paths between input and output. This will also lead to the decrease in the parasitic capacitance at the Sum and Cout nodes. The area overhead of the three proposed adders is lower than that of conventional CMOS, CPL and hybrid adders taken for comparison. However, the proposed adders, namely Design 2 and Design 3, have slightly increased the transistor count compared to the full adders discussed in Reference 15,18. The performance metrics of all the simulated adders such as delay, power consumption, energy consumption and process variation analysis are discussed elaborately in the forthcoming sub-sections. 4.2.1. Delay The delay is measured by accounting the time taken from 50% of the input voltage swing to 50% of the output voltage swing for each transition. The maximum delay is treated as worst case delay [17]. The delay results of the simulated adders are given in Table 5. Comparing among three proposed adder designs, Design 2 has the lowest delay since Cout and Sum are computed in parallel. Also the improved delay in Design 2 would have resulted due to better driving capability of the proposed XOR gate. The adder design based on Design 2 operates faster by 34.9% 45.3% and 16.5%, respectively, than the adder based on XOR discussed in References 15,16,18. The presence of inverter in the critical path of Design 1 leads the design to have higher delay among the three proposed full adder. However, the Design 3 in terms of delay stands midway between Design 1 and Design 2 of the proposed full adder. The full adder based on XOR in Reference 15 has more delay. The low output voltage at internal nodes of full adder based on XOR in Reference 15 causes less driving capability result in more delay. Though the design discussed in Reference 16 operates at full swing, the presence of buffer in the critical path slows down the operation.

The adder based on F1 and F2 gates in Reference 18 reduces the delay than those in References 15,16 at the cost of more transistor counts. However, the speed is still lesser than the proposed adder Design 2.

4.2.2. Power consumption The power consumed by the adders are computed through simulation and also presented in Table 5. It reveals that the three proposed adders consume low power. Among the proposed adders, Design 1 consumes low power since it adopts the proposed XOR gate and requires minimum transistor count than the other two proposed designs. However, their power consumption is still slightly higher than Design 1, which is lower than other existing adders except CMOS based adder. The percentage of power savings attained with Design 1 than those in References 15,16,18 are 29.2, 44.9 and 36.5, respectively.

4.2.3. Energy consumption From the simulation results, it is observed that three proposed full adders consume small amount of energy, which is possible due to the presence of full swing gates in those designs. These gates will only switch the required transistor for the particular input. Hence, they consume less energy. Among the designs taken for simulation, Design 2 operates with less energy consumption. The amount of energy saving can be achieved with Design 2 is 32.1%, 70.5% and 46.1% than CMOS, CPL and hybrid, respectively. The adder in Reference 16 provides full swing only at output stage owing to the buffering whereas the intermediate nodes suffered by voltage drop like adder discussed in Reference 15. Therefore, the energy consumption of the adder increases significantly. In respect of full adder based on F1 and F2 gates in Reference 18, though it mitigates threshold drop at intermediate nodes, the overall energy consumption is high due to more transistor count required for design as shown in Table 5. The EDP of Design 2 is better than all other designs.

4.2.4. Process variation Due to device dimensions’ miniaturization as technology advances, process variation analysis of the circuits is necessary. Therefore, Monte Carlo simulations are carried out, in order to validate that the proposed designs have robustness against global and local process variations than the existing designs. The Monte Carlo simulation results of power and delay distribution of full adders are given in Table 6. The Monte Carlo simulation results of full adder power distribution of proposed and the existing designs are illustrated in the graphs as shown in Figs. 6 and 7, respectively. The value of μ/σ

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS 8

M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

Fig. 6. Monte Carlo simulation results of power distribution of proposed full adder based on (a) Design 1 (b) Design 2 and (c) Design 3.

measures the sensitivity of the circuits to process variation [18] where μ and σ represent the mean and standard deviation, respectively. The circuit which has more value of μ/σ denotes less variation with process changes. From the calculated μ/σ value, it is observed that the adder discussed in Reference 15 has more variation in power distribution and the full adder proposed as Design 2 in this manuscript has less variation in power distribution. The decreasing order of sensitivity to process variation among the adders taken for Monte Carlo simulation is Design 2, Design 3, Design 1, adder based on XOR in Reference 18, CMOS, CPL, adder based on XOR in Reference 16, hybrid and adder based on XOR in Reference 15. The Monte Carlo simulation results for delay distribution of proposed and the existing full adders are illustrated in the graph as shown in Figs. 8 and 9, respectively. With reference to μ/σ value, the decreasing order of delay variation, due to process changes,

among the simulated designs is Design 2, adder based on XOR in Reference 16, CMOS, Design 1, adder based on XOR in Reference 18, Design 3, adder based on XOR in Reference 15, CPL and hybrid. From the μ/σ values of delay distribution, the full adder based on F1 and F2 gates have higher sensitive to process variation than CMOS based design as discussed in Reference 18. It is observed from the delay distribution results that the full adder based on hybrid has more variation and the Design 2 adder has less variation. It can be concluded that Design 2 adder has higher immunity to process variation in both delay and power distribution. Three proposed full adder designs have advantages and also limitations. Design 1 is an optimal candidate for the applications in which minimum transistor count and low power is a design requirement. The Design 2 provides lower EDP and minimum delay, so it can be suitable for battery operated and real-time applications. It

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

9

Fig. 7. Monte Carlo simulation results of power distribution of full adder based on (a) CMOS (b) CPL (c) Hybrid (d) XOR in Reference 15 (e) XOR in Reference 16 and (f) XOR in Reference 18.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

10

Fig. 8. Monte Carlo simulation results of delay distribution of proposed full adder based on (a) Design 1 (b) Design 2 and (c) Design 3.

has slightly increased in transistor count compared with Design 1. Design 3 lies midway between Design 1 and Design 2, and offers lower delay than Design 1. From the obtained results, it can be concluded that all three designs operate with less energy consumption than existing adders taken for comparison. Hence, these designs can be suitable candidates for realizing energy efficient arithmetic applications. 5. Conclusion In this paper, three full adder designs that use as few as twenty transistors per bit are proposed. The design adopts full swing XOR, AND and OR gates to alleviate the threshold voltage problem and to enhance the driving capability for cascaded operation. The enhanced driving capability also facilitates lower voltage and faster operation which leads to less energy consumption. The proposed designs along with existing adder circuits are simulated using the SPICE simulation tool at 45 nm technology. The comparison is done

in terms of power consumption, propagation delay, transistor count, energy and EDP. The proposed three designs have lower energy consumption when compared with other designs presented in the literature. The process variation analysis of circuits is studied through Monte Carlo simulation. From the Monte Carlo simulation results, it is found that proposed adder based on Design 2 can operate reliably and has higher tolerance against process variation than the previously reported adder in the literature. Hence, these proposed designs may be suitable for low energy and high speed VLSI circuit applications. Acknowledgements This work is supported in part by the University Grants Commission (UGC) India, under the Junior Research Fellowship (JRF) scheme. The authors would like to thank the VIT University, Vellore, India, for providing support to carry out some of the simulation works at Integrated Circuit Design Laboratory.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

11

Fig. 9. Monte Carlo simulation results of delay distribution of full adder based on (a) CMOS (b) CPL (c) Hybrid (d) XOR in Reference 15 (e) XOR in Reference 16 and (f) XOR in Reference 18.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006

ARTICLE IN PRESS 12

M. Shoba, R. Nakkeeran/Engineering Science and Technology, an International Journal ■■ (2015) ■■–■■

References [1] A. Chandrakasan, R.W. Broderson, Low Power Digital CMOS Design, Kluwer Academic Publishers, 2002. [2] N.H.E. Weste, D. Harris, CMOS VLSI Design, Pearson Education, 2005. [3] V.G. Oklobdzija, D. Villeger, Improving multiplier design using improved column compression tree and optimized final adder in CMOS technology, IEEE Trans. VLSI Syst. 3 (2) (1995) 292–301. [4] A.M. Shams, D.K. Darwish, M.A. Bayoumi, Performance analysis of low power 1-bit CMOS full adder cells, IEEE Trans. VLSI Syst. 10 (1) (2002) 20–29. [5] M. Maeen, V. Foroutan, K. Navi, On the design of low power 1 –bit full adder cell, IEICE Electron. Expr. 6 (16) (2009) 1148–1154. [6] J.M. Rabey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuit, A Design Perspective, Prentice Hall, Englewood Cliffs, NJ, 2002. [7] S. Purohit, M. Margala, Investigating the impact of logic and circuit implementation for full adder performance, IEEE Trans. VLSI Syst. 20 (7) (2012) 1327–1331. [8] R. Singh, S. Akashe, Modeling and analysis of low power 10T full adder with reduced ground noise, J. Circuits Syst. Comput. 23 (14) (2014) 1–14. [9] R. Patel, H. Parasar, M. Wajid, Faster arithmetic and logical unit CMOS design with reduced number of transistors, Proc. of Intl. Conf. on Advances in Communication, Network and Computing 142 (2011) 519–522. [10] P.M. Lee, C.H. Hsu, Y.H. Hung, Novel 10-T full adders realized by GDI structure, Proc. IEEE Int. Symp. Integr. Circuits (2007) 115–118.

[11] K.K. Chaddha, R. Chandel, Design and analysis of a modified low power CMOS full adder using gate-diffusion input technique, J. Low Power Electron. 6 (4) (2010) 482–490. [12] I. Hassoune, D. Flandre, I. O’Connor, J. Legat, ULPFA: a new efficient design of a power-aware full adder, IEEE Trans. Circuits Syst. 57 (8) (2010) 2066– 2074. [13] M. Aguirre-Hernandez, M. Linares-Aranda, CMOS full adders for energy efficient arithmetic applications, IEEE Trans. VLSI Syst. 19 (4) (2011) 718–721. [14] G. Ramana Murthy, C. Senthil Pari, P. Velraj Kumar, T.S. Lim, A new 6-T multiplexer based full adder for low power and leakage current optimisation, IEICE Electronics Express 9 (17) (2012) 1434–1441. [15] A. Morgenshtein, A. Fish, I.A. Wagner, Gate Diffusion Input (GDI)-A power efficient method for digital combinatorial circuits, IEEE Trans. VLSI Syst. 10 (5) (2002) 566–581. [16] R. Uma, P. Dhavachelvan, Modified gate diffusion input technique: a new technique for enhancing performance in full adder circuits, Proc. ICCCS (2012) 74–81. [17] V. Foroutan, M. Taheri, K. Navi, A. Mazreah, Design of two low power full adder cells using GDI structure and hybrid CMOS logic style, Integration (Amst) 47 (1) (2014) 48–61. [18] A. Morgenshtein, I. Shwartz, A. Fish, Full swing Gate Diffusion Input (GDI) logic – case study for low power CLA adder design, Integration (Amst) 47 (1) (2014) 62–70. [19] S. Wariya, R. Nagaria, S. Tiwari, Performance analysis of high speed hybrid CMOS full adder circuits for low voltage VLSI design, VLSI Design (2012) 1–18.

Please cite this article in press as: Mohan Shoba, Rangaswamy Nakkeeran, GDI based full adders for energy efficient arithmetic applications, Engineering Science and Technology, an International Journal (2015), doi: 10.1016/j.jestch.2015.09.006