Novel low-cost and fault-tolerant reversible logic adders

Novel low-cost and fault-tolerant reversible logic adders

Computers and Electrical Engineering 53 (2016) 56–72 Contents lists available at ScienceDirect Computers and Electrical Engineering journal homepage...

2MB Sizes 127 Downloads 140 Views

Computers and Electrical Engineering 53 (2016) 56–72

Contents lists available at ScienceDirect

Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng

Novel low-cost and fault-tolerant reversible logic adders Mojtaba Valinataj∗, Mahboobeh Mirshekar, Hamid Jazayeri School of Electrical and Computer Engineering, Babol University of Technology, Shariati Street, Babol, P.B. 484, IRAN

a r t i c l e

i n f o

Article history: Received 3 September 2015 Revised 15 June 2016 Accepted 16 June 2016

Keywords: Reversible logic Parity preserving gates Fault-tolerance Low-power CMOS Binary coded decimal adder Carry skip adder Carry look-ahead adder

a b s t r a c t In recent years, reversible logic circuits have received considerable attention due to their diverse applications in various fields. As the computing systems are susceptible to different environmental effects which can impact their intended operations, having the faulttolerance capability is of great importance. In this paper, at first, a novel reversible gate is presented to achieve a parity preserving full adder which serves as the main building block of different adders. Further on, by using the proposed full adder and new arrangements of other reversible gates, some new low-cost fault-tolerant adders including binary coded decimal, carry skip and carry look-ahead architectures are presented. The new adders are highly efficient in the quantum cost, total logical calculation and transistor count compared to the existing designs. In addition, regarding other factors including the number of gates, garbage outputs and maximum delay, they are the best or among the favorite parity preserving reversible adders. © 2016 Elsevier Ltd. All rights reserved.

1. Introduction The processing elements based on the classical integration circuit designs will no doubt reach their physical limitations. However, before that, the high power dissipation in the VLSI circuits that are made according to the latest nanometer technologies has slowed down the scaling of these circuits based on the Moore’s law. Therefore, many researchers are investigating the new design paradigms such as reversible computing to overcome the current major challenges. In fact, the ordinary or irreversible computations always lead to energy dissipations. R. Landauer in 1961 [1] was the first to demonstrate that losing one information bit leads to an energy dissipation of kTLn2 jouls in which k is Boltzmann’s constant and T is the absolute temperature at which the computation is performed. In fact, some information is lost if it would not be possible to retrieve the input data from the output which is common in ordinary computations. After that, C. H. Bennett in 1973 [2] showed that no energy will be dissipated if a circuit only consists of reversible gates. Thus, reversible computing can be attractive to conquer one of the main challenges in computing systems by obtaining lower power circuits. As a result of the inherent reversibility of closed quantum mechanical systems, quantum circuits are also reversible by their nature [3]. Therefore, reversible logic circuits can be used to build quantum computers in nano scales in order to benefit from both the reduction in energy dissipation and to reach a higher computational speed. As in the reversible circuits no information is lost and the input vector can be recovered from the output vector, there should be a one to one correspondence between the input and output vectors which means that the number of inputs should be equal to the number of outputs. However, if a fault occurs inside a reversible circuit, the output vector will be incorrect and as a result the input vector cannot be retrieved. Thus, the fault-tolerance capability is vital for reversible circuits due to the fact that ∗

Corresponding author. E-mail address: [email protected] (M. Valinataj).

http://dx.doi.org/10.1016/j.compeleceng.2016.06.008 0045-7906/© 2016 Elsevier Ltd. All rights reserved.

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

57

these circuits, especially reversible low-power CMOS designs, are vulnerable to the environmental effects. A straightforward and low-cost method to achieve fault-tolerance is the use of the parity codes to detect errors in the output. This way, the parity preserving characteristic can be applied to reversible gates and circuits with a reasonable cost. In a parity preserving reversible gate, the parity of the input vector is equal to the parity of the output vector. However, as the feedback and fanout are not permitted in reversible circuits [4], the synthesis of these circuits, especially fault-tolerant designs, is different and more difficult than the irreversible circuits. So far, many reversible arithmetic operators have been designed to perform addition [5–12] and multiplication [3,13– 17] that some of them are parity preserving and thus beneficial for fault-tolerant reversible circuits. In addition, some designs include proposing new gates and then utilizing them in the proposed adder or multiplier circuit such as [9,10,12,13], and the others are based on exploiting the new arrangements of the existing gates. In this paper, we employ both types of reversible circuit designs to attain the intended fault-tolerant reversible adder architectures. First, we propose a new lowcost reversible gate that after adjustment of its inputs, it operates as a parity preserving full adder. This full adder requires the minimum hardware complexity or the number of logical operations to prepare the output amongst the existing parity preserving full adders. Thereafter, by incorporating the proposed full adder and the innovative organizations of other required reversible gates, some adder architectures with the aim of being low-cost and parity preserving are proposed including ripple carry adder (RCA), binary coded decimal (BCD) adder, carry skip (CSK) adder and carry look-ahead adder (CLA). In addition, we will introduce a more precise delay computation not presented before to be used in the evaluation process of the proposed designs compared to their counterparts. The rest of the paper is organized as follows. In Section 2 some preliminaries and the basic concepts mainly regarding the reversible gates are described. In Section 3 the related works, and in Section 4, the proposed reversible gate needed for designing the low-cost parity preserving full adder is explained. The new low-cost fault-tolerant reversible adder architectures are described in Section 5. Section 6 presents an approach for the transistor realization of the proposed gate and adder architectures. Section 7 illustrates a new delay estimation approach. The evaluation of the proposed reversible adders compared to the existing designs are presented and discussed in Section 8. Finally, some conclusions and future works are drawn in Section 9. 2. Preliminaries 2.1. Basic concepts A logic gate is reversible if there is a one to one correspondence between its input vector and output vector. Each output vector is uniquely determined from the input vector, and each input vector is uniquely recovered from the output vector. A reversible gate is denoted by n × n which means that the number of inputs is n and equals the number of outputs. A circuit is reversible if it only consists of reversible gates. In a reversible gate or circuit the garbage outputs are the outputs that would not be used in the subsequent computations. In other words, the outputs that are solely needed to maintain reversibility are called garbage outputs [7]. In addition, the constant inputs are the inputs whose values do not change in a gate and have to be maintained at either 0 or 1 in order for the gate to perform the intended function. These inputs are also added to a gate to make it reversible [18]. The quantum cost is defined as the number of 2 × 2 quantum primitives required to implement a reversible gate. All quantum primitives of the form 2 × 2 have a unit quantum cost such as CNOT (controlled NOT), V and V+ in which V is the square-root of NOT gate and V+ is its Hermitian. Thus, V × V = V+ × V+ = NOT and V × V+ = V+ × V = I that is an identity matrix. In addition, the NOT gate is a 1 × 1 quantum primitive that has no quantum cost [7]. The reversible gates bigger than 2 × 2 gates cannot be directly realized by quantum techniques. Therefore, the quantum primitives should be used for implementation of the bigger gates. Another criterion in the design of reversible circuits is the total logical calculation that includes the number of XOR, AND, and NOT operations appeared in the output expressions. Therefore, the hardware complexity is proportional to the number of logical operations required to prepare the outputs. In other words, it reflects the computational complexity of a reversible circuit. Another important criterion is the delay of a reversible circuit that is defined as the maximum number of gates on the paths from the inputs to the outputs [7]. So far many reversible gates are designed. However, some of them as the basic gates are more frequently used in the reversible circuits. In the following, these gates are introduced in two classes. 2.2. Simple reversible gates The most important simple gates are Feynman gate (FG) [19], Toffoli gate (TG) [20] and Peres gate (PG) [21]. The block diagram and quantum realization of FG which is also called CNOT are represented in Fig. 1. In addition, Table 1 shows its truth table. Based on the function of FG and its truth table, its reversibility is evident. This 2 × 2 gate can also be used as the equivalent of fan-out operation in reversible circuits when its B input is set to zero. The block diagram and quantum realization of TG are shown in Fig. 2 as a 3 × 3 reversible gate. This gate can be considered as a universal reversible gate that means any reversible logic circuit can be implemented only by using this gate after the proper setting of some inputs to one or zero to produce the Boolean functions. The block diagram of PG is shown in Fig. 3 as the lowest-cost universal

58

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Fig. 1. (a) Block diagram, and (b) quantum realization of Feynman gate.

Table 1 Truth table of Feynman gate. A

B

P

Q

0 0 1 1

0 1 0 1

0 0 1 1

0 1 1 0

Fig. 2. (a) Block diagram, and (b) quantum realization of Toffoli gate.

Fig. 3. (a) Block diagram, and (b) quantum realization of Peres gate.

gate. In fact, PG operation is a combination of FG and TG operations which also leads to a lower quantum cost. Based on Figs. 1–3, the quantum cost of FG, TG and PG is one, five and four, respectively. 2.3. Parity preserving reversible gates A gate is parity preserving if the following relation exists between the input vector Iv = {I1 , I2 , . . . , In } and the output vector Ov = {O1 , O2 , . . . , On }:

I1  I2  . . .  In−1  In = O1  O2  . . .  On−1  On

(1)

which means the parity of inputs is the same as the parity of outputs. Accordingly, if a fault occurs on one of the outputs, it can be detected. A circuit will be parity preserving if it is only constructed from the parity preserving gates. The parity preserving is one of the simple but efficient characteristics that make a reversible circuit fault-tolerant. Thus, it is beneficial to make the gates or circuits fault-tolerant in the form of parity preserving. According to Eq. (1) and the parity preserving characteristic, the used fault model in the parity preserving reversible circuits is the logical fault model in which the output deviations at the logic level are considered not taking the cause of faults into account. There are some basic parity preserving gates such as Fredkin gate (FRG) [22] and double Feynman gate (F2G) [23]. In addition, two examples of more recent parity preserving gates are modified Islam gate (MIG) [10] and new fault-tolerant gate (NFT) [24]. The block diagrams of these parity preserving gates are shown in Fig. 4. Among these gates, NFT gate, FRG and MIG are universal with the quantum cost of five, five and seven, respectively, based to [24, 22] and [10], and F2G has the quantum cost of two as it includes two FGs. 3. Related works 3.1. Fault-tolerant reversible full adders Different reversible full adder designs exist in the literature. However, some of them are parity preserving that naturally have a higher complexity and quantum cost. As shown in [5] a reversible full adder can be designed with at least one constant input and two garbage outputs. However, a parity preserving full adder requires at least two constant inputs and three garbage outputs [8,25]. Generally, the design of a reversible full adder can be performed in two ways. First, exploiting

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

59

Fig. 4. Block diagrams of parity preserving gates, (a) Fredkin gate, (b) Double Feynman gate, (c) new fault-tolerant gate, and (d) modified Islam gate.

a new arrangement of the existing gates, and second, creation of a new gate to perform the intended operation. In [26], a parity preserving full adder as one of the first fault-tolerant full adders is proposed which is a combination of four FRGs with the quantum cost of 20. Another multi-gate full adder design [9] uses four F2Gs and two FRGs thus has the quantum cost of 18, but requires six garbage outputs. The full adder presented in [12] requires three F2Gs and a NFT gate with the total quantum cost of 11. The designs proposed in [10,27,28] use a new gate to achieve a parity preserving full adder. In [10] after introducing a new gate, MIG, a full adder is constructed by using two MIGs with the quantum cost of 14. However, both full adder designs presented in [27,28] benefits from a new 5 × 5 gate that after setting their two last inputs to zero, operate as a parity preserving full adder but with the quantum cost of 14 and 8, respectively. It is worth mentioning that the introduced parity preserving full adders also differ in the other criterions that will be investigated later. 3.2. Fault-tolerant reversible adder architectures So far, many parity preserving reversible adder architectures have been proposed that among them ripple carry adders [10,12,27,28] are simply constructed by using the parity preserving reversible full adders described before, with the quantum cost that linearly increases with the adder size. However, the first parity preserving BCD adder [8] which uses a carry skip adder, as well, is a one-digit BCD adder with the quantum cost of 148. This adder requires a high amount i.e. 36 garbage outputs. Another BCD adder [11] has reduced the number of garbage outputs and quantum cost to 29 and 131, respectively. In [29] a more efficient BCD adder is proposed with a considerable reduction in the garbage outputs and quantum cost where they are reduced to 14 and 84, respectively. Besides, a more recent work [28] could reduce the quantum cost of this type of fault-tolerant adder to 61 by incorporating three new parity preserving gates but increased the number of garbage outputs to 18. Moreover, a CSK-based BCD adder is proposed in [28] with the quantum cost of 81. The first parity preserving CSK adder [26] with the size of four bits includes 24 FRGs thus requires the quantum cost of 120 in addition to producing 23 garbage outputs. However, other CSK adder architectures [10,12] propose more efficient designs with a lower number of garbage outputs and quantum cost. Furthermore, the only valid parity preserving CLA is proposed in [10] with the quantum cost of 73 for a two-bit adder. (Another proposed parity preserving CLA is introduced in [30] which cannot be used for comparison because of having a big mistake in its proposed structure).The existing parity preserving adder designs are also different in the number of required gates and the overall delay. However, in this paper, we propose more beneficial adder architectures based on different criteria. 4. New low-cost reversible gate Here, we introduce a new 5 × 5 reversible gate called low-complexity gate (LCG) that has a very low hardware complexity compared to previous similar gates proposed in [9,10,12,26–28] as well as a small quantum cost. This gate is used to construct a new low-cost parity preserving full adder. The block diagram and quantum realization of LCG are illustrated in Fig. 5. Regarding Fig. 5b, the quantum cost of LCG is 10 as it is constructed from 10 2 × 2 quantum primitives. At first glance, the LCG is not parity preserving. However, it has the characteristic that if two last inputs (D and E) are set to zero as the constant inputs, this gate operates as a parity preserving full adder. The block diagram of this gate operating as a parity preserving full adder and its truth table are shown in Fig. 6 and Table 2, respectively. According to Fig. 6, the LCG performs the add operation on the inputs A, B and C, and produces the corresponding outputs, sum and carry (Cout ), as required. In this figure, it should be noted that the output logic equation for Cout ((AB)CAB) is the same as the ordinary logic AB+AC+BC for the output carry. In addition, Table 2 shows the equality of input and output parities as required in a parity preserving full adder. Furthermore, it shows that the number of constant inputs (D and E) and the number of garbage outputs (P, Q and T) are two and three, respectively, where both are the minimum required amounts for a parity preserving full adder as proved in [10]. As stated before, the total logical calculation in a reversible circuit is defined as the number of XOR (α ), AND (β ), and NOT (γ ) operations required for preparing the outputs. This way, the total logical calculation or the hardware complexity of LCG is T = 6α + 2β according to Fig. 5a while this criterion is T = 9α + 3β + 1γ for ZPLG, the gate presented in [28] with the lowest quantum cost equal to eight that can operate as a parity preserving full adder according to Fig. 7. Thus, despite

60

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Fig. 5. (a) Block diagram, and (b) quantum realization of the new low-complexity gate.

Fig. 6. LCG adjusted as a parity preserving full adder.

Fig. 7. (a) Block diagram of ZPLG, and (b) realization of a parity preserving full adder [28].

Table 2 Truth table of the proposed parity preserving full adder. A

B

C

D

E

P

Q

Sum

Cout

T

0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 1 1 1 1

0 0 1 1 1 1 0 0

0 1 1 0 1 0 0 1

0 0 0 1 0 1 1 1

0 0 1 0 0 1 0 0

the fact that the proposed LCG has a more quantum cost compared to ZPLG, the implementation of appropriate low-power circuits with CMOS technologies by using LCG is more cost-effective. In fact, we will show later that the transistor realization of LCG requires much fewer transistors compared to ZPLG. Therefore, during circuit synthesis such as the adder design, the gates should be chosen by considering weather a quantum realization is intended or an implementation in CMOS technology is expected. It should be noted that in this paper to obtain a precise hardware complexity, the total logical calculations presented for different gates and circuits are obtained by assuming logic sharing in the outputs which means a common operation is accounted once. For example, AB is appeared in four of five outputs of LCG. However, its corresponding XOR

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

61

Fig. 8. General structure of a one-digit BCD adder.

Fig. 9. A four-bit ripple carry adder by using the proposed LCG.

gate is declared once (1α ) instead of four times (4α ). In fact, it is in accordance to the quantum realization since its relevant quantum primitive is commonly used by four outputs according to Fig. 5b. 5. Proposed parity preserving adder architectures 5.1. BCD adder A one-digit BCD adder (Fig. 8) is composed of three parts: a four-bit binary adder that can be a simple ripple carry adder, an overflow detection logic that determines the primary sum requires to be corrected or not, and a result correction logic that produces the correct sum by adding the primary sum with number 6 using another adder. In order to design a parity preserving reversible BCD adder, all three parts should be parity preserving. For the first part, a four-bit parity preserving RCA can be constructed by using four LCGs as depicted in Fig. 9 with the quantum cost of 40. However, the remaining parts of a BCD adder can be designed differently. Here, a new structure is proposed that performs the result correction in combination with the overflow detection, and this way, the total quantum cost is reduced. This structure is attained by a new arrangement of some parity preserving gates as shown in Fig. 10. In this figure, z3 z2 z1 z0 and z4 are the primary sum and carry, respectively. Accordingly, S3 S2 S1 S0 is the corrected sum and Cout is the output carry of the BCD adder. It should be noted that in Fig. 10 and similar figures, the unconnected or unnamed outputs of the gates are the garbage outputs, and the inputs fixed to zero or one are the constant inputs. The first proposed parity preserving BCD adder architecture is shown in Fig. 11. This adder is constructed by connecting the circuits in Figs. 9 and 10. However, we can reduce the quantum cost even more and reach the second BCD adder if we use ZPLG instead of LCG although the total logical calculation will increase. In addition, in the second proposed BCD adder depicted in Fig. 12, another new combined structure for overflow detection and result correction is utilized. Despite the fact that this combined structure requires one more gate compared to the structure shown in Fig. 10, both have the same quantum cost. In addition, the second structure includes a different parity preserving gate called ZCG that its block diagram is shown in Fig. 13. As will be shown later, the proposed BCD adders are the best in some criterions amongst the existing designs. 5.2. Carry skip adder The general structure of a four-bit CSK adder is depicted in Fig. 14. In this figure, Pi is the propagate signal and equals ai bi as a function of the input operands’ bits. If Pi equals one, it means that its input carry ci is transmitted to the next bit

62

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Fig. 10. Proposed combined structure for overflow detection and result correction in a BCD adder.

of the adder. Therefore, the output carry cout of a four-bit CSK adder is computed as P.cin + c4 in which P equals P1 P2 P3 P4 . Thus, the input carry cin can skip the full adders if P equals one. However, to properly implement the carry skip logic by reversible gates, Eq. (2) should be used:

cout = P.cin  P.c4

(2)

The design of a low-cost parity preserving CSK adder depends on the structure of carry skip logic as well as the incorporated full adders. The required AND and XOR operations in the carry skip logic can be implemented by using either parity preserving gates of FRG or NFT. This way, the designs in [12,26] use FRG and the design presented in [10] uses NFT gate. However, a F2G is required to produce two fan-outs for cin . Our proposed low-cost parity preserving CSK adder is depicted in Fig. 15 in which NFT gates are used for the carry skip logic. However, the proposed carry skip circuit is different from [10] and requires a quantum cost of 22 instead of 24 which is needed in [10]. Furthermore, it is possible to reduce the overall quantum cost of the proposed CSK adder to reach the second CSK adder if the full adders are implemented by ZPLG instead of LCG. A complete evaluation of the proposed CSK adders compared to the previous designs will be illustrated in Section 7. 5.3. Carry look-ahead adder In a carry look-ahead adder it is attempted that all internal carries are generated in parallel to eliminate the carry propagation delay incurred in a ripple carry adder. In a carry look-ahead adder, the generate signal Gi equals ai .bi as a function of the input operands’ bits. If Gi equals one, it means that the carry ci +1 is produced and transmitted to the next bit of the adder. However, the definition of the propagate signal Pi is the same as Pi in a CSK adder as stated before. In

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

63

Fig. 11. First proposed parity preserving BCD adder.

addition, each output sum bit Si is produced as ci Pi in which ci is the input carry for the ith bit of the adder and is computed as Gi- 1 + ci- 1 .Pi- 1 . Thus, for a four-bit carry look-ahead adder, after expansion of the carries, c1 to c4 are computed as follows in which c0 equals the input carry cin and c4 equals the output carry cout of the whole adder:

c1 = G0 + c0 P0

(3)

c2 = G1 + G0 P1 + c0 P0 P1

(4)

c3 = G2 + G1 P2 + G0 P1 P2 + c0 P0 P1 P2

(5)

c4 = G3 + G2 P3 + G1 P2 P3 + G0 P1 P2 P3 + c0 P0 P1 P2 P3

(6)

According to the logic of Pi and Gi , it is perceived that in each equation from Eq. (3) to Eq. (6), at most one term will equal one and the other terms will be zero. Therefore, all required OR operations can be replaced by XOR logic, thus are easily implemented by using reversible gates. The proposed four-bit parity preserving CLA is illustrated in Fig. 16 that also encompasses a two-bit CLA. In this figure, besides using the proposed LCG, a new arrangement of the basic gates is utilized which is different from the two-bit CLA presented in [10]. For the first two bits of the proposed CLA, only 14 gates are required compared to 19 gates required in [10]. It should be noted that in Fig. 16, one of F2Gs in the second bit of the adder is placed to produce an extra fan-out required in the third bit of the adder, thus it is not accounted for the proposed two-bit CLA. Furthermore, it is possible to reduce the overall quantum cost of the proposed CLAs if the full adders are implemented by ZPLG instead of LCG although the total logical calculation will be higher. A complete evaluation of the proposed CLAs will be shown in Section 7.

64

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Fig. 12. Second proposed parity preserving BCD adder.

Fig. 13. Block diagram of ZCG [28].

Fig. 14. General structure of a four-bit carry skip adder.

6. Transistor realization Some works such as [31,32] have presented transistor-based implementation of reversible logic circuits. An approach for electronic-based implementation of reversible circuits is presented in [31]. However, we extend that approach to implement our proposed parity preserving circuits. This type of implementation is beneficial for the evaluation of reversible circuits from transistor realization point of view. This way, the new criterion called the number of required transistors can be used

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

65

Fig. 15. Proposed four-bit parity preserving carry skip adder.

Table 3 Required number of transistors for basic reversible gates. Simple gates

Transistor count

Parity preserving gates

Transistor count

FG TG PG

8 16 24

F2G FRG NFT MIG

16 16 24 40

for comparing different reversible designs. Moreover, this type of implementation, in accordance with the adiabatic switching [33], can be used in the design of low-power CMOS circuits. In fact, the adiabatic switching intends that the transistors switch in a more energy efficient way by reusing the signals energy. Based on the approach proposed in [31], the implementation of the Q function (AB) of FG (Fig. 1) using on/off switches is illustrated in Fig. 17a. As both the input signals and their complement are required in this method, it is assumed in [31] that all input signals are entered along with their complement. This way, all internal and output signals are produced together with their complement signals, and thus, the NOT operations have no cost. Similar to [31], we use the CMOS transmission gate to implement the operation of on/off switches that eventually leads to the implementation of reversible gates and proposed adder designs maintaining their reversibility characteristic. The CMOS transmission gate can perfectly pass the low voltage (logical ’0 ) and the high voltage (logical ’1 ) without a drop in the input voltage. This gate is made of two pass transistors including an NMOS and a PMOS transistor that have the common source and drain. In addition, their control inputs are the complement of each other, i.e. if the control input of the NMOS transistor is equal to C then the control input of the PMOS transistor should be equal toCto act as a simple on/off switch. This way, if C equals one, both transistors conduct and therefore the transmission gate is on and connects its input and output lines, and if C equals zero, the transmission gate is off and disconnects its input and output lines. In other words, if the control input of a transmission gate equals one, the gate conducts and its corresponding switch will be close. Similarly, if the control input equals zero, the gate does not conduct and its corresponding switch will be open. In Fig. 17a, the control input of horizontal and vertical switches are B and B, respectively. The implementation of FG using CMOS transmission gates is shown in Fig. 17b. Regarding this figure, an XOR operation and as a result an FG requires four transmission gates or totally eight transistors. The design presented in Fig. 17 can be extended to be used for other reversible gates. For example, FRG needs two rings similar to Fig. 17a, and thus requires 16 transistors [31]. In addition, TG requires eight transmission gates and thus 16 transistors, as well. It is worth mentioning that as the P and Q outputs of TG have no cost, the cost of TG is the same as the cost of its R function as depicted in Fig. 18 [31]. To obtain the transistor count of our proposed parity preserving adders compared to the previous designs, we need to know the transistor count of other reversible gates, as well. Therefore, we investigated the transistor realization of some basic gates that led to the results presented in Table 3. Due to the fact that the transmission gates can operate in both directions, transistor-based reversible gates can exhibit their reversibility characteristic. Thus, in these gates it is possible to reversely feed the outputs to the circuit to obtain the initial inputs which can be simply performed in one stage for some basic gates such as FG, TG, F2G and FRG according to their truth tables. In addition to Table 3, it is mandatory to be aware of the transistor count of the full adders as the main basic blocks in different adder architectures. To do so, we illustrate the transistor realization of our proposed full adder, and then obtain the transistor count of other parity preserving full adders. As our parity preserving full adder uses the proposed gate in Fig. 5, LCG, its required number of transistors should be computed. The P output function of LCG has no cost as it is the same as

66

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Fig. 16. Proposed four-bit parity preserving carry look-ahead adder encompassing a two-bit carry look-ahead adder.

Fig. 17. (a) Implementation of XOR logic as the Q function of FG with on/off switches, (b) Realization of FG using CMOS transmission gates.

the first input. The Q function (AB) of LCG is the same as that of FG thus Fig. 17a is applicable. Besides, since the Q function is repeated inside the R, S and T functions of LCG, its cost (eight transistors) is accounted once. Thus, the R function can be stated as QC which requires eight transistors, as well. In addition, the S function can be stated as DQCAB that requires 32 transistors based on the implementation presented in Fig. 19. Finally, the T function can be represented as BES which requires twice as much transistor compared to the Q function. Therefore, the total transistor count of LCG is 64.

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

67

Fig. 18. Implementation of the R function (ABC) of TG using on/off switches.

Fig. 19. (a) Implementation of DQC, and then (b) use of it in producing the S function of LCG.

Table 4 Required gates and transistor count for different parity preserving full adders. Fault-tolerant full adder [26] [9] [10] [27] [34] [12] [28] Proposed design

Required gates 4 4 2 1 1 3 1 1

FRG F2G + 2 FRG MIG nameless gate F2PG F2G + 1 NFT ZPLG LCG

Transistor count 64 96 80 128 98 72 104 64

Table 4 shows the type and the number of gates utilized in different parity preserving full adders in addition to the required number of transistors. In this table, the transistor count of the full adders presented in [9,10,12,26] can be simply computed according to their required basic gates. However, to compute the transistor count of the full adders presented in [27, 28,34] which include only one complex gate, an approach similar to the one applied for LCG has been used based on their output functions. Thus, according to Table 4, our proposed full adder that uses one LCG requires the minimum number of transistors to be implemented in a low-power CMOS technology. 7. Proposed delay estimation Among the mentioned criteria, the delay of a reversible circuit can be redefined in order to reflect the difference between the similar designs more precisely. The first delay definition is presented in [7] as the maximum number of reversible gates on the paths from the inputs to the outputs of a circuit. Then, more recent works [3,12,15,25,28] utilized this definition according to the assumptions stated in [7]. Therefore, based on the first delay estimation which corresponds to the logical depth of a combinatorial circuit in terms of the number of gates, any-size reversible gates have the same unit delay which is not realistic due to having different quantum costs and logical operations. Besides, in this definition, all outputs of a gate have the same delay, as well. Thus, in this paper, we propose a more precise delay computation to be used in the evaluation process of different designs. In fact, we use a fine grain delay computation in which the quantum primitives construct the unit delays instead of the reversible gates in the previous coarse grain delay estimation. Accordingly, we assume each 2 × 2 quantum primitive (CNOT, V and V+ ) has one unit delay that causes different gates may have different delays due to the

68

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72 Table 5 Comparison of different parity preserving full adders. Fault-tolerant full adder

No. of garbage outputs

Precise delay Sum Carry

Quantum cost

Total logical calculation

Transistor count

[26] [9] [10] [27] [34] [12] [28] Proposed design

4 6 3 3 3 3 3 3

12 12 2 2 2 3 3 2

20 18 14 14 14 11 8 10

8α + 16β + 4γ 12α + 8β + 2γ 6α + 4β + 2γ 9α + 7β + 2γ 6α + 5β + 2γ 9α + 3β + 2γ 9α + 3β + 1γ 6α + 2β ∗

64 96 80 128 98 72 104 64



11 11 6 6 6 4 4 6

Based on Fig. 5a.

Table 6 Comparison of different overflow detection and result correction parts. Design

No. of gates

No. of garbage outputs

Simple delay (No. of gates on the critical path)

Precise delay (No. of quantum primitives on the critical path)

Quantum cost

[8]

11

19

6

17

92

[11]

9

17

8

24

75

[35] [29] [28] 1st Proposed circuit (Fig. 10) 2nd proposed circuit (a part of Fig. 12)

13 8 4 7

18 2 6 6

11 8 4 7

18 12 12 14

8

8

7

14

Total logical calculation

50α + 48β + 14γ

Transistor count

640 616

49 28 29 27

49α + 37β + 14γ 24α + 10β 14α + 10β + 2γ 21α + 8β + 2γ 16α + 8β + 3γ

272 128 280 160

27

22α + 8β + 3γ

208

fact that different number of quantum primitives may exist in their critical paths. In addition, the outputs of a gate do not necessarily have the same delay. For example, according to Fig. 2, the P function of TG has no delay, the Q function has a delay of two, and the R function has a delay of three. Therefore, if the outputs of TG are connected to the inputs of other gates in a circuit, different path delays are produced that should be considered in each circuit. 8. Results and discussion To perform a precise comparison between different reversible logic circuits, some criterions are used including the number of gates, the number of constant inputs, the number of garbage outputs, delay, quantum cost, and total logical calculation. Among these criterions, the first three criterions affect the later ones. Thus, delay, quantum cost, and total logical calculation are more important in reversible logic circuits. In addition, as we have presented the transistor realization of the proposed LCG in addition to the other reversible gates, the transistor count can be used as an important criterion especially when we want to design the low-power CMOS circuit counterparts of the reversible circuits. Table 5 demonstrates comparative results of different full adders including our proposed full adder based on different criteria. In this table, the designs are ordered according to their introduction date starting from the older ones. In addition, the transistor count is presented again to make the comparison more precise and convenient. In Table 5, the delays for the output sum and carry are computed according to the proposed precise delay computation that makes the differences more realistic. This way, the precise delay of a specific output is equal to the maximum number of quantum primitives on the paths from the inputs to the output. Furthermore, the total logical calculation criterion is according to the approach presented in Section 4. In each column, the bold numbers are the minimum which makes them the best amounts of the appropriate criterion, thus our proposed full adder is the best in four out of six criterions solely or commonly with some other full adders. However, the most recent design, ZPLG [28], is the best in three criterions. In addition, our proposed full adder has distinctively the minimum total logical calculation equal to 6α + 2β besides requiring the minimum transistor count of 64. It is worth mentioning that as a RCA is composed of full adders, the obtained results are applicable to the RCAs, as well. Before comparing different parity preserving BCD adders, at first, the overflow detection and result correction parts of the BCD adders are compared as depicted in Table 6. As stated before, in this paper a combined structure is proposed for joint overflow detection and result correction. However, among previous works, only [28,29] have presented a combined structure

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

69

Table 7 Comparison of different parity preserving BCD adders. 1-digit BCD adder

No. of gates

No. of garbage outputs

Simple delay

Precise delay

Quantum cost

Total logical calculation

[8]

23

36

10

29

148

[11]

13

29

11

34

131

[35] [29]

30 12

40 14

23 12

35 25

118 84

[28] 1st Proposed adder (Fig. 11) 2nd proposed adder (Fig. 12)

8 11

18 18

8 11

19 27

61 67

50α + 38β + 10γ 57α + 20β + 6γ 40α + 16β + 3γ

696 416

12

20

11

23

59

58α + 20β + 7γ

624

Total logical calculation

82α + 72β + 22γ 85α + 65β + 22γ 58α + 24β

Transistor count 1088 1128 656 640

Table 8 Comparison of different parity preserving CSK adders. 4-bit carry skip adder

No. of gates

No. of garbage outputs

Simple delay

Precise delay

Quantum cost

[26]

24

23

10

23

120∗

[10]

14

19

5

13

80

[12]

21

16

7

12

66

1st Proposed adder (Fig. 15) 2nd proposed adder

9

17

5

13

62

9

17

5

9

54



48α + 96β + 24γ 40α + 28β + 16γ 46α + 28β + 12γ 38α + 20β + 8γ 50α + 24β + 12γ

Transistor count 384 448 368 368 528

With the corrected fan-out it requires the quantum cost of 122.

which led to better results as perceived from Table 6. The circuit proposed in [35] regarding its introduction date is newer than the circuit presented in [29]. However, its results are shown in Table 6 because it has achieved some better results compared to the other precedence designs. According to this table, both proposed circuits require the minimum quantum cost of 27 among all existing designs. In addition, in Table 6 and the next tables two different circuit delays are presented. The first one, called simple delay, is in accordance to the delays presented in the previous works, and equals the number of gates on the critical path. However, the second is the precise delay and depends on the number of quantum primitives on the critical path, as stated before. The comparative results for different one-digit parity preserving BCD adders including the proposed designs are represented in Table 7. To perform a fair comparison, only the BCD adders that incorporate the RCA as their internal adder are chosen similar to the proposed BCD adders. In Table 7, the bold numbers are the minimum amounts in each column. As stated before, the delay and the implementation cost are more important among different criteria. Thus, regarding Table 7, the design presented in [28] has the minimum delay, the first proposed BCD adder (Fig. 11) requires the minimum total logical calculation and transistor count, and the second proposed BCD adder (Fig. 12) has the minimum quantum cost among all existing designs. In other words, the first proposed BCD adder distinctively requires the minimum total logical calculation equal to 40α + 16β + 3γ . In addition, it remarkably requires the minimum transistor count of 416. It is worth mentioning that the second proposed BCD adder is the second best design respecting the transistor count with the transistor count of 624 among all existing designs in addition to being the second best design respecting the precise delay. Moreover, the second proposed BCD adder has achieved the quantum cost of 59 which is lower compared to that of the best of previous designs which is presented in [28]. Table 8 illustrates the comparative results for different four-bit parity preserving CSK adders. The CSK adder presented in [26] as the oldest parity preserving design requires the quantum cost of 122 after correcting the fan-out for the input carry. The design in [10] requires the quantum costs of 56 and 24 for the incorporated four-bit RCA and the carry skip logic, respectively. It utilizes eight MIGs for the four-bit RCA, and two F2Gs and four NFT gates for the carry skip logic. In [12] a parity preserving CSK adder is presented that in spite of using more gates has reduced the precise delay, quantum cost, and transistor count compared to [10]. This CSK adder exploits one F2G and four FRGs to realize the carry skip logic with the quantum cost of 22. However, our proposed parity preserving CSK adders only require nine gates which among them one F2G and four NFT gates are used to realize the carry skip logic with the quantum cost of 22. The remaining gates are four LCGs and four ZPLGs for the first and the second proposed CSK adders, respectively. In other words, the second proposed CSK adder is a circuit that utilizes ZPLG instead of LCG.

70

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Table 9 Comparison of different parity preserving CLAs. Carry look-ahead adder

No. of gates

No. of garbage outputs

Simple Delay

Quantum cost

2-bit [10]

4 MIG + 10 F2G + 5 NFT = 19

28

12

73

1st proposed 2-bit (Fig. 16) 2nd proposed 2-bit 1st proposed 4-bit (Fig. 16) 2nd proposed 4-bit

2 LCG + 7 F2G + 5 FRG = 14

20

9

59

2 ZPLG + 7 F2G + 5 FRG = 14 4 LCG + 18 F2G + 14 FRG = 36

20 56

9 21

54 146

4 ZPLG + 18 F2G + 14 FRG = 36

56

21

138

Total logical calculation

47α + 23β + 14γ 36α + 24β + 5γ 42α + 26β + 7γ 88α + 64β + 14γ 100α + 68β + 18γ

Transistor count 440 320 400 768 928

Regarding Table 8 where the bold numbers are the best amounts in each column, both proposed CSK adders are the best in four out of seven criterions solely or commonly with some other designs. However, among more important criteria, the first proposed adder requires the minimum total logical calculation and the minimum transistor count. In addition, its quantum cost is lower than that of all previous designs. Furthermore, the second proposed adder requires the minimum quantum cost and incurs the minimum precise delay among all existing designs. In general, the proposed CSK adders altogether represent the best and minimum amounts in all criterions except the number of garbage outputs compared to the previous designs. Table 9 illustrates the comparative results for different parity preserving carry look-ahead adders. In this table, the number of each gate type is shown in the column titled “No. of gate” to simplify the understanding of differences between two-bit and four-bit adders. The only existing parity preserving CLA presented in [10] with the size of two bits requires the quantum cost of 73 utilizing 19 gates. However, based on Table 9, our proposed two-bit parity preserving CLAs composed of 14 gates are better in all criterions compared to that of [10]. It should be noted that in Table 9, the second proposed adders differ from the first proposed adders in utilizing ZPLG instead of LCG. Among the proposed CLAs, the first proposed two-bit (four-bit) adder requires the minimum total logical calculation and minimum transistor count, and the second proposed two-bit (four-bit) adder requires the minimum quantum cost among two-bit (four-bit) adders. In addition, it is perceived that the four-bit adders require more than twice the number of gates, total logical calculation, transistor count, and quantum cost compared to their two-bit counterparts that is usually true for CLAs including irreversible circuits, as well. 9. Conclusion and future work In this paper, some novel fault-tolerant reversible adders were presented with the aim of being both low-cost and parity preserving. In this manner, first, a new low-cost reversible gate was proposed to be used as a parity preserving full adder. Then, by incorporating the new arrangements of the existing parity preserving reversible gates and the proposed full adder, the new low-cost parity preserving adder architectures were presented including RCA, CLA, BCD adder and CSK adder. In addition, by investigating the logical operations appeared in the outputs of the parity preserving gates, and the properties of the CMOS transmission gate, an efficient approach was illustrated for the transistor realization of the parity preserving full adders in addition to the basic gates that finally results in the transistor realization of different reversible adder architectures. This type of implementation can be used in the low-power CMOS circuits, as well. Furthermore, a more precise delay computation was proposed and utilized in the evaluation of different reversible adders. The comparison of the proposed parity preserving adders with the existing designs showed that the new adders are favorably efficient in different criteria especially in the quantum cost, total logical calculation and transistor count. For future work, the proposed designs and approaches can be extended for designing different types of low-cost and fault-tolerant reversible multipliers. References [1] Landauer R. Irreversibility and heat generation in the computational process. IBM J Res Dev 1961;5:183–91. [2] Bennett CH. Logical reversibility of computation. IBM J Res Dev 1973:525–32. [3] Pouraliakbar E, Haghparast M, Navi K. Novel design of a fast reversible Wallace sign multiplier circuit in nanotechnology. Microelectron J 2011;42:973–81. [4] Perkowski M, Al-Rabadi A, Kerntopf P, Buller A, Chrzanowska-Jeske M, Mishchenko A, et al. A general decomposition for reversible logic. In: Proc. RM; 2001. p. 119–38. [5] Islam MS, Islam R. Minimization of reversible adder circuits. Asian J Inf Technol 2005;4(12):1146–51. [6] Babu HMH, Chowdhury AR. Design of a compact reversible binary coded decimal adder circuit. J Syst Arch 2006;52(5):272–82. [7] Biswas AK, Hasan MM, Chowdhury AR, Babu HMH. Efficient approaches for designing reversible binary coded decimal adders. Microelectron. J. 2008;39:1693–703. [8] Islam MS, Begum Z. Reversible logic synthesis of fault tolerant carry skip BCD adder. J Bangladesh Acad Sci 2008;32(2):193–200. [9] Haghparast M, Navi K. Design of a novel fault tolerant reversible full adder for nanotechnology based systems. World Appl. Sci. J. 2008;3(1):114–18. [10] Islam MS, Rahman MM, Begum Z, Hafiz MZ. Fault tolerant reversible logic synthesis: carry look-ahead and carry-skip adders. In: Intl. Conf. advances in computational tools for engineering applications (ACTEA); 2009. p. 396–401. [11] Haghparast M. Design and implementation of nanometric fault tolerant reversible BCD adder. Aust J Basic Appl Sci 2011;5(10):896–901. [12] Mitra SK, Chowdhury AR. Minimum cost fault tolerant adder circuits in reversible logic synthesis. In: 25th IEEE Intl. Conf. VLSI design (VLSID); 2012. p. 334–9.

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72 [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35]

71

Islam MS, Rahman MM, Begum Z, Hafiz MZ. Low cost quantum realization of reversible multiplier circuit. Inf Technol J 2009;8:208–13. Haghparast M, Mohammadi M, Navi K, Eshghi M. Optimized reversible multiplier circuit. J Circ Syst Comput 2009;18(2):311–23. Moghadam MZ, Navi K. Ultra-area-efficient reversible multiplier. Microelectron J 2012;43:377–85. Babazadeh S, Haghparast M. Design of a nanometric fault tolerant reversible multiplier circuit. J Basic Appl Sci Res 2012;2(2):1355–61. Bhardwaj K, Deshpande BM. K-Algorithm: an improved Booth’s recoding for optimal fault-tolerant reversible multiplier. In: 26th Int. Conf. VLSI design (VLSID); 2013. p. 362–7. Maslov D, Dueck GW. Reversible cascades with minimal garbage. IEEE Trans CAD Integr Circuits Syst 2004;23(11):1497–509. Feynman R. Quantum mechanical computers. Optics News 1985;11:11–20. Toffoli T. Reversible computing. Tech. memo MIT/LCS/TM-151. MIT Lab for Comp. Sci.; 1980. Peres A. Reversible logic and quantum computers. Phys Rev 1985;32:3266–76. Fredkin E, Toffoli T. Conservative logic. Int J Theor Phys 1982;21:219–53. Parhami B. Fault-tolerant reversible circuits. 40th asilomar Conf. signals, systems, and computers, Pacific Grove, CA, October; 2006. Hagparast M, Navi K. A novel fault tolerant reversible gate for nanotechnology based system. Am J Appl Sci 2008;5(5):519–23. Islam MS, Rahman MM, Begum Z, Hafiz MZ, Mahmud AA. Synthesis of fault tolerant reversible logic circuits. IEEE Intl. Conf. testing and diagnosis (ICTD), April; 2009. Bruce JW, Thornton MA, Shivakumaraiah L, Kokate PS, Li X. Efficient adder circuits based on a conservative reversible logic gate. In: IEEE computer society annual Symp. on VLSI (ISVLSI); 2002. p. 74–9. Dastan F, Haghparast M. A novel nanometric fault tolerant reversible divider. Int J Phys Sci 2011;6(24):5671–81. Zhou R-G, Li Y-C, Zhang M-Q. Novel designs for fault tolerant reversible binary coded decimal adders. Int J Electron 2014;101(10):1336–56. Haghparast M, Shams M. Optimized nanometric fault tolerant reversible BCD adder. Res J Appl Sci Eng Technol 2012;4(9):1067–72. Babu HMH, Jamal L, Saleheen N. An efficient approach for designing a reversible fault tolerant n-bit carry look-ahead adder. In: 26th IEEE Intl. SOC Conf. (SOCC); 2013. p. 98–103. Van Rentergem Y, De Vos A. Optimal design of a reversible full adder. Int J Unconv Comput 2005;1:339–55. Zhou R, Shi Y, Wanga H, Cao J. Transistor realization of reversible ‘‘ZS’’ series gates and reversible array multiplier. Microelectron J 2011;42:305–15. Koller J, Athas W. Adiabatic switching, low energy computing, and the physics of storing and erasing information. In: Workshop on physics and computation (PhysComp); 1992. p. 267–70. Qi X, Chen F, Zuo K, Guo L, Luo Y, Hu M. Design of fast fault tolerant reversible signed multiplier. Int J Phys Sci 2012;7(17):2506–14. Saligram R. Design and implementation of logical cost efficient nanometric fault tolerant reversible BCD adder. Annual IEEE India Conf. (INDICON); 2013.

72

M. Valinataj et al. / Computers and Electrical Engineering 53 (2016) 56–72

Mojtaba Valinataj received his B.Sc., M.Sc. and Ph.D. degrees from the University of Tehran, Tehran, Iran in computer engineering, in 20 0 0, 20 03 and 2010, respectively. He is working as a faculty member in Babol University of Technology, Babol, Iran since 2010. His research interests include fault-tolerant system design, on-chip networks, computer arithmetic, many-core systems, and reversible logic design. Mahboobeh Mirshekar received her B.Sc. and M.Sc. degrees from Shahed University, Tehran, Iran and Babol University of Technology, Babol, Iran in computer engineering, in 2011 and 2015, respectively. Her research interests include quantum computing, reversible logic design and fault-tolerant system design. Hamid Jazayeri received his B.Sc. and M.Sc. degrees in computer engineering from University of Tehran and Isfahan University, Iran, in 1996 and 20 0 0, respectively, and his Ph.D. from University Putra Malaysia in 2011. He is working as a faculty member in Babol University of Technology, Babol, Iran since 2002. His research interests include artificial intelligence, graph theory and reversible logic.