Stateful-NOR based reconfigurable architecture for logic implementation

Stateful-NOR based reconfigurable architecture for logic implementation

Microelectronics Journal 46 (2015) 551–562 Contents lists available at ScienceDirect Microelectronics Journal journal homepage: www.elsevier.com/loc...

2MB Sizes 0 Downloads 105 Views

Microelectronics Journal 46 (2015) 551–562

Contents lists available at ScienceDirect

Microelectronics Journal journal homepage: www.elsevier.com/locate/mejo

Stateful-NOR based reconfigurable architecture for logic implementation Pravin Mane n, Nishil Talati, Ameya Riswadkar, Ramesh Raghu, C.K. Ramesha Department of Electrical, Electronics & Instrumentation Engineering, BITS Pilani, K.K. Birla Goa Campus, Zuarinagar, Goa 403726, India

art ic l e i nf o

a b s t r a c t

Article history: Received 22 July 2014 Received in revised form 25 March 2015 Accepted 30 March 2015

Most commercial Field Programmable Gate Arrays (FPGAs) have limitations in terms of density, speed, configuration overhead and power consumption mostly due to the use of SRAM cells in Look-Up Tables (LUTs), configuration memory and programmable interconnects. Also, hardwired Application Specific Integrated Circuit (ASIC) blocks designed for high performance arithmetic circuits in FPGA reduce the area available for reconfiguration. In this paper, we propose a novel generalized hybrid CMOS-memristor based architecture using stateful-NOR gates as basic building blocks for implementation of logic functions. These logic functions are implemented on memristor nanocrossbar layers, while the CMOS layer is used for selection and connection of memristors. The proposed pipelined architecture combines the features of ASIC, FPGA and microprocessor based designs. It has high density due to the use of nanocrossbar layer and high throughput especially for arithmetic circuits. The proposed architecture for three input one output logic block is compared with conventional LUT based Configurable Logic Block (CLB) having the same number of inputs and outputs; which shows 1.82  area saving, 1.57  speedup and 3.63  less power consumption. The automation algorithm to implement any logic function using proposed architecture is also presented. & 2015 Elsevier Ltd. All rights reserved.

Keywords: Memristor IMPLY logic Stateful-NOR Reconfigurable logic FPGA

1. Introduction On the one side with skyrocketing price, high development time with current competitive technology and non-flexibility after fabrication of ASIC design, and on the other side with energy inefficiency and low performance of microprocessor based design, FPGA based design has emerged as an attractive alternative in variety of applications [1,2], which combine some of the best features of ASIC and microprocessor based design. FPGAs are highly flexible, cost-effective, require short application development time and can be fine-grain customized, but at the cost of programming overhead, configuration time and less density. Attempts have been made to improve the performance gap between ASIC and FPGA in terms of speed, area (density) and power consumption [3–8]. Specifically, monolithic stacking in 3D FPGA has shown improvement in density, speed, power consumption [9,10]. SRAM based FPGA designs are very common. But as minimum dimensions of CMOS transistors shrink down to 90 nm and below, the problems such as high leakage current will prevent SRAM from maintaining its state for a very long time which is important in FPGA

n Corresponding author. Mobile: þ91 9657711842. Tel.: þ 91 832 2580 255 (Office). E-mail address: [email protected] (P. Mane).

http://dx.doi.org/10.1016/j.mejo.2015.03.021 0026-2692/& 2015 Elsevier Ltd. All rights reserved.

(specifically configuration data) [11,12]. As a consequence, researchers are looking for new devices as a replacement of SRAM [13]. Recently, memristor, a new electrical element has emerged as an attractive alternative to the SRAM based memories. Also material implication, a fundamental logical operation has been experimentally realized using memristors. The material implication can be used to implement logic functions [14]. The dexterity of a memristive device to perform logical operations and retain its resistance value (even if the power is switched off) makes it a suitable candidate for future technologies [15]. Since the memristor is a passive element, it is essential to complement crossbar arrays of memristive devices with conventional CMOS arrays which provide energy, signal restoration and gain [16]. 3D CMOS-Memristor hybrid architectures combine advantages of both nanotechnology and 3D integration. Such architectures are reported in [6,17–21]. Memristor is a two terminal passive circuit element, theoretically first introduced by Leon Chua in 1971 on the fundamental basis of symmetry [22]. Out of six possible one-to-one relationships between four circuit variables (voltage v, current i, charge q and flux φ), five were well defined and one was missing (between q and φ). Leon Chua claimed that this missing relationship represents fourth passive element named as memristor (Fig. 1). He also showed that memristor behavior cannot be realized with interconnection of existing passive elements without the use of active source. The memristance (resistance of memristor) M of device at any instant of

552

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

2. Background 2.1. Memristor and CMOS-memristor hybrid circuits

Fig. 1. Memristor definition as missing relationship between q and φ postulated by Leon Chua.

time depends on past history of charge through it. Thus it acts as resistive memory. Memristor has shown hysteretic behavior under sinusoidal excitation, property shown by memory devices. The fabrication of memristor at HP labs using TiO2 material sandwiched between platinum plates doped partly by oxygen deficiencies to form TiO2  x and the linear ion drift model [23] accelerated research in memristor related applications. Practical memristors, however, are mostly nonlinear [24]. They change their memristance only when excitation crosses its threshold values. Because of the thickness of dielectric solid state material in nanometers (memristor thickness) and very strong electric field even with small voltages, memristors show two extreme values of resistance (depending on polarity of applied voltage) and hence can be used for storing binary information in the form of resistance. Such devices can be used as memory elements [25,8], programmable interconnect switches [6], logic elements [26] and latches [27]. The immediate applications of memristors are in the design of high density memories, reconfigurable designs (FPGAs) and neuromorphic systems [28–31,20]. The ideal memristor model suggested by Strukov et al. [23] does not account for all effects observed in practical devices [32,33]. In fact ideal memristors have never been fabricated. Two classes of Redox-based bipolar resistive switches namely electrochemical metallization (ECM) or conductive bridge memory (CBRAM) based on the formation of Cu or Ag filaments [34] and valence change mechanism (VCM) [35] based on formation of oxygen-deficient filaments in transition metal oxide are considered the key components for future memories and logic circuits [36,37]. More realistic models of such memristors are given by Pickett's physics-based nonlinear memristor model [38,39] and models derived from it in comparison to the linear memristor models based on Strukov's initial memristor model extended by different window functions [40,32]. The two terminal memory devices based on resistive switching are nothing but memristors [41]. Resistive switching memories have been reported in [29,42,43]. Because of its ability to modify memristance by changing value and/or direction of excitation, it can be used for computations. Thus, there is a scope for in-memory calculations in resistive memories. This is beyond von Neumann architecture [44] and it will be key to tackle memory bottleneck problem in many applications. In applications like memory and computational device, important characteristics of memristor include high speed, low power, good scalability, data retention, endurance, compatibility with conventional CMOS in terms of manufacturing, operating voltages and of course nonlinearity [24].

Memristor's resistance depends on the amount and direction of charge passed through it. It retains its resistance even if it is unpowered or the current through it and/or voltage across it is within its threshold limits. It imposes pinched hysteresis loop in its I–V characteristics, signature property of memory elements. Hence, it can be used as nonvolatile memory. Memristors are bistable devices which can be programmed between ON and OFF states. In this work, we have used current threshold based TEAM Model and Kvatinsky Window, since it is computationally efficient, highly nonlinear and sufficiently accurate [45]. It can be used to model real memristor by choosing the appropriate parameter values. Hence, this model is adaptive in nature like other models given in [46]. The current threshold levels are given by ION and IOFF. Memristors are ideal for realizing functions that need several transistors in a CMOS circuit, namely, a configuration-bit, flipflop and associated data-routing multiplexer [47]. Hybrid circuits consisting of CMOS and Memristor technologies are fully compatible in terms of both materials and processing techniques employed [29] and are successfully fabricated by integrating memristor-based crossbars onto a foundry-built CMOS platform using nanoimprint lithography (NIL) [47]. Multiple nanocrossbar layers are placed over a CMOS layer. The conceptual diagram of nanocrossbar layer is shown in Fig. 2. Single memristor from such nanocrossbar layer can be accessed as shown in Fig. 4. Nanocrossbar layers have an excellent scaling potential (10 GB/cm2) and exhibit high yield (99%), fast programming speed (in ps) [48–50], high ON/OFF ratio (103), long endurance (106) and retention time (5 months) [29], low energy (1 pJ/ operation), multiple-state operation, scalability, stackability [51]. 2.2. Reconfigurable circuits The use of memristor in programmable logic has shown promising improvements over LUT based FPGAs. In [52,53], programmable interconnects over CMOS logic block have shown improvement in terms of density and delay. CMOL FPGA architecture in [5] showed that reconfiguration can be done at nanowire layer with extremely high density. The Field Programmable Nanowire Interconnect (FPNI) architecture proposed in [6] improves FPGA architecture by lifting the configuration bit and associated components out of the semiconductor plane in CMOL architecture and replacing them in the interconnect with nonvolatile switches, which decreases both the area and power consumption of the circuit with advantages like simpler fabrication, more conservative process parameters and greater flexibility in the choice of nanoscale devices [6,7]. Tu et al. proposed modified CMOL

Fig. 2. Conceptual representation of nanocrossbar layer.

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

553

Table 1 IMPLY gate truth table P

Q

P-Q

0 0 1 1

0 1 0 1

1 1 0 1

Fig. 3. Selecting nanowire segment from CMOL architecture.

Fig. 5. IMPLY gate using memristor.

in the design of NOR gate which is the basic building block in our design (refer to Appendix A for mathematical analysis of IMPLY logic).

Fig. 4. Accessing single memristor [58].

3D architecture by sandwiching nanolayer between CMOS die to improve pin usage and density [17]. Stateful logic operations [54,26] can be performed in nanocrossbar memory [55]. Capacitive keeper circuit is used in [56] to improve the energy efficiency for any fan-in and fan-out to implement logic functions using stateful logic. Field Programmable Stateful Logic Array (FPSLA) uses basic AND operation proposed by Kim et al. and such many concurrent operations to build OR or NOR gate with large fan-in [8]. These units are connected through reconfiguration at nanolayer to implement any functionality in a pipelined fashion. The reconfigurable stateful-NOR gate proposed in [57] is used as basic building block in our architecture. The proposed architecture has 3 levels independent of functions and number of basic operations are fixed in one pipeline cycle. 2.3. Implication Logic Implication logic is the most natural operation that can be implemented using memristors. Truth table and circuit diagram of IMPLY logic are given in Table 1 and Fig. 5 respectively. In IMPLY logic implementation using memristors, the inputs and outputs are in the form of resistances. Logic ‘0’ refers to ROFF (high resistance state HRS) and logic ‘1’ refers to RON (low resistance state LRS). The inputs are written to the memristors P and Q (Fig. 5) and the final output is stored in the memristor Q. The output memristor (in this case, it is also one of the input memristors) toggles only if both the inputs are logic ‘0’. Also, only ‘0’ to ‘1’ transition is possible. These observations are used

3. Generalized architecture for logic function implementation The work carried out in this paper is based on NOR block consisting of (nþ 1) memristors, where n is number of inputs constrained by fan-in of gate. One should note that it is not possible to isolate a single memristor from the crossbar array (Fig. 2). Instead, it is accessed from the array using CMOS cells and mutually perpendicular nanowires (Fig. 4). Thus, the sneak path problem needs to be handled very carefully. In the proposed architecture, CMOS layer is used for memristor selection, connection, for writing to and reading from the memristor. The nanocrossbar layers are used for implementing logic functions. The interconnects are not required at nanocrossbar layer in this architecture. 3.1. The basic building block The schematic of implication based n-input stateful-NOR gate is shown in Fig. 6. It consists of n input memristors and one evaluation memristor. The following steps are involved in implementing NOR operation: 1. Initialization phase: ½ðM 1 ; M 2 ; …; M n Þ’data – Write data to input memristors, ðM Eval ’clearÞ – Clear the evaluation memristor. Switches are controlled by data input to individual input memristor (‘0’ or ‘1’) in Fig. 6 so that correct data will be written to all input memristors and logic ‘0’ to evaluation memristor simultaneously. Write control signal (WRC) is active during this phase. WR0 and WR1 are used to write logic ‘0’ and ‘1’ respectively using write circuit.

554

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

V ME val

Hn

Mn

Hn

M3

H3

M2

H2

M1

Fig. 6. Stateful-NOR logic circuit.

1

H1

Fig. 8. NOR block mapping to CMOL Scheme-1.

V1 V 2 V 3

Vn V n

1

Hn

ME va Mn Fig. 7. NOR logic block symbol.

M3

1

Hn

H3

M2

H2

M1

H1

2. Evaluation phase: ½VðM 1 ; M 2 ; …; M n Þ’V COND , ½VðM Eval Þ’V SET Þ – Evaluate the logic function. Lower terminals of input memristors and evaluation memristor are connected to ground through RG in Fig. 6. WRC is inactive during this phase. 3. Read phase: ðM Eval -V Read Þ – Read the resistance of evaluate memristor in the form of voltage. Evaluation memristor is connected to ground through RS to read its resistance value. Note that the evaluation memristor initialized to logic ‘0’ in step 1, toggles only if all the inputs are logic ‘0’ which is a characteristic to a NOR gate. The block shown in Fig. 7 is the symbolic representation of n-input stateful-NOR, used in subsequent description, where n represents the number of inputs of NOR gate, WRC is write control signal, RDC is read control signal (which applies read voltage VR to evaluation memristor), DO is the output of read operation. During evaluation, EVAL control signal (not shown in block explicitly) applies VCOND to n-input memristors and VSET to evaluation memristor of NOR block. There are two possible schemes to map NOR block to CMOL architecture as shown in Figs. 8 and 9 [59]. In scheme-1 shown in Fig. 8, all memristors on single vertical nanowire segment V and successive horizontal nanowire segments H1, H2, H3 etc. are used for NOR block. This method requires different write cycles for writing ‘0's and ‘1's and hence one extra clock cycle is required for operation. It is more compact and additional transistors are not required for shorting memristors in evaluation phase as in scheme-2. In other scheme shown in Fig. 9, memristors at crosspoint of successive horizontal and vertical nanowire segments (V1, H1), (V2, H2), (V3, H3) etc. are used to map NOR block. In this method, simultaneous writing of ‘0’ s and ‘1’ s is possible, hence one less clock cycle for operation is required. Additional speedup can be achieved in this scheme by using (Vn þ 1, Hn þ 1) and (Vn þ 2, Hn þ 2) alternatively as evaluation memristor. As one of these is isolated from NOR block after evaluation phase for read operation, the other can be used in NOR block to write another set of data simultaneously with read operation on previous evaluation memristor. This scheme, however, requires more area and additional transistors to short memristors in evaluation phase.

l

G V Fig. 9. NOR block mapping to CMOL Scheme-2.

For fixed resistance RG (used during evaluation phase) in Fig. 6 the constraint on fan-in of NOR gate can be determined as follows: 0 1 B V COND RG C  C V SET  B @ ROFF A RG þ n Z I ON ROFF

ð1Þ

Hence, nr

ROFF V COND RG  RG V SET  I ON ROFF

ð2Þ

where VSET and VCOND are the voltages applied to evaluate IMPLY logic during evaluation phase. ION and IOFF are the current thresholds in TEAM model as described in Section 2.1. RON and ROFF are the ON and OFF state resistances of memristors respectively. 3.2. Common blocks in CMOS layer Four major logic blocks are being used in this architecture to support the logic function implementation in nanolayer. In the figures used to describe these blocks, T1 and T2 are the terminals between which the memristor(s) would be connected. 3.2.1. Write block It is used to write data to the input memristors, default values to the evaluation and destination memristors and write results in destination memristors after the read phase. The circuit diagram is given in Fig. 12.

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

555

Fig. 10. Block diagram to implement 1-bit full adder using the proposed architecture.

DataIn1

Data1

WR1

twrite

Data3

Data2

teval

EVAL1

tread

RD1

Dout12

Dout11

DataIn2 WR2 EVAL2 RD2

Dout21

DoutX1

DataIn3 WR3

Fig. 13. Evaluation circuit.

EVAL3 RD3 DInDest

DResultY

DResultX

WRDest Fig. 11. Timing diagram of logic architecture.

DResult1

3.2.2. Evaluate block It applies VCOND to the input memristors and VSET to the evaluation memristor in evaluation phase. The circuit diagram is given in Fig. 13. 3.2.3. Read block It reads the state of the evaluation memristor in the form of voltage in read phase, and reads destination memristor if the results are further used in subsequent logic functions. The circuit diagram is given in Fig. 14.

Fig. 12. Write circuit.

3.2.4. Priority logic block Notice that, when the evaluation memristor of the 3-input NOR block toggles from logic low to high (i.e. all the inputs are logic zero), the evaluation memristors in all the NOR blocks in LEVEL1

556

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

small because nanocrossbar wires are interrupted due to via in CMOL architecture as shown in Fig. 3 (horizontal nanowire will be delimited in nanoimprint lithography). To write data on selected memristor, one end of memristor is connected to 7VWRITE and the other end to 8V WRITE as shown in Fig. 12, so the effective voltage is 7 2V WRITE . The voltage j V WRITE j is selected such that when applied across memristor, does not cross its threshold (voltage or current), but j 2V WRITE j does [60]. VREAD shown in read circuit of Fig. 14 is such that the memristor does not cross its threshold limit. The memristors during read operation or unselected memristors for write operation (one end of such memristor will be floating) will not cross their threshold limits and thus sneak path problem can be handled in such small sections of nanocrossbar. 3.4. Architecture for logic function implementation Fig. 14. Read circuit.

The block diagram of 3 input architecture is shown in Fig. 10 (it is for 1-bit full adder specifically, but structure will be similar for any other application). If the logic function has n inputs, the number of NOR blocks in LEVEL1 is given by n

C n þ n C n  1 þ ⋯ þ n C 2 þ n C 1 ¼ 2n  1

ð3Þ

n

where C k represents the binomial coefficient. Each k-input NOR block consists of k input memristors and one evaluation memristor. Hence, the total number of memristors in LEVEL1 of the architecture of an n-input logic function is       ðn þ 1Þ n C n þ ðnÞ n C n  1 þ ⋯ þ 2 n C 1 ¼ n2n  1 þ 2n  1 ð4Þ LEVEL2 consists of the Priority circuit. The number of NOR blocks after the priority block(s) in LEVEL3 is equal to the number of destination memristor(s). The number of memristors in these NOR blocks can be calculated as follows:

Fig. 15. PRIORITY circuit (I1-highest priority, I7-lowest priority).

toggle. Also, if any of the evaluation memristors in the 2-input NOR block toggles, 2 of the evaluation memristors in the 1-input NOR blocks toggle. To filter out these insignificant bits of data, we may use a priority logic (Fig. 15). This dramatically simplifies the logic operations in LEVEL3. The priority logic block takes as inputs all possible outputs of 3or less-input NORs (i.e., NOR(A,B,C), NOR(A,B), NOR(A,C), NOR(B,C), NOT(A), NOT(B), NOT(C)), and produces as outputs the min-terms m0–m6. Min-term m7 cannot be detected using this arrangement, so it is required to write the default value(s) at the destination memristor(s). The NOR Block takes 3 cycles- write, evaluate and read to execute its function. Had we used x0 y0 z0 ; x0 y0 ; y0 z0 ; x0 z0 ; x0 ; y0 and z0 as the basis for computing functions, we would have required more stages of NOR blocks in LEVEL3. More the stages of NOR blocks we use, more the cycles used for processing. Now, there are 256 different functions of 3 inputs possible. The priority block lies in the CMOS layer and is common to the computation of all these functions. Moreover, there is only one stage of NOR blocks required in LEVEL3. Hence, the processing time is optimized. 3.2.5. Other blocks In addition to it, transmission gates are used to connect/isolate these blocks in different phases. Decoders are used to select nanowires for function implementation. 3.3. Sneak path problem Since the implementation of the architecture is on nanocrossbar, sneak path problem arises. However the size of crossbar to be handled is

1. Find out the output of the function if all inputs were logic high. 2. The number of input combinations that produce a different output than the above output is the number of input memristors of the corresponding NOR block in addition to the destination memristor(s). In LEVEL1, the input data is given to the write block, which writes the data onto the memristors in each NOR block being used in LEVEL1. The outputs of these NOR blocks are given as the inputs to the priority circuit. The output of the priority circuit is given as the input to the NOR blocks in LEVEL3. Finally, the data will be written on the destination memristor (in terms of its resistance) according to the outputs of the NOR blocks in LEVEL3. 3.5. Timing analysis The timing diagram for the proposed architecture is shown in Fig. 11. The number after every signal name indicate the LEVEL number of the signal. One clock cycle can be saved at each level if we disconnect evaluation memristor of each NOR block after evaluation phase and replace it with another memristor. The timing performance of logic function implementation is decided by delays in write, evaluation and read blocks, and switching speed of memristors. The total delay is given by t total ¼ t Write_LEVEL1 þ t Evaluate_LEVEL1 þ t Read_LEVEL1 ð ¼ t Write_LEVEL2 Þ þ t Evaluate_LEVEL2 þ t Read_LEVEL2 ð ¼ t Write_LEVEL3 Þ þ t Evaluate_LEVEL3 þ t Read_LEVEL3 ð ¼ t Write_back_phase Þ

ð5Þ

where t Write_phase ¼ t Write_Block þ t Memristor_switching

ð6Þ

t Evaluate_phase ¼ t Evaluate_Block þ t Memristor_switching

ð7Þ

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

557

The parameter C0 used in Eqs. (11) and (12) is given by   C0 ¼

λ

2

þ

   nc λ D ni 2

ð13Þ

where λ is the transition/conduction/insulation region length; nc, nt, ni the conduction, transition, insulation region concentrations respectively; D the thickness of insulating material (TiO2) in memristor; γt the electron generating coefficient in the transition region. The rate of generation of electrons in transition region is given by d nt ¼ γ t ðnc  ni ðtÞÞEt ðtÞ dt

ð14Þ

Here in Eq. (14), Et(t) is the electric field in the transition region. The parameter β used in Eqs. (11) and (12) is given by  3  2   1 nc 2 nc 1 nc β¼ 1 þ 1 þ 1 ð15Þ 4 ni 3 ni 2 ni

Fig. 16. Turnoff delay as a function of doped length and voltage.

The parameter β has no physical significance. The voltage across memristor in the write cycle is fixed. However, the voltage in the evaluate cycle varies with the number of inputs to the block as given below: 0 1 B V ¼ V SET  V COND B @

C RG C ROFF RON A J RG þ n k k 

ð16Þ

where k is the number of memristors in ON state and n the number of inputs satisfying Eq. (2). For maximum value of V¼Vmax, k ¼0. Therefore the minimum switching times for memristor are given by t onmin ¼

Fig. 17. Turnon delay as a function of doped length and voltage.

C 20 2γ t β V max λ

ð17Þ

and t Read_phase ¼ t Read_Block

ð8Þ

t Write_back_phase ¼ t Write_Block þ t Memristor_switching

ð9Þ

The time taken by a single set of data for executing an operation is given by Eq. (5). Further, when it is required to compute n sets of data, the time taken need not be n  t total . Since the operation can be pipelined, the next data can be manipulated in the ðk  1Þth step when the previous data is being processed in kth step. Thus, the total timing for the operation would be t pipelined ¼ t total þ ðn  1Þ  t LEVEL1

ð10Þ

where t LEVEL1 ¼ t Write_LEVEL1 þ t Evaluate_LEVEL1 þ t Read_LEVEL1 The delay incurred in CMOS layer blocks can be found by the conventional methods of CMOS logic. The switching delays based on the physical properties of material and process involved in working of a memristor are given by [61] are C 20 t on ¼ 2γ t β V λ where ton is the time for ROFF to RON transition and     nc nc 1 D þ 2C 0 1  D ni ni t off ¼ 2γ t βλV

ð11Þ

ð12Þ

where toff is the time for RON to ROFF transition, V the voltage across memristor.

t off min

    nc nc D þ 2C 0 1  D 1 ni ni ¼ 2γ t βλV max

ð18Þ

Using Eqs. (17) and (18) with physical parameters given in [61] we calculated the turnon delay as 200 ps and turnoff delay as 60 ps even for intrinsic concentration of TiO2 as 1014. (The value of ni for TiO2 is 1 m  3 but in [61], it is given as a function of electric field with condition ni {nc, nc being equal to 8.75  1022 m-3.) The graphs for turnon and turnoff versus ni are given in Figs. 16 and 17 respectively using the above equations. In [48] the reported value of memristor switching is 120 ps which is close to average value we have calculated. The set and reset time for tantalum oxide based memristors are 105 and 120 ps respectively [49,50].

4. Automation algorithm The automation algorithm for implementation of any y input and z output logic function is given in Algorithm 1. This algorithm gives the port mapping to implement the given functionality according to the proposed architecture. In this algorithm, notation ABC1x:I=Onm is used to denote NOR logic blocks in LEVEL1, where ABC is the name of block or gate, x denotes the number of inputs to the block, I/O represents either input (I) or output (O) pin, n represents pin number and m represents the serial number of the NOR block. In LEVEL3, notation ABC3x:I=On is

558

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

Input=011

Table 2 Values of parameters used for the simulation Numerical value

RON ROFF IOFF ION αoff αon koff kon DMemristor LCMOS

100 Ω 200 kΩ 1 μA  1 μA 3 3  0:8 pm=s 0:8 pm=s 3 nm 65 nm

200k

100k

50k

0 Input=000 200k

Carry Sum

150k Resistance(Ω)

Parameter

Carry Sum

0

0.01

0.02 time(µs)

0.03

0.04

Fig. 20. Simulation results for Inputs ¼ 011 Outputs: Sum ¼ 0 (200 kΩ), Carry ¼ 1 (100 Ω).

Resistance(Ω)

150k Input=111

100k

200k

50k

Carry Sum

0 0

0.01

0.02 time(µs)

0.03

0.04

Fig. 18. Simulation results for Inputs ¼000 Outputs: Sum ¼ 0 (200 kΩ), Carry ¼0 (200 kΩ).

100k

50k

0

Input=001 200k

Resistance(Ω)

150k

Carry Sum

0

0.01

0.02 time(µs)

0.03

0.04

Fig. 21. Simulation results for Inputs ¼ 111 Outputs: Sum ¼ 1 (100 Ω), Carry¼ 1 (100 Ω).

Resistance(Ω)

150k Inputs: Array Iðn; mÞ: where n:1 to 2y, y is the number of inputs, m:1 to z, z is the number of outputs and Iðn; mÞ are the minterms; NOR blocks; priority block. Outputs: Number of NOR blocks with number of inputs at each level, ‘netlist’-like connection pattern. The overview of this algorithm can be given as follows.

100k

50k

0

0

0.01

0.02 time(µs)

0.03

0.04

Fig. 19. Simulation results for Inputs ¼001 Outputs: Sum ¼1 (100 Ω), Carry ¼0 (200 kΩ).

used to denote NOR logic blocks, where ABC is the name of block or gate, x denotes the number of inputs to the block, I/O represents either input (I) or output (O) pin and n represents pin number. Moreover, in LEVEL3, notation ABC3:I=On is used to denote NOT blocks, where ABC is the name of block or gate, 3 is the level number, I/O represents either input (I) or output (O) pin and n represents pin number.

1. Write the default values into the destination memristors. 2. Connect all the inputs, in the combinations of ‘a’, to ‘a’-input NOR blocks. The ‘a’-input implication-based NOR blocks produce minterms with ‘a’ being number of complemented inputs. 3. All the a input NOR blocks are grouped together to produce n C a outputs and these outputs are connected to consecutive inputs of the priority block in opposition to fetching minterms consecutively. This prioritizes the a-input implication-based NOR blocks over a  1, a  2, …- input implication-based NOR blocks. 4. Count the number of minterms that produce a different output than the output produced by the minterm with all standard inputs (i.e. all inputs are logic high).

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

5. Call NOR blocks with the above calculated count of inputs, each NOR block associated with an output of the function to be implemented, and connect the inputs to the outputs of the priority block corresponding to the minterms that produce a different output than the output produced by the minterm with all standard inputs. 6. The outputs of all the NOR blocks are inverted. 7. The outputs of the inverters go to the write signals of the write blocks (to enable/disable the write block) of the corresponding destination memristors, where the outputs of the function are stored. Here, the data signal to the write blocks is the negation of the default value (the destination memristor will toggle only if the output(s) of a particular input combination is (are) different than the default value(s), which is the optimal way of implementing any function).

Algorithm 1. Implication-NOR based logic design.

559

6. Conclusions We proposed a generalized reconfigurable architecture using stateful-NOR gate as basic building block for logic function implementation. Hybrid CMOS-memristor based FPNI architecture is modified by lifting some logic functions from CMOS layer to the nanocrossbar layer. The reconfigurable logic blocks are dynamic in proposed architecture (as they are configured in crossbar during runtime) as opposed to SRAM-based LUTs in commercial FPGAs. The comparison between proposed logic block with three inputs and one output (as a special case) with conventional LUT based CLB having the same number of inputs and outputs is given in Table 3. The transistor count given for LUT based design is native to CLB, while in proposed architecture the common CMOS blocks can be used for any configured logic block in time multiplexed manner and thus it greatly reduces the area requirement. The results tabulated show improvement in terms of area, delay and power dissipation using proposed reconfigurable architecture over conventional architecture. Ease in the fabrication of

Input Array Iðn; mÞ where n:1 to 2y, y is the number of inputs; m:1 to z, z is the number of outputs and Iðn; mÞ are the minterm; NOR blocks; PRIORITY block Output Number of NOR blocks with number of inputs at each level, ‘netlist’-like connection pattern 1 memristorðkÞ’Ið2y ; kÞ ▹ writing default values into the destination memristors 2 NOR1a:IðxÞð1 to y C a Þ’Iðcombinations of a from y inputsÞ ▹a:1 to y, x:1 to a 3 PRIORITY2ðð2y Þ 1Þ:IðuÞ’NOR1ðmÞ:OðxÞð1 to n C m Þ ▹u:1 to ðð2n Þ  1Þ ▹All the m input NOR blocks are grouped together and these nCa outputs are connected to consecutive inputs of the PRIORITY block in opposition to fetching minterms consecutively 4 for i’1 to z do 6 6 for j’1 to 2y  1 do 5 6 66 y 6 6 if Iðj; iÞ! ¼ Ið2 ; iÞ do 6 66 j 4 4 Increment number of inputs to ith LEVEL3 NOR block 7 8 for i’1 to z do 6 6 for j’1 to 2y  1 do 6 9 66 if Iðj; iÞ! ¼ Ið2y ; iÞ then 66 66 10 6 6 66 NOR3ðnumber of inputs to ith LEVEL3 NOR blockÞ:Iðpin_countÞ 666 6 ’PRIORITY2ð2y  1Þ:Oj 11 6 6 4 66 4 12 4 Increment pin_count 13 for i’1 to z do j 14 NOT3:Ii’NOR3ðnumber of inputs to ith LEVEL3 NOR blockÞ:Oi 15 for i’1 to z do j 16 memristorðiÞ’NOT3:Ii 17 return connection pattern 1. NOR1a:IðxÞð1 to y C m Þ’Iðcombinations of a from y inputsÞ ▹connection pattern between inputs and NOR inputs 2. PRIORITY2ð2y  1Þ:Id’NOR1a:Obs ▹connectionpattern between outputs of NOR blocks at LEVEL1 and PRIORITY block inputs 3. NOR3f :IðwÞðkÞ’PRIORITY2ð2y  1Þ:Id ▹connection pattern between outputs of PRIORITY at LEVEL2 and NOR block inputs

5. Simulation results 1-bit full adder using the stateful-NOR based architecture is implemented and simulated using current threshold based TEAM Model and Kvatinsky Window [45]. The values of the various parameters used in the TEAM model are listed in Table 2. The proposed design has been simulated in Cadence Analog Design Environment (ADE) with Spectre simulator. The results are shown in Figs. 18–21. The inputs and outputs for 1-bit pipelined full adder are given in Figs. 22 and 23 respectively. The adder design given in [44,54] use basic logic equations for sum and carry output and number of basic steps vary from function to function. However, the number of basic steps remain the same for any function in the proposed architecture.

nanocrossbar layers over CMOS layer using nanoimprint technology and high density of memristors in nanocrossbar makes such circuits suitable candidates for low cost, high density reconfigurable FPGAs. Also the automation algorithm for the proposed architecture is presented.

7. Future work The performance of proposed architecture will be verified with benchmark circuits as part of future work. The CAD tool design for this architecture is to be carried out.

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

C (volts)

B (volts)

A (volts)

560

3 2 1 0

Inputs t3 t1

t4

t2

0

0.2

3 2 1 t1 0 0

0.4 t3

t2

0.8

0.6

0.8

0.6

0.8

t4

0.2

3 2 1 t1 0 0

0.6

0.4 t3 t4

t2 0.2

0.4 time(µs)

Fig. 22. Inputs applied to 1-bit pipelined full adder.

Carry (Ω)

Outputs t1

200k 100k 0

t3 0

0.2

0.4

t4 0.6

t1

200k Sum (Ω)

t2

0.8 t4

100k 0

t2 0

0.2

t3

0.4 time(µs)

0.6

0.8

Fig. 23. Output of 1-bit pipelined full adder for different inputs. Output (t1) (Sum¼ 0, Carry ¼0) for inputs ABC ¼ 000, Output (t2) (Sum ¼1, Carry ¼0) for inputs ABC ¼ 010, Output (t3) (Sum ¼1, Carry ¼1) for inputs ABC ¼ 111, Output (t4) (Sum¼ 0, Carry ¼ 1) for inputs ABC ¼ 110.

Table 3 Conventional LUT based CLB and proposed stateful-NOR gate based logic block comparison. Parameters

3-Input LUT based CLB

3-Input logic block using stateful-NOR gate

Memristor count Transistor count Delay (ns) Power Dissipation (mW)

– 342 3.8 7.8

33 188 2.43 2.15

Acknowledgments

and M2 respectively. Voltage across RG is given by

This work is supported by DST-FIST grant (FIST SI NO. 133) for the Department of Electrical, Electronics & Instrumentation Engineering, BITS Pilani, K.K. Birla Goa Campus, Goa, by Government of India.

V RG ¼

V SET ðR1 J RG Þ V COND ðR2 J RG Þ þ ðR1 J RG Þ þR2 ðR2 J RG Þ þR1

ðA:1Þ

Voltage across R1 and R2 is given by V R1 ¼ V COND  V RG

ðA:2Þ

V R2 ¼ V SET V RG

ðA:3Þ

Appendix A. Analysis of material implication with memristor To find the value of voltage VSET, VCOND and RG in Fig. 5 to implement IMPLY operation as shown in Table 1, we use the following conditions. Let R1 and R2 be resistance of memristors M1

Let us consider four combinations of input and output from Table 1.

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

A.1. Case 1: R1 ¼ ROFF ; R2 ¼ ROFF ; final value of R2 ¼ RON V RG ¼

V SET ðROFF J RG Þ V COND ðROFF J RG Þ þ ðROFF J RG Þ þ ROFF ðROFF J RG Þ þ ROFF

ðA:4Þ

Eq. (A.17) can be written as   V R2 V SET  V COND o I ON ¼ 0:5 R2 R2

561

ðA:19Þ

If RON {RG {ROFF (necessary to satisfy all conditions) then Eq. (A.4) can be approximated as

Remember that V SET  V COND cannot be negative as V SET 4 V COND . Similar analysis is given in [8,44] but we have neglected state drift (due to often alternate read and write) and the number of memristors in each NOR block is limited to four (like 3 input LUT) so scaling of RG is not required. Also resultant equations are in terms of R1, R2, RG, VSET and VCOND which can be solved for getting RG, VSET and VCOND. (R1, R2 take values RON or ROFF).

V RG  0

References

To change the final value of R2 from ROFF to RON, current through R2 must cross ION. Hence V R2 V SET  V RG ¼ 4 I ON R2 R2

ðA:5Þ

ðA:6Þ

Eq. (A.5) can be written as V R2 V SET ¼ 4 I ON R2 R2

ðA:7Þ

After this step, circuit will follow case 4. A.2. Case 2: R1 ¼ ROFF ; R2 ¼ RON ; final value of R2 ¼ RON V RG ¼

V SET ðROFF J RG Þ V COND ðRON J RG Þ þ ðROFF J RG Þ þ RON ðRON J RG Þ þ ROFF

ðA:8Þ

To maintain the final value of R2 as RON, either I OFF oI R2 o I ON or I R2 may cross ION. Both conditions are satisfied by V R2 V SET  V RG ¼ 4 I OFF R2 R2

ðA:9Þ

If RON {RG {ROFF Eq. (A.8) can be approximated as V RG  V SET

ðA:10Þ

Eq. (A.9) can be written as I OFF o0

ðA:11Þ

A.3. Case 3: R1 ¼ RON ; R2 ¼ ROFF ; final value of R2 ¼ ROFF V RG ¼

V SET ðRON J RG Þ V COND ðROFF J RG Þ þ ðRON J RG Þ þ ROFF ðROFF J RG Þ þRON

ðA:12Þ

To maintain the final value of R2 as ROFF, either I OFF o I R2 o I ON or I R2 may fall below IOFF. Both conditions are satisfied by V R2 V SET  V RG ¼ o I ON R2 R2

ðA:13Þ

If RON {RG {ROFF Eq. (A.12) can be approximated as V RG  V COND

ðA:14Þ

Eq. (A.13) can be written as V R2 V SET  V COND ¼ o I ON R2 R2

ðA:15Þ

A.4. Case 4: R1 ¼ RON ; R2 ¼ RON ; final value of R2 ¼ RON V RG ¼

V SET ðRON J RG Þ V COND ðRON J RG Þ þ ðRON J RG Þ þ RON ðRON J RG Þ þ RON

ðA:16Þ

To maintain the final value of R2 as RON, I OFF o I R2 o I ON or I R2 may cross ION. Both conditions are satisfied by V R2 V SET  V RG ¼ 4 I OFF R2 R2

ðA:17Þ

If RON {RG Eq. (A.16) can be approximated as V RG  0:5ðV SET þ V COND Þ

ðA:18Þ

[1] S. Hauck, A. DeHon (Eds.), Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2007. [2] I. Kuon, R. Tessier, J. Rose, FPGA architecture: survey and challenges, Found. Trends Electron. Des. Autom. 2 (2) (2008) 135–253. [3] I. Kuon, J. Rose, Quantifying and Exploring the Gap Between FPGAs and ASICs, 1st edition, Springer Publishing Company, Incorporated, New York, USA, 2009. [4] K. Likharev, D. Strukov, CMOL: devices, circuits, and architectures, in: G. Cuniberti, K. Richter, G. Fagas (Eds.), Introducing Molecular Electronics, Lecture Notes in Physics, vol. 680, Springer, Berlin, Heidelberg, 2005, pp. 447–477. [5] D.B. Strukov, K.K. Likharev, CMOL FPGA: a reconfigurable architecture for hybrid digital circuits with two-terminal nanodevices, Nanotechnology 16 (6) (2005) 888–900. [6] G.S. Snider, R.S. Williams, Nano/CMOS architectures using a field-programmable nanowire interconnect, Nanotechnology 18(3) (2007) 035204:1–11. [7] D. Strukov, A. Mishchenko, Monolithically stackable hybrid FPGA, in: Design, Automation Test in Europe Conference Exhibition (DATE), 2010, pp. 661–666. [8] K. Kim, S. Shin, S.-M. Kang, Field programmable stateful logic array, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 30 (12) (2011) 1800–1813. [9] M. Lin, A.E. Gamal, Y.-C. Lu, S. Wong, Performance benefits of monolithically stacked 3D-FPGA, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 26 (2) (2007) 216–229. [10] C. Dong, D. Chen, S. Haruehanroengra, W. Wang, 3-D nFPGA: a reconfigurable architecture for 3-D CMOS/Nanomaterial Hybrid digital circuits, IEEE Trans. Circuits Syst. I: Regular Papers 54(11) (2007) 2489–2501. [11] W. Zhao, C. Gamrat, Y. Lhuillier, Nanocomputing block based multi-context FPGA, in: T.P. Plaks (Ed.), ERSA, CSREA Press, Las Vegas Nevada, USA, 2009, pp. 297–298. [12] O. Turkyilmaz, S. Onkaraiah, M. Reyboz, F. Clermidy, C. Anghel Hraziia, J.-M. Portal, M. Bocquet, RRAM-based {FPGA} for “normally off, instantly on” applications, J. Parallel Distrib. Comput. 74 (6) (2014) 2441–2451. [13] O. Kavehei, S. Al-Sarawi, K.-R. Cho, K. Eshraghian, D. Abbott, An analytical approach for memristive nanoarchitectures, IEEE Trans. Nanotechnol. 11 (2) (2012) 374–385. [14] E. Lehtonen, J. Poikonen, M. Laiho, Two memristors suffice to compute all Boolean functions, Electron. Lett. 46 (3) (2010) 239–240. [15] A. Chattopadhyay, Z. Rakosi, Combinational logic synthesis for material implication, in: IEEE/IFIP 19th International Conference on VLSI and Systemon-Chip (VLSI-SoC), 2011, pp. 200–203. [16] D. Strukov, D. Stewart, J. Borghetti, X. Li, M. Pickett, G. Ribeiro, W. Robinett, G. Snider, J. Strachan, W. Wu, Q. Xia, J. Yang, R. Williams, Hybrid CMOS/ memristor circuits, In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), 2010, pp. 1967–1970. [17] D. Tu, M. Liu, W. Wang, S. Haruehanroengra, Three-dimensional CMOL: threedimensional integration of CMOS/nanomaterial hybrid digital circuits, IET Micro Nano Lett. 2 (2) (2007) 40–45. [18] D.B. Strukov, K.K. Likharev, A Reconfigurable Architecture for Hybrid CMOS/ Nanodevice Circuits, In: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, FPGA '06, ACM, New York, NY, USA, 2006, pp. 131–140. [19] D.B. Strukov, R.S. Williams, Four-dimensional address topology for circuits with stacked multilayer crossbar arrays, Proc. Natl. Acad. Sci. 106 (48) (2009) 20155– 20158. arXiv:http://www.pnas.org/content/106/48/20155.full.pdfþhtml. [20] D. Strukov, 3D hybrid CMOS/memristor circuits: basic principle and prospective applications, in: Conference on Optoelectronic and Microelectronic Materials Devices (COMMAD), 2012, pp. 21–22. [21] D. Strukov, K. Likharev, Prospects for the development of digital CMOL circuits, in: IEEE International Symposium on Nanoscale Architectures, 2007. NANOSARCH 2007, 2007, pp. 109–116. [22] L. Chua, Memristor—the missing circuit element, IEEE Trans. Circuit Theory 18 (5) (1971) 507–519. [23] D.B. Strukov, G.S. Snider, D.R. Stewart, R.S. Williams, The missing memristor found, Nature 453 (2008) 80–83. [24] S. Kvatinsky, E. Friedman, A. Kolodny, U. Weiser, The desired memristor for circuit designers, IEEE Circuits Syst. Mag. 13 (2) (2013) 17–22. [25] Y. Ho, G. Huang, P. Li, Nonvolatile memristor memory: device characteristics and design implications, In: IEEE/ACM International Conference on ComputerAided Design—Digest of Technical Papers ICCAD, 2009, pp. 485–490.

562

P. Mane et al. / Microelectronics Journal 46 (2015) 551–562

[26] J. Borghetti, G.S. Snider, P.J. Kuekes, J.J. Yang, D.R. Stewart, R.S. Williams, Memristive switches enable stateful logic operations via material implication, Nature 464 (2010) 873–876. [27] P.J. Kuekes, D.R. Stewart, R.S. Williams, The crossbar latch: logic value storage, restoration, and inversion in crossbar circuits, J. Appl. Phys. 97 (3) (2005) 034301:1–5. [28] D.B. Strukov, K.K. Likharev, Defect-tolerant architectures for nanoelectronic crossbar memories, J. Nanosci. Nanotechnol. 7 (1) (2007) 151–167. [29] S.H. Jo, W. Lu, CMOS: compatible nanoscale nonvolatile resistance switching memory, Nano Lett. 8 (2) (2008) 392–397. http://dx.doi.org/10.1021/ nl073225h. [30] N. McDonald, R. Pino, P. Rozwood, B. Wysocki, Analysis of dynamic linear and non-linear memristor device models for emerging neuromorphic computing hardware design, in: The 2010 International Joint Conference on Neural Networks (IJCNN), 2010, pp. 1–5. [31] S.H. Jo, T. Chang, I. Ebong, B.B. Bhadviya, P. Mazumder, W. Lu, Nanoscale memristor device as synapse in neuromorphic systems, Nano Lett. 10 (4) (2010) 1297–1301. http://dx.doi.org/10.1021/nl904092h. [32] A. Ascoli, F. Corinto, V. Senger, R. Tetzlaff, Memristor model comparison, IEEE Circuits Syst. Mag. 13 (2) (2013) 89–105. [33] F. Corinto, A. Ascoli, Memristive diode bridge with LCR filter, Electron. Lett. 48 (14) (2012) 824–825. [34] I. Valov, R. Waser, J.R. Jameson, M.N. Kozicki, Electrochemical metallization memories fundamentals, applications, prospects, Nanotechnology 22(25) (2011) 254003:1–22. [35] R. Waser, R. Dittmann, G. Staikov, K. Szot, Redox-based resistive switching memories—nanoionic mechanisms, prospects, and challenges, Adv. Mater. 21 (25–26) (2009) 2632–2663. [36] J.J. Yang, D.B. Strukov, D.R. Stewart, Memristive devices for computing, Nat. Nanotechnol. 8 (1) (2013) 13–24. [37] M.D. Ventra, Y.V. Pershin, The parallel approach, Nat. Phys. 9 (4) (2013) 200–202. [38] M.D. Pickett, D.B. Strukov, J.L. Borghetti, J.J. Yang, G.S. Snider, D.R. Stewart, R.S. Williams, Switching dynamics in titanium dioxide memristive devices, J. Appl. Phys. 106 (7) (2009) 074508:1–6. [39] R. AWilliams, M. Pickett, J. Strachan, Physics-based memristor models, in: IEEE International Symposium on Circuits and Systems (ISCAS), 2013, pp. 217–220. [40] E. Linn, A. Siemon, R. Waser, S. Menzel, Applicability of well-established memristive models for simulations of resistive switching devices, IEEE Trans. Circuits Syst. I: Regular Papers 61 (8) (2014) 2402–2410. [41] L. Chua, Resistance switching memories are memristors, Appl. Phys. A 102 (4) (2011) 765–783. [42] H. Manem, G.S. Rose, X. He, W. Wang, Design considerations for variation tolerant multilevel CMOS/Nano memristor memory, in: Proceedings of the 20th Symposium on Great Lakes Symposium on VLSI, GLSVLSI '10, ACM, New York, NY, USA, 2010, pp. 287–292. [43] C. Yakopcic, T.M. Taha, G. Subramanyam, Hybrid crossbar architecture for a memristor based cache, CoRR abs/1302.6515. [44] S. Kvatinsky, G. Satat, N. Wald, E. Friedman, A. Kolodny, U. Weiser, Memristorbased material implication (imply) logic: design principles and methodologies, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22 (10) (2014) 2054–2066.

[45] S. Kvatinsky, E. Friedman, A. Kolodny, U. Weiser, TEAM: threshold adaptive memristor model, IEEE Trans. Circuits Syst. I: Regular Papers 60 (1) (2013) 211–221. [46] C. Yakopcic, T. Taha, G. Subramanyam, R. Pino, Generalized memristive device spice model and its application in circuit design, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 32 (8) (2013) 1201–1214. [47] Q. Xia, W. Robinett, M.W. Cumbie, N. Banerjee, T.J. Cardinali, J.J. Yang, W. Wu, X. Li, W.M. Tong, D.B. Strukov, G.S. Snider, G. Medeiros-Ribeiro, R.S. Williams, Memristor-CMOS hybrid integrated circuits for reconfigurable logic, Nano Lett. 9 (10) (2009) 3640–3645. http://dx.doi.org/10.1021/nl901874j. [48] A. Mazady, M. Anwar, Memristor: Part II—DC, transient, and RF analysis, IEEE Trans. Electron Devices 61 (4) (2014) 1062–1070. [49] A.C. Torrezan, J.P. Strachan, G. Medeiros-Ribeiro, R.S. Williams, Subnanosecond switching of a tantalum oxide memristor, Nanotechnology 22 (48) (2011) 485203:1–7. [50] H. Owlia, P. Keshavarzi, A. Rezai, A novel digital logic implementation approach on nanocrossbar arrays using memristor-based multiplexers, Microelectron. J. 45 (6) (2014) 597–603. [51] J.J. Yang, M.-X. Zhang, J.P. Strachan, F. Miao, M. D. Pickett, R. D. Kelley, G. Medeiros-Ribeiro, R. S. Williams, High switching endurance in TaOx memristive devices, Appl. Phys. Lett. 97 (23) (2010) 232102:1–3. [52] W. Zhang, N.K. Jha, L. Shang, A Hybrid Nano/CMOS Dynamically Reconfigurable System Part I: architecture, ACM J. Emerg. Technol. Comput. Syst. 5(4) (2009) 16:1–30. [53] J. Cong, B. Xiao, mrFPGA: a novel FPGA architecture with memristor-based reconfiguration, in: IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), 2011, pp. 1–8. [54] E. Lehtonen, M. Laiho, Stateful implication logic with memristors, in: IEEE/ ACM International Symposium on Nanoscale Architectures NANOARCH '09, 2009, pp. 33–36. [55] X. Zhu, X. Yang, C. Wu, N. Xiao, J. Wu, X. Yi, Performing stateful logic on memristor memory, IEEE Trans. Circuits Syst. II: Express Briefs 60 (10) (2013) 682–686. [56] E. Lehtonen, J. Tissari, J. Poikonen, M. Laiho, L. Koskinen, A cellular computing architecture for parallel memristive stateful logic, Microelectron. J. 45 (11) (2014) 1438–1449. [57] S. Shin, K. Kim, S.-M. Kang, Reconfigurable stateful nor gate for large-scale logic-array integrations, IEEE Trans. Circuits Syst. II: Express Briefs 58 (7) (2011) 442–446. [58] D.B. Strukov, K. K. Likharev, CMOL FPGA circuits, in: Proceedings of International Conference on Computer Design, CDES2006, 2006, pp. 213–219. [59] P. Mane, N. Talati, A. Riswadkar, B. Jasani, C. Ramesha, Implementation of NOR logic based on material implication on CMOL FPGA architecture, in: 2015 28th International Conference on VLSI Design (VLSID), 2015, pp. 523–528. [60] M. Qureshi, M. Pickett, F. Miao, J. P. Strachan, CMOS interface circuits for reading and writing memristor crossbar array, in: IEEE International Symposium on Circuits and Systems (ISCAS), 2011, pp. 2954–2957. [61] L. Zhang, Z. Chen, J.J. Yang, B. Wysocki, N. McDonald, Y. Chen, A compact modeling of TiO2–TiO2  x memristor, Appl. Phys. Lett. 102(15) (2013) 153503:1–4.