CMOS LIM structure

CMOS LIM structure

Ain Shams Engineering Journal xxx (xxxx) xxx Contents lists available at ScienceDirect Ain Shams Engineering Journal journal homepage: www.sciencedi...

1MB Sizes 2 Downloads 69 Views

Ain Shams Engineering Journal xxx (xxxx) xxx

Contents lists available at ScienceDirect

Ain Shams Engineering Journal journal homepage: www.sciencedirect.com

Electrical Engineering

A novel self write-terminated driver for hybrid STT-MTJ/CMOS LIM structure Prashanth Barla, Vinod Kumar Joshi ⇑, Somashekara Bhat Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academey of Higher Education, Manipal 576104, Karnataka, India

a r t i c l e

i n f o

Article history: Received 11 June 2020 Revised 3 October 2020 Accepted 28 October 2020 Available online xxxx Keywords: Spintronics Logic-in-memory Magnetic tunnel junction Tunnel magnetoresistance Spin transfer torque

a b s t r a c t A novel self write-terminated driver is proposed for the hybrid spin transfer torque-magnetic tunnel junction (STT-MTJ)/CMOS circuits based on logic-in-memory (LIM) structure. Using continuous write monitoring mechanism, the novel circuitry completely eliminates the unnecessary flow of write current which abolishes the wastage of write energy in the proposed write driver. Hence, the total energy required for writing process is reduced noticeably by 63.32% in novel write driver compared to the conventional write circuit. Monte-Carlo simulation is then performed by incorporating process and mismatch variations for CMOS and extracted parameters of MTJ. Simulations are also carried out for the proposed write driver, by varying the transistor sizes and supply voltage to analyze its switching probability to obtain safe operating region. Further, the proposed write driver is integrated with hybrid full adder to demonstrate its feasibility in low-power VLSI circuits. Ó 2020 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Ain Shams University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/ by-nc-nd/4.0/).

1. Introduction In this modern era of big data and artificial intelligence which demands high-performance computation, limited data transfer bandwidth and surge in power dissipation are the bottleneck associated with conventional Von-Neuman architecture due to the physical separation between memory and processor modules [1– 5] (Fig. 1a). Among all the possible methods to mitigate this problem, employing logic-in-memory (LIM) has emerged as one of the most contemporary solution [1,2]. LIM is also portrayed in literature by several names such as processing in memory (PIM), computation in memory (CIM), in-memory computation (IMC) [1,2]. In LIM, there is a significant change at the architectural level, where computational capability is embedded into the memory i.e., memory and processing of data/information is united in a single unit. In LIM, the memory which is also involved in processing of information is non-volatile in nature and among all the nonvolatile memories (NVM), magnetic random access memory ⇑ Corresponding author. E-mail address: [email protected] (V.K. Joshi). Peer review under responsibility of Ain Shams University.

Production and hosting by Elsevier

(MRAM) has proved to be the most efficient owing to its advantages such as non-volatility, fast reading capability, high density, infinite endurance, 3D fabrication and ease of integration with the existing CMOS technology [1,2]. Fig. 1b shows the LIM structure where NVM is built directly on top of the existing CMOS using 3D integration. This creates a hybrid MRAM/CMOS structure which reduces the global routings and data transfers distance alleviating the communication bottleneck seen in conventional Von-Neuman architecture [6]. Most importantly, use of NVM enables to completely power-off the temporarily unused blocks during idle conditions without losing the stored data in the memory, thus saving a significant amount of standby power [2]. When the power is restored back, the NV data is instantly recovered back and available for the processing. Hence this approach is also most suitable for normally-off and instant-on systems [7]. Further there is a large reduction of area in LIM structures due to 3D integration since MRAM do not occupy additional area on silicon. Typically, MRAM consists of a spintronic device, magnetic tunnel junction (MTJ). It utilize spin degree of freedom along with the charge of an electron. The MTJs basically are of two types; p-MTJ (perpendicular magnetic tunnel junction) and i-MTJ (in plane magnetic tunnel junction). In our work we have selected p-MTJ over i-MTJ due to its advantages such as low power, high thermal stability, long retention time of the stored data, ease of scalability and fabrication [8,9]. In LIM structure, p-MTJs not only stores the bits, but also participate in logic operations. There are several circuits developed

https://doi.org/10.1016/j.asej.2020.10.012 2090-4479/Ó 2020 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Ain Shams University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Please cite this article as: P. Barla, Vinod Kumar Joshi and S. Bhat, A novel self write-terminated driver for hybrid STT-MTJ/CMOS LIM structure, Ain Shams Engineering Journal, https://doi.org/10.1016/j.asej.2020.10.012

P. Barla, Vinod Kumar Joshi and S. Bhat

Ain Shams Engineering Journal xxx (xxxx) xxx

Fig. 1. (a) Von-Neumann architecture showing logic and memory units. Both logic and memory are kept separate having their total area as X + Y and interconnects facilitate the communication between them. (b) LIM structure with stack integration having total area as either X or Y [6].

known as fixed layer and free layer as shown in the Fig. 2. The resistance variation of these two states is defined by a quantum mechanical effect, tunnel magnetoresistance (TMR) [36] and is defined using the Eq. (1),

using hybrid STT-MTJ/CMOS LIM structures such as adders [10,11], ALU [12,13], decoder [14], logic gates [15,16], random number generator [17], non-volatile ternary content-addressable memory (TCAM) [18], non-volatile TCAM with priority-decision [19] etc. All these hybrid circuits occupy relatively smaller area and dissipates less power than its CMOS counterparts. However, the ominous issue in these hybrid circuit is the need of high STT spin polarized bidirectional write current (IW ) during write process. As discussed in Refs. [20,21], about 90% of the total energy consumption in the circuit is due to MTJ write process. Since STT-MTJ switching is inherently stochastic due to incubation delay [22,23], typical time for the completion of write operation varies dramatically. To tackle this uncertain switching period, generally the STT-MTJ write time is kept longer than the ideal switching time (five times in [24]). This means STT write current flows even after the completion of MTJ switching. It causes wastage of energy in the write circuitry during STT-MTJ switching process. It is a major challenge to reduce this wastage of write energy. There are several circuits proposed in the literature to reduce the wastage of write power or energy dissipation for STT process using selfterminated write mechanisms [24–35]. However, all these circuits deal with either flip-flops or memory array. There is no write circuit developed for reducing the energy dissipation in LIM based circuits, which employs pair of MTJs in its logic network. In this paper, we proposed a novel write driver for hybrid STT-MTJ/ CMOS LIM structure which eliminates the wastage of write energy that occurs after MTJ switching which is commonly observed in conventional write circuit. In conventional writing, IW continues to flow in the write circuit as long as enable signal is asserted high. The current which flows after the switching of MTJ pair increases overall energy consumed by the write circuit, and this additional energy is considered as wastage. The proposed write driver consists of a continuous write completion monitoring and self writetermination mechanism using which the IW ceases to flow soon after the p-MTJ switching takes place. Since bidirectional STT current is used to switch p-MTJs, the node voltage before and after switching depends on the data being written into these p-MTJs. Once the write operation is completed, this voltage difference is detected and amplified by a CMOS buffer to disable the STT writing process even when the p-MTJ write enable is ON. This would curb the wastage of write energy in hybrid circuits.

TMR ¼

RAP  RP : RP

ð1Þ

Attaining a high value of TMR is desirable to produce large swing which results in the faithful reproduction of bit stored in the pMTJ. For CoFeB/MgO/CoFeB MTJ the largest TMR reported is 604% at room temperature till date [37]. This has propelled the expectation to utilize MTJs in hybrid LIM structures. 2.2. Switching mechanism for p-MTJ Changing the direction of free layer from low resistance state (RP ) to high resistance state (RAP ) or vice versa is called as p-MTJ writing or switching. There are various switching mechanisms reported in the literature such as field induced magnetic switching (FIMS) [38], thermally assisted switching [39,40], spin hall effect switching (SHE) [41,42], voltage induced switching (VIS) [43,44] and spin transfer torque (STT) [45,46] switching. Among all the switching mechanisms mentioned so far, STT is preferred due to its feasibility of commercialization [47]. Hence we have adopted STT switching mechanism in our work. Though Slonczewski [48] and Berger [49] theoretically predicted STT in 1996, experimental validation of STT mechanism was achieved in the year 2004 using deep submicron sized low resistance MTJ structure [50]. In this method IW which is greater than critical current (ICO ) is employed to switch p-MTJ states. In STT process, when electrons move from fixed layer to free layer the electrons with spin orientation as that of fixed layer magnetic orientation, easily move forward. The electrons with spin orientation opposite to that of fixed layer are scattered. Hence the electrons which are allowed to pass through are called majority electrons whereas the scattered electrons are called minority electrons. This selective forward movement of electron is called spin polarization. When these spin polarized electrons reach the free layer, they exert a torque called spin transfer torque on the

2. Background This section describes construction and switching mechanism of p-MTJ device. It also briefs about the structure, working concept of hybrid MTJ/CMOS circuits and conventional writing circuit. 2.1. Device structure of p-MTJ Fig. 2. (a) Low resistance state of p-MTJ, where magnetic orientation of free layer and fixed layer are in the same direction. (b) High resistance state of p-MTJ, where magnetic orientation of free layer and fixed layer are in the opposite direction [13].

The p-MTJ is a nano-pillar, where a non-magnetic thin oxide barrier is sandwiched between the two ferromagnetic materials 2

Ain Shams Engineering Journal xxx (xxxx) xxx

P. Barla, Vinod Kumar Joshi and S. Bhat

magnetization of free layer to flip its orientation to RP state. When electrons move in reverse direction, i.e. from free layer to fixed layer the scattered back minority electrons exert spin transfer torque to change p-MTJ to RAP state. STT switching mechanism is illustrated in Fig. 3.

3. The principle of operation of proposed self write-terminated driver Fig. 6 shows the schematic of proposed self write-terminated driver. It consists of 4T writing core (Fig. 6a), write completion detector (Fig. 6b), write termination (Fig. 6c) and write enable generator (Fig. 6d). Different MTJ states and various intermediate signals of proposed write driver are shown in the Fig. 6e. Working of the proposed write driver is explained as follows: Consider that, bit ‘‘1” is stored in the MTJ0-MTJ1, represented as P-AP configuration (Fig. 7c and d). To write bit ‘‘0”, IN and EN signals are asserted as ‘‘0” and ‘‘1” respectively. When EN = ‘‘1” the output of the domino logic inverter (MP2, MN2, MN3 and I3 in Fig. 6c) is still at ‘‘1” (because EN_CNT = ‘‘0”) and X2 will generate its output as CKT_EN = ‘‘1” as observed at T1 of Fig. 7f. This will enable the write enable generator circuit (Fig. 6d). For the combination of CKT_EN = ‘‘1” and IN = ‘‘0”, intermediate signals V1, V2, V3, V4 are ‘‘1010” respectively (Fig. 6e), which enables MP0-MN1 transistor pair from writing core (Fig. 6a), driving IW through MP0-MTJ0-MTJ1-MN1. Hence the MTJ0-MTJ1 switches its state from P-AP to AP-P respectively observed at T2 of Fig. 7c and d. Hence bit ‘‘0” is written into the MTJ pair. Immediately, change in the state of MTJ pair is detected by write completion detector (Fig. 6b), i.e. when MTJ0-MTJ1 changes from P-AP to AP-P configuration, the voltage at the CMP_PT changes from ‘‘1” to ‘‘0”. This can be explained from the Eqs. (2) and (3). When MTJ0-MTJ1 are at PAP configuration, using voltage divider rule the voltage at node point R can be written as,

2.3. Hybrid MTJ/CMOS LIM structure Fig. 4 shows the basic block diagram of the hybrid MTJ/CMOS LIM structure. It has three main parts: (a) current controlled sense amplifier to detect the results of the logic operations being performed; (b) MTJ switching/writing block to change the MTJ state from RP to RAP and vice versa using STT process; (c) logic network (LN), which performs logical operations on the data. LN is amalgamation of MOS logic structure for volatile input and MTJs which stores NV data. In hybrid LIM structure integration of MTJs with the existing CMOS is achieved due to the resistance compatibility of the MTJs i.e., ON resistance of the MOS (RON ) is less than RP ; (RON < RP ) and the OFF resistance of the MOS (ROFF ) is more than RAP ; (ROFF > RAP ). The combination of MOS logic tree and MTJs decides the left and right branch resistances (RL and RR ) of the LIM structure. RL and RR has a inverse relationship with the reading current (IL and IR ) respectively and decides its rate of flow in the branch thereby producing output (OUT) and its complement (OUT). For instance when branch current ðIL > IR Þ, the output OUT = 1 and OUT = 0, otherwise if branch current ðIL < IR Þ, the output OUT = 0 and OUT = 1.

VH ¼

Vdda  RAP : RAP þ RP

ð2Þ

When MTJ0-MTJ1 change to AP-P configuration, the voltage at node point R can be written as,

2.4. Conventional writing circuit Conventional STT-MTJ writing circuit which employs IW is shown in the Fig. 5. Here the combination of input IN and control signal EN, enables a pair of transistors from writing core (either MP1-MN0 or MP0-MN1) to drive IW through the MTJ pair (MTJ0MTJ1). IW continues to flow in the write circuitry as long as EN = ‘‘1”, i.e. even after completion of the MTJ switching process. The current which flows after the switching of MTJ pair increases overall energy consumed by the write circuit, and this additional energy consumed by the circuit is considered as wastage. One possible way to overcome this issue is to employ self write terminating control circuit which can cut-off IW soon after the completion of MTJ switching process.

VL ¼

Vdda  RP : RAP þ RP

ð3Þ

(Note: In Eqs. (2) and (3) the ON resistance of MOS is ignored). As soon as the MTJ writing process is complete, the voltage at R will be changed from V H to V L . (i.e. from ‘‘1” to ‘‘0”). This change is amplified by CMOS buffer (B1) and is reproduced at CMP_PT. So at the output of X0 and X1, there is transition from ‘‘0” to ‘‘1” and ‘‘1” to ‘‘0” respectively. Since the MUX select line (S0) is IN = ‘‘0”, output of the MUX (EN_CNT) becomes ‘‘1”. Logic ‘‘1” value at EN_CNT will turn-on the transistor MN2 of domino logic inverter to produce ‘‘1” at its output (Fig. 6c). This signal, along with EN = ‘‘1” will make the write termination circuit to change its output CKT_EN from ‘‘1” to ‘‘0”. As a result of which the write enable generator circuit will disable the writing core (Fig. 6e). This completes the self write-termination process. In a similar fashion write completion detection operates when bit stored in the MTJ pair changes from ‘‘0” to ‘‘1” for EN = ‘‘1” and IN = ‘‘1” as shown at T4 of Fig. 6e. 4. Simulation results and discussions We have used Cadence tool (IC6.1.6-64b.500.4) with 45 nm CMOS generic process design kit and STT p-MTJ electrical model for simulation developed using Verilog-A language [51]. The unique feature of this model are flexibility and accuracy. This model can easily incorporate the experimentation or fabrication result which would be resulted in future in Cadence environment. Table 1 shows the p-MTJ parameters used in our work and other values are retained as defaults mentioned in [51]. For both conventional write circuit and proposed write driver, IW is driven by Vdda = 1.3 V which is higher than rest of the circuitry operating at Vdd = 1.1 V. During the simulation, 4T writing transistors set

Fig. 3. STT switching mechanism in p-MTJ: When IW flows from free to fixed layer RAP changes to RP , whereas when IW flows from fixed to free layer RP changes to RAP [13]. 3

P. Barla, Vinod Kumar Joshi and S. Bhat

Ain Shams Engineering Journal xxx (xxxx) xxx

Fig. 4. Block diagram of LIM structure, consisting of sense amplifier, MOS logic structure with volatile inputs and STT-MTJs with non-volatile data. MTJ writing circuit is used to change the MTJ states [1].

Fig. 5. Conventional MTJ writing circuitry, consisting of (a) control circuitry and (b) 4T writing core circuit [1].

The reduction in write energy/bit achieved for bit ‘‘1” and bit ‘‘0” is 46.98% and 49.79% respectively for the proposed write driver. Furthermore, in proposed write driver a huge energy saving of 97.18% is achieved when input IN is same as that of the bit stored in MTJ pair. Hence, there is a 64.63% reduction in the average write energy/bit. But the energy consumed by the control circuitry of proposed write driver is higher than the conventional one by 89.14% due to additional hardware i.e. write completion detector (Fig. 6b) and write termination module (Fig. 6c). Proposed write driver needs 49 additional transistors than conventional write circuit. However, the total energy/bit consumed by the proposed write driver is significantly less than conventional write circuit by 63.32%. But the self write-terminated driver requires an additional 49 transistors in comparison with the conventional write circuit. Since the prime focus in low power VLSI design is to achieve ultra-low power dissipation, there may be an area overhead. This is the power v/s area trade-off that needs to be compromised by the design. However, in our proposed design additional transistors are of minimum size (L = 45 nm and W = 120 nm) and advancements in the fabrication technology such as 3D integration, facilitates to keep the lateral dimension of the silicon ICs under nominal range. Further in conventional write circuit the IW flows through the MTJ even after the completion of switching process and also when IN and information stored in the MTJ is same. This not only causes the wastage of energy but also poses endurance issues for the MTJ. Because, the flow of unnecessary writing current in MTJ can result in the break down of barrier dielectric material. Hence, the proposed self write-terminated circuit not

as W = 480 nm and all other transistor width are set to W = 120 nm. The write completion detector module of proposed write driver continuously monitors the change in voltage at node R (V H and V L ) and the amplified version, EN_CNT is used to control the output CKT_EN of write termination module. This in-turn will control IW in the writing core. Hence, we can say a continuous monitoring mechanism is adopted here which will immediately cut-off IW when MTJ pairs changes its configuration. For example at time T1 when EN = ‘‘1” and IN = ‘‘0”, IW begins to flow through the MTJ pair. At time T2, when switching of the MTJ pair takes place from P-AP to AP-P configuration, IW immediately ceases in the writing core due to self write-termination mechanism. But in the conventional write circuit, IW continues to flow as long as EN is high, even after MTJs switching has been completed. Time interval between T3-T2 is the period for which the energy saving is achieved for the proposed self write-terminated driver by eliminating the flow of IW . Similarly, at T4 when MTJ pair switches from AP-P to P-AP configuration, IW is cut-off as shown in Fig. 7h. Furthermore, we can also notice that at T5, when input IN and bit stored in MTJ pair are the same (in this cases its ‘‘1”), IW is immediately cut-off just after when EN becomes high (Fig. 7h). Whereas, in the conventional write circuit, IW continues to flow (Fig. 7i). Hence, a significant amount of wasted write energy is completely eliminated by the proposed write driver as compared to the conventional write circuit. Table 2 summarizes the performance comparison of the self write-terminated driver with the conventional write circuit. 4

Ain Shams Engineering Journal xxx (xxxx) xxx

P. Barla, Vinod Kumar Joshi and S. Bhat

Fig. 6. Schematic of proposed write circuitry, consisting of (a) 4T writing core, (b) write completion detector (c) write-termination and (d) write enable generator. (e) Various signals and its corresponding MTJ states.

Fig. 7. Timing diagram for both conventional write and proposed self write-termination driver. (a) Input IN. (b) Control signal enable EN. (c) State of MTJ0. (d) State of MTJ1. (e) Control signal EN_CNT from write completion detector. (f) Control signal CKT_EN from write termination. (g) Voltage at CMP_PT in write completion detector. (h) IW in proposed circuit, showing the energy saved. (i) IW in conventional write circuit. 5

P. Barla, Vinod Kumar Joshi and S. Bhat

Ain Shams Engineering Journal xxx (xxxx) xxx

4.2. Process and mismatch variations

Table 1 p-MTJ parameters used for the simulation [51]. Paramenter

Description

Value

Area TMRð0Þ tsl RA tox

p-MTJ dimensions TMR ratio with zero bias Thickness of the free layer Resistance area product Thickness of the oxide barrier Standard deviation of TMR Standard deviation of tsl Standard deviation of t ox

32 nm32 nmp=4 200% 1.3 nm 5 lm2 0.85 nm 3% of TMR 3% of t sl 3% of tox

rTMR rtsl rtox

Process and mismatch variations of CMOS transistors, RP and RAP can seriously change the performance of the write circuit. The resistances RP and RAP are influenced by various parameters such as device dimensions, material properties, and TMR [55]. These resistances in turn will affect the node voltages (V H and V L ) at R (Fig. 6), which can directly regulate the performance of the write circuit. In order to study this effect at the design stage we have incorporated 3% variations in the extracted parameters of MTJ such as; TMR, t ox and t sl which follow Gaussian distribution and performed Monte Carlo (MC) simulation of 200 runs for both self write-terminated and conventional write circuits as shown in the Table 3. From which we can notice that the wastage of energy in self write-terminated driver is less than the conventional write circuit for all the min, max and mean cases. Though the difference of average energy consumed/bit for write driver is small in the case of max, but this difference considerably large for the cases of min and mean. Hence, self write-terminated driver is superior than the conventional write circuit and a eliminates the wastage of write energy.

Table 2 Performance comparison of write circuits.

Self write-termination Monitoring process Number of Transistors Control circuitry energy/bit(a) Write energy/bit for bit ‘‘1” Write energy/bit for bit ‘‘0” Write energy/bit, when IN = Bit stored in MTJ pair Average write energy/bit(b) Total energy energy/bit(a + b)

Conventional write circuit [Fig. 5]

Proposed write driver [Fig. 6]

No N/A 20 1.28fJ 861.9fJ 861.9fJ 861.9fJ

Yes Continuous 69 11.79fJ 457fJ 433.1fJ 24.34fJ

861.9fJ 863.18fJ

304.81fJ 316.6fJ

4.3. Integration of proposed write driver with hybrid full adder To study the synchronous behavior with the LIM structure, we have integrated our write driver with the full adder proposed in Ref. [13] which is shown in the Fig. 9a. The proposed write driver has P,Q and R points of contact with the hybrid full adder circuit. These are same points as shown in Fig. 6. Full adder consists of three inputs A, B, Cin and two outputs SUM and CARRY. Input Cin is stored in pair of MTJs (MTJ0 - MTJ1) using self writeterminated driver, imparting non-volatility to the full adder. Simulation results (Fig. 9b) confirm the proper working of proposed write driver with full adder developed using hybrid MTJ/CMOS LIM structure. For instance, when clock (CLK) is ‘‘1”, ‘‘ABCin ” = ‘‘011”, SUM and CARRY are ‘‘0” and ‘‘1” respectively.

only eliminate the write energy wastage but also increases the endurance of MTJs.

4.1. Analysis of switching probability We have analyzed the switching behavior of the proposed write driver under different input conditions, such as by varying Vdda, writing core transistors width (W) and writing duration (EN= ‘‘1”) as illustrated in Fig. 8. For Fig. 8a and b the region-A depicts safe operating points where the proposed driver will switch correctly without any errors. This is because, larger Vdda and wider writing core transistor would generate higher density of IW in pMTJs, thereby deterministically switching p-MTJs as described by Sun model [52,53]. Whereas in Fig. 8c wider writing core transistor and larger write duration enables the IW to flow for a longer period establishing p-MTJ switching as described by Neel-Brown model [54]. The other points apart from region-A indicates that the proposed write driver either may not switch p-MTJs or switches them partially to produce erroneous results.

5. Conclusion Hybrid MTJ/CMOS LIM structure is considered to replace the conventional Von-Neumann architecture owing to its advantages. But the main hurdle in this process is the wastage of write energy during MTJ write process. An effective way to eliminate this wastage of energy is by developing self write terminating drivers for hybrid circuits. For the first time, a power-efficient write driver has been proposed for hybrid STT-MTJ/CMOS LIM structure. Thanks to the continuous monitoring mechanism adopted here

Fig. 8. Switching probability (a) by varying Vdda and W while EN = 10 ns, (b) by varying Vdda and EN while W = 480 nm, (c) by varying W and EN while Vdda = 1.3 V for proposed self write-terminated driver. 6

Ain Shams Engineering Journal xxx (xxxx) xxx

P. Barla, Vinod Kumar Joshi and S. Bhat

Table 3 Comparison between the conventional write circuit and the proposed write driver for the energy consumption in terms of MC simulation of 200 runs.



Circuit type

Particulars of Energy consumed/bit

Min (fJ)

Max (fJ)

Mean (fJ)

Standard deviation (fJ)

Conventional write circuit

Control circuitry Write driver⁄ Total

1.242 628.9 630.2

1.341 931.7 933

1.291 812.2 813.5

0.018 64.67 64.67

Proposed write driver

Control circuitry Write driver⁄ Total

3.17 283.7 297.8

25.13 908.2 917

11.35 473.4 484.8

3.88 149.9 148.8

Average value.

Fig. 9. (a) Proposed write driver showing the integration with hybrid MTJ/CMOS LIM full adder [13]. (b) SUM and CARRY output waveform of full adder showing the synchronous working. 7

P. Barla, Vinod Kumar Joshi and S. Bhat

Ain Shams Engineering Journal xxx (xxxx) xxx

which will self-terminate the writing process soon after the MTJs are written, consequently a significant saving in write energy is achieved along with the improvement in the endurance of MTJs.

[21] Kaushik BK, Verma S, Kulkarni AA, Prajapati S. Next Generation Spin Torque Memories. Singapore: Springer; 2017. doi: https://doi.org/10.1007/978-98110-2720-8. [22] Devolder T, Hayakawa J, Ito K, Takahashi H, Ikeda S, Crozat P, Zerounian N, Kim J-V, Chappert C, Ohno H. Single-shot time-resolved measurements of nanosecond-scale spin-transfer induced switching: stochastic versus deterministic aspects. Phys. Rev. Lett. 2008;100(5):057206. doi: https://doi. org/10.1103/PhysRevLett.100.057206. [23] Iga F, Yoshida Y, Ikeda S, Hanyu T, Ohno H, Endoh T. Time-resolved switching characteristic in magnetic tunnel junction with spin transfer torque write scheme. Jpn. J. Appl. Phys. 2012;51(2):02BM02. doi: https://doi.org/10.1143/ jjap.51.02bm02. [24] Suzuki D, Natsui M, Mochizuki A, Hanyu T. Cost-efficient self-terminated write driver for spin-transfer-torque RAM and logic. IEEE Trans. Magn. 2014;50 (11):1–4. doi: https://doi.org/10.1109/TMAG.2014.2322387. [25] Bishnoi R, Ebrahimi M, Oboril F, Tahoori MB. Improving write performance for STT-MRAM. IEEE Trans. Magn. 2016;52(8):1–11. doi: https://doi.org/10.1109/ TMAG.2016.2541629. [26] Bishnoi R, Oboril F, Ebrahimi M, Tahoori MB. Self-timed read and write operations in STT-MRAM. IEEE Trans. Very Large Scale Integr. VLSI Syst. 2016;24(5):1783–93. doi: https://doi.org/10.1109/TVLSI.2015.2496363. [27] Sayed N, Bishnoi R, Oboril F, Tahoori MB. A cross-layer adaptive approach for performance and power optimization in STT-MRAM. IEEE; 2018. p. 791–6. doi: https://doi.org/10.23919/DATE.2018.8342114. [28] K. Monga, A. Malhotra, N. Chaturvedi, S. Gurunayaranan, A novel low power non-volatile SRAM cell with self write termination, in: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2019, pp. 1–4. doi:10.1109/ICCCNT45670.2019.8944846. [29] Gupta MK, Hasan Mohd. Self-terminated write-assist technique for STT-RAM. IEEE Trans. Magn. 2016;52(8):1–6. doi: https://doi.org/10.1109/ TMAG.2016.2542785. [30] Farkhani H, Tohidi M, Peiravi A, Madsen JK, Moradi F. STT-RAM energy reduction using self-referenced differential write termination technique. IEEE Trans Very Large Scale Integr. VLSI Syst. 2017;25(2):476–87. doi: https://doi. org/10.1109/TVLSI.2016.2588585. [31] R. Bishnoi, M. Ebrahimi, F. Oboril, M.B. Tahoori, Asynchronous Asymmetrical Write Termination (AAWT) for a low power STT-MRAM, in: 2014 Design, Automation Test in Europe Conference Exhibition (DATE), 2014, pp. 1–6. doi:10.7873/DATE.2014.193. [32] Bishnoi R, Oboril F, Ebrahimi M, Tahoori MB. Avoiding Unnecessary Write Operations in STT-MRAM for Low Power Implementation. IEEE; 2014. p. 548–53. doi: https://doi.org/10.1109/ISQED.2014.6783375. [33] Zhou P, Zhao B, Yang J, Zhang Y. Energy reduction for STT-RAM using early write termination. In: 2009 IEEE/ACM International Conference on ComputerAided Design - Digest of Technical Papers. p. 264–8. [34] Zhang D, Zeng L, Wang G, Zhang Y, Zhang Y, Klein JO, Zhao W. High-speed, lowpower, and error-free asynchronous write circuit for STT-MRAM and logic. IEEE Trans. Magn. 2016;52(8):1–4. doi: https://doi.org/10.1109/ TMAG.2016.2539519. [35] Lakys Y, Zhao WS, Devolder T, Zhang Y, Klein J-O, Ravelosona D, Chappert C. Self-enabled ‘‘Error-Free” switching circuit for spin transfer torque MRAM and logic. IEEE Trans. Magn. 2012;48(9):2403–6. doi: https://doi.org/10.1109/ TMAG.2012.2194790. [36] Joshi VK. Spintronics: A contemporary review of emerging electronics devices. Eng. Sci. Technol. 2016;19(3):1503–13. doi: https://doi.org/10.1016/ j.jestch.2016.05.002. [37] Ikeda S, Hayakawa J, Ashizawa Y, Lee YM, Miura K, Hasegawa H, Tsunoda M, Matsukura F, Ohno H. Tunnel magnetoresistance of 604% at 300K by suppression of Ta diffusion in CoFeB/MgO/CoFeB pseudo-spin-valves annealed at high temperature. Appl. Phys. Lett. 2008;93(8):082508-1–8-3. doi: https://doi.org/10.1063/1.2976435. [38] Wolf SA, Awschalom DD, Buhrman RA, Daughton JM, von Molnár S, Roukes ML, Chtchelkanova AY, Treger DM. Spintronics: a spin-based electronics vision for the future. Science 2001;294(5546):1488–95. doi: https://doi.org/ 10.1126/science.1065389. [39] Wang Jianguo, Freitas PP. Low-current blocking temperature writing of double-barrier mram cells. IEEE Trans. Magn. 2004;40(4):2622–4. doi: https://doi.org/10.1109/TMAG.2004.834239. [40] Prejbeanu IL, Kerekes M, Sousa RC, Sibuet H, Redon O, Dieny B, Nozières JP. Thermally assisted MRAM. J. Phys.: Condens. Matter 2007;19(16):165218. doi: https://doi.org/10.1088/0953-8984/19/16/165218. [41] Liu L, Pai C-F, Li Y, Tseng HW, Ralph DC, Buhrman RA. Spin-torque switching with the giant spin hall effect of tantalum. Science 2012;336(6081):555–8. doi: https://doi.org/10.1126/science.1218197. [42] Hirsch JE. Spin hall effect. Phys. Rev. Lett. 1999;83(9):1834–7. doi: https://doi. org/10.1103/PhysRevLett.83.1834. [43] Shiota Y, Nozaki T, Bonell F, Murakami S, Shinjo T, Suzuki Y. Induction of coherent magnetization switching in a few atomic layers of FeCo using voltage pulses. Nat. Mater. 2012;11(1):39–43. doi: https://doi.org/10.1038/nmat3172. [44] Wang W-G, Li M, Hageman S, Chien CL. Electric-field-assisted switching in magnetic tunnel junctions. Nat. Mater. 2011;11(1):64–8. doi: https://doi.org/ 10.1038/nmat3171. [45] Ralph DC, Stiles MD. Spin transfer torques. J. Magn. Magn. Mater. 2008;320 (7):1190–216. doi: https://doi.org/10.1016/j.jmmm.2007.12.019. [46] Diao Z, Li Z, Wang S, Ding Y, Panchula A, Chen E, Wang L-C, Huai Y. Spintransfer torque switching in magnetic tunnel junctions and spin-transfer

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. References [1] Zhao W, Prenat G. Spintronics-based Computing. Berlin, Germany: Springer International Publishing; 2015. doi: https://doi.org/10.1007/978-3-31915180-9. [2] W. Kang, E. Deng, Z. Wang, W. Zhao, Spintronic Logic-in-memory paradigms and implementations, in: M. Suri (Ed.), Applications of Emerging Memory Technology, vol. 63, Springer Series in Advanced Microelectronics, 2020, pp. 215–229. doi:10.1007/978-981-13-8379-3-9. [3] Kim NS, Austin T, Blaauw D, Mudge T, Hu JS, Irwin MJ, Kandemir M, Narayanan V, et al. Leakage Current: Moore’s Law Meets Static Power. Computer 2003;36 (12):68–75. doi: https://doi.org/10.1109/MC.2003.1250885. [4] Kang W, Zhang Y, Wang Z, Klein J-O, Chappert C, Ravelosona D, Wang G, Zhang Y, Zhao W. Spintronics: Emerging ultra-low-power circuits and systems beyond mos technology. ACM J. Emerg. Technol. Comput. Syst. (JETC) 2015;12(2):1–42. doi: https://doi.org/10.1145/2663351. [5] Wulf WA, McKee SA. Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput. Architec. News 1995;23(1):20–4. [6] Endoh T, Koike H, Ikeda S, Hanyu T, Ohno H. An overview of nonvolatile emerging memories– Spintronics for working memories. IEEE J. Emerging Sel. Top. Circuits Syst. 2016;6(2):109–19. doi: https://doi.org/10.1109/ JETCAS.2016.2547704. [7] E. Deng, Design and development of low-power and reliable logic circuits based on spin-transfer torque magnetic tunnel junctions, Ph.D. thesis, Université Grenoble Alpes (France), 2017. [8] Lim H, Lee S, Shin H. A survey on the modeling of magnetic tunnel junctions for circuit simulation. Act. Passive Electron. Compon. 2016;2016:1–12. doi: https://doi.org/10.1155/2016/3858621. [9] Ahmed I, Zhao Z, Mankalale MG, Sapatnekar SS, Wang J-P, Kim CH. A comparative study between spin-transfer-torque and spin-hall-effect switching mechanisms in PMTJ using SPICE. IEEE J. Explor. Solid-State Comput. Devices Circ. 2017;3:74–82. doi: https://doi.org/10.1109/ JXCDC.2017.2762699. [10] Matsunaga S, Hayakawa J, Ikeda S, Miura K, Endoh T, Ohno H, Hanyu T. MTJbased nonvolatile logic-in-memory circuit, future prospects and issues. In: Proc. Conf. Design, Autom. Test Europe. p. 433–5. doi: https://doi.org/10.1109/ DATE.2009.5090704. [11] Deng E, Zhang Y, Klein J-O, Ravelsona D, Chappert C, Zhao W. Low power magnetic full-adder based on spin transfer torque MRAM. IEEE Trans. Magn. 2013;49(9):4982–7. doi: https://doi.org/10.1109/TMAG.2013.2245911. [12] Guo W, Prenat G, Dieny B. A novel architecture of non-volatile magnetic arithmetic logic unit using magnetic tunnel junctions. J. Phys. D: Appl. Phys. 2014;47(16):165001. doi: https://doi.org/10.1088/0022-3727/47/16/165001. [13] Barla P, Joshi VK, Bhat S. A novel low power and reduced transistor count magnetic arithmetic logic unit using hybrid STT-MTJ/CMOS circuit. IEEE Access 2020;8:6876–89. doi: https://doi.org/10.1109/ACCESS.2019.2963727. [14] Deng EY, Prenat G, Anghel L, Zhao WS. Non-volatile magnetic decoder based on MTJs. Electron. Lett. 2016;52(21):1774–6. doi: https://doi.org/10.1049/ el.2016.2450. [15] Zhao W, Moreau M, Deng E, Zhang Y, Portal J-M, Klein J-O, Bocquet M, Aziza H, Deleruyelle D, Muller C, Querlioz D, Romdhane NB, Ravelosona D, Chappert C. Synchronous non-volatile logic gate design based on resistive switching memories. IEEE Trans. Circ. Syst. I: Regul. Pap. 2014;61(2):443–54. doi: https://doi.org/10.1109/TCSI.2013.2278332. [16] Barla P, Shet D, Joshi VK, Bhat S. Design and analysis of lim hybrid mtj/cmos logic gates. In: 2020 5th International Conference on Devices, Circuits and Systems (ICDCS). p. 41–5. doi: https://doi.org/10.1109/ ICDCS48716.2020.243544. [17] Y. Wang, H. Cai, L.A.B. Naviner, J.-O. Klein, J. Yang, W. Zhao, A novel circuit design of true random number generator using magnetic tunnel junction, 2016, pp. 123–128. doi:10.1145/2950067.2950108. [18] Wang C, Zhang D, Zeng L, Deng E, Chen J, Zhao W. A novel MTJ-based nonvolatile ternary content-addressable memory for high-speed, low-power, and high-reliable search operation. IEEE Trans. Circ. Syst. I 2018;66(4):1454–64. doi: https://doi.org/10.1109/TCSI.2018.2885343. [19] Wang C, Zhang D, Zeng L, Zhao W. Design of magnetic non-volatile TCAM with priority-decision in memory technology for high speed, low power, and high reliability. IEEE Trans. Circ. Syst. I 2019;67(2):464–74. doi: https://doi.org/ 10.1109/TCSI.2019.2929796. [20] Rajaei R, Fazeli M, Tabandeh M. Soft error-tolerant design of MRAM-based nonvolatile latches for sequential logics. IEEE Trans. Magn. 2015;51(6):1–14. doi: https://doi.org/10.1109/TMAG.2014.2375273. 8

Ain Shams Engineering Journal xxx (xxxx) xxx

P. Barla, Vinod Kumar Joshi and S. Bhat

[47] [48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

torque random access memory. J. Phys.: Condens. Matter 2007;19(16):165209. doi: https://doi.org/10.1088/0953-8984/19/16/165209. Spin-transfer Torque MRAM ProductsjEverspin (2020). https://www. everspin.com/spin-transfer-torque-mram-products. Slonczewski JC. Current-driven excitation of magnetic multilayers. J. Magn. Magn. Mater. 1996;159(1):L1–7. doi: https://doi.org/10.1016/0304-8853(96) 00062-5. Berger L. Emission of spin waves by a magnetic multilayer traversed by a current. Phys. Rev. B 1996;54(13):9353–8. doi: https://doi.org/10.1103/ PhysRevB.54.9353. Huai Y, Albert F, Nguyen P, Pakala M, Valet T. Observation of spin-transfer switching in deep submicron-sized and low-resistance magnetic tunnel junctions. Appl. Phys. Lett. 2004;84(16):3118–20. doi: https://doi.org/ 10.1063/1.1707228. Y. Wang, Y. Zhang, W. Zhao, J.-O. Klein, T. Devolder, D. Ravelosona, C. Chappert, Compact model for perpendicular magnetic anisotropy magnetic tunnel junction, doi:10.4231/D3154DQ21. https://www.researchgate.net/ publication/309355960. Koch RH, Katine JA, Sun JZ. Time-resolved reversal of spin-transfer switching in a nanomagnet. Phys. Rev. Lett. 2004;92(8):088302. doi: https://doi.org/ 10.1103/PhysRevLett.92.088302. Heindl R, Rippard WH, Russek SE, Pufall MR, Kos AB. Validity of the thermal activation model for spin-transfer torque switching in magnetic tunnel junctions. J. Appl. Phys. 2011;109(7):073910. doi: https://doi.org/10.1063/ 1.3562136. Faber L, Zhao W, Klein J, Devolder T, Chappert C. Dynamic compact model of Spin-Transfer Torque based Magnetic Tunnel Junction (MTJ). In: 2009 4th International Conference on Design Technology of Integrated Systems in Nanoscal Era. p. 130–5. doi: https://doi.org/10.1109/DTIS.2009.4938040. Joshi K V, Barla P, Bhat S, Kaushik K B. From MTJ device to hybrid CMOS/MTJ circuits: A review. IEEE Access 2020;8:194105–46. doi: https://doi.org/ 10.1109/ACCESS.2020.3033023.

Prashanth Barla received B.E. degree in Electronics and Communication Engineering and M. Tech. degree in Microelectronics and Control system from Visvesvaraya Technological University, Belgaum, Karnataka, India. He is currently pursuing Ph.D. from the department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India. His research interests include VLSI design and Spintronics. He is working under the guidance of Prof. Vinod Kumar Joshi and Prof. Somashekara Bhat on hybrid MTJ/CMOS based on logic-in-memory structure. He is IEEE student member. Vinod Kumar Joshi received the PhD degree from Kumaun University, Nainital, India and the M.Tech. degree from VIT University, Vellore, India. He is currently an Associate Professor with the Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal, India. His main research interests include Spintronics-based VLSI, logic-in-memory based hybrid non-volatile logic circuits and their application for low power application. He is member of Institution of Engineering and Technology (IET), UK, and Life member of Indian Society of Systems for Science and Engineering (LMISSE-00361), VSSC-ISRO, Trivandrum, India. Somashekara Bhat, PhD, is currently serving as Professor in the Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India. He has received the PhD degree in the field of MEMS from Indian Institute of Technology, Madras, India. The area of his interests are MEMS and electronics for biomedical applications.

9