Power transition X filling based selective Huffman encoding technique for test-data compression and Scan Power Reduction for SOCs

Microprocessors and Microsystems 72 (2020) 102937 Contents lists available at ScienceDirect Microprocessors and Microsystems journal homepage: www.e...

Download PDF

948KB Sizes 0 Downloads 28 Views

Report

PDF Reader
Full Text

Microprocessors and Microsystems 72 (2020) 102937

Contents lists available at ScienceDirect

Microprocessors and Microsystems journal homepage: www.elsevier.com/locate/micpro

Power transition X ﬁlling based selective Huffman encoding technique for test-data compression and Scan Power Reduction for SOCs Lokesh Sivanandam a,∗, Sakthivel Periyasamy a, Uma Maheswari Oorkavalan a a

Department of Electronics and Communication Engineering, Anna University, Chennai 600025, India

a r t i c l e

i n f o

Article history: Received 6 September 2019 Revised 21 October 2019 Accepted 29 October 2019 Available online 31 October 2019 Keywords: Test data compression Power transition X ﬁlling System-on-a-chip (SOC) Testing power X ﬁlling, Selective Huffman encoding Chip area overhead

a b s t r a c t Due to the excessive utilization of memory, data compression is an evergreen research topic. Realizing the constant demand of compression algorithms, this article presents a compression algorithm to analyse the digital VLSI circuits for constraint optimization, such as test data volume, switching power, chip area overhead and processing speed of testing. This article proposes a new power transition X ﬁlling based selective Huffman encoding technique, which achieves better data compression, switching power reduction, chip area overhead reduction and speed of testing. The performance of the proposed work is examined with the help of ISCAS benchmark circuits. Initially, the test set is occupied by using the power transition X ﬁlling technique to replace the don’t care bits and the ﬁlled test set is further encoded by selective Huffman encoding technique. The experimental results show that the proposed power transition X ﬁlling based selective Huffman encoding gives effective results compared to the related data compression techniques with minimal time and memory consumption. © 2019 Elsevier B.V. All rights reserved.

1. Introduction Numerous constraints exist in every product development and it is common for all designs. Hence, it is mandatory to analyse the constraints both before and after product development, as the product performance and quality may get affected. For a costeffective product, we need to follow proper designing and testing. Testing is a very important phase for every product development that examines these constraints so that quality and effectiveness are measured before they shipping to the market. Similarly, VLSI chip design has a lot of issues that are examined by testing such as application testing time, memory, power consumption and area overhead. In digital circuits, System-on-a-Chip (SOC) uses pre-veriﬁed and predeﬁned modules to achieve these goals. One of the major issues in SOCs is to handle a large amount of test data during testing of the device in the Automatic Test Equipment (ATE) kit. Due to this vast amount of test data volume, the limited number of input pins in a device may lead to bottleneck while processing that increases the application testing time. To avoid this problem, the input test data is applied in compressed form to the input pins of SOCs. The chip design using pre-veriﬁed and predeﬁned modules reduces the development time of the chips but increases the

∗

Corresponding author. E-mail address: [email protected] (L. Sivanandam).

https://doi.org/10.1016/j.micpro.2019.102937 0141-9331/© 2019 Elsevier B.V. All rights reserved.

complexity of the system-on-a-chip and affects as a bottleneck. This issue is handled by transforming the test sets to compressed form and ATE memory is utilized for storage. These compressed test patterns pass through each core of the system-on-a-chip. The compressed test patterns are accessed in each core of SOC and decompressed into the test set to the original form. A single-chip contains millions of transistors that increase the complexity of chips while processing big test data sets. This complexity happens when the scan chain contains too much of switching transitions in the data set. Another issue in SOCs is the handling of bottleneck conditions while processing big test data set without affecting the size of the chip design. The increase in the number of cores in chips design increases hardware area overhead, which is a major issue in chip designing. So, designing chips should be with a small area overhead and without meeting any bottleneck. Limiting the size of the hardware area, testing application speed, memory management depends on the effective ‘X’ ﬁlling approach. ‘X’ ﬁlling reduces the switching power used in the device testing process so that we can avoid the physical damage by the overheating of switching activities. The compression of test data helps to minimize the tester memory, increase the speed of processing, limit the pins of chips and prevent physical damage of the chips. The speed of device testing always depends on the amount of data transferred to each core of the chip and how fast the input test set is transformed into a decompressed form.

2

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937

This paper presents a Power Transition X ﬁlling based Selective Huffman Encoding technique to compress the test data, reduce area overhead, increase the operational frequency and reduce switching power and compares these results with various ‘X’ ﬁlling techniques such as 0 s ﬁlling, 1 s ﬁlling, random ﬁlling. This work consists of two parts, where the ﬁrst part is Power Transition X Filling and the second part is Selective Huffman Encoding. Using power transition X ﬁlling the don’t care bits of the test set are ﬁlled and the test set is divided into ﬁxed-size blocks with its frequency and it’s stored in the ATE memory. The compressed test data are then transferred to each core of the SOCs where the compressed test data are decompressed. The Selective Huffman encoding technique assigns unique code words by constructing the Huffman tree of the top frequency block of the test patterns. This power transition X ﬁlling based selective Huffman coding is more eﬃcient than other techniques in reducing test application time, area overhead, test power, and test data compression. This approach doesn’t require any additional set up for testing as it uses existing decoder hardware for decompression. The major contributions of this work are presented as follows: 1. The test data compression using the power transition X ﬁlling based selective Huffman technique is enhanced. 2. A signiﬁcant amount of application testing time and tester memory while processing test set is reduced. 3. The scan in power and chip area overhead in digital VLSI circuits are minimized. The paper is organized as follows. Section 2 discusses basic introductions to test data compression and its issues that includes related works about test data compression and X ﬁlling techniques. Section 3 discusses the basic terms and formula used for this work and Section 4 discusses different X ﬁlling techniques used in this paper. Power optimized X ﬁlling based selective Huffman technique proposed in Section 5, experiment results are discussed in Section 6 and Section 7 discusses the conclusion. 2. Related works In digital electronics, proper application testing techniques of System-on–chips play a major role in test power optimization and reduction in chip area. In most of the cases, the issues are caused by poor ‘X’ ﬁlling techniques of the test set. These ‘X ﬁlling techniques determine the compression ratio, which ﬁlls the don’t care bits (unspeciﬁed bits represented as ‘X’) either by 0 or 1. We can reduce as much as possible, the amount of switching power used for testing application by an eﬃcient ‘X’ ﬁlling technique. Many approaches are proposed for the test data compression to enhance the compression ratio, tester memory management, switching power reduction and operational frequency. The power requirement while testing of application is more when compared to the normal mode of operation. This is happening because of too much of switching operations in the systemon-chips by the effect of poor ‘X’ ﬁlling techniques. The high rate of switching increases the temperature of the chips that leads to physical damage to the product and increases the cooling cost, which is proposed by Girard P et al. [1]. They proposed a novel approach that concern about power consumption and its critical issues in digital VLSI circuits. One of the common approaches to ﬁll the don’t care bits is to replace them with either 1 s or 0 s. These two approaches are reducing the switching activity in test sets that reduce the switching power but doesn’t reduce much power dissipation. J. Aerts et al. [2] proposed a technique that reduces chip area overhead, according to the reduction in test data volume size. This approach discusses the maximum utilization of input cores and tester memory but doesn’t concern about the compression

mechanism. The reduction of test data compression and testing power using run-length ﬁlling with Huffman encoding proposed by Mehrdad Nourani et al. [3] but that not concern about to reduce the chip area overhead. Gonciari P et al. [4] proposed an approach to discuss the inﬂuence of three test data compression parameters i.e. compression ratio, application processing time and chip area overhead using variable-length Huffman coding technique but compression ratio, chip area overhead and testing time is not more optimal. The statistical coding approach proposed by Abhijit Jas et al. [5] that assign variable length code words to ﬁxed-size blocks of a test set to maximize the compression ratio and minimize the area overhead than existing approaches [2-5] but selective Huffman technique is eﬃcient only if the frequency of test set is more than certain range. Optimal selective Huffman proposed by Kavousianos X et al. [9] to achieve better compression than selective Huffman encoding [5] which maximizes compression ratio and minimizes testing time and area overhead. Golomb codes, Frequency Directed Runlength code, Alternating Run Length Code proposed by A. Chandra et al. in [6,7] and [8] for the eﬃcient test data compression. New test data compression and decompression technique using Golomb code [6] which gives more eﬃcient than run-length based Huffman technique and less eﬃcient than variable-length Huffman coding [4], selective Huffman coding [5] and optimal selective Huffman coding [9]. The Frequency Directed Run-length coding [7] is proposed which gives a better compression ratio than Golomb codes [6] but it only deals with the distribution of 0 s for encoding. Alternating Run Length coding [8] is more eﬃcient for test data compression, application testing time and scan in power than Frequency Directed Run-length code [7] but it doesn’t discuss area overhead issue. The signiﬁcant amount of switching reduction is achieved using Linear Feedback Shift Register (LFSR) with Memory Built-in Self-Test(MBIST) proposed by C.V. Krishna et al. [10] that enhance test data compression much better but it only deals with test data compression. Extended Frequency Directed Run-length proposed by A.H. ElMaleh et al. [11] which deals with both 0 s and 1 s distribution for encoding and its more optimal for test data compression than Frequency Directed Run-length code [7] but doesn’t discuss area overhead and application testing time. Equal Length Run-length ﬁlling proposed test data compression and chip area overhead reduction by Zhan W et al. [12] that deal with both 0 s and 1 s and maximize the equal consecutive runs and apply the same length codewords of those consecutive runs. A. Jas et al. [13] propose a new technique for test data compression, application testing time reduction that supports the highspeed shifting of test bits to SOCs cores with a slower clock rate. K. Murali Krishna et al. [14] proposed a novel approach that proposed the relationship between input bandwidth optimization and application testing time according to the operating frequency and this approach is optimal to reduce switching activity but it is costeffective. The hybrid X ﬁlling and two-stage compression techniques proposed by K. Thilagavathi et al. [15] for effective test data reduction using adjacent X ﬁling [28] and modiﬁed 4m X ﬁlling [29] which results are optimal than previous approaches [3-13]. The novel approach proposed for the test data compression approach is by A. El-Maleh et al. [16], which purely depends on the basic geometric shapes such as line, triangle and square and it is not more optimal for chip area overhead reduction and scan in power reduction. Dorsch et al. [17] proposed the approach to reduce the bandwidth of test data by reusing the SOCs cores that enhance the test data compression to ATE. The Hybrid Built-in SelfTest (BIST) based data compression technique for VLSI circuits proposed by D. Das et al. [18] which reduces the tester memory but failed to optimize much better application testing time and scan

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937

in power. V. Krishna et al. [19] proposed a partial LFSR reseeding technique for effective for manage tester memory and drawback is having more computational time. A. Jas et al. [20] proposed a hybrid resource partitioning scheme for tester memory management and effective for area overhead reduction but have more application testing time. Sivanantham S et al. [21] proposed CSP ﬁlling, Trinadh A et al. [22] proposed ‘X’ ﬁlling based on dynamic programming and [23] proposed DP ﬁlling technique that effectively reduce the average power and peak power while application testing but doesn’t concern on area overhead. Multilevel Huffman coding proposed by Kavousianos X et al. [24] reduces the application testing time by introducing the parallel decompression architecture. In [31], an eﬃcient VLSI architecture for the parallel dictionary Lempel- Ziv-Welch (LZW) data compression algorithm was presented. This work utilizes multiple dictionaries for speeding up the encoding process. A VLSI implementation of lossy to lossless LTM Electro-Cardio-Gram (ECG) compression framework was proposed in [32]. This work employs lifting discrete wavelet transform for producing the coeﬃcients of tail bits and the bits truncated by considering the information loss. The processed coeﬃcients are encoded by the altered run-length code. A compression-based line buffer design is presented for image or video processing circuits in [33]. This work utilizes the variablelength code effectively inline buffer architecture. An ortho normalized multi-stage discrete fast Stockwell transform-based VLSI implementation for image compression was presented in [34]. The proposed transform unit deals with the split, predict and update operations with respect to the odd samples. This idea fastens the entire process of compression. 3. Power transition X ﬁlling algorithm Usually, the test set generated by automatic generation toolkit ATPG contains a series of more and more numbers of 0 s, 1 s and don’t care bits. (speciﬁed as “X”) in the form of m∗ n matrix and each don’t care bit is replaced by either 0 or 1 in order to maximize the compression ratio. In the previous researches some other X ﬁlling techniques are proposed for ﬁlling don’t care bits but due to the ineﬃciency in compression ratio and area overhead here

Ti =

3

Now T1 . is moved to T7 . position and T2 . is moved to T8 . position. As mentioned above, now T1 (considerd as T7 ). and T2 (considerd as T8 ) are compared and ﬁll X bits. This algorithm similarly scans every bit in the test set Ti up to its length l. The result of this algorithm by after completion of X ﬁlling is Ti = {1111110 0 010 0}. This power transition X ﬁlling algorithm is brieﬂy explained below. Input: Unﬁlled Test set (m∗ n), {T = j1 ., j2 , j3 . …… jm−1 , jm , jm+1 …………. jn } Output: X ﬁlled test data set Step 1: Start Declare variables i, j, FirstBit, NextBit, FillSize, FillBit TestVector [] Step 3: Initialize variables i ← 1 and j ← 0 Step 4: READ FirstBit ← TestVector[j] Step 5: FOR j = 1 to TestVector [] length READ NextBit ← TestVector[j] IF NextBit = FirstBit OR NextBit = X THEN INCREMENT i ELSE IF FirstBit = X THEN INCREMENT i IF NextBit = 1 SET FirstBit ← 1 IF NextBit = 0 SET FirstBit ← 0 SET FillSize to 0 ELSEIF FirstBit = 1 THEN SET FillSize to i SET FillBit ← 1 SET FirstBit ← 0 FILL TestVector [] by FillBit up to FillSize SET i ← 1 ELSE SET FillSize to i SET FillBit ← 0 SET FirstBit ← 1 FILL TestVector [] by FillBit up to FillSize SET i ← 1 ENDIF ENDIF ENDFOR Step 6: Stop

Consider the test set Ti . that contains 4 scan vectors (T1, T2, T3, and T4) and each scan vector has a length of 64 and a total length of 256. The above power transition X ﬁlling algorithm is applied to replace the don’t cares for this Ti .

⎧ ⎫ ⎪ ⎨ X X 0 0 0 0 0 0 01X X X 1X X 1X X X 10 0 0 0 0 0 011X 1X 0 0 0 0X 0 0X 01X X X X X X X X 111001X 0X 0X 11 ⎪ ⎬ 1111X X X X 1X X X 0 0 0 01111X 1X 1X 1X X 1X X X 1X X 100X 00X 00X 0100000000010X 00X 00 0 0 0 0 0X 0X 0X 11110110X 01X X X X 1X X 1X X X X X X 1X 110 0 01X 11111111110X 011X X X 11 ⎪ ⎭ 10 0 01X 0 0 0 0 0 0 0 0 01X X X X X X 1X 1X 1X 1X 0 0 0 0X 0X 0 0 0 01X 0X 10 0 01X X X X X X X X X 0 0 0 0 0

⎪ ⎩

Result of Ti after power transition X ﬁlling is presented as follows.

⎧ ⎫ ⎪ ⎨ 0 0 0 0 0 0 0 0 01111111111110 0 0 0 0 0 0111110 0 0 0 0 0 0 0 011111111111100110 0 0 011⎪ ⎬ 1111111111110 0 0 0111111111111111111110 0 0 0 0 0 0 0 0 010 0 0 0 0 0 0 0 010 0 0 0 0 0 0

⎪ ⎩ 0 0 0 0 0 0 0 0 0 011110110 0 011111111111111111110 0 01111111111110 0 01111111 ⎪ ⎭ 10 0 0110 0 0 0 0 0 0 0 01111111111111110 0 0 0 0 0 0 0 0 0 0110 010 0 011111111110 0 0 0 0

proposed power transition X ﬁlling algorithm that follows new X ﬁlling mechanism which maximizes the compression. The don’t care bits present in the input test set are slightly too high so that the power transition X ﬁlling technique ﬁlls the don’t care bits depends on the neighboring bits. The algorithm work as follows: Consider the test set Ti = {XXXXX10XX10X}, where the ﬁrst bit is T1 and the second bit is T2 and l is the length of the test set. Now T2 is compared with T1 . If T1 and T2 are equal then scan T3 and compare T3 with T1 otherwise, ﬁll T1 by NOT value of T2 . According to the above concept, T1 is equal up to T5 (all are X bits). Now T6 = 1 which is not equal to T1 so that T1 to T6 of Ti are ﬁlled by 1.

After completing the X ﬁlling, the test set is Ti is partition into k distinct blocks b1 , b2 , b3 , . . . .bk with frequency p1 , p2 , p3, . . . . pk and where l is the length of block b Table 1. For example, consider Ti is partitioned into 4-bit length (l = 4) distinct blocks then 16 distinct blocks (2l = 24 = 16) are possible i.e. 0 0 0 0, 0 0 01, 0 010, …..1111. 4. Selective huffman encoding Huffman encoding is the lossless data compression approach to achieve a maximum compression ratio of the test set, but major issues in the Huffman encoding is the ineﬃciency of area overhead that requires large area overhead for the decompressor architecture

4

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937 Table. 1 Test set partition and Frequency Calculation. Test Set Ti 0000 1111 0000 1000 Distinct Block 1111 0000 1100 1000 0011 0111 1110 1101 0100 0010 0001

0000 0111 1111 1111 1111 0000 0000 0011 1101 0001 1100 0000 Frequency of occurrence 24 16 5 5 4 3 2 1 1 1 1

1111 1111 1000 1111

1000 1111 1111 1111

0000 1111 1111 1111

1111 1111 1111 1100

1000 1111 1111 0000

0000 0000 1110 0000

0011 0000 0011 0110

1111 0010 1111 0100

1111 0000 1111 0111

1100 0000 1100 1111

1100 1000 0111 1110

0011 0000 1111 0000

Fig. 1. Huffman tree for selective Huffman encoding.

to encode all test set’s blocks. Due to the linear growth of the block size l, the distinct block limit k will exponentially rise so that more numbers of blocks need to be decoded. So we propose a selective Huffman encoding approach to encode only a few distinct blocks of the test set’s blocks. From the blocks b1 , b2 , b3 , . . . ., bk we take only most occurring blocks (f > 3) from Out of k distinct blocks i.e. b1 , b2 , b3 , . . . ., b f (f < k) only encoded selective Huffman encoding remain b f +1 , b f +2 ,. . . ., bk are not encoded. The selective Huffman encoding technique assigns code words (short keywords) to each unique block depending on the frequency of occurring. To assign code words the binary tree is constructed from blocks frequencies that beginning with two lowest frequency blocks and moving toward the root node. From the root node, ‘0 is assigned to every left node and ‘1 is assigned to every right node.

code words to applied to small frequency blocks. As already mentioned, to reduce area overhead the higher frequency blocks (f > 3) only encoded by selective Huffman encoding and apply code words with a preﬁx of ‘1 . For the lower frequency blocks (f < 3) encoding not applied so that blocks are transferred with the preﬁx of ‘0 . To optimize the area overhead we apply these selective Huffman encoding techniques for both ﬁxed-size blocks (4 bit, 8 bit, 10 bit and 12 bit). The higher compression possible only if the number of encoded blocks is more than not encoded blocks and l size should be minimum. Fig. 1 shows the Huffman tree for selective Huffman encoding.

The code word obtains by traversing each node of the tree from the root node. Selective Huffman algorithm assigns smaller code words to the high frequency of occurrence of blocks and larger

The switching power is the number of times the system diodes ON/OFF as a result of the input test set’s adjacent bits. Basically, switching power of application testing is more when compared to

4.1. Total switching power, average power, peak power, and compression ratio calculation

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937

5

Table. 2 Compression ratio, Total power, Average power and Peak power of different X ﬁlling techniques for example pattern. X Filling Approach

Compression Ratio

Total Power

Average Power

Peak Power

0 s ﬁlling 1 s ﬁlling Random ﬁlling Power Transition X ﬁlling

8.20 6.25 15.35 17.96

2690 1845 2180 1068

672 461 622 267

748 648 710 366

normal mode. This is due to the continuous scanning of a large number of different adjacent bits, so that, it is possible to reduce switching activities by keeping the same adjacent bits. Switching power is highly inﬂuenced by the don’t care bits of the input test set and its length. The scanning power is calculated by Eq.1.

W TTotal =

n−1

(n − m ) Tim Ti(m+1)

(1)

m=1

Where m and n denote the total number of scan vectors in Ti and the total number of bits per scan vector respectively. The average power of the test set is computed by the following formula.

Average Power =

m

W TTotal ( j )/m

(2)

j=1

The peak power is the maximum switching transition in an input test set. The maximum peak power is harmful to the VLSI circuits it may lead to physical damage to the circuit. Peak power is highly inﬂuenced by the input test data set and its length. The peak power is calculated as follows

W Tpeak = max1

≤ i ≤ n

{W TTotal (i )}

(3)

The compression ratio (CR) for the test set is calculated by the following formula.

CR =

n i=1

F ∗ (B − C )/

n

Tm

(4)

m=1

In the above equation, F, B, C represent the frequency, size, and codeword length of each block respectively. Tm is the total number of bits in the input test set. The power transition ﬁlling based Selective Huffman encoding technique is applied to the example test set and compression ratio, total power, average power and peak power for different X ﬁlling approaches is tabulated in Table 2. From the table, it is proven that the power transition X ﬁlling gives more compression than other approaches. The bandwidth utilization in SOCs circuits is directly proportional to the compression ratio. The maximum compression helps to reduce the memory requirements in ATE equipment to process the input test data set which also helps to maximize the throughput of operating frequency. The average scanning power of the SOCs depends on the number of switching activities of the circuits. The maximum compression helps to improve the input bandwidth test data size in ATE. The relationship between average power and switching activities is given in the below Eq. (5).

Pavg = α T .Cload V 2 . fclk

(5)

In the above equation, α T. denotes the switching activity, Cload represents the total load capacitance, V stands for voltage and fclk . denotes the operating frequency. The decompression architecture is presented in the following section. 5. Decompression architecture The compression and decompression architecture for our approach [5] shown in Fig. 2. The on-chip decoder decompresses

the code words into the original test set. The decoder accesses variable-length codewords of each unique ﬁxed block and generates an original test set from this code word. The decoder simply accepts serial input of code words from ATE and the on-chip decoder simply replaces 0 s or 1 s to the scan lines. The decoder decodes the variable-length code words and generates its corresponding ﬁxed-size blocks. The decoder decodes the code words only if code word preﬁx with 1 otherwise it directly sent to scan lines. The generated blocks from codewords are transferred to scan chains. The serializer plays an important role in the decoder circuit because it helps to achieve the degree of parallelism. Parallelism means that handling two or more operation execution in the chip concurrently, it helps to reduce the testing time. In the decoder circuit, to reduce the testing time, the ATE never stop the code words transmission to the decoder, it means that it never cares about decompression and transmission ﬁnishes of the decoder. The decoder accepts the series of inputs from ATE at each clock cycle b by bit. After accepting a complete codeword from ATE, the decoder immediately starts to ceive the next codeword for the corresponding block bit by bit and the other hand decoder decompresses the codeword and transferred ﬁnished decoded output to scan chains. Here, the decoder provides two parallelisms to the decoder, one is for receiving codeword at each clock of ATE and another is for decode the received codeword and generate corresponding block k and transfer them to the scan chains at the same time. The scan chain uses two clocks for this operation, during the ﬁrst clock cycle the decoder receives the code word from ATE. At the second clock cycle, the de-compressed codewords are transferred to the scan chain and the next codeword is loaded to the decoder. The decoder access codeword bit by bit at each tester clock cycle so that decoder totally takes N clock cycles to complete N length codeword. After receiving the complete codeword, the decoder executes the code word in parallel and generates corresponding block size. 5.1. Area overhead calculation Area overhead is the extra hardware space required for encoder and decoder in VLSI circuits to process excess input. This area overhead is highly inﬂuenced by decoder nature, don’t care bits of input test set and length of the test set. To generate ﬁxed-size input test patterns, shift registers are registered and for variable-size input test patterns counters are used. The area overhead is calculated using formula.

Area Overhead =

Area o f Decoder Area o f Benchmark Circuit

(6)

For example, consider the ATE serially sending 00, 010, 11 and 0 codewords bit by bit and its corresponding blocks are 0 0 0 0, 10 0 0, 0101 and 1010. In this example, codeword lengths are variable size and blocks are ﬁxed. Another important fact here is that the amount of transferring bit from ATE is lesser than the amount of bit transferred to the scan chain after decompression. We can easily understand this from the above example because the input

6

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937

Fig. 2. Compression and decompression architecture. Table. 3 ISCAS89 Benchmark Circuit Mintest Proﬁle. Circuits

Number of bits per scan line

Total patterns

Test data volume

Total X-bits

% of X bits

S5378 S9234 S13207 S15850 S35932 S38417 S38584

214 247 700 126 1763 1664 1464

111 159 236 611 16 99 136

23,754 39,273 165,200 76,986 28,208 164,736 199,104

17,249 28,672 153,887 64,329 9957 112,154 163,817

72.615 73.007 93.152 83.559 35.298 68.081 82.277

codeword length of each codeword is lesser than its corresponding block size. The ATE sends one 1-bit codeword (0), two 2-bit codewords (00 and 11) and one 3-bit codewords (010). But decoder generates four 4 bit blocks from these code words. Here, the ATE sends a totally 8-bits but the decoder generates 16-bit (4 × . 4 blocks). This proves the number of bits transferred after decompression from the decoder to the scan chain is always higher than the number bits coming to the decoder from ATE. This is possible by using a scan chain with a faster clock than the ATE clock [27,29]. It shows that, when serializer clock is faster than the ATE clock, it is possible to achieve faster transmission of decoder output to scan chains than transfer code words to the decoder. Fig. 2 shows that, optimize the clock rate of scan chains is to use a single tester channel to feed multiple scan chains where the code word bits are transferred to each decoder as per the clock cycles. If chip contains there are two decoders with separate scan chains, then decoder 1 accepts the input at even clock cycle and decoder 2 accepts input at odd clock cycle. Two important things are present in designing decoder. One is that it should have minimum area overhead and another is transmission time of decoder to serializer should not faster than the transmission time from serializer to scan chains. The experimental results of the proposed work are presented in the following section. 6. Experimental results The proposed power transition X ﬁlling approach for test data compression is evaluated using ISCAS’89 benchmark circuits [26].

The proposed approach is implemented using the C program and tested in Intel Pentium Core Duo, 2.2GHZ processor with 2 GB DDR3 RAM. We used Mintest proposed by Brglez F et al. [25] as Automatic Test Generation Toolkit for generating input test data sets for our experimentation which shown in Table 3. Compression ratio, total power, average power, peak power and testing application time are discussed in this section. 6.1. Average power and peak power The average power and peak for proposed and existing X ﬁlling techniques described in Table 4. As for getting an optimal reduction of average power and peak power we need to analyze the results presented in the tables. Our power transition X ﬁlling technique is compared with ﬁve existing Xﬁlling approaches (0 s ﬁlling, 1 s ﬁlling, random ﬁlling, adjacent ﬁlling [28], 4 m ﬁlling [30]) and results are presented in Table 4. The percentage of improvement of PTX ﬁlling with other ﬁlling techniques is tabulated in Table 5. From the results except for benchmark circuit S13207, power transition X ﬁlling has a much better compression ratio for remaining benchmark circuits than all existing approaches. The average power and peak power of PTX ﬁlling are compared with existing ﬁlling techniques using the following graphs that showed in Fig. 3 and Fig. 4. The average power and peak power improvement for the proposed approach are calculated by the following formula. Power Saving =

Total Power (others ) − Total Power (PT X f il l ing ) × 100 Total Power (others )

(7)

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937

7

Table. 4 Average power and peak power for various ﬁlling approaches. Circuits Average Power

Peak Power

0 s ﬁlling 1 s ﬁlling Random ﬁlling Adjacent ﬁlling [28] 4 m ﬁlling [29] PT ﬁlling 0 s ﬁlling 1 s ﬁlling Random ﬁlling Adjacent ﬁlling [28] 4 m ﬁlling [29] PT ﬁlling S5378 S9234 S13207 S15850 S38417 S38584

4300 6705 12317 19448 194843 133322

4086 6520 14528 25635 193143 142216

7654 8567 16789 27654 201278 169923

5851 14749 74854 102154 337658 610803

3796 5511 6710 14209 142560 117550

3524 4003 8074 13612 118099 86136

12085 15395 110129 84360 514716 530464

12375 15640 126820 88794 539019 533975

13270 18400 137888 102916 565645 572544

12516 20431 100896 99537 572169 583614

10908 15143 76140 62269 304948 528835

9732 14093 84880 50876 297884 481159

Table. 5 Average power and Peak power improvement (%) of PTX ﬁlling than other approaches. Circuits

S5378 S9234 S13207 S15850 S38417 S38584

Average Power

Peak Power

0 s ﬁlling

1 s ﬁlling

Random ﬁlling

Adjacent ﬁlling [28]

4 m ﬁlling [29]

0 s ﬁlling

1 s ﬁlling

Random ﬁlling

Adjacent ﬁlling [28]

4 m ﬁlling [29]

18.04 40.29 34.44 30.00 39.38 35.39

13.75 38.60 52.39 46.89 38.85 39.43

53.95 53.27 51.90 50.77 41.32 49.30

39.77 72.85 89.21 4.20 17.15 26.72

7.16 27.36 −16.89 4.20 17.15 26.72

2.92 8.45 13.84 15.98 14.92 9.29

5.19 9.89 25.18 20.17 18.76 9.89

11.59 23.40 31.19 31.13 22.58 15.96

22.24 31.02 15.87 48.88 47.93 17.55

10.78 6.93 −10.29 18.29 2.31 9.01

6.2. Compression ratio The compression ratio for the proposed method is compared with exiting six (SC [30], Golomb [6], FDR [8], VIHC [4], ERLC [11], SHC [5]) methods and results are tabulated in the following Table 6 and 7. To achieve maximum compression we proposed a pattern matching algorithm.

Fig. 3. Average power analysis of different X-ﬁlling techniques.

6.2.1. Pattern matching algorithm Consider the four-bit patterns 0 0 0 0, 0 0 01, 0 010, 0 011, 010 0, 0101, 0110, 0111, 10 0 0, 10 01, 1010, 1011, 1100, 1101, 1111. Patterns 0 0 0 0 and 1111 are treated as same. But the patterns 0 0 01, 0 010, 0100 are having three 0 s and one 1so that these three patterns are treated as 0 0 0 0 by pattern matching algorithm. Similarly, pattern 0111 has three 1 s and one 0so this pattern treated as 1111. Here, pattern matching is applied for one-bit varied patterns. For 8-bit, 12-bit, 16-bit patterns, pattern matching is applied for 2-bit varied patterns. The compression ratio with and without matching patterns is given in Table 6. Table 7 shows the compression ratio for the existing technique SHC [5] and the proposed technique. From the results, the power transition X ﬁlling based selective Huffman encoding approach gives better results than the previous approach. Except for the benchmark S38417, this method gives better compression for all remaining benchmark circuits. The compression ratio will linearly increase with the increase of the block size. So that block size 16 has better compression than small size blocks 12,8, and 4. Table 8 shows the compression ration analysis with existing techniques and from that results we can conclude that the proposed technique is eﬃcient than existing techniques.

6.3. Area overhead

Fig. 4. Peak power analysis of different X-ﬁlling techniques.

From the graphs, we can conclude that the proposed scheme ensures better improvement in reducing average power and peak power.

We used the compression and decompression architecture of SHC [5] for the area overhead reduction and results are tabulated in Table 9. The area overhead is computed for 8 blocks, 10 blocks and 12 blocks respectively. From the above result, smaller block sizes improve the eﬃciency in the area overhead reduction. In the above Table 9, we tabulated both decoder area overhead and serializer area overhead for block sizes 8, 10 and 12. From the above results, PTX based selective human encoding reduces the average decoder

8

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937

Table. 6 Compression ratio for the proposed method with and without pattern matching. Circuits

Proposed Encoding (PTX +SHC) without Pattern Matching 8 enc.dist.blocks

S5378 S9234 S13207 S15850 S38417 S38584

Proposed Encoding (PTX +SHC) with Pattern Matching

16 enc.dist.blocks

32 enc.dist.blocks

8 enc.dist.blocks

16 enc.dist.blocks

32 enc.dist.blocks

4

8

12

16

8

12

16

8

12

16

4

8

12

16

8

12

16

8

12

16

17.9 14.2 31.7 23.6 23.6 22.7

38.6 31.6 58.0 46.0 45.6 44.2

38.3 25.6 62.7 47.4 45.3 44.1

36.2 20.9 62.8 45.1 41.9 40.6

39.3 33.6 58.5 47.3 46.7 45.7

41.1 30.9 64.9 49.7 48.4 47.9

38.8 25.5 65.6 48.4 45.4 44.5

38.4 32.9 58.0 46.3 45.7 44.8

42.9 34.6 66.4 51.9 50.9 50.3

41.7 30.5 68.6 51.4 49.8 49.8

32.0 35.1 50.7 44.0 40.8 40.3

53.6 54.7 74.0 65.3 60.1 59.6

56.5 52.9 80.3 68.1 60.8 66.0

51.3 55.0 79.2 64.7 59.2 66.5

54.4 54.3 69.0 65.0 57.2 63.5

57.1 57.0 79.9 71.3 57.8 66.8

53.6 52.1 85.0 63.9 58.9 69.6

54.8 56.6 73.9 60.9 60.6 59.2

59.3 59.6 80.9 70.5 65.8 69.7

60.9 58.9 85.2 69.9 64.2 67.5

Table. 7 Compression ratio comparison. Circuits

Proposed Encoding (PTX +SHC)%

SHC [5]% 8 enc.dist.blocks

S5378 S9234 S13207 S15850 S38417 S38584

16 enc.dist.blocks

32 enc.dist.blocks

8 enc.dist.blocks

16 enc.dist.blocks

32 enc.dist.blocks

4

8

12

16

8

12

16

8

12

16

4

8

12

16

8

12

16

8

12

16

28.9 30.0 45.6 38.4 34.9 37.8

50.1 50.4 69.2 60.0 55.3 58.5

53.0 50.7 76.6 63.6 56.9 62.2

50.3 46.1 78.6 61.6 54.4 61.7

50.2 50.2 69.2 59.9 55.5 58.5

55.1 54.2 77.1 65.6 58.9 63.9

53.8 51.0 79.7 64.8 57.3 64.1

49.8 49.0 68.9 59.2 54.7 57.7

56.6 55.7 80.9 66.5 60.3 64.5

56.9 54.4 85.2 66.8 59.9 65.5

32.0 35.1 50.7 44.0 40.8 40.3

53.6 54.7 74.0 65.3 60.1 59.6

56.5 52.9 80.3 68.1 60.8 66.0

51.3 55.0 79.2 64.7 59.2 66.5

54.4 54.3 69.0 65.0 57.2 63.5

57.1 57.0 79.9 71.3 57.8 66.8

53.6 52.1 85.0 63.9 58.9 69.6

54.8 56.6 73.9 60.9 60.6 59.2

59.3 59.6 80.9 70.5 65.8 69.7

60.9 58.9 85.2 69.9 64.2 67.5

Table. 8 Compression ratio analysis with existing approaches. Circuits

Total Bits

S5378 S9234 S13207 S15850 S38417 S38584

23,754 39,273 165,200 76,986 164,736 199,104

Compression Ratio% SC [30]

Golomb [6]

FDR [8]

VIHC [4]

ERLC [11]

SHC [5]

PTX+SHC

43.6 40.0 74.4 58.8 45.1 55.2

37.60 41.53 72.97 46.79 42.38 47.67

48.0 43.5 81.3 66.23 43.26 60.91

46.9 46.1 80.4 64.4 47.8 59.6

47.8 43.5 80.6 66.4 58.7 61.6

55.1 54.2 77.0 66.0 59.0 64.1

57.1 57.0 80.0 71.3 57.8 66.8

Table. 9 Area overhead of decoder and serializer using PTX based Selective Huffman technique [5]. Circuits

Block Size

Num.States (n + b)

Decoder Area Overhead

Serializer Area Overhead

S5378

8 10 12 8 10 12 8 10 12 8 10 12 8 10 12 8 10 12

16 25 28 14 26 27 14 18 26 16 20 22 14 27 30 15 21 29

4.2 11.1 10.3 4.8 8.4 8.9 1.7 2.6 3.7 1.4 3.2 4.4 0.9 1.5 1.4 0.8 1.3 1.5

3.8 4.9 5.6 3.2 3.9 4.8 1.2 1.7 2.2 1.4 1.9 1.9 0.6 0.9 0.7 0.7 0.7 0.7

S9234

S13207

S15850

S38417

S38584

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937 Table. 10 PTX+SHC application testing time analysis with different techniques. Circuits

Golomb [6]

FDR [8]

VIHC [4]

SHC [5]

PTX+SHC

S5378 S9234 S13207 S15850 S38417 S38584

4 4 16 8 4 8

128 64 512 512 1024 512

8 5.3 16 12 16 8

6 6 6 6 6 6

4 4 4.5 4.3 3.5 4

area overhead to 2.24% for block size 8, 4.51% for block size 10 and 4.91% for block size 12. Similarly, this approach reduces average serializer overhead to 1.75% for block size 8, 2.25% for block size 10 and 2.55% for block size 10. 6.4. Application testing time The total application testing time for each benchmark circuit is presented in Table 10. The application testing time for various techniques is presented in Table 10. As per the comparison of application testing time analysis with existing approaches, PTX+SHC gives the lowest testing time for S13207, S15850, S38417, and S38584 benchmark circuits and for remaining S5378 and S9234 benchmark circuits Golomb code [6] and PTX+SHC gives same results. From this analysis, PTX+SHC is a much better approach for application testing time reduction.

7. Conclusion In this paper, we proposed a technique named power transition X ﬁlling based selective Huffman encoding for test data compression, scan in power reduction, area overhead reduction and speed up application testing time for digital VLSI circuits. According to the experimental results, PTX+SHC encoding techniques give a better compression ratio compared to existing approaches to the ISCAS’89 benchmark circuits. Also, the proposed technique concentrates all dimensions of the VLSI constraints i.e. average power, peak power, total power, area overhead and application time. According to the comparative study, this PTX+SHC approach is more eﬃcient than previous techniques. The main advantage of this proposed technique is concern eﬃciency in all factors of digital VLSI circuits (data compression, testing time, scan in power and area overhead) equally. The simple decoder architecture is proposed in this paper that signiﬁcantly reduces the area of decoder and input cores. This paper experimentally analyzed the data compression, application testing time and area overhead. In the future, this work can be extended to reduce the time and memory consumption even better.

Declaration of Competing Interest There is no conﬂict of Interest. References [1] Girard P., Nicolici N., Wen X. Power-aware testing and test strategies for low power devices. Springer US978-1-4419-0928-2; 2010. [2] J. Aerts, E.J. Marinissen, Scan chain design.test time reduction in core-based ICs, in: Proc.Int.Test.Conf, 1998, pp. 448–457. pp. [3] Mehrdad Nourani, Mohammad H, Tehranipour.RL-Huffman encoding for test compression and power reduction in scan application, ACM Trans. Design Autom. Electron. Syst. (TOADES) 10 (2004) 91–115 ppjan.

9

[4] P. Gonciari, B. Al-Hashimi, N Nicolici, Variable-length input huffman coding for system-on-a-chip test, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 22 (6) (2003) 783–796. [5] Abhijit Jas, Jayabrata Ghosh-Dastidar, Ng Mom_Eng, A Nur, Touba.An eﬃcient test set compression scheme using selective huffman coding, IEEE Trans. Comput.-Aided Integr. Circuits Syst. 22 (6) (2003) June. [6] A. Chandra, K. Chakrabarty, System-on-a-Chip test-data compression and decompression architectures based on golomb codes.ieee trans, Comput.-Aided Design 20 (2001) 355–368 Mar. [7] A. Chandra, K. Chakrabarty, A uniﬁed approach to reduce soc test data volume, Scan Power and Testing Time, IEEE Trans. Comput.-Aided Design 22 (2003) 352–363 Mar. [8] A. Chandra, K. Chakrabarty, Test data compression and test resource partitioning for system-on-a-chip using frequency-directed run-length (FDR) codes, IEEE Trans. Comput. 52 (8) (2003) 1076–1088 Aug. [9] X. Kavousianos, E. Kalligeros, D Nikolos, Optimal selective Huffman coding for test-data compression, IEEE Trans. Comput. 56 (8) (2007) 1146–1152. [10] C.V. Krishna, N.A. Touba, Reducing test data volume using lfsr reseeding with seed compression.proc, in: Int’l Test Conf., Oct. 2002, pp. 321–330. [11] A.H. El-Maleh, R.H. Al-Abaji, Extended frequency-directed run-length code with improved application to system-on-a-chip test data compression.proc, Ninth Int’l Conf. Electronics, Circuits, and Systems 2 (2002) 449–452 Sept. [12] W. Zhan, A El-Maleh, A new scheme of test data compression based on equal-run-length coding (erlc), Integr. VLSI J. 45 (1) (2012) 91–98. [13] A. Jas, N.A. Touba, Using an embedded processor for eﬃcient deterministic testing of system-on-a-chip, in: Proc. Int. Conf. Computer Design, 1999, pp. 418–423. pp. [14] K.Murali Krishna and M.Sailaja. Low power memory built in self-test address generator using clock controlled linear feedback shift registers. Springer J. Electron. Test 92014) 30: 77–85. [15] K. Thilagavathi, S. Sivanantham, Two-Stage low power test data compression for digital VLSI circuits, Elsevier-Comput. Electr. Eng. 71 (2018) 309–320. [16] A. El-Maleh, S. Al-Zahir, E. Khan, A geometric-primitives-based compression scheme for testing systems-on-a-chip, in: Proc. VLSI Test Symp., 2001, pp. 54–59. [17] R. Dorsch, H.-.J. Wunderlich, Reusing scan chains for test pattern decompression, in: Proc. European Test Workshop, 2001, pp. 124–132. [18] D. Das, N.A. Touba, Reducing test data volume using external/LBIST hybrid test patterns, in: Proc. Int. Test Conf., 20 0 0, pp. 115–122. [19] C.V. Krishna, A. Jas, N.A. Touba, Test set encoding using partial lfsr reseeding, in: Proc. Int. Test Conf., 2001, pp. 885–893. [20] A. Jas, C.V. Krishna, N.A. Touba, Hybrid bist based on weighted pseudo-random testing: a new test resource partitioning scheme, in: Proc. VLSI Test Symp., 2001, pp. 114–120. [21] S. Sivanantham, K. Sarathkumar, J. Manuel, P. Mallick, J. Perinbam, Csp-ﬁlling: a new X-ﬁlling technique to reduce capture and shift power in test applications, in: International Symposium on Electronic System Design, 2012, pp. 135–139. [22] A. Trinadh, S. Potluri, S. Balachandran, C. Babu, V Kamakoti, Xstat: statistical X– ﬁlling algorithm for peak capture power reduction in scan tests, J. Low Power Electron. 10 (1) (2014) 107–115. [23] Dp-ﬁll: a dynamic programming approach to X-ﬁlling for minimizing peak test power in scan tests. In: Design, Automation and Test in Europe, DATE, 2015. p. 836–841. [24] X. Kavousianos, E. Kalligeros, D Nikolos, Multilevel-huffman test-data compression for ip cores with multiple scan chains, IEEE Trans. Very Large Scale Integr. VLSI Syst. 16 (7) (2008) 926–931. [25] F. Brglez, D. Bryan, K Kozminski, Combinational proﬁles of sequential benchmark circuits, in: IEEE International Symposium on Circuits and Systems, 3, 1989, pp. 1929–1934. [26] I. Hamzaoglu, J.H. Patel, Test set compaction algorithms for combinational circuits, in: IEEE/ACM international conference on computer-aided design, digest of technical papers, 1998, pp. 283–289. [27] D. Heidel, S. Dhong, P. Hofstee, M. Immediiato, K. Nowka, J Silberman, K. Stawiasz, High speed seralizing/de-serializing deign-for-test method for evaluating a 1GHz microprocessor, in: VLSI Test Symposium, 1998, pp. 234–238. [28] K. Butler, J. Saxena, T. Fryars, G. Hetherington, A. Jain, J Lewis, Minimizing power consumption in scan testing: pattern generation and dft techniques, in: International test conference, 2004, pp. 355–364. [29] H. Yuan, K. Guo, X. Sun, Z. Ju, in: A power eﬃcient test data compression method for soc using alternating statistical run-length coding, 32, 2016, pp. 59–68. [30] A. Jas, J. Ghosh-Dastidar, and N.A. Touba, “Scan vector compression/decompression using statistical coding,” in Proc. IEEE VLSI Test Symp, Apr. 1999, pp. 114–121. [31] M. Saﬁeh, J. Freudenberger, Eﬃcient vlsi architecture for the parallel dictionary lzw data compression algorithm, IET Circuits, Devices & Systems 13 (5) (2019) 576–583. [32] H. Zhu, Y. Pan, K. Li, R. Huan, Method and VLSI implementation of lossy– to-lossless LTM ECG compression framework, Electron Lett. 55 (2) (2018) 70–72. [33] Wang, H., Wang, T., Liu, L., Sun, H., & Zheng, N. (2019). Eﬃcient compressionbased line buffer design for image/video processing circuits. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. [34] S. Tadisetty, A novel ortho normalized multi-stage discrete fast stockwell transform based memory-aware high-speed vlsi implementation for image compression, Multimed. Tools Appl. 78 (13) (2019) 17673–17699.

10

L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937 Lokesh Sivanandam received his B.E. Degree in Electrical and Electronics Engineering from Dr MGR Engineering College and M.E. Degree in Electronics Engineering from Madras Institute of Technology, Chennai, India, in the year 20 02 and 20 05 respectively. He is currently a Research Scholar in the Department of Electronics and Communication Engineering, College of Engineering, Guindy, Anna University, Chennai, India. His-research interest is in the area of Testing of Digital Circuits and SOCs. He is a member of Computer Society of India.

Sakthivel Periyasamy received his B.E. degree in Computer Science and Engineering from University of Madras, India in 1992, M.E degree in Computer Science and Engineering from Jadavpur University, India in 1994 and Ph.D. degree in Computer Science and Engineering from Anna University, India in 2008. He is currently Professor in the Department of Electronics and Communication Engineering, Anna University, India. He is a member of Computer Society of India (CSI), Institute of Electronics and Telecommunication Engineers (IETE), Institution of Engineers (India), Indian Society for Technical Education (ISTE), VLSI Society of India (VSI), Institute of Electrical and Electronics Engineers (IEEE), Association of Computing Machinery (ACM) and Institute of Electronics, Information and Communication Engineers (IEICE). His-research interests include VLSI Design and Testing and Computer-Aided Design of VLSI Circuits.

Uma Maheswari Oorkavalan received her B.E. degree in Electronics and Communication Engineering and M.E. Degree in Communication Engineering from Thiagarajar College of Engineering, Madurai, India, in the year 1998 and 1999 respectively. She obtained her Ph.D. Degree in the area “Nonlinear signal processing” in the year 2009 from Anna University, Chennai, India. She is currently Professor in the Department of Electronics and Communication Engineering, College of Engineering, Guindy, Anna University, Chennai, India. Her research includes nonlinear signal processing, underwater image enhancement, and analysis of FMRI. She is a senior member of IEEE and life member of Computer Society of India.

Power transition X filling based selective Huffman encoding technique for test-data compression and Scan Power Reduction for SOCs

Power transition X filling based selective Huffman encoding technique for test-data compression and Scan Power Reduction for SOCs

Recommend Documents