Microprocessors and Microsystems 72 (2020) 102937
Contents lists available at ScienceDirect
Microprocessors and Microsystems journal homepage: www.elsevier.com/locate/micpro
Power transition X filling based selective Huffman encoding technique for test-data compression and Scan Power Reduction for SOCs Lokesh Sivanandam a,∗, Sakthivel Periyasamy a, Uma Maheswari Oorkavalan a a
Department of Electronics and Communication Engineering, Anna University, Chennai 600025, India
a r t i c l e
i n f o
Article history: Received 6 September 2019 Revised 21 October 2019 Accepted 29 October 2019 Available online 31 October 2019 Keywords: Test data compression Power transition X filling System-on-a-chip (SOC) Testing power X filling, Selective Huffman encoding Chip area overhead
a b s t r a c t Due to the excessive utilization of memory, data compression is an evergreen research topic. Realizing the constant demand of compression algorithms, this article presents a compression algorithm to analyse the digital VLSI circuits for constraint optimization, such as test data volume, switching power, chip area overhead and processing speed of testing. This article proposes a new power transition X filling based selective Huffman encoding technique, which achieves better data compression, switching power reduction, chip area overhead reduction and speed of testing. The performance of the proposed work is examined with the help of ISCAS benchmark circuits. Initially, the test set is occupied by using the power transition X filling technique to replace the don’t care bits and the filled test set is further encoded by selective Huffman encoding technique. The experimental results show that the proposed power transition X filling based selective Huffman encoding gives effective results compared to the related data compression techniques with minimal time and memory consumption. © 2019 Elsevier B.V. All rights reserved.
1. Introduction Numerous constraints exist in every product development and it is common for all designs. Hence, it is mandatory to analyse the constraints both before and after product development, as the product performance and quality may get affected. For a costeffective product, we need to follow proper designing and testing. Testing is a very important phase for every product development that examines these constraints so that quality and effectiveness are measured before they shipping to the market. Similarly, VLSI chip design has a lot of issues that are examined by testing such as application testing time, memory, power consumption and area overhead. In digital circuits, System-on-a-Chip (SOC) uses pre-verified and predefined modules to achieve these goals. One of the major issues in SOCs is to handle a large amount of test data during testing of the device in the Automatic Test Equipment (ATE) kit. Due to this vast amount of test data volume, the limited number of input pins in a device may lead to bottleneck while processing that increases the application testing time. To avoid this problem, the input test data is applied in compressed form to the input pins of SOCs. The chip design using pre-verified and predefined modules reduces the development time of the chips but increases the
∗
Corresponding author. E-mail address:
[email protected] (L. Sivanandam).
https://doi.org/10.1016/j.micpro.2019.102937 0141-9331/© 2019 Elsevier B.V. All rights reserved.
complexity of the system-on-a-chip and affects as a bottleneck. This issue is handled by transforming the test sets to compressed form and ATE memory is utilized for storage. These compressed test patterns pass through each core of the system-on-a-chip. The compressed test patterns are accessed in each core of SOC and decompressed into the test set to the original form. A single-chip contains millions of transistors that increase the complexity of chips while processing big test data sets. This complexity happens when the scan chain contains too much of switching transitions in the data set. Another issue in SOCs is the handling of bottleneck conditions while processing big test data set without affecting the size of the chip design. The increase in the number of cores in chips design increases hardware area overhead, which is a major issue in chip designing. So, designing chips should be with a small area overhead and without meeting any bottleneck. Limiting the size of the hardware area, testing application speed, memory management depends on the effective ‘X’ filling approach. ‘X’ filling reduces the switching power used in the device testing process so that we can avoid the physical damage by the overheating of switching activities. The compression of test data helps to minimize the tester memory, increase the speed of processing, limit the pins of chips and prevent physical damage of the chips. The speed of device testing always depends on the amount of data transferred to each core of the chip and how fast the input test set is transformed into a decompressed form.
2
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937
This paper presents a Power Transition X filling based Selective Huffman Encoding technique to compress the test data, reduce area overhead, increase the operational frequency and reduce switching power and compares these results with various ‘X’ filling techniques such as 0 s filling, 1 s filling, random filling. This work consists of two parts, where the first part is Power Transition X Filling and the second part is Selective Huffman Encoding. Using power transition X filling the don’t care bits of the test set are filled and the test set is divided into fixed-size blocks with its frequency and it’s stored in the ATE memory. The compressed test data are then transferred to each core of the SOCs where the compressed test data are decompressed. The Selective Huffman encoding technique assigns unique code words by constructing the Huffman tree of the top frequency block of the test patterns. This power transition X filling based selective Huffman coding is more efficient than other techniques in reducing test application time, area overhead, test power, and test data compression. This approach doesn’t require any additional set up for testing as it uses existing decoder hardware for decompression. The major contributions of this work are presented as follows: 1. The test data compression using the power transition X filling based selective Huffman technique is enhanced. 2. A significant amount of application testing time and tester memory while processing test set is reduced. 3. The scan in power and chip area overhead in digital VLSI circuits are minimized. The paper is organized as follows. Section 2 discusses basic introductions to test data compression and its issues that includes related works about test data compression and X filling techniques. Section 3 discusses the basic terms and formula used for this work and Section 4 discusses different X filling techniques used in this paper. Power optimized X filling based selective Huffman technique proposed in Section 5, experiment results are discussed in Section 6 and Section 7 discusses the conclusion. 2. Related works In digital electronics, proper application testing techniques of System-on–chips play a major role in test power optimization and reduction in chip area. In most of the cases, the issues are caused by poor ‘X’ filling techniques of the test set. These ‘X filling techniques determine the compression ratio, which fills the don’t care bits (unspecified bits represented as ‘X’) either by 0 or 1. We can reduce as much as possible, the amount of switching power used for testing application by an efficient ‘X’ filling technique. Many approaches are proposed for the test data compression to enhance the compression ratio, tester memory management, switching power reduction and operational frequency. The power requirement while testing of application is more when compared to the normal mode of operation. This is happening because of too much of switching operations in the systemon-chips by the effect of poor ‘X’ filling techniques. The high rate of switching increases the temperature of the chips that leads to physical damage to the product and increases the cooling cost, which is proposed by Girard P et al. [1]. They proposed a novel approach that concern about power consumption and its critical issues in digital VLSI circuits. One of the common approaches to fill the don’t care bits is to replace them with either 1 s or 0 s. These two approaches are reducing the switching activity in test sets that reduce the switching power but doesn’t reduce much power dissipation. J. Aerts et al. [2] proposed a technique that reduces chip area overhead, according to the reduction in test data volume size. This approach discusses the maximum utilization of input cores and tester memory but doesn’t concern about the compression
mechanism. The reduction of test data compression and testing power using run-length filling with Huffman encoding proposed by Mehrdad Nourani et al. [3] but that not concern about to reduce the chip area overhead. Gonciari P et al. [4] proposed an approach to discuss the influence of three test data compression parameters i.e. compression ratio, application processing time and chip area overhead using variable-length Huffman coding technique but compression ratio, chip area overhead and testing time is not more optimal. The statistical coding approach proposed by Abhijit Jas et al. [5] that assign variable length code words to fixed-size blocks of a test set to maximize the compression ratio and minimize the area overhead than existing approaches [2-5] but selective Huffman technique is efficient only if the frequency of test set is more than certain range. Optimal selective Huffman proposed by Kavousianos X et al. [9] to achieve better compression than selective Huffman encoding [5] which maximizes compression ratio and minimizes testing time and area overhead. Golomb codes, Frequency Directed Runlength code, Alternating Run Length Code proposed by A. Chandra et al. in [6,7] and [8] for the efficient test data compression. New test data compression and decompression technique using Golomb code [6] which gives more efficient than run-length based Huffman technique and less efficient than variable-length Huffman coding [4], selective Huffman coding [5] and optimal selective Huffman coding [9]. The Frequency Directed Run-length coding [7] is proposed which gives a better compression ratio than Golomb codes [6] but it only deals with the distribution of 0 s for encoding. Alternating Run Length coding [8] is more efficient for test data compression, application testing time and scan in power than Frequency Directed Run-length code [7] but it doesn’t discuss area overhead issue. The significant amount of switching reduction is achieved using Linear Feedback Shift Register (LFSR) with Memory Built-in Self-Test(MBIST) proposed by C.V. Krishna et al. [10] that enhance test data compression much better but it only deals with test data compression. Extended Frequency Directed Run-length proposed by A.H. ElMaleh et al. [11] which deals with both 0 s and 1 s distribution for encoding and its more optimal for test data compression than Frequency Directed Run-length code [7] but doesn’t discuss area overhead and application testing time. Equal Length Run-length filling proposed test data compression and chip area overhead reduction by Zhan W et al. [12] that deal with both 0 s and 1 s and maximize the equal consecutive runs and apply the same length codewords of those consecutive runs. A. Jas et al. [13] propose a new technique for test data compression, application testing time reduction that supports the highspeed shifting of test bits to SOCs cores with a slower clock rate. K. Murali Krishna et al. [14] proposed a novel approach that proposed the relationship between input bandwidth optimization and application testing time according to the operating frequency and this approach is optimal to reduce switching activity but it is costeffective. The hybrid X filling and two-stage compression techniques proposed by K. Thilagavathi et al. [15] for effective test data reduction using adjacent X filing [28] and modified 4m X filling [29] which results are optimal than previous approaches [3-13]. The novel approach proposed for the test data compression approach is by A. El-Maleh et al. [16], which purely depends on the basic geometric shapes such as line, triangle and square and it is not more optimal for chip area overhead reduction and scan in power reduction. Dorsch et al. [17] proposed the approach to reduce the bandwidth of test data by reusing the SOCs cores that enhance the test data compression to ATE. The Hybrid Built-in SelfTest (BIST) based data compression technique for VLSI circuits proposed by D. Das et al. [18] which reduces the tester memory but failed to optimize much better application testing time and scan
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937
in power. V. Krishna et al. [19] proposed a partial LFSR reseeding technique for effective for manage tester memory and drawback is having more computational time. A. Jas et al. [20] proposed a hybrid resource partitioning scheme for tester memory management and effective for area overhead reduction but have more application testing time. Sivanantham S et al. [21] proposed CSP filling, Trinadh A et al. [22] proposed ‘X’ filling based on dynamic programming and [23] proposed DP filling technique that effectively reduce the average power and peak power while application testing but doesn’t concern on area overhead. Multilevel Huffman coding proposed by Kavousianos X et al. [24] reduces the application testing time by introducing the parallel decompression architecture. In [31], an efficient VLSI architecture for the parallel dictionary Lempel- Ziv-Welch (LZW) data compression algorithm was presented. This work utilizes multiple dictionaries for speeding up the encoding process. A VLSI implementation of lossy to lossless LTM Electro-Cardio-Gram (ECG) compression framework was proposed in [32]. This work employs lifting discrete wavelet transform for producing the coefficients of tail bits and the bits truncated by considering the information loss. The processed coefficients are encoded by the altered run-length code. A compression-based line buffer design is presented for image or video processing circuits in [33]. This work utilizes the variablelength code effectively inline buffer architecture. An ortho normalized multi-stage discrete fast Stockwell transform-based VLSI implementation for image compression was presented in [34]. The proposed transform unit deals with the split, predict and update operations with respect to the odd samples. This idea fastens the entire process of compression. 3. Power transition X filling algorithm Usually, the test set generated by automatic generation toolkit ATPG contains a series of more and more numbers of 0 s, 1 s and don’t care bits. (specified as “X”) in the form of m∗ n matrix and each don’t care bit is replaced by either 0 or 1 in order to maximize the compression ratio. In the previous researches some other X filling techniques are proposed for filling don’t care bits but due to the inefficiency in compression ratio and area overhead here
Ti =
3
Now T1 . is moved to T7 . position and T2 . is moved to T8 . position. As mentioned above, now T1 (considerd as T7 ). and T2 (considerd as T8 ) are compared and fill X bits. This algorithm similarly scans every bit in the test set Ti up to its length l. The result of this algorithm by after completion of X filling is Ti = {1111110 0 010 0}. This power transition X filling algorithm is briefly explained below. Input: Unfilled Test set (m∗ n), {T = j1 ., j2 , j3 . …… jm−1 , jm , jm+1 …………. jn } Output: X filled test data set Step 1: Start Declare variables i, j, FirstBit, NextBit, FillSize, FillBit TestVector [] Step 3: Initialize variables i ← 1 and j ← 0 Step 4: READ FirstBit ← TestVector[j] Step 5: FOR j = 1 to TestVector [] length READ NextBit ← TestVector[j] IF NextBit = FirstBit OR NextBit = X THEN INCREMENT i ELSE IF FirstBit = X THEN INCREMENT i IF NextBit = 1 SET FirstBit ← 1 IF NextBit = 0 SET FirstBit ← 0 SET FillSize to 0 ELSEIF FirstBit = 1 THEN SET FillSize to i SET FillBit ← 1 SET FirstBit ← 0 FILL TestVector [] by FillBit up to FillSize SET i ← 1 ELSE SET FillSize to i SET FillBit ← 0 SET FirstBit ← 1 FILL TestVector [] by FillBit up to FillSize SET i ← 1 ENDIF ENDIF ENDFOR Step 6: Stop
Consider the test set Ti . that contains 4 scan vectors (T1, T2, T3, and T4) and each scan vector has a length of 64 and a total length of 256. The above power transition X filling algorithm is applied to replace the don’t cares for this Ti .
⎧ ⎫ ⎪ ⎨ X X 0 0 0 0 0 0 01X X X 1X X 1X X X 10 0 0 0 0 0 011X 1X 0 0 0 0X 0 0X 01X X X X X X X X 111001X 0X 0X 11 ⎪ ⎬ 1111X X X X 1X X X 0 0 0 01111X 1X 1X 1X X 1X X X 1X X 100X 00X 00X 0100000000010X 00X 00 0 0 0 0 0X 0X 0X 11110110X 01X X X X 1X X 1X X X X X X 1X 110 0 01X 11111111110X 011X X X 11 ⎪ ⎭ 10 0 01X 0 0 0 0 0 0 0 0 01X X X X X X 1X 1X 1X 1X 0 0 0 0X 0X 0 0 0 01X 0X 10 0 01X X X X X X X X X 0 0 0 0 0
⎪ ⎩
Result of Ti after power transition X filling is presented as follows.
⎧ ⎫ ⎪ ⎨ 0 0 0 0 0 0 0 0 01111111111110 0 0 0 0 0 0111110 0 0 0 0 0 0 0 011111111111100110 0 0 011⎪ ⎬ 1111111111110 0 0 0111111111111111111110 0 0 0 0 0 0 0 0 010 0 0 0 0 0 0 0 010 0 0 0 0 0 0
⎪ ⎩ 0 0 0 0 0 0 0 0 0 011110110 0 011111111111111111110 0 01111111111110 0 01111111 ⎪ ⎭ 10 0 0110 0 0 0 0 0 0 0 01111111111111110 0 0 0 0 0 0 0 0 0 0110 010 0 011111111110 0 0 0 0
proposed power transition X filling algorithm that follows new X filling mechanism which maximizes the compression. The don’t care bits present in the input test set are slightly too high so that the power transition X filling technique fills the don’t care bits depends on the neighboring bits. The algorithm work as follows: Consider the test set Ti = {XXXXX10XX10X}, where the first bit is T1 and the second bit is T2 and l is the length of the test set. Now T2 is compared with T1 . If T1 and T2 are equal then scan T3 and compare T3 with T1 otherwise, fill T1 by NOT value of T2 . According to the above concept, T1 is equal up to T5 (all are X bits). Now T6 = 1 which is not equal to T1 so that T1 to T6 of Ti are filled by 1.
After completing the X filling, the test set is Ti is partition into k distinct blocks b1 , b2 , b3 , . . . .bk with frequency p1 , p2 , p3, . . . . pk and where l is the length of block b Table 1. For example, consider Ti is partitioned into 4-bit length (l = 4) distinct blocks then 16 distinct blocks (2l = 24 = 16) are possible i.e. 0 0 0 0, 0 0 01, 0 010, …..1111. 4. Selective huffman encoding Huffman encoding is the lossless data compression approach to achieve a maximum compression ratio of the test set, but major issues in the Huffman encoding is the inefficiency of area overhead that requires large area overhead for the decompressor architecture
4
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937 Table. 1 Test set partition and Frequency Calculation. Test Set Ti 0000 1111 0000 1000 Distinct Block 1111 0000 1100 1000 0011 0111 1110 1101 0100 0010 0001
0000 0111 1111 1111 1111 0000 0000 0011 1101 0001 1100 0000 Frequency of occurrence 24 16 5 5 4 3 2 1 1 1 1
1111 1111 1000 1111
1000 1111 1111 1111
0000 1111 1111 1111
1111 1111 1111 1100
1000 1111 1111 0000
0000 0000 1110 0000
0011 0000 0011 0110
1111 0010 1111 0100
1111 0000 1111 0111
1100 0000 1100 1111
1100 1000 0111 1110
0011 0000 1111 0000
Fig. 1. Huffman tree for selective Huffman encoding.
to encode all test set’s blocks. Due to the linear growth of the block size l, the distinct block limit k will exponentially rise so that more numbers of blocks need to be decoded. So we propose a selective Huffman encoding approach to encode only a few distinct blocks of the test set’s blocks. From the blocks b1 , b2 , b3 , . . . ., bk we take only most occurring blocks (f > 3) from Out of k distinct blocks i.e. b1 , b2 , b3 , . . . ., b f (f < k) only encoded selective Huffman encoding remain b f +1 , b f +2 ,. . . ., bk are not encoded. The selective Huffman encoding technique assigns code words (short keywords) to each unique block depending on the frequency of occurring. To assign code words the binary tree is constructed from blocks frequencies that beginning with two lowest frequency blocks and moving toward the root node. From the root node, ‘0 is assigned to every left node and ‘1 is assigned to every right node.
code words to applied to small frequency blocks. As already mentioned, to reduce area overhead the higher frequency blocks (f > 3) only encoded by selective Huffman encoding and apply code words with a prefix of ‘1 . For the lower frequency blocks (f < 3) encoding not applied so that blocks are transferred with the prefix of ‘0 . To optimize the area overhead we apply these selective Huffman encoding techniques for both fixed-size blocks (4 bit, 8 bit, 10 bit and 12 bit). The higher compression possible only if the number of encoded blocks is more than not encoded blocks and l size should be minimum. Fig. 1 shows the Huffman tree for selective Huffman encoding.
The code word obtains by traversing each node of the tree from the root node. Selective Huffman algorithm assigns smaller code words to the high frequency of occurrence of blocks and larger
The switching power is the number of times the system diodes ON/OFF as a result of the input test set’s adjacent bits. Basically, switching power of application testing is more when compared to
4.1. Total switching power, average power, peak power, and compression ratio calculation
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937
5
Table. 2 Compression ratio, Total power, Average power and Peak power of different X filling techniques for example pattern. X Filling Approach
Compression Ratio
Total Power
Average Power
Peak Power
0 s filling 1 s filling Random filling Power Transition X filling
8.20 6.25 15.35 17.96
2690 1845 2180 1068
672 461 622 267
748 648 710 366
normal mode. This is due to the continuous scanning of a large number of different adjacent bits, so that, it is possible to reduce switching activities by keeping the same adjacent bits. Switching power is highly influenced by the don’t care bits of the input test set and its length. The scanning power is calculated by Eq.1.
W TTotal =
n−1
(n − m ) Tim Ti(m+1)
(1)
m=1
Where m and n denote the total number of scan vectors in Ti and the total number of bits per scan vector respectively. The average power of the test set is computed by the following formula.
Average Power =
m
W TTotal ( j )/m
(2)
j=1
The peak power is the maximum switching transition in an input test set. The maximum peak power is harmful to the VLSI circuits it may lead to physical damage to the circuit. Peak power is highly influenced by the input test data set and its length. The peak power is calculated as follows
W Tpeak = max1
≤ i ≤ n
{W TTotal (i )}
(3)
The compression ratio (CR) for the test set is calculated by the following formula.
CR =
n i=1
F ∗ (B − C )/
n
Tm
(4)
m=1
In the above equation, F, B, C represent the frequency, size, and codeword length of each block respectively. Tm is the total number of bits in the input test set. The power transition filling based Selective Huffman encoding technique is applied to the example test set and compression ratio, total power, average power and peak power for different X filling approaches is tabulated in Table 2. From the table, it is proven that the power transition X filling gives more compression than other approaches. The bandwidth utilization in SOCs circuits is directly proportional to the compression ratio. The maximum compression helps to reduce the memory requirements in ATE equipment to process the input test data set which also helps to maximize the throughput of operating frequency. The average scanning power of the SOCs depends on the number of switching activities of the circuits. The maximum compression helps to improve the input bandwidth test data size in ATE. The relationship between average power and switching activities is given in the below Eq. (5).
Pavg = α T .Cload V 2 . fclk
(5)
In the above equation, α T. denotes the switching activity, Cload represents the total load capacitance, V stands for voltage and fclk . denotes the operating frequency. The decompression architecture is presented in the following section. 5. Decompression architecture The compression and decompression architecture for our approach [5] shown in Fig. 2. The on-chip decoder decompresses
the code words into the original test set. The decoder accesses variable-length codewords of each unique fixed block and generates an original test set from this code word. The decoder simply accepts serial input of code words from ATE and the on-chip decoder simply replaces 0 s or 1 s to the scan lines. The decoder decodes the variable-length code words and generates its corresponding fixed-size blocks. The decoder decodes the code words only if code word prefix with 1 otherwise it directly sent to scan lines. The generated blocks from codewords are transferred to scan chains. The serializer plays an important role in the decoder circuit because it helps to achieve the degree of parallelism. Parallelism means that handling two or more operation execution in the chip concurrently, it helps to reduce the testing time. In the decoder circuit, to reduce the testing time, the ATE never stop the code words transmission to the decoder, it means that it never cares about decompression and transmission finishes of the decoder. The decoder accepts the series of inputs from ATE at each clock cycle b by bit. After accepting a complete codeword from ATE, the decoder immediately starts to ceive the next codeword for the corresponding block bit by bit and the other hand decoder decompresses the codeword and transferred finished decoded output to scan chains. Here, the decoder provides two parallelisms to the decoder, one is for receiving codeword at each clock of ATE and another is for decode the received codeword and generate corresponding block k and transfer them to the scan chains at the same time. The scan chain uses two clocks for this operation, during the first clock cycle the decoder receives the code word from ATE. At the second clock cycle, the de-compressed codewords are transferred to the scan chain and the next codeword is loaded to the decoder. The decoder access codeword bit by bit at each tester clock cycle so that decoder totally takes N clock cycles to complete N length codeword. After receiving the complete codeword, the decoder executes the code word in parallel and generates corresponding block size. 5.1. Area overhead calculation Area overhead is the extra hardware space required for encoder and decoder in VLSI circuits to process excess input. This area overhead is highly influenced by decoder nature, don’t care bits of input test set and length of the test set. To generate fixed-size input test patterns, shift registers are registered and for variable-size input test patterns counters are used. The area overhead is calculated using formula.
Area Overhead =
Area o f Decoder Area o f Benchmark Circuit
(6)
For example, consider the ATE serially sending 00, 010, 11 and 0 codewords bit by bit and its corresponding blocks are 0 0 0 0, 10 0 0, 0101 and 1010. In this example, codeword lengths are variable size and blocks are fixed. Another important fact here is that the amount of transferring bit from ATE is lesser than the amount of bit transferred to the scan chain after decompression. We can easily understand this from the above example because the input
6
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937
Fig. 2. Compression and decompression architecture. Table. 3 ISCAS89 Benchmark Circuit Mintest Profile. Circuits
Number of bits per scan line
Total patterns
Test data volume
Total X-bits
% of X bits
S5378 S9234 S13207 S15850 S35932 S38417 S38584
214 247 700 126 1763 1664 1464
111 159 236 611 16 99 136
23,754 39,273 165,200 76,986 28,208 164,736 199,104
17,249 28,672 153,887 64,329 9957 112,154 163,817
72.615 73.007 93.152 83.559 35.298 68.081 82.277
codeword length of each codeword is lesser than its corresponding block size. The ATE sends one 1-bit codeword (0), two 2-bit codewords (00 and 11) and one 3-bit codewords (010). But decoder generates four 4 bit blocks from these code words. Here, the ATE sends a totally 8-bits but the decoder generates 16-bit (4 × . 4 blocks). This proves the number of bits transferred after decompression from the decoder to the scan chain is always higher than the number bits coming to the decoder from ATE. This is possible by using a scan chain with a faster clock than the ATE clock [27,29]. It shows that, when serializer clock is faster than the ATE clock, it is possible to achieve faster transmission of decoder output to scan chains than transfer code words to the decoder. Fig. 2 shows that, optimize the clock rate of scan chains is to use a single tester channel to feed multiple scan chains where the code word bits are transferred to each decoder as per the clock cycles. If chip contains there are two decoders with separate scan chains, then decoder 1 accepts the input at even clock cycle and decoder 2 accepts input at odd clock cycle. Two important things are present in designing decoder. One is that it should have minimum area overhead and another is transmission time of decoder to serializer should not faster than the transmission time from serializer to scan chains. The experimental results of the proposed work are presented in the following section. 6. Experimental results The proposed power transition X filling approach for test data compression is evaluated using ISCAS’89 benchmark circuits [26].
The proposed approach is implemented using the C program and tested in Intel Pentium Core Duo, 2.2GHZ processor with 2 GB DDR3 RAM. We used Mintest proposed by Brglez F et al. [25] as Automatic Test Generation Toolkit for generating input test data sets for our experimentation which shown in Table 3. Compression ratio, total power, average power, peak power and testing application time are discussed in this section. 6.1. Average power and peak power The average power and peak for proposed and existing X filling techniques described in Table 4. As for getting an optimal reduction of average power and peak power we need to analyze the results presented in the tables. Our power transition X filling technique is compared with five existing Xfilling approaches (0 s filling, 1 s filling, random filling, adjacent filling [28], 4 m filling [30]) and results are presented in Table 4. The percentage of improvement of PTX filling with other filling techniques is tabulated in Table 5. From the results except for benchmark circuit S13207, power transition X filling has a much better compression ratio for remaining benchmark circuits than all existing approaches. The average power and peak power of PTX filling are compared with existing filling techniques using the following graphs that showed in Fig. 3 and Fig. 4. The average power and peak power improvement for the proposed approach are calculated by the following formula. Power Saving =
Total Power (others ) − Total Power (PT X f il l ing ) × 100 Total Power (others )
(7)
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937
7
Table. 4 Average power and peak power for various filling approaches. Circuits Average Power
Peak Power
0 s filling 1 s filling Random filling Adjacent filling [28] 4 m filling [29] PT filling 0 s filling 1 s filling Random filling Adjacent filling [28] 4 m filling [29] PT filling S5378 S9234 S13207 S15850 S38417 S38584
4300 6705 12317 19448 194843 133322
4086 6520 14528 25635 193143 142216
7654 8567 16789 27654 201278 169923
5851 14749 74854 102154 337658 610803
3796 5511 6710 14209 142560 117550
3524 4003 8074 13612 118099 86136
12085 15395 110129 84360 514716 530464
12375 15640 126820 88794 539019 533975
13270 18400 137888 102916 565645 572544
12516 20431 100896 99537 572169 583614
10908 15143 76140 62269 304948 528835
9732 14093 84880 50876 297884 481159
Table. 5 Average power and Peak power improvement (%) of PTX filling than other approaches. Circuits
S5378 S9234 S13207 S15850 S38417 S38584
Average Power
Peak Power
0 s filling
1 s filling
Random filling
Adjacent filling [28]
4 m filling [29]
0 s filling
1 s filling
Random filling
Adjacent filling [28]
4 m filling [29]
18.04 40.29 34.44 30.00 39.38 35.39
13.75 38.60 52.39 46.89 38.85 39.43
53.95 53.27 51.90 50.77 41.32 49.30
39.77 72.85 89.21 4.20 17.15 26.72
7.16 27.36 −16.89 4.20 17.15 26.72
2.92 8.45 13.84 15.98 14.92 9.29
5.19 9.89 25.18 20.17 18.76 9.89
11.59 23.40 31.19 31.13 22.58 15.96
22.24 31.02 15.87 48.88 47.93 17.55
10.78 6.93 −10.29 18.29 2.31 9.01
6.2. Compression ratio The compression ratio for the proposed method is compared with exiting six (SC [30], Golomb [6], FDR [8], VIHC [4], ERLC [11], SHC [5]) methods and results are tabulated in the following Table 6 and 7. To achieve maximum compression we proposed a pattern matching algorithm.
Fig. 3. Average power analysis of different X-filling techniques.
6.2.1. Pattern matching algorithm Consider the four-bit patterns 0 0 0 0, 0 0 01, 0 010, 0 011, 010 0, 0101, 0110, 0111, 10 0 0, 10 01, 1010, 1011, 1100, 1101, 1111. Patterns 0 0 0 0 and 1111 are treated as same. But the patterns 0 0 01, 0 010, 0100 are having three 0 s and one 1so that these three patterns are treated as 0 0 0 0 by pattern matching algorithm. Similarly, pattern 0111 has three 1 s and one 0so this pattern treated as 1111. Here, pattern matching is applied for one-bit varied patterns. For 8-bit, 12-bit, 16-bit patterns, pattern matching is applied for 2-bit varied patterns. The compression ratio with and without matching patterns is given in Table 6. Table 7 shows the compression ratio for the existing technique SHC [5] and the proposed technique. From the results, the power transition X filling based selective Huffman encoding approach gives better results than the previous approach. Except for the benchmark S38417, this method gives better compression for all remaining benchmark circuits. The compression ratio will linearly increase with the increase of the block size. So that block size 16 has better compression than small size blocks 12,8, and 4. Table 8 shows the compression ration analysis with existing techniques and from that results we can conclude that the proposed technique is efficient than existing techniques.
6.3. Area overhead
Fig. 4. Peak power analysis of different X-filling techniques.
From the graphs, we can conclude that the proposed scheme ensures better improvement in reducing average power and peak power.
We used the compression and decompression architecture of SHC [5] for the area overhead reduction and results are tabulated in Table 9. The area overhead is computed for 8 blocks, 10 blocks and 12 blocks respectively. From the above result, smaller block sizes improve the efficiency in the area overhead reduction. In the above Table 9, we tabulated both decoder area overhead and serializer area overhead for block sizes 8, 10 and 12. From the above results, PTX based selective human encoding reduces the average decoder
8
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937
Table. 6 Compression ratio for the proposed method with and without pattern matching. Circuits
Proposed Encoding (PTX +SHC) without Pattern Matching 8 enc.dist.blocks
S5378 S9234 S13207 S15850 S38417 S38584
Proposed Encoding (PTX +SHC) with Pattern Matching
16 enc.dist.blocks
32 enc.dist.blocks
8 enc.dist.blocks
16 enc.dist.blocks
32 enc.dist.blocks
4
8
12
16
8
12
16
8
12
16
4
8
12
16
8
12
16
8
12
16
17.9 14.2 31.7 23.6 23.6 22.7
38.6 31.6 58.0 46.0 45.6 44.2
38.3 25.6 62.7 47.4 45.3 44.1
36.2 20.9 62.8 45.1 41.9 40.6
39.3 33.6 58.5 47.3 46.7 45.7
41.1 30.9 64.9 49.7 48.4 47.9
38.8 25.5 65.6 48.4 45.4 44.5
38.4 32.9 58.0 46.3 45.7 44.8
42.9 34.6 66.4 51.9 50.9 50.3
41.7 30.5 68.6 51.4 49.8 49.8
32.0 35.1 50.7 44.0 40.8 40.3
53.6 54.7 74.0 65.3 60.1 59.6
56.5 52.9 80.3 68.1 60.8 66.0
51.3 55.0 79.2 64.7 59.2 66.5
54.4 54.3 69.0 65.0 57.2 63.5
57.1 57.0 79.9 71.3 57.8 66.8
53.6 52.1 85.0 63.9 58.9 69.6
54.8 56.6 73.9 60.9 60.6 59.2
59.3 59.6 80.9 70.5 65.8 69.7
60.9 58.9 85.2 69.9 64.2 67.5
Table. 7 Compression ratio comparison. Circuits
Proposed Encoding (PTX +SHC)%
SHC [5]% 8 enc.dist.blocks
S5378 S9234 S13207 S15850 S38417 S38584
16 enc.dist.blocks
32 enc.dist.blocks
8 enc.dist.blocks
16 enc.dist.blocks
32 enc.dist.blocks
4
8
12
16
8
12
16
8
12
16
4
8
12
16
8
12
16
8
12
16
28.9 30.0 45.6 38.4 34.9 37.8
50.1 50.4 69.2 60.0 55.3 58.5
53.0 50.7 76.6 63.6 56.9 62.2
50.3 46.1 78.6 61.6 54.4 61.7
50.2 50.2 69.2 59.9 55.5 58.5
55.1 54.2 77.1 65.6 58.9 63.9
53.8 51.0 79.7 64.8 57.3 64.1
49.8 49.0 68.9 59.2 54.7 57.7
56.6 55.7 80.9 66.5 60.3 64.5
56.9 54.4 85.2 66.8 59.9 65.5
32.0 35.1 50.7 44.0 40.8 40.3
53.6 54.7 74.0 65.3 60.1 59.6
56.5 52.9 80.3 68.1 60.8 66.0
51.3 55.0 79.2 64.7 59.2 66.5
54.4 54.3 69.0 65.0 57.2 63.5
57.1 57.0 79.9 71.3 57.8 66.8
53.6 52.1 85.0 63.9 58.9 69.6
54.8 56.6 73.9 60.9 60.6 59.2
59.3 59.6 80.9 70.5 65.8 69.7
60.9 58.9 85.2 69.9 64.2 67.5
Table. 8 Compression ratio analysis with existing approaches. Circuits
Total Bits
S5378 S9234 S13207 S15850 S38417 S38584
23,754 39,273 165,200 76,986 164,736 199,104
Compression Ratio% SC [30]
Golomb [6]
FDR [8]
VIHC [4]
ERLC [11]
SHC [5]
PTX+SHC
43.6 40.0 74.4 58.8 45.1 55.2
37.60 41.53 72.97 46.79 42.38 47.67
48.0 43.5 81.3 66.23 43.26 60.91
46.9 46.1 80.4 64.4 47.8 59.6
47.8 43.5 80.6 66.4 58.7 61.6
55.1 54.2 77.0 66.0 59.0 64.1
57.1 57.0 80.0 71.3 57.8 66.8
Table. 9 Area overhead of decoder and serializer using PTX based Selective Huffman technique [5]. Circuits
Block Size
Num.States (n + b)
Decoder Area Overhead
Serializer Area Overhead
S5378
8 10 12 8 10 12 8 10 12 8 10 12 8 10 12 8 10 12
16 25 28 14 26 27 14 18 26 16 20 22 14 27 30 15 21 29
4.2 11.1 10.3 4.8 8.4 8.9 1.7 2.6 3.7 1.4 3.2 4.4 0.9 1.5 1.4 0.8 1.3 1.5
3.8 4.9 5.6 3.2 3.9 4.8 1.2 1.7 2.2 1.4 1.9 1.9 0.6 0.9 0.7 0.7 0.7 0.7
S9234
S13207
S15850
S38417
S38584
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937 Table. 10 PTX+SHC application testing time analysis with different techniques. Circuits
Golomb [6]
FDR [8]
VIHC [4]
SHC [5]
PTX+SHC
S5378 S9234 S13207 S15850 S38417 S38584
4 4 16 8 4 8
128 64 512 512 1024 512
8 5.3 16 12 16 8
6 6 6 6 6 6
4 4 4.5 4.3 3.5 4
area overhead to 2.24% for block size 8, 4.51% for block size 10 and 4.91% for block size 12. Similarly, this approach reduces average serializer overhead to 1.75% for block size 8, 2.25% for block size 10 and 2.55% for block size 10. 6.4. Application testing time The total application testing time for each benchmark circuit is presented in Table 10. The application testing time for various techniques is presented in Table 10. As per the comparison of application testing time analysis with existing approaches, PTX+SHC gives the lowest testing time for S13207, S15850, S38417, and S38584 benchmark circuits and for remaining S5378 and S9234 benchmark circuits Golomb code [6] and PTX+SHC gives same results. From this analysis, PTX+SHC is a much better approach for application testing time reduction.
7. Conclusion In this paper, we proposed a technique named power transition X filling based selective Huffman encoding for test data compression, scan in power reduction, area overhead reduction and speed up application testing time for digital VLSI circuits. According to the experimental results, PTX+SHC encoding techniques give a better compression ratio compared to existing approaches to the ISCAS’89 benchmark circuits. Also, the proposed technique concentrates all dimensions of the VLSI constraints i.e. average power, peak power, total power, area overhead and application time. According to the comparative study, this PTX+SHC approach is more efficient than previous techniques. The main advantage of this proposed technique is concern efficiency in all factors of digital VLSI circuits (data compression, testing time, scan in power and area overhead) equally. The simple decoder architecture is proposed in this paper that significantly reduces the area of decoder and input cores. This paper experimentally analyzed the data compression, application testing time and area overhead. In the future, this work can be extended to reduce the time and memory consumption even better.
Declaration of Competing Interest There is no conflict of Interest. References [1] Girard P., Nicolici N., Wen X. Power-aware testing and test strategies for low power devices. Springer US978-1-4419-0928-2; 2010. [2] J. Aerts, E.J. Marinissen, Scan chain design.test time reduction in core-based ICs, in: Proc.Int.Test.Conf, 1998, pp. 448–457. pp. [3] Mehrdad Nourani, Mohammad H, Tehranipour.RL-Huffman encoding for test compression and power reduction in scan application, ACM Trans. Design Autom. Electron. Syst. (TOADES) 10 (2004) 91–115 ppjan.
9
[4] P. Gonciari, B. Al-Hashimi, N Nicolici, Variable-length input huffman coding for system-on-a-chip test, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 22 (6) (2003) 783–796. [5] Abhijit Jas, Jayabrata Ghosh-Dastidar, Ng Mom_Eng, A Nur, Touba.An efficient test set compression scheme using selective huffman coding, IEEE Trans. Comput.-Aided Integr. Circuits Syst. 22 (6) (2003) June. [6] A. Chandra, K. Chakrabarty, System-on-a-Chip test-data compression and decompression architectures based on golomb codes.ieee trans, Comput.-Aided Design 20 (2001) 355–368 Mar. [7] A. Chandra, K. Chakrabarty, A unified approach to reduce soc test data volume, Scan Power and Testing Time, IEEE Trans. Comput.-Aided Design 22 (2003) 352–363 Mar. [8] A. Chandra, K. Chakrabarty, Test data compression and test resource partitioning for system-on-a-chip using frequency-directed run-length (FDR) codes, IEEE Trans. Comput. 52 (8) (2003) 1076–1088 Aug. [9] X. Kavousianos, E. Kalligeros, D Nikolos, Optimal selective Huffman coding for test-data compression, IEEE Trans. Comput. 56 (8) (2007) 1146–1152. [10] C.V. Krishna, N.A. Touba, Reducing test data volume using lfsr reseeding with seed compression.proc, in: Int’l Test Conf., Oct. 2002, pp. 321–330. [11] A.H. El-Maleh, R.H. Al-Abaji, Extended frequency-directed run-length code with improved application to system-on-a-chip test data compression.proc, Ninth Int’l Conf. Electronics, Circuits, and Systems 2 (2002) 449–452 Sept. [12] W. Zhan, A El-Maleh, A new scheme of test data compression based on equal-run-length coding (erlc), Integr. VLSI J. 45 (1) (2012) 91–98. [13] A. Jas, N.A. Touba, Using an embedded processor for efficient deterministic testing of system-on-a-chip, in: Proc. Int. Conf. Computer Design, 1999, pp. 418–423. pp. [14] K.Murali Krishna and M.Sailaja. Low power memory built in self-test address generator using clock controlled linear feedback shift registers. Springer J. Electron. Test 92014) 30: 77–85. [15] K. Thilagavathi, S. Sivanantham, Two-Stage low power test data compression for digital VLSI circuits, Elsevier-Comput. Electr. Eng. 71 (2018) 309–320. [16] A. El-Maleh, S. Al-Zahir, E. Khan, A geometric-primitives-based compression scheme for testing systems-on-a-chip, in: Proc. VLSI Test Symp., 2001, pp. 54–59. [17] R. Dorsch, H.-.J. Wunderlich, Reusing scan chains for test pattern decompression, in: Proc. European Test Workshop, 2001, pp. 124–132. [18] D. Das, N.A. Touba, Reducing test data volume using external/LBIST hybrid test patterns, in: Proc. Int. Test Conf., 20 0 0, pp. 115–122. [19] C.V. Krishna, A. Jas, N.A. Touba, Test set encoding using partial lfsr reseeding, in: Proc. Int. Test Conf., 2001, pp. 885–893. [20] A. Jas, C.V. Krishna, N.A. Touba, Hybrid bist based on weighted pseudo-random testing: a new test resource partitioning scheme, in: Proc. VLSI Test Symp., 2001, pp. 114–120. [21] S. Sivanantham, K. Sarathkumar, J. Manuel, P. Mallick, J. Perinbam, Csp-filling: a new X-filling technique to reduce capture and shift power in test applications, in: International Symposium on Electronic System Design, 2012, pp. 135–139. [22] A. Trinadh, S. Potluri, S. Balachandran, C. Babu, V Kamakoti, Xstat: statistical X– filling algorithm for peak capture power reduction in scan tests, J. Low Power Electron. 10 (1) (2014) 107–115. [23] Dp-fill: a dynamic programming approach to X-filling for minimizing peak test power in scan tests. In: Design, Automation and Test in Europe, DATE, 2015. p. 836–841. [24] X. Kavousianos, E. Kalligeros, D Nikolos, Multilevel-huffman test-data compression for ip cores with multiple scan chains, IEEE Trans. Very Large Scale Integr. VLSI Syst. 16 (7) (2008) 926–931. [25] F. Brglez, D. Bryan, K Kozminski, Combinational profiles of sequential benchmark circuits, in: IEEE International Symposium on Circuits and Systems, 3, 1989, pp. 1929–1934. [26] I. Hamzaoglu, J.H. Patel, Test set compaction algorithms for combinational circuits, in: IEEE/ACM international conference on computer-aided design, digest of technical papers, 1998, pp. 283–289. [27] D. Heidel, S. Dhong, P. Hofstee, M. Immediiato, K. Nowka, J Silberman, K. Stawiasz, High speed seralizing/de-serializing deign-for-test method for evaluating a 1GHz microprocessor, in: VLSI Test Symposium, 1998, pp. 234–238. [28] K. Butler, J. Saxena, T. Fryars, G. Hetherington, A. Jain, J Lewis, Minimizing power consumption in scan testing: pattern generation and dft techniques, in: International test conference, 2004, pp. 355–364. [29] H. Yuan, K. Guo, X. Sun, Z. Ju, in: A power efficient test data compression method for soc using alternating statistical run-length coding, 32, 2016, pp. 59–68. [30] A. Jas, J. Ghosh-Dastidar, and N.A. Touba, “Scan vector compression/decompression using statistical coding,” in Proc. IEEE VLSI Test Symp, Apr. 1999, pp. 114–121. [31] M. Safieh, J. Freudenberger, Efficient vlsi architecture for the parallel dictionary lzw data compression algorithm, IET Circuits, Devices & Systems 13 (5) (2019) 576–583. [32] H. Zhu, Y. Pan, K. Li, R. Huan, Method and VLSI implementation of lossy– to-lossless LTM ECG compression framework, Electron Lett. 55 (2) (2018) 70–72. [33] Wang, H., Wang, T., Liu, L., Sun, H., & Zheng, N. (2019). Efficient compressionbased line buffer design for image/video processing circuits. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. [34] S. Tadisetty, A novel ortho normalized multi-stage discrete fast stockwell transform based memory-aware high-speed vlsi implementation for image compression, Multimed. Tools Appl. 78 (13) (2019) 17673–17699.
10
L. Sivanandam, S. Periyasamy and U.M. Oorkavalan / Microprocessors and Microsystems 72 (2020) 102937 Lokesh Sivanandam received his B.E. Degree in Electrical and Electronics Engineering from Dr MGR Engineering College and M.E. Degree in Electronics Engineering from Madras Institute of Technology, Chennai, India, in the year 20 02 and 20 05 respectively. He is currently a Research Scholar in the Department of Electronics and Communication Engineering, College of Engineering, Guindy, Anna University, Chennai, India. His-research interest is in the area of Testing of Digital Circuits and SOCs. He is a member of Computer Society of India.
Sakthivel Periyasamy received his B.E. degree in Computer Science and Engineering from University of Madras, India in 1992, M.E degree in Computer Science and Engineering from Jadavpur University, India in 1994 and Ph.D. degree in Computer Science and Engineering from Anna University, India in 2008. He is currently Professor in the Department of Electronics and Communication Engineering, Anna University, India. He is a member of Computer Society of India (CSI), Institute of Electronics and Telecommunication Engineers (IETE), Institution of Engineers (India), Indian Society for Technical Education (ISTE), VLSI Society of India (VSI), Institute of Electrical and Electronics Engineers (IEEE), Association of Computing Machinery (ACM) and Institute of Electronics, Information and Communication Engineers (IEICE). His-research interests include VLSI Design and Testing and Computer-Aided Design of VLSI Circuits.
Uma Maheswari Oorkavalan received her B.E. degree in Electronics and Communication Engineering and M.E. Degree in Communication Engineering from Thiagarajar College of Engineering, Madurai, India, in the year 1998 and 1999 respectively. She obtained her Ph.D. Degree in the area “Nonlinear signal processing” in the year 2009 from Anna University, Chennai, India. She is currently Professor in the Department of Electronics and Communication Engineering, College of Engineering, Guindy, Anna University, Chennai, India. Her research includes nonlinear signal processing, underwater image enhancement, and analysis of FMRI. She is a senior member of IEEE and life member of Computer Society of India.