Information Sciences 195 (2012) 266–276
Contents lists available at SciVerse ScienceDirect
Information Sciences journal homepage: www.elsevier.com/locate/ins
Coupled map lattice based hash function with collision resistance in single-iteration computation Shihong Wang a,b, Gang Hu c,⇑ a
School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China Key Laboratory of Optical Communication and Light Wave, Beijing University of Posts and Telecommunications, Beijing 100876, China c Department of Physics, Beijing Normal University, Beijing 100875, China b
a r t i c l e
i n f o
Article history: Received 9 February 2010 Received in revised form 3 January 2012 Accepted 8 January 2012 Available online 24 January 2012 Keywords: Hash function Coupled map lattice Chaos
a b s t r a c t A new hash function based on a coupled chaotic map lattice is proposed. By combining floating-point chaotic computations with algebraic operations as well as local and global couplings, the system reaches high bit confusion and diffusion rates and thus desirable collision resistance with even one-iteration computation. The algorithm can be used to calculate hash values of 128, 160, 192, 256, 384 and 512 bits with little difference in performance for the different hash values. The algorithm has both strong collision resistance and high efficiency and can serve as a new type of candidate hash function in software. Ó 2012 Elsevier Inc. All rights reserved.
1. Introduction With the rapid development of e-commerce, the study of hash functions has become increasingly important in modern cryptography. Especially after the discovery by Wang et al. of an effective method for reducing the complexity of finding collisions in SHA-1, a federal information-processing standard issued by NIST [12,22], analysing and designing optimal hash functions has attracted great attention [15–17,23,28]. Most hash functions designed for complexity and nonlinearity are based on purely digital algebraic operations that are convenient for implementation in hardware. To enhance message diffusion and confusion rates and to improve the collision resistance of the systems, large numbers of rounds of operations are needed to make various collision attacks difficult, however the correlation between different rounds in the multi-round approaches can become a weakness that can be utilized by attackers for collision analysis. In the field of chaos, many cryptographic systems based on chaotic dynamics have been proposed such as chaotic encryption ciphers [2,8,19,10], chaotic image encryption [3,29], chaotic key agreement protocols [26,7] and chaotic hash functions [9,20,21]. Kwok and Tang developed a hash function for message authorization using high-dimensional discrete chaotic maps [9]. Although the approach has an efficiency 1.5 times as good as MD5, Li found a collision of it based on its ununiform sensitivity to initial variable conditions and system parameters [11], while Deng et al. also suggested that ensuring the security of the approach called for further improvement [6]. Some other hash functions were also proposed using chaotic tent maps or the piecewise linear maps [27,25,1]. Spatiotemporal chaos such as 1-dimensional and 2-dimensional coupled map lattices have also been suggested for enhancing the complexity of the hash functions [20,21,18]. All of the algorithms are based on the chaotic dynamics of multi-iterations enhancing bit diffusion and confusion rates. The effects lower performance efficiency and increase difficulty of security analysis.
⇑ Corresponding author. E-mail address:
[email protected] (G. Hu). 0020-0255/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2012.01.032
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
267
In the present paper, we design a novel type of collision-resistant hash function based on a coupled chaotic map lattice combining floating-point computations with digital algebraic operations as well as local and global couplings. The coupled map lattice based hash function (CMLHF) has strong collision-resistance with even a single round of operations, therefore the weakness due to the correlations in different rounds can be avoided. Moreover, CMLHF is fast, and its software implementation is convenient. The paper is organized as follows. In Section 2, we introduce our hash function operations. In Section 3, the characteristics and advantages of CMLHF are discussed. In Section 4, detailed security analysis of our CMLHF is presented. The last section presents conclusions. 2. Model of CMLHF 2.1. Collision-resistant hash function A hash function is a function that compresses an input message of arbitrary length into an output with short and fixed length:
h ¼ ðh1 h2 . . . hq Þ ¼ HðMÞ; M ¼ ðm1 m2 m3 . . . mp Þ;
p q;
ð1Þ
where H represents the operation that maps a message M of arbitrary length p bits to a q-bit hash value (also called a digest). A hash value usually serves as the fingerprint of the corresponding message. With a hash function H and a received message M0 the hash value h0 = H(M0 ) is calculated, and if h0 is the same as the received hash value h, the received message M0 is taken to be the true message M. For a secure hash function H and a given hash value h, it is difficult to find a message M (called a pre-image) such that H(M) = h, and it is equally difficult to find a fake message M0 with the same hash value h as the true message M, i.e., H(M0 ) = H(M) = h, M0 – M [15]. 2.2. The design of CMLHF In this section, we provide a detailed description of the algorithm of CMLHF. CMLHF is constructed from a coupled map lattice combining floating-point calculations with some simple digital algebraic operations. The algorithm is based on N coupled maps with N = 4, 5, 6, 8, 12, 16, corresponding to the hash value sizes of 128, 160, 192, 256, 384 and 512 bits respectively. 2.2.1. Padding and processing of CMLHF Before processing with the hashing operation, the input M is padded to a multiple of 32N bits by adding a one and some zeroes, i.e., 100 . . . 0, where 32N bits is the message block size. The last padded 64 or 128 bits are used to represent the length of the original message. The maximum length of the message is 264 for the algorithms with hash sizes of 128, 160, and 192 bits and 2128 for the others with larger hash sizes. If the last block of the input M has less than 64 or 128 bits, one more block is added. The padded M is then divided into t blocks denoted by M1 through Mt, each block having 32N bits. CMLHF is based on a round transformation. Fig. 1a shows the scheme of CMLHF. H0i 2 ½0; 232 Þ; i ¼ 1; 2; . . . ; N, are the initial variables. The hash value of the message is the output after successively processing all message blocks, M1, M2, . . . , Mt. 2.2.2. Round transformation The most important part of the proposed algorithm is the design of the round transformation. Fig. 1b shows the scheme for the round transformation that maps a 32N-bit chaining value Hi, i = 1, 2, . . . , N and a 32N-bit block message mi, i = 1, 2, . . . , N to another 32N-bit chaining value new Hi, i = 1, 2, . . . , N. The round transformation includes five parts: (i) Message expansion that expands a block message from 32N bits to 64N bits. (ii) Parameter computation producing the parameters of the coupled map lattice. (iii) Variable computation producing the variables of maps from the chaining values. (iv) Dynamics of the coupled map lattice. (v) Output transformation giving new chaining values from the variables of dynamics. 2.2.2.1. Message expansion. This message expansion step expands the block message from size 32N bits to 64N bits. The 32Nbit message is represented as m1, m2, . . . , mN, mi 2 [0, 232) and the expansion is made by nonlinear transformations defined as:
sum ¼ ðm1 m2 . . . mN H1 H2 . . . HN Þo10;
ð2Þ
miþN ¼ gðsum mi Hi Þ;
ð3Þ
i ¼ 1; 2; . . . ; N;
where operation denotes bitwise XOR. Operation x o (n)y stands for a right (left) rotation of x by y bits. Function g(A) is a nonlinear transformation shown in Fig. 1c. First, an input 32-bit A is transformed to a 32-bit A0 by certain rotations and bitwise XORs.
A0 ¼ A ðAo4Þ ðAo12Þ ðAo19Þ ðAn5Þ:
ð4Þ
268
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
(a)
M1
Initial Variables H 10 , H 20 ,..., H N0
M2 H 11 , H 21 ,..., H 1N
Round transformation
N - bit (b) 32 Block message
Mt
H 12 , H 22 ,..., H N2 Round transformation
H 1 , H 2 ,..., H N
(c)
32 N - bit chaining value H 1 , H 2 ,..., H N
m1 , m2 ,..., m N
Round transformation
32 - bit A
shift and xor A ( A >>>4) ( A >>>12) (A >>>19) ( A <<<5)
Map variable s xi = H i 2 32
Message expansion m1 , m2 ,..., m2 N
xi Parameter calculatio n ai , bi
Coupled map lattice yi Output transformation
32 - bit A'
Constant k1 , k 2 ,..., k N
A′1 (8bits ) A′2 (8bits )A′3 (8bits ) A′4 (8bits ) Sbox S ( A1′ ) S ( A2′ ) S ( A4′ ) S ( A3′ )
H1 , H 2 ,..., H N
new 32 − bit g( A)
new H 1 , H 2 ,...H N Fig. 1. (a) A scheme of CMLHF. (b) The structure of the round transformation. (c) The scheme of the transformation g from 32-bit to another 32-bit.
0 0 0 0 0 Then in the order of significance of bits from high to low, the 32-bit 0 A is divided into four bytes, A1 A2 A3 A4 , each of is presented in Section 2.2.3 and Table 1). The which undergoes an 8-bit to 8-bit nonlinear S-box transformation S Ai (S-box final four new bytes are combined to a new 32-bit integer as S A01 S A02 S A03 S A04 . The rotations of Eq. (4) guarantee (i) each bit of A can influence each bit of A0 through the g transformation of Eq. (3) and (ii) each bit of A plays different roles in different bytes A0i ; i ¼ 1; 2; 3; 4. 2.2.2.2. Parameters computation. The step produces the parameters of the coupled map lattice from the above expanded message by the following linear transformations:
ai ¼ 3:75 þ
mi 234
bi ¼ 0:25 þ
;
miþN 234
i ¼ 1; 2; . . . ; N:
;
ð5Þ
2.2.2.3. Computation of variables of maps. With the following formula, the step transforms the chaining values Hi to the real variable xi, i = 1, 2, . . . , N, of the coupled map lattice:
xi ¼
Hi 232
Hi 2 ½0; 232 Þ;
;
i ¼ 1; 2; . . . ; N;
ð6Þ
where the chaining values Hi are shown in Fig. 1a, either the initial variables H01 ; H02 ; . . . ; H0N or the outputs of each round transformation Hi1 ; Hi2 ; . . . ; HiN . The method producing the initial variables for different hash value sizes is presented in Section 2.2.4. The specific values are listed in Table 2. Table 1 8-bit to 8-bit nonlinear transformation of the S-box. x
y 0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0 1 2 3 4 5 6 7 8 9 A B C D E F
B5 42 7D 45 9F 63 65 70 B4 F7 60 B6 5F 4A C8 06
B0 4C 12 58 61 D2 54 3D 46 04 A8 4B 28 A9 5D 13
40 6F BE 8C D8 D9 B2 96 C2 8D E0 37 16 18 03 D7
F3 02 5C B9 A4 39 CA 74 EE AF 38 B3 67 7F CE AB
AC DD BB 92 4E 4D 9D 11 69 D1 CB 88 CD 98 2D 78
0E A3 9C FE FD FC A5 94 3F 53 CC 27 44 7E 30 09
B1 73 5E A6 E3 81 EF 19 FF 93 56 8E 48 3C ED F8
08 EC 3A DF 36 72 80 F2 FB C4 20 22 A0 C9 8B FA
5A C1 51 97 6A 6E E4 8A 0B 35 90 D0 31 E2 6C 64
1E D3 15 57 E5 C7 84 C0 86 3B DA 62 2E F5 49 A7
0F 91 E1 0C 79 DC EB 1D 7B DE 21 AE 66 47 77 DB
85 89 9E 8F 1C B8 B7 33 87 52 32 83 43 BA 6D 2F
CF 23 99 AA 0D A1 1B E6 2A 14 F1 7A 3E 75 1F BD
26 4F 34 71 F0 E8 2B BF AD F6 BC F4 C6 C3 9A 41
82 E9 01 E7 25 0A 05 D5 D4 17 76 00 24 95 07 2C
29 9B 55 F9 10 6B 68 1A 59 EA 50 7C D6 5B C5 A2
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
269
2.2.2.4. Dynamics of the coupled map lattice. A coupled map lattice with both local and global couplings is designed to perform bit confusion and diffusion between the input message and the chaining values. The function reads
yi ¼ f1 ðai ; xi Þ þ f2 ðbi1 ; ki1 ; xi1 Þ þ f1 ðaiþ1 ; uÞ þ f2 ðbiþ2 ; kiþ2 ; wÞ; f1 ða; xÞ ¼ axð1 xÞ; a 2 ½3:75; 4:0; x 2 ½0; 1:0; f2 ðb; k; xÞ ¼ bðx þ kÞ; b 2 ½0:25; 0:5; k 2 ½0:0; 1:0;
i ¼ 1; 2; . . . ; N;
x 2 ½0; 1:0;
ð7Þ ð8Þ ð9Þ
with parameter ki, i = 1, 2, . . . , N, representing a set of real constants. Section 2.2.4 specifies how to produce ki and Table 2 lists their specific values. Periodic boundary conditions are used, ki+N = ki, ai+N = ai, bi+N = bi, i = 1, 2, . . . , N x0 = xN. The real terms of u and w on the right hand side of Eq. (7) produce global couplings and they are defined as
u1 ¼ 248
N X
f2 ðbi ; ki ; xi Þ mod232 ;
ð10Þ
i¼1
u¼
gðu1 Þ 232
:
w1 ¼ 248
N X
f1 ðai ; xi Þ mod232 ;
ð11Þ
i¼1
w¼
gðw1 Þ 232
;
where the operation g is given by Fig. 1c and Table 1. 2.2.2.5. Output transformation. The output function digitizes the real output of the coupled map lattice and produces the new chaining values. The output function is defined as
Z i ¼ 248 yi
mod232 ;
Z i 2 ½0; 232 Þ;
newHi ¼ Z i mNiþ1 m2Niþ1 ;
ð12Þ
i ¼ 1; 2; . . . ; N:
ð13Þ
2.2.3. S-box of CMLHF A large number of 8-bit-to-8-bit random maps are constructed. Their statistical properties are compared with a strict avalanche criterion, differential cryptanalysis, and linear cryptanalysis, then an optimal map is chosen as our S-box from 108 random maps. Table 1 is the 8-bit-to-8-bit transformation of this S-box where the input variables xy(x and y are row and column variables having hexadecimal values) and the corresponding outputs are presented as hexadecimal values. For example, an input variable ‘xy = 10’ produces the corresponding output variable ‘42’. 2.2.4. Parameters of CMLHF We specify the initial chaining values Hi in Eq. (6) and the real constants in Eq. (7). The real constants ki are given by the following formulas:
i ; Nþ1 i ki ¼ cos ; Nþ3
ki ¼ cos
i ¼ 1; 2; . . . ; N;
N ¼ 4; 5; 6; 12; 16;
i ¼ 1; 2; . . . ; N;
N ¼ 8:
Table 2 Parameters of CMLHF for different hash value sizes. Output size (bits)
Initial variables Hi, i = 1, 2, . . . , N
Constants ki, i = 1, 2, . . . , N
4 5 6 8
128 160 192 256
12
384
16
512
91ED38AA, BC4AC89D, D8A01D1A, EE4B30CB 880C115A, B067BB33, CB9005FD, E0698708, F178C3C5 8018B519, A6C50F56, C0EDC832, D514141C, E59AD573, F3ADFDEE 6ADA291D, 8CB280AB, A3EFC179, B6030647, C4F176FB, D1BB3843, DCF3E6BF, E6F91BB7 63C1258E, 83DF84C4, 9A0BEFB1, AB5AC599, B9B0AE14, C5FE6CFD, D0CF18EC, DA7B1242, E33E5756, EB43ABBF, F2AAB63B, F98B9A28 592B1196, 76992425, 8B16AC27, 9B2C65DA, A88D7C2D, B4131777, BE3B2292, C755E0A4, CF99F4BD, D72E69A4, DE303559, E4B57966, EACF8964, F08C3CA7, F5F6D254, FB18903E
0.9801, 0.9211, 0.8253, 0.6967 0.9861, 0.9450, 0.8776, 0.7859, 0.6724 0.9898, 0.9595, 0.9096, 0.8411, 0.7556, 0.6546 0.9959, 0.9835, 0.9630, 0.9346, 0.8985, 0.8549, 0.8043, 0.7470 0.9970, 0.9882, 0.9735, 0.9530, 0.9269, 0.8954, 0.8585, 0.8166, 0.7698, 0.7184, 0.6629, 0.6034 0.9983, 0.9931, 0.9845, 0.9724, 0.9571, 0.9384, 0.9164, 0.8913, 0.8631, 0.8319, 0.7979, 0.7610, 0.7216, 0.6796, 0.6353, 0.5888
N
270
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
For example, if N = 6, the six real constants are k1 = 0.9898, k2 = 0.9595, k3 = 0.9096, k4 = 0.8411, k5 = 0.7556, k6 = 0.6546. Table 2 lists the constants of the systems for different hash value sizes. The initial chaining values Hi are generated by the following formulas:
2
3 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 i 32 Hi ¼ int4ln @1 þ 1:718A 2 5; Nþ1 2 0 3 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 i 32 Hi ¼ int4ln @1 þ 1:718A 2 5; Nþ3 0
i ¼ 1; 2; . . . ; N;
N ¼ 4; 5; 6; 12; 16;
i ¼ 1; 2; . . . ; N;
N ¼ 8:
For example, if N = 6, the six initial chaining values are H1 = 8018B519, H2 = A6C50F56, H3 = C0EDC832, H4 = H514141C, H5 = E59AD573, H6 = F3ADFDEE as hexadecimal numbers. Table 2 presents the specified initial chaining values of the systems for different hash value sizes. 3. Characteristics and advantages of CMLHF Our previous work [20] is the first work using chaotic coupled map lattices to design hash algorithms. The present work develops the idea of coupled map lattices to design a hash function with one-iteration collision-resistance. CMLHF has the following novel characteristics: (i) The hash function operations are mainly based on coupled map lattices with both one-way local and global couplings (through variables u and w, see the third and fourth terms on the right hand side of Eqs. (7), (10), and (11)). The combination structure of the local and global couplings yields strong bit confusion and diffusion rates among the state variables of all sites xi and all message parameters ai, bi, i = 1, 2, . . . , N, and makes it difficult to separately analyze any part of the bits even with a single iteration of Eq. (7). In our previous work, different maps have distinctive functions (see Eqs. (5a)–(5e) in Ref. [20]) with the Nth map different from all other maps, and other maps are dynamically different according to their distances to the Nth map. Some of these maps might have some particular weakness and be vulnerable to attacks; in addition the heterogeneity of maps also makes security analysis difficult. In the present work, all coupled maps are structurally equivalent by its nearest-local coupling and global couplings (Eq. (7)). Security analysis on any single map can represent the security of the whole system. (ii) All the input message bits of m1, m2, . . . , mN and the chaining value bits of H1, H2, . . . , HN affect every bit of the parameters of the coupled maps, b1, b2, . . . , bN, affect the global coupling variables u and w through Eqs. (3), (5), (10) and (11), and further affect every bit of the output newHi. The fast diffusion and confusion rates make one-iteration collision-resistance possible. (iii) Pure floating-point computation may be easily analyzed by analytical computations even for chaotic systems. CMLHF incorporates various algebraic operations such as input bit expansion (Eqs. (2)–(4)), S-box operation (Eqs. Eqs. (2)–(4), (10), and (11)), and modulo operations (10)–(12)). Though all of the algebraic operations are very simple and weak in resistance against algebraic attacks, the joint application of algebraic operations and analytical chaos computations considerably enhances the efficiency of bit confusion and diffusion and has strong resistance against both algebraic and analytical attacks. Due to the above characteristics, CMLHF system has two major advantages over all the other hash functions proposed so far. The first is that with one interation of operations, CMLHF has strong resistance against collision attacks, so the security of the system can be estimated easier than other hash function systems. All previously known hash functions have weak collision-resistance in one or a few rounds, so multiple and sometimes very large numbers of rounds are employed to strengthen their security as in, for example, SHA-1 [12,17] and other algorithms proposed in [18,20,21]. In other words, time complexity is utilized in these systems to strengthen security. In these cases the links from round to round could hide some fatal weaknesses that validate attacks based on local collision analysis [23,28]. CMLHF makes use of spatial complexity instead of time complexity. The spatial complexity in the CMLHF system is generated first by two global couplings, i.e., variables u and w in Eq. (7), compared with commonly used one-way coupled map lattices like xn+1(i) = (1 e)f(xn(i)) + ef(xn(i 1)) and second by floating-point computations of 64-bit size compared with algebraic computations on an integer of 32-bit size. This spatial complexity makes it possible to use the algebraic modulo operation, which is extremely important to strengthen the single-iteration collision resistance in our system. This system is anticipated to have high security after a single iteration of operations. Though a precise evaluation is impossible, the following heuristic analysis is of help in estimating the strength of security. Redundancy of computational variable spaces seems to be impossible because the input bit expansion (Eqs. (2)–(4)) and local and global coupling of maps make each input bit diffused and confused to each output bit. All attacks based on analytical computation are difficult because of the S-box (Eqs. (2), (3), (4),(10), and (11)) and modulo operations (Eqs. (10)–(12)). On the other hand, attacks based on algebraic computations will be also difficult because of the random map (Table 1) together with the complexity of the nonlinear floating-point computation of Eqs. (7)–(11)). We thus estimate that (i) Given the input chaining value Hi (i.e.,
271
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
xi) and the output new Hi, one cannot explore the message mi with computational expense less than 232N tests. (ii) Given the input message mi and the output new Hi, one cannot find the input chaining value Hi with computational expense less than 232N tests. (iii) One cannot find any two distinct sets of the input chaining value Hi and mi producing the same new Hi with computational expense less than 216N (the security expense for Birthday attacks). These estimates have not yet been proven, and they may be regarded as challenging tasks for any interested attackers. The second advantage of our CMLHF is its efficiency in producing hash values. In Ref. [18], the authors compared the efficiencies of four chaotic hash algorithms proposed in [18,21,25], and concluded that Ren’s scheme in [18] had much higher efficiency than the others. To show the comparisons of efficiency, we calculate the speeds of our algorithm and Ren’s and Wang’s algorithms with Visual C/C++ [18,20]. PC I and II stand for Intel Pentium IV 3.0 GHz computers and Intel Core Duo 2.8 GHz computers, respectively. Table 4 shows that our system is three times faster than Ren’s scheme and comparatively faster than Wang’s algorithm [20]. For SHA serials and hash functions based on AES, increasing the sizes of hash values can considerably decrease the efficiency of producing hash values [17,13]. Table 3 shows the performance of some hash functions on a Pentium III processor in Refs. [17,13]. In Table 3, we can observe that the performance speeds of SHA-256, SHA-512, and Whirlpool based on AES are considerably slower than SHA-1. A remarkable advantage of CMLHF is that the speed of message processing is not affected by increasing the number of coupled maps N (from 128 bits for N = 4, to 160, 192, 256, 521 bits for N = 5, 6, 8, 12, 16, respectively), namely the hash value size. A detailed comparison of speeds between CMLHF and SHA-1 is given in Table 4. Table 4 shows that while our system is slightly slower than SHA-1 for 160-bit hash values, it is considerably faster than SHA serials for larger hash sizes. Recently, hash values with large bit numbers have been most often used for guaranteeing the security of hash systems. In these cases, the high efficiency of our system for large hash value sizes has remarkable advantages. 4. Security analysis of CMLHF This section investigates first the statistical properties of CMLHF related to the collision-resistance and second some attacks against it, including the sensitivity of output hash values to input parameters and initial chaining values, distinguishing attacks, and algebraic cryptanalysis. Without loss of generality, we choose the algorithm with N = 4 in Eq. (7) and it has 128bit hash value and message block. The parameters of the algorithm are listed as H1 = 91ED38AA, H2 = BC4AC89D, H3 = D8A01D1A, H4 = EE4B30CB, k1 = 0.9801, k2 = 0.9211, k3 = 0.8253, k4 = 0.6967. 4.1. Sensitivity of output hash values to changes of message bits To study the sensitivity of the output hash values to the message, we compute a pair of outputs Zi and Z 0i in Eqs. (2)–(12) 0 0 0 0 with a pair of different messages (m1, m2, m3, m4) and m1 ; m2 ; m3 ; m4 , and the difference between Zi and Z 0i will be DZ i ¼ Z i Z 0i ; DZ i ¼ Dz32ði1Þþ1 Dz32ði1Þþ2 . . . Dz32ði1Þþ32 ; i ¼ 1; 2; 3; 4. The averages of all 128 bits Dz1, Dz2, . . . , Dz128 with T pairs of different messages are
hDzi i ¼
T 1X Dz i ; T 1
i ¼ 1; 2; . . . ; 128:
ð14Þ
The output of floating-point computations of chaotic dynamics are less sensitive to the low significant bits of the parameters and state variables. Therefore our investigation focuses on these ‘‘weak’’ bits. m0i ¼ mi ¼ 0; i ¼ 2; 3; 4; m01 ¼ m1 1; m1 2 ½0; 232 Þ is arbitrarily chosen. H1 = H2 = H3 = H4 = 0. In Fig. 2a the averages hDziiare plotted for all the 128 bits with T = 108 in Eq. (14). Fig. 2a shows that the average values of all output bits are near to 12, and Fig. 2b shows that the fluctuations hDzi imax;min 12 are proportional to p1ffiffiT (the circles and dots correspond to the maximal and minimal values of the 128 fluctuations, respectively). Fig. 2 indicates that the hash values produced by the two different messages are completely uncorrelated. Other tested cases such as arbitrary differences between m1 and m01 also show similar good statistical properties. 4.2. Sensitivity of output hash values to changes of initial chaining values Fig. 3 demonstrates the sensitivity of the hash values to the initial chaining values. With a pair of initial chaining values Hi and H0i , outputs Zi and Z 0i are calculated through Eqs. (2)–(12). We keep m0i ¼ mi ¼ 0; i ¼ 1; 2; 3; 4. H0i ¼ Hi ; i ¼ 2; 3; 4; Table 3 Performance results on a Pentium III processor in Refs. [17,13]. Algorithm
Output size (bits)
Block size (bits)
Number of rounds
Relative performance
Speed (cycles/byte)
SHA-1 SHA-256 SHA-512 Whirlpool RIPEMD-160
160 256 512 512 160
512 512 1024 512 512
80 64 80 10 80
1.00 0.40 0.21 0.23 0.73
8.3 20.59 40.18 36.52 11.34
272
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
Table 4 Performance results of SHA-1, CMLHF, and other systems on two Intel processors. Algorithm
Output size (bits)
Block size (bits)
Speed (PC I)(Mbit/s) (number of rounds)
Speed (PC II) (Mbit/s) (number of rounds)
SHA-1 CMLHF
160 128 160 192 256 384 512 160 256 128
512 128 160 192 256 384 512 256 448 256
250 (80) 130 (1) 120 (1) 150 (1) 147 (1) 152 (1) 156 (1) 128 (10) 143 (10) 32 (4)
440 (80) 330 (1) 370 (1) 380 (1) 400 (1) 400 (1) 410 (1) 320 (10) 422 (10) 98 (4)
Wang’s scheme[20] Ren’s scheme [18]
86 97 99 108 100 110
(2) (2) (2) (2) (2) (2)
200 210 230 240 250 260
< zi>
0.5003
(2) (2) (2) (2) (2) (2)
(a)
0.5
0.4997
0
20
40
60
80
bit
100
120
0
|< zi>
max.min
−1/2|
10
(b)
−1
10
−2
10
−3
10
−4
10
1
10
2
10
3
10
4
10
5
10
6
T
10
7
10
8
10
9
10
Fig. 2. Simulation results of Eqs. (2)–(12). hDzii are defined in Eq. (14). m0i ¼ mi ; i ¼ 2; 3; 4, and m01 ¼ m1 1. m1 is randomly chosen in the range ½0; 232 Þ. H0i ¼Hi ¼ 0; i ¼ 1; 2; 3; 4. (a) All hDziis are equal to 12 with some random fluctuations indicating satisfactory statistical randomness. T = 108. (b) hDzimax 12 and hDzimin 12 are plotted vs. T, and circles and dots correspond to the maximum and minimum of all hDzii, i = 1, 2, . . . , 128.
H01 ¼ H1 1. With randomly chosen H1, the averages of the differences in Eq. (14) are plotted in Fig. 3 as in Fig. 2. The differences of Eq. (14) are clearly random for the two sets of initial chaining values with slight difference, therefore the hash values of the algorithm are sensitive to the initial chaining values. 4.3. S-box analysis The security of CMLHF is closely correlated with the statistical properties of S-box transformation. We have analyzed various statistical properties of S-box, including strict avalanche criterion (SAC), differential cryptanalysis (DC), and linear cryptanalysis (LC) [5]. If S-box has favorable SAC, a one-bit change in input should result in a 50% probability of change of output bits. Table 5 gives the probabilities that the output bits change when the input bits change. S1 presents the input bit change of 00000001 and S8 presents that of 10000000. In and Out show the input and output of the S-box. Indeed, the probabilities of all output changes in Table 5 are around 0.5.
273
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
0.5003
< zi>
(a) 0.5
0.4997
0
20
40
60
80
bit
100
120
0
|< zi>
max.min
−1/2|
10
(b)
−1
10
−2
10
−3
10
−4
10
1
10
2
10
3
10
4
5
10
6
10
10
T
7
10
8
10
9
10
Fig. 3. The same curves as Fig. 2 with a pair of different initial variables H0i ¼ Hi ¼ 0; i ¼ 2; 3; 4; H01 ¼ H1 1. H1 is randomly chosen in ½0; 232 Þ, m0i ¼ mi ¼ 0; i ¼ 1; 2; 3; 4.
To measure DC we compute difference propagation values, i.e., the numbers of the output differential patterns for all different input combinations. The output differential pattern is defined as
c ¼ SðaÞ SðbÞ; a; b ¼ 0; 1; . . . ; 255;
ð15Þ
where a and b are two arbitrary inputs of S-box, and S(a) and S(b) are their corresponding outputs of the S-box. S-boxes with preferable DC properties should minimize the largest difference propagation values. Numerical results show that the difference propagation values of our S-boxes are 0, 2, 4, 6, 8 and 10, and the corresponding probabilities are 60.65%, 30.34%, 7.54%, 1.31%, 0.14%, 0.02%. The maximum difference propagation value can be as low as 10 (corresponding to a difference propa5 gation probability of 10 ¼ 128 ), which is close to the value of the S-box of AES [5]. 28 To study LC we compute the input/output correlation of the S-box [5]. S-boxes with good security quality should minimize the largest non-trivial correlation between linear combinations of input bits and linear combinations of output bits. The maximum input/output correlation of the S-box in our protocol can be as low as 32 ¼ 23 , which is the same as that 28 of the S-box of AES [5].
Table 5 The probabilities of the output bit changes against the input bit changes in the S-box transformation Out
S1 S2 S3 S4 S5 S6 S7 S8
In S1
S2
S3
S4
S5
S6
S7
S8
0.516 0.531 0.500 0.437 0.516 0.516 0.500 0.516
0.453 0.500 0.562 0.516 0.453 0.562 0.484 0.453
0.594 0.453 0.578 0.500 0.500 0.500 0.531 0.437
0.500 0.484 0.562 0.484 0.469 0.469 0.484 0.500
0.453 0.437 0.484 0.406 0.437 0.531 0.578 0.484
0.578 0.437 0.562 0.531 0.547 0.422 0.562 0.437
0.547 0.500 0.547 0.484 0.531 0.500 0.422 0.469
0.484 0.578 0.531 0.516 0.469 0.469 0.578 0.531
274
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
4.4. Distinguishing attack A useful approach to analyze the security of synchronous stream ciphers is distinguishing attacks, a tool aimed at distinguishing cipher sequences from a true random sequence. Our analysis shows that the distinguishing attack against our system requires 2256 trials. The analysis is given below. We first investigate how the message affects the outputs of the system. Starting from randomly chosen initial chaining values H1, H2, H3, H4, and block message m1, m2, m3, m4, and taking an example of variable y2 in Eq. (7), we obtain the following output:
y2 ¼ f1 ða2 ; x2 Þ þ f2 ðb1 ; k1 ; x1 Þ þ f1 ða3 ; uÞ þ f2 ðb4 ; k4 ; wÞ:
ð16Þ
By changing m1 to m01 , another output is obtained
0 0 y02 ¼ f1 ða2 ; x2 Þ þ f2 b1 ; k1 ; x1 þ f1 ða3 ; u0 Þ þ f2 b4 ; k4 ; w0 :
ð17Þ
Now consider a pair of messages with a difference in the least significant bit m01 ¼ m1 1 while m0i ¼ mi for i = 2, 3, 4. A random expansion message m0i and mi, i = 5, 6, 7, 8, can be obtained from the message expansion Eqs. (2), (3) and function 0 0 0 0 g(x). The random coupled variables u0 , u and w0 , w can be represented by the formulas u0 ¼ fu b1 ; b2 ; b3 ; b4 , u ¼ fu ðb1 ; b2 ; b3 ; b4 Þ; w0 ¼ fw a01 ; w ¼ fw ða1 Þ, in which 0
a01 – a1 ;
bi – bi ;
u0 – u;
i ¼ 1; 2; 3; 4;
w0 – w:
All pairs except a01 and a1 in the above relations show good random differential properties since the nonlinear function g(x) has optimal confusion and diffusion properties through the multiple rotations and random nonlinear S-box operations. If y02 is equal to y2, the following equation holds
0 0 f1 ða2 ; x2 Þ þ f2 ðb1 ; k1 ; x1 Þ þ f1 ða3 ; uÞ þ f2 ðb4 ; k4 ; wÞ ¼ f1 ða2 ; x2 Þ þ f2 b1 ; k1 ; x1 þ f1 ða3 ; u0 Þ þ f2 b4 ; k4 ; w0 :
ð18Þ
Approximate Eq. (18) as
Fðxx0 Þ ¼ FðxxÞ;
ð19Þ 0
where F denotes 0 0 a 128-bit-to-32-bit nonlinear random transformation, and xx and xx are two 128-bit random inputs, xx0 ¼ ku0 kw0 b1 b4 ; xx ¼ kukwkb1 kb4 k. The following theorem gives the collision probability of the outputs of F(xx). Theorem 1. Let F be an m-bit-to-n-bit S-box with m P n and all of the n-bit elements be randomly generated. Let xx0 ,xx be two mbit random inputs to F. We have F(xx0 ) = F(xx) with probability 2m + 2n + 2mn [24].
According to Theorem 1, 232 + 2128 is the probability that Eq. (19) holds and 21 + 2128 is the probability that any pair of corresponding bits of y2 and y02 be equal. So the maximum bias d is equal to 2128. The probability of success in distinguishing the output of our CMLHF from a completely random selection can be estimated as
Z pffiffiffiffiffiffi Prðx > d NLÞ ¼
1
pffiffiffiffi d NL
1 1 2 pffiffiffiffiffiffiffi e2x dx; 2p
ð20Þ
where NL is the test number. After 2256 trials of Eq. (18), the output of CMLHF can be distinguished from a random selection with success rate 0.84 (the negative and positive error rates are 0.16). Next we investigate how initial chaining values Hi affect the outputs of the system. Using the same block message mi, i = 1, 2, 3, 4 and among the initial chaining values Hi, changing only H1 to H01 but keeping the rest, we have the following relations:
x01 – x1 ; 0
u – u;
0
bi – bi ;
i ¼ 1; 2; 3; 4;
0
w – w:
0 0 0 0 Variables u0 , u and w0 , w can be represented by formulas u0 ¼ fu b1 ; b2 ; b3 ; b4 ; x01 ; u ¼ fu ðb1 ; b2 ; b3 ; b4 ; x1 Þ; w0 ¼ fw x01 ; w ¼ fw ðx1 Þ. Taken an example the 3rd map of the lattice in Eq. (7), the variable reads
y3 ¼ f1 ða3 ; x3 Þ þ f2 ðb2 ; k2 ; x2 Þ þ f1 ða4 ; uÞ þ f2 ðb1 ; k1 ; wÞ
ð21Þ
and
0 0 y03 ¼ f1 ða3 ; x3 Þ þ f2 b2 ; k2 ; x2 þ f1 ða4 ; u0 Þ þ f2 b1 ; k1 ; w0 : y03 ,
32
128
ð22Þ 256
To keep y3 ¼ the corresponding probability is 2 +2 . After 2 trials of Eqs. (21) and (22), the output of one of the coupled maps can be distinguished from a random selection with a success rate of about 0.84.
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
275
4.5. Algebraic cryptanalysis of CMLHF This part considers algebraic cryptanalysis of CMLHF by solving multivariate equations over the field GF (2). The aim is to solve the unknown 128-bit message m1, m2, m3, m4 (i.e., a1, a2, a3, a4), with known 128-bit initial values H1, H2, H3, H4 (i.e., x1, x2, x3, x4), and 128-bit output Z1, Z2, Z3, Z4. From Eqs. (7)–(9) and (12) we have
y1 ¼ f1 ða1 ; x1 Þ þ f2 ðb4 ; k4 ; x4 Þ þ f1 ða2 ; uÞ þ f2 ðb4 ; k4 ; wÞ; Z 1 ¼ 248 y1
mod232 :
The above bit equalities give 32 multivariate quadratic equations in which the quadratic items are from multiplications a2u and b4w of the nonlinear functions f1(a2, u) and f2(b4, k4, w). For 128-bit output Z1, Z2, Z3 and Z4, there are 128 quadratic equations. For mid-variables b1, b2, b3, b4, and u, w, based on Eqs. (2)–(5), (10) and (11), the transformations can be represented by
bi ¼ NF 1 ða1 ; a2 ; a3 ; a4 Þ;
i ¼ 1; 2; 3; 4;
u ¼ NF 2 ðb1 ; b2 ; b3 ; b4 Þ; w ¼ NF 3 ða1 ; a2 ; a3 ; a4 Þ; which include 192 corresponding bit equalities. Because the S-box transformation in NF1, NF2 and NF3 is a random map and cannot be analytically represented, we assumed that these 192 equations were multivariate quadratic (MQ), which have less complexity and therefore lower power than the power of the S-box transformation. Then there are 320 MQ equations and 320 unknowns. The problem of solving our system thus becomes to solving the MQ problem with these 320 MQ equations and 320 unknowns. This algebraic problem can be transformed into an NP-hard problem, and the complexity is about O(2640), which is higher than O(2128) (the complexity of exhaustive key search [4]) and higher than that of the 128-bit AES system [14]. All of the above analysis is based on single-iteration operations. The complexity can be further greatly enhanced if we use two-iteration operations. 5. Conclusions In conclusion we have designed a hash algorithm CMLHF based on a coupled map lattice that incorporates three key ingredients: chaotic map functions, local and global couplings, and combination of floating-point computations and algebraic bit operations. These ingredients give the algorithm collision resistance even in a single iteration. The characteristic automatically invalidates local collisions based attacks. Another advantage of the algorithm is its high efficiency. The one-iteration CMLHF should provide adequate security, but we suggest using the two-iteration CMLHF for even stronger protection. The floating-point computation of CMLHF is suitable for software implementation. CMLHF can be implemented in hardware with the inserted CPU technique. In conclusion, CMLHF systems may serve as completely new and effective hash function generators in practical applications. Acknowledgments The authors thank the anonymous reviewers for their valuable suggestions. This work was supported by Grant Nos. 60973109 and 10975015 from National Natural Science Foundation of China and the National Basic Research Program of China (973 Program) under Grant No. 2007CB814800. References [1] M. Amin, O.S. Faragallah, A.A. Abd El-Latif, Chaos-based hash function (CBHF) for cryptographic applications, Chaos, Solitons and Fractals 42 (2009) 767–772. [2] L.M. Cuomo, A.V. Oppenheim, Circuit implementation of synchronized chaos with applications communications, Physical Review Letters 71 (1993) 65– 68. [3] G. Chen, Y. Mao, C.K. Chui, A symmetric image encryption scheme based on 3D chaotic cat maps, Chaos, Solitons and Fractals 21 (2003) 749–761. [4] N. Courtois, A. Klimov, J. Patarin, A. Shamir, Efficient algorithms for solving overdefined systems of multivariate polynomial equations, in: Advances in Cryptology-EUROCRYT 2000, LNCS 1807, (2000), pp. 392–407. [5] J. Daemen, V. Rijmen, Note on Naming,
. [6] S.J. Deng, Y.T. Li, D. Xiao, Analysis and improvement of a chaos-based Hash function construction, Communications in Nonlinear Science and Numerical Simulation 15 (2010) 1338–1347. [7] X. Guo, J. Zhang, Secure group key agreement protocol based on chaotic Hash, Information Sciences 180 (2010) 4069–4074. [8] L. Kocarev, U. Parlitz, General approach for chaotic synchronization with applications to communication, Physical Review Letters 74 (1995) 5028–5031. [9] H.S. Kwok, K.S. Tang, A chaos-based cryptographic hash function for message authentication, International Journal of Bifurcation and Chaos in Applied Sciences and Engineering 15 (2005) 4043–4050. [10] H.P. Lu, S. H Wang, X.W. Li, G.N. Tang, J.Y. Kuang, W.P. Ye, G. Hu, A new spatiotemporally chaotic cryptosystem and its security and performance analyses, Chaos 14 (2004) 617–629. [11] C. Li, S.H. Wang, A new one-time signature scheme based on improved chaos hash function, Computer Engineering and Applications 43 (35) (2007) 133–136 (in Chinese). [12] NIST, Secure Hash Standard (SHS). .
276
S. Wang, G. Hu / Information Sciences 195 (2012) 266–276
[13] J. Nakajima, M. Matsui, Performance analysis and parallel implementation of dedicated Hash functions, in: L.R. Knudsen (Ed.), Proceedings of Eurocrypt’02, Lecture Notes in Computer Science, vol. 2332, Springer-Verlag, 2002, pp. 165–180. [14] H. NOVER, Algebraic Cryptanalysis of AES: An Overview. . [15] B. Preneel, Analysis and Design of Cryptographic Hash Functions, Doctoral thesis, 1993. . [16] B. Preneel et al., NESSIE Security Report. , 2003. [17] B. Preneel et al., Final Report of European Project Number IST-1999-12324 Named New European Schemes for Signature, Integrity and Encryption. , 2004. [18] H.J. Ren, Y. Wang, Q. Xie, H.Q. Yang, A novel method for one-way hash function construction based on spatiotemporal chaos, Chaos, Solitons and Fractals 42 (2009) 2014–2022. [19] D.G. Van Wiggeren, R. Roy, Communication with chaotic lasers, Science 279 (1998) 1198–1200. [20] S. Wang, G. Hu, Hash function based on chaotic map lattices, Chaos 17 (2007) 023119. [21] Y. Wang, X. Liao, D. Xiao, K. W Wong, One-way hash function construction based on 2D coupled map lattices, Information Sciences 178 (2008) 1391– 1406. [22] X. Wang, Y.Yin, H. Yu, Finding Collisions in the Full SHA-1. . [23] X. Wang, X. Lai, D. Feng, H. Chen, X. Yu, Cryptanalysis of the Hash functions MD4 and RIPEMD, in: Advances in Cryptology-EUROCRYPT 2005, LNCS 3494 (2005), pp. 1–18. [24] H.J. Wu, The Stream Cipher HC-128. . [25] D. Xiao, X. Liao, S. Deng, One-way Hash function construction based on the chaotic map with changeable-parameter, Chaos, Solitons and Fractals 24 (2005) 65–71. [26] D. Xiao, X. Liao, S. Deng, Using time-stamp to improve the security of a chaotic maps-based key agreement protocol, Information Sciences 178 (2008) 1598–1602. [27] X. Yi, Hash function based on chaotic tent maps, IEEE Transactions on Circuits and Systems. Part II: Express Briefs 52 (6) (2005) 354–357. [28] H. Yu, X. Wang, Multi-collision attack on the compression functions of MD4 and 3-Pass HAVAL, Information Security and Cryptology (2007) 206–226. [29] Q. Zhou, K. W Wong, X. Liao, T. Xiang, Y. Hu, Parallel image encryption algorithm based on discretized chaotic map, Chaos, Solitons and Fractals 38 (2008) 1081–1092.