Fast algorithms of public key cryptosystem based on Chebyshev polynomials over finite field

Fast algorithms of public key cryptosystem based on Chebyshev polynomials over finite field

The Journal of China Universities of Posts and Telecommunications April 2011, 18(2): 86–93 www.sciencedirect.com/science/journal/10058885 http://www...

236KB Sizes 0 Downloads 25 Views

The Journal of China Universities of Posts and Telecommunications April 2011, 18(2): 86–93 www.sciencedirect.com/science/journal/10058885

http://www.jcupt.com

Fast algorithms of public key cryptosystem based on Chebyshev polynomials over finite field LI Zhi-hui1, 2 ( ), CUI Yi-dong1, XU Hui-min2 1. Key Laboratory of Trustworthy Distributed Computing and Service, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China 2. School of Information and Communication engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract

The computation of Chebyshev polynomial over finite field is a dominating operation for a public key cryptosystem. Two generic algorithms with running time of O (lb n) have been presented for this computation: the matrix algorithm and the characteristic polynomial algorithm, which are feasible but not optimized. In this paper, these two algorithms are modified in procedure to get faster execution speed. The complexity of modified algorithms is still O (lb n) , but the number of required operations is reduced, so the execution speed is improved. Besides, a new algorithm relevant with eigenvalues of matrix in representation of Chebyshev polynomials is also presented, which can further reduce the running time of that computation if certain conditions are satisfied. Software implementations of these algorithms are realized, and the running time comparison is given. Finally an efficient scheme for the computation of Chebyshev polynomial over finite field is presented. Keywords Chebyshev polynomial, algorithm, running time, square root

1

Introduction 

The application of chaotic maps in cryptography has been studied for more than twenty years [1], but the most attempts were made on symmetric key cryptosystems [2–3]. Recently, a public key cryptosystem based on Chaotic-Chebyshev maps had been presented [4], but was soon broken by a triangle substitution attack [5]. To resist that kind of attack, the definition of Chebyshev polynomial was expanded from real field to finite field, and a modified cryptosystem was presented [6–9], which can be treated as a generalization of Elgamal algorithm [10]. Refs. [6–7,11] proved that to break the cryptosystem through triangle substitution in finite field is equivalent to solving discrete logarithm problem. As the security of the cryptosystem is equivalent to that of Elgamal cryptosystem, its efficiency is not so satisfied. The most frequently performed operation in that cryptosystem is the computation of Chebyshev polynomial with large degree. For example, in key pair generation process, the degree represents Received date: 03-08-2010 Corresponding author: LI Zhi-hui, E-mail: [email protected] DOI: 10.1016/S1005-8885(10)60049-0

private key and the value of Chebyshev polynomial with that degree is a part of public key. Such operation is performed in encryption and decryption processes as well. So the efficiency of this cryptosystem mainly depends on the speed of computation of Chebyshev polynomial. To perform that computation efficiently, some Refs. [12–14] proposed to select a composite integer with small factors as private key, and the computation of Chebyshev polynomial can make use of the semi-group property of Chebyshev maps. But it is not a generic algorithm, while integer with large factors can not be used as private key. To compute Chebyshev polynomial with arbitrary degree, a recursive algorithm can also be applied [5,14], but its costs on both time and memory are very large. For the same purpose a generic algorithm using matrix representation of Chebyshev polynomial was proposed [6–8], which requires smaller memory space and costs less time than the recursive algorithm. In Ref. [6], Fee and Monagan generalized the matrix algorithm by Cayley-Hamilton theorem, and the characteristic polynomial algorithm was presented. The characteristic polynomial algorithm is faster than the matrix algorithm, for it requires less multiplications and squares than matrix algorithm, but all of them are not optimized.

Issue 2

LI Zhi-hui, et al. / Fast algorithms of public key cryptosystem based on Chebyshev polynomials…

In this paper, the original matrix algorithm and characteristic polynomial algorithm are discussed, and modified algorithms are proposed to attain better performance. Besides, a new algorithm is also presented, which can be applied and further speed up the computation of Chebyshev polynomial, when some certain conditions are satisfied. These algorithms are realized in software, and their running times are given. Finally all algorithms that have been proposed are compared and an efficient computation scheme for the cryptosystem is recommended.

2 Cryptosystem based on Chebyshev polynomials over finite field Let n  Z  and x  Fp , the Chebyshev polynomials are recursively defined as Tn ( x) 2 xTn 1 ( x)  Tn  2 ( x) mod p

With nı2 and T0 ( x) 1 mod p , T1 ( x)

( 1) x mod p .

Thus, the first Chebyshev polynomials are: T0 ( x ) 1 mod p T1 ( x ) x mod p T2 ( x) 2 x2 1 mod p T3 ( x) 4 x3 3 x mod p T4 ( x) 8 x4 8 x2 1 mod p The Chebyshev polynomials have commutative semi-group property, i.e. for arbitrary r , s  Z  and x  Fp , equation Ts >Tr ( x ) @ Tsr ( x) Trs ( x) Tr >Ts ( x) @ mod p holds. Utilizing the commutative semi-group property, a public key cryptosystem is constructed. Its process can be described as follows: 1) Key pair generation a) Randomly select integers s  Z n* , x  Fp* , and s z 1 , x z 1 , and compute pk

Ts ( x ) .

c

k2c mod p .

Since mc m , Bob gets the message from the cipher. By some fast algorithms, Bob can easily compute public key ( x, pk ) from private key s, but the computation of private key s from public key ( x, pk ) is practically impossible. In decryption process, Bob can get the message from the cipher quickly, using semi-group property of Chebyshev polynomials. In this way the security of the cryptosystem is guaranteed. The computation of Chebyshev polynomial is the primary operation in these three processes, whose speed has significant influence on the efficiency of the cryptosystem.

3 Matrix algorithm and characteristic algorithm The Chebyshev polynomial can be written by matrix form [6]: n T ( x) · § Tn ( x ) · § 0 1 · § T0 ( x ) · n§ 0 n §1 · ¨ ¸ ¨ ¸ A ¨ ¸ A ¨ ¸ mod p ¸ ¨ © x¹ © T1 ( x ) ¹ © Tn 1 ( x ) ¹ © 1 2 x ¹ © T1 ( x) ¹ (2) or n 1 § Tn 1 ( x) · § 0 1 · § T0 ( x) · ¨ ¸ ¨ ¸ ¸ ¨ © Tn ( x) ¹ © 1 2 x ¹ © T1 ( x) ¹ 1 § T0 ( x) · n 1 § · An 1 ¨ ¸ A ¨ ¸ mod p © x¹ © T1 ( x) ¹

(3)

According to Eq. (3), If An1 is known, then the value of Tn ( x) is easy to compute. In Ref. [6] the author uses a ‘square and multiply’ method to compute An1 . One can also base his calculation on Eq. (2), performing the computation of An to get the value of Tn ( x) . The latter choice can simplify procedure, while only ª«ceil(lb n)  ceil(lb(n  1)) º» additional ‘for’ loops should be executed. In present article, for convenience of discussion, the latter procedure is adopted. Let r

¦b

exponent n

2) Message encryption Assume Alice wants to send a message m  Fp * to Bob.

result in the process, cij the element of C . By matrix

b) Alice uses Bob’s public key to compute as follows: k1 Tr ( x ) c

b) Computes mc

Ts ( k1 ).

b) Then s is private key and ( x, pk ) is public key.

The encryption process is: a) Alice randomly selects an integer r  Z n* and r z 1 .

k2

a) Computes k2c

87

Tr ( pk ) mk2 mTr ( pk ) mod p c) Alice sends (c, k1 ) to Bob as the cipher. 3) Message decryption After received the cipher, Bob can decrypt it as follows:

i 0

i 2i

, and use C to represent the immediate

algorithm Tn ( x) is computed as follows: if n 0 then return 1 else if n ! 0 A m I, S m A for i 0 up to r  1 by 1 if bi 1 then C mCuS S m SuS C mCuS return c11  xc12

88

The Journal of China Universities of Posts and Telecommunications

At each step of ‘for’ loop, C

i

¦b 2

Ak , where k

j

j

,

j 0

and S

A2( i 1) . In the end of that loop, k

r 1

¦b 2 j

j

, and

j 0

A2r . When n ! 1 , br must equal to 1, so the last

S

operation C m C u S is always executed, and finally C An . The algorithm takes O(lb n) matrix multiplications to compute An . The Cayley-Hamilton theorem states that a matrix satisfies its own characteristic polynomial, i.e. if the characteristic polynomial of a square matrix A is f (O ) , then f ( A) 0 . So instead of powering a matrix, we can just compute O n modulo this characteristic polynomial using the same ‘square and multiply’ scheme. Because f (O ) is a quadratic polynomial, the n th power of O is a linear polynomial. Denoted it by a1O  a0 , the nth power of A must have the same form: a1 A  a0 , so Tn ( x ) can be calculated as Tn ( x)

§ T0 ( x ) · §1 · An ¨ ¸ (a1 A  a0 ) ¨ ¸ © x¹ © T1 ( x) ¹

a1 x  a0 mod p

Therefore, the characteristic polynomial algorithm can be described as follows: if n 0 then return 1 else if n ! 0 a1O  a0 m 1, s1O  s0 m O for i 0 up to r  1 by 1 if bi 1 then a1O  a0 m (a1O  a0 ) u ( s1O  s0 ) s1O  s0 m ( s1O  s0 ) u ( s1O  s0 ) a1O  a0 m (a1O  a0 ) u ( s1O  s0 ) return a1 x  a0 The complexity of characteristic polynomial algorithm is O(lb n) , as well as the matrix algorithm. But the time cost on linear polynomial multiplication is less than that on square matrix multiplication, so the characteristic polynomial algorithm is faster than the matrix algorithm. In these two algorithms, modulo operation is performed after every integer multiplication, square or addition, to restrict the length of immediate results, and improve the efficiency of computation.

4 Modified matrix algorithm and characteristic polynomial algorithm 4.1

Modified matrix algorithm

In the matrix algorithm mentioned above, the binary digits

2011

of exponent n are checked one by one in a ‘right-to-left’ order. In each step of ‘for’ loop, a matrix square must be performed. And one matrix multiplication will be executed, if the conditional expression is true, i.e. the present binary digit is 1. In most operations, all elements of calculated matrix are large integers. However, if the inverse check-up order is applied, matrix A will participate in the calculation, which has three particular elements: 1, 1 , and 0. As a result the required multiplications and additions will be reduced, and the calculation speed can be improved. Based on it the modified matrix algorithm is presented, whose procedure can be described as follows: CmI for i r down to 0 by  1 C m C uC if bi 1 then C m C u A return xc11  xc12

The modified algorithm checks binary digits of n by ‘left-to-right’ order. In each step of ‘for’ loop, C Ak , where r

k

¦b 2 j

j i

C

j

. When the loop comes to end, k

r

¦b 2 j

j

and

j 0

An . The algorithm procedure is also simplified,

compared with the original matrix algorithm. To analyze the running times of these two matrix algorithms, suppose the average time cost on one modular multiplication is tm , that on one modular squaring is ts , and on one modular addition or modular subtraction, ta . We also suppose all operations of matrix multiplication are classified in three forms: 1) C u S , where C z S , and all elements of these two matrices are large integers, the multiplication is calculated as:  c ·§ s s · §c s c s c s c s · §c C u S ¨ 11 12 ¸¨ 11 12 ¸ ¨ 11 11 12 21 11 12 12 22 ¸ © c21 c22 ¹© s21 s22 ¹ © c21s11  c22 s21 c21s12  c22 s22 ¹ which needs 8 multiplications and 4 additions. Its time cost is about 8tm  4ta . 2) C u C , where all elements of C are large integers. The square is calculated as: c ·§ c c · § c 2  c12c21 c12 (c11  c22 ) · §c C u C ¨ 11 12 ¸¨ 11 12 ¸ ¨ 11 2 ¸ © c21 c22 ¹© c21 c22 ¹ © c21 (c11  c22 ) c21c12  c22 ¹ which needs 3 multiplications, 2 squares and 3 additions. Its time cost is about 3tm  2ts  3ta . 3) C u A or A u C , where all elements of C are large §0 1· integers, and A ¨ ¸ is the square matrix in Eqs. (2) © 1 2 x ¹ or (3). The multiplication of C u A is calculated as:

Issue 2

LI Zhi-hui, et al. / Fast algorithms of public key cryptosystem based on Chebyshev polynomials…

c · § 0 1 · § c12 §c C u A ¨ 11 12 ¸ ¨ ¸ ¨ © c21 c22 ¹ © 1 2 x ¹ © c22

c11  2 xc12 · ¸ c21  2 xc22 ¹

only 2 multiplications and 2 additions are required for the computation, with time cost of 2tm  2ta . The calculation of

A u C will cost the same time. In our classification, the numerically negligible A u A is classified in the third form, even though it needs only one multiplication and one addition. Small integers as element of matrix may appear in immediate results, but all of these undetermined values are treated as large integers. As for the original matrix algorithm mentioned in Sect. 3, assuming the number of 1’s in binary representation of n is m, matrix square operation is performed for r times, of which one has the form of A u A , and the expression C m C u S is executed for m  1 times, with the first multiplication having an operator I , whose running time can be neglected. After An is worked out, a multiplication and an addition are required for computation of Tn ( x) . So the total running time is approximately T1 (r  1)(3tm  2ts  3ta )  m(8tm  4ta )  (2tm  2ta )  (tm  ta ) (r  1)(3tm  2ts  3ta )  m(8tm  4ta )  3(tm  ta )

In modified matrix algorithm, matrix square operation is done for r  1 times, while the first one is I u I and the second one is A u A . Operation C m C u A inside conditional expression will be executed for m times, and one of them is I u A . The total running time is T2 (r  1)(3tm  2ts  3ta )  m(2tm  2ta )  (tm  ta ) We can compare time costs of these two algorithms: T1  T2 m(6tm  2ta )  2(tm  ta ) The modified algorithm is always faster than the original one, even when m 1 , owning to its preferable procedure. There is a linear relationship between running time and m in these two algorithms. If m becomes greater, both algorithms will cost longer time to perform the computation. The growth of original matrix algorithm’s running time with increment of m is faster than modified matrix algorithm. So the improvement of modified matrix algorithm is more remarkable when m is greater. 4.2

Modified characteristic polynomial algorithm

In characteristic polynomial algorithm, the ‘left-to-right’ order of check-up can also be applied, to improve the computation speed. The modified algorithm computes Tn ( x ) as follows:

s1O  s0 m 1

89

for i r down to 0 by  1 s1O  s0 m (s1O  s0 )( s1O  s0 ) if bi 1 then (s1O  s0 ) m ( s1O  s0 )O return s1 x  s0 In order to compare the modified algorithm with the original one, there are mainly three kinds of operations that should be considered: 1) ( s1O  s0 )( a1O  a0 ) , where s1O  s0 z a1O  a0 , and s1 , s0 , a1 and a0 are large integers. The multiplication can be computed as follows: ( s1O  s0 )(a1O  a0 ) (2 xs1a1  s1a0  s0 a1 )O  ( s0 a0  s1a1 ) Its time cost is 5tm  3ta . 2) ( s1O  s0 )( s1O  s0 ) , where s1

and

s0

are large

integers. The square is calculated as ( s1O  s0 )( s1O  s0 ) (2 xs12  2s1s0 )O  ( s02  s12 ) Where 2s1s0

can be calculated through one left shift

operation, or one addition operation. If addition is executed, the computation requires 2 multiplications, 2 squares and 3 additions, and the time cost is 2tm  2ts  3ta . 3) O ( s1O  s0 ) , where s1 and s0 are large integers. The multiplication is calculated as O ( s1O  s0 ) (2 xs1  s0 )O  s1 which requires only one multiplication and one addition, with time cost of tm  ta . Using the similar analysis method in matrix algorithms, the running time of original characteristic polynomial algorithm is approximately T3 ( r  1)(2tm  2ts  3ta )  (m  1)(5tm  3ta )  (tm  ta )  (tm  ta ) (r  1)(2tm  2ts  3ta )  ( m  1) ˜ (5tm  3ta )  2(tm  ta ) And the running time of modified characteristic polynomial algorithm is T4 (r  1)(2tm  2ts  3ta )  (m  1)(tm  ta )  (tm  ta )

( r  1)(2tm  2ts  3ta )  m(tm  ta ) The difference between T3 and T4 is T3  T4

(m  1)(4tm  2ta )  (tm  ta )

The modified characteristic algorithm always takes less time than its original counterpart. Linear relationship between running time and m also exists in these two algorithms, as well as in the matrix algorithms. We can also compare these four algorithms, since their basic operations are the same. For example, we can compare the modified matrix algorithm with the original characteristic polynomial algorithm: T2  T3 ( r  3)tm  m(3tm  ta )  2ta

90

The Journal of China Universities of Posts and Telecommunications

When

m ! ¬ª ( r  3)tm  2ta 3tm  ta ¼º ,

T2  T3 , the

modified matrix algorithm costs less time than the original characteristic polynomial algorithm. Otherwise its running time is longer than the latter. Hence there is a threshold with respect to m between these two algorithms. Denote this threshold as mth , whose value is

(r  3)tm  2ta 3tm  ta

mth

5

r 3

Eigenvalue algorithm

In this section, a fast algorithm relevant with eigenvalues of matrix A is presented, which can run faster than the modified characteristic polynomial algorithm in some circumstances. The eigenvalue algorithm is based on the following proposition: Proposition 1 Chebyshev polynomial can be written as

x 

n

x2  1  x  x2  1



n

(4) mod p 2 Proof Firstly, the Chebyshev polynomial can be represented as n § Tn ( x ) · § 0 1 · § T0 ( x ) · n §1 · ¨ ¸ ¨ ¸ A ¨ ¸ mod p ¸ ¨ © x¹ © Tn 1 ( x ) ¹ © 1 2 x ¹ © T1 ( x) ¹ Tn ( x)

§0 1· Where A ¨ ¸ is a nonsingular square matrix. The © 1 2 x ¹

characteristic polynomial of A is f (O ) O 2  2 xO  1 Eigenvalues of matrix A can be drawn from above equation:

O1

x  x2  1

O2 x  x 2  1 Eigenvectors of A corresponding to O1 , O2 are §1 · D1 ¨ ¸ © O1 ¹ §1 · D2 ¨ ¸ © O2 ¹ Then the n th power of A can be computed as follows:

A

n

Apply it in Eq. (2), and notice O1  O2 x mod p 2 then the value of Tn ( x ) with respect to O1 and O2 is Tn ( x)

O1n  O2 n

x 

2

n

x2  1  x  x2  1



n

2

mod p.

When x  x 2  1 and x  x 2  1 are in finite field Fp ,

If ta is far less than tm , and r is far greater than 3, then mth |

2011

1 §O 0 · D1 D 2 ¨ 01 O ¸ D1 D 2 2¹ © n § O1 O2  O2 nO1 O2 n  O1n · 1 ¨ ¸ O2  O1 © O1n 1O2  O2 n 1O1 O2 n 1  O1n 1 ¹

the value of Chebyshev polynomial can be calculated by expression O1n  O2 n Tn ( x) (5) 2 Where O1 and O2 are eigenvalues of matrix A . If Eq. (5) is to be used, two conditions must be satisfied. First, the square roots of x 2  1 must exist in Fp , in other words, x 2  1 must be a quadratic residue in Fp . This can be

verified by Legendre symbol [15] p 1 § x 2  1· 2 2  (6) ( 1) L ¨ x ¸ p © ¹ When L 1 or L 0 , square roots of x 2  1 are in Fp .

x 2  1 must not be too much, or else the entire time for computation of Chebyshev polynomial is still intolerable. This can be achieved by selecting appropriate parameter p. From Refs. [15] and [16] we know that if p 3 mod 4 or p 5 mod 8 , the Second, the time cost on computation of

computation of square roots of a quadratic residue in prime field requires only a few steps of basic operations, then the second condition is satisfied. If p is neither 3 mod 4 nor 5 mod 8 , running time for computation of

x 2  1 is

undetermined, so the fastness of the algorithm can not be guaranteed. Suppose the computation of O1 and O2 is executed whenever the eigenvalue algorithm is applied, whose time cost is denoted by te . The factor 1/ 2 in Eq. (5) is substituted by

p  1

2 , and the time cost for its

computation is negligible. After O1 and O2 are known, two modular exponentiations, one modular addition and one modular multiplication are needed for the computation of Chebyshev polynomial. If the computation of O1n and O2n uses the ‘square and multiply’ method of ‘left-to-right’ order, total running time for that algorithm is T5 te  2(r  m  1)tm  (tm  ta ) We can compare that running time with that of modified characteristic polynomial algorithm

Issue 2 T4  T5

LI Zhi-hui, et al. / Fast algorithms of public key cryptosystem based on Chebyshev polynomials… ( m  1)tm  (2r  2)ts  (3r  m  4)ta  te

the statistical average of all elements in Fp , and x 2  1 is a

When ( m  1)tm  (2r  2)ts  (3r  m  4)ta ! te , T4 ! T5 , the eigenvalue algorithm is faster than modified characteristic polynomial algorithm. If the computation is done for N times with the same parameter x , and average running times of these two algorithms with respect to m are compared, then parameter m in T5 should be substituted by r  1 2 (m starts from 1 when r ! 1 ), and te by te N . Then T5 is changed to t t T5c e  (3r  1)tm  (tm  ta ) e  3rtm  ta N N For the same reason T4 will be T4c

r 1· ¸ t m  ta © 2 ¹ 5r  3 7r  5 t m  2 r  2 ts  ta 2 2

And the difference between them is ( r  3) te § 7r  7 · T4c  T5c tm  (2r  2)ts  ¨ ¸ ta  2 N © 2 ¹

>(r  3) 2@tm  (2r  2)ts  ª¬ 7r  7

2 º¼ ta !

te N ; when tm  4ts  7ta ! 0 (which is the usual case), r ! ª¬ 2te N  3tm  4ts  7ta º¼ tm  4ts  7ta .

It

means

only if r is beyond a threshold, the eigenvalue algorithm can be faster than modified characteristic polynomial algorithm in average execution speed. There are lots of algorithms for fast modular exponentiation computation. By using one or more of them the eigenvalue algorithm can be further improved in speed, and the required threshold to make it faster than modified characteristic polynomial algorithm will come down. In our implementation, a ‘modPow()’ function from Java BigInteger class labrary is used to do the modular exponentiation operation, in which Montgomery modular reduction algorithm and window exponentiation algorithm are applied. Analysis of required number of multiplications for it is difficult, according to Ref. [15], but its performance can be shown in experimental result, which will be presented in Sect. 6.

6

quadratic residue in the field. According to our analysis in Sect. 4, the running times of matrix algorithms and characteristic polynomial algorithms increase linearly with m, so we fix length of n at 1 200 bits, and change m from 1 to 1 200. When m 1 , n 21 199 , only the most significant binary digit being set; when m 1 200 , n 21 200  1 , and all binary digits of n are set. When m is between 1 and 1 200, m  1 binary digits are selected randomly and set, along with the most significant binary digit. The value of Chebyshev polynomial with different n is computed by different algorithms, and their average running times are recorded. The eigenvalue algorithm is implemented by the same way, although the relationship between its running time and n is more complex. We also plot out the running time for computation of x n mod p , to compare the eigenvalue

r  1 2tm  2ts  3ta  §¨

If T4c ! T5c ,

91

Implementations

We implement four generic algorithms and eigenvalue algorithm in a computer with 2.5 GHz pentium CPU, using Java ‘BigInteger’ class library. A strong prime p with length of 1 315 bits satisfying p 5 mod 8 is generated using Gordon’s algorithm [15]. The value of x is closest to p 2 ,

algorithm with modular exponentiation algorithm adopted in Java ‘BigInteger’ class library. The relationship of running time and m for different algorithms is shown in Fig. 1. The following abbreviations are used to denote different algorithms in the figure (and Fig. 2): MEA: modular exponentiation algorithm. EA: eigenvalue algorithm. OMA: original matrix algorithm. OCPA: original characteristic polynomial algorithm. MMA: modified matrix algorithm. MCPA: modified characteristic polynomial algorithm.

Fig. 1

Running times of different algorithms

From Fig. 1 approximately linear relationship between running time and m can be seen, in matrix algorithms and characteristic polynomial algorithms, and the former increases with increment of the latter. Original matrix algorithm’s running time curve has the greatest slope, followed by original characteristic polynomial algorithm, modified matrix algorithm, and modified characteristic polynomial algorithm. When m 1 , the running time of modified matrix algorithm is hardly different from that of original matrix algorithm. As

92

The Journal of China Universities of Posts and Telecommunications

m increases, the former increases with slower speed than the latter, so the difference between them gets large. This means average running time of modified matrix algorithm is less than the original one, in other words, the modified matrix algorithm is faster than the other. Similarly, the average speed of modified characteristic polynomial algorithm is faster than its original counterpart. Among these four generic algorithms, the modified characteristic polynomial algorithm is the fastest one, while the original matrix algorithm is the slowest. The time curves of modified matrix algorithm and original characteristic polynomial algorithm intersect in a point where m is about 400. The original characteristic polynomial algorithm is faster than the modified matrix algorithm before this point, and is slower after it. As for the comparison of their average speed, suppose the value of m in their intersection point is mth , as

mentioned before, then the modified matrix algorithm is the better one only when r ! 2(mth  1) . Because mth is relevant with r, the speed comparison of these two algorithms also depend on it. Different from generic algorithms, the running time of eigenvalue algorithm is seemingly invariant, when m is changed. Its performance mainly depends on the speed of modular exponentiation operation, which is influenced by factors other than m, when Montgomery modular reduction algorithm and sliding-window algorithm are adopted. In Fig. 1 the running time curve marked with squares denotes just the time that cost on computation of Chebyshev polynomial using Eq. (5), assuming the values of O1 and O2 are already

2011

Suppose O1 and O2 are calculated for every performance of eigenvalue algorithm and this algorithm is used only when it is faster than modified characteristic polynomial algorithm all the time. This condition entails the former is faster than the latter when m 1 . To determine that threshold of r, we alter r form 1 to 1 200 with m fixed at 1, then compute Tn ( x) using these two algorithms. The average running times are shown in Fig. 2.

Fig. 2

Running time comparison of MCPA and EA

From Fig. 2 it can be seen the eigenvalue algorithm will cost longer time than modified characteristic polynomial algorithm when r is small, owning to the cost on computation of O1 and O2 . When r increases, both running times increase but the growth rate of the latter is faster, so these two running time curves intersect at a point where r is about 850bits. If r is greater than that value, the eigenvalue algorithm is faster than the other in case of m 1 , therefore it must be the faster one all the time. p 3 mod 4 is adopted, When

fewer

modular

exponentiations are executed for computation of O1 and O2 , so the speed of eigenvalue algorithm can be further improved by trivial extent. The simulation result for p 3 mod 4 is

known. The running time is about 60 ms, two times as long as the average time of one modular exponentiation operation. Time for computation of O1 and O2 are about 100.2 ms for

similar to Figs. 1 and 2, and is not given by figure.

present simulation. If these two parameters are calculated for every performance of eigenvalue algorithm, the total time cost is briefly depicted with dotted line in the figure, which is still less than the best generic algorithm all the time. If the computation of Tn ( x) is performed repeatedly with different

Theoretically there are mainly four kinds of algorithms to compute the value of Chebyshev polynomial of arbitrary degree: the iteration algorithm, the recursive algorithm, the matrix algorithms, and the characteristic polynomial algorithms. When the degree of Chebyshev polynomial is small, the simplest way to compute its value is by the iteration algorithm. But as the degree used in the cryptosystem is large, it can not be used in practice. The average running time of recursive algorithm is less than the iteration algorithm, but its costs in both time and space still grow exponentially with the length of degree. Hence the recursive algorithm is also impracticable. By contrast, the matrix algorithms and characteristic polynomial algorithms are practically feasible, with time complexity of O (lb n) , and space complexity of

n and the same x, average time of computation of O1 and

O2 will come down. It means that eigenvalue algorithm can be faster than other generic algorithms in certain circumstance. Because of the cost on computation of O1 and O2 , the eigenvalue algorithm is faster than one of these four generic algorithms only if r is beyond a threshold value, which is relevant with algorithm adopted in modular exponentiation, time costs for modular multiplication and modular addition, time cost on computation of square roots of

x 2  1 , etc.

7 Hybrid scheme in application

Issue 2

LI Zhi-hui, et al. / Fast algorithms of public key cryptosystem based on Chebyshev polynomials…

93

O(1) . According to our discussion in previous sections, the

Fp , eigenvalue algorithm may be the faster one. A hybrid

modified characteristic polynomial algorithm is the fastest of them all the time, so it is preferable as a primary generic algorithm in the cryptosystem. There are also some algorithms that can be faster than the best generic algorithm, but their application is restricted in a few particular cases, such as the eigenvalue algorithm and the semi-group algorithm. The eigenvalue algorithm is faster than characteristic polynomial algorithm only when certain values of x and p are used. As the validation of these conditions requires only a few steps of computation, it can be used accompanied with the modified characteristic polynomial algorithm, to speed up the execution of the cryptosystem. The semi-group algorithm can be faster than characteristic polynomial algorithm and even eigenvalue algorithm in some cases, for example, all factors of the private key are less than 5. If the private key is selected randomly, the validation of such condition will result in factorization problem. Otherwise if the private key is selected only from integers having no large prime factor, the space of private key will be reduced, so there is the risk that the private key is recovered by exhaustive attack. Hence its usage in the cryptosystem is not recommended. From views of security and efficiency, we recommend using the modified characteristic polynomial algorithm as a generic algorithm, accompanied by eigenvalue algorithm as a special-purpose algorithm to perform the computation of Chebyshev polynomial. When conditions for eigenvalue algorithm are satisfied, it is used; in other circumstances the characteristic polynomial algorithm is applied.

scheme including two of them can be used in practice to get the best performance.

8

Conclusions

In this paper, two fast algorithms for computation of Chebyshev polynomial have been modified to improve the efficiency of a public key cryptosystem. Through adoption of different procedures, the number of required operations in modified algorithms is reduced, and the running time is decreased. A new algorithm relevant with eigenvalues of matrix used in representation of Chebyshev polynomial has also been presented. It converts the computation of Chebyshev polynomial to modular exponentiation operations. By techniques for fast modular exponentiation the speed of eigenvalue algorithm is further improved. When parameter p is neither 3 mod 4 nor 5 mod 8 , modified characteristic polynomial algorithm is recommended. If p 3 mod 4 or p 5 mod 8 is satisfied, and x 2  1 is quadratic residue in

Acknowledgements This work was supported by the National Basic Research Program of China (2009CB320505), the National Natural Science Foundation of China (61002011).

References 1. Kocarev L. Chaos-based cryptography: A brief overview. IEEE Circuits and Systems Magazine, 2001, 1(3): 621 2. Pareek N K, Patidar V , Sud K K. A random bit generator using chaotic maps. International Journal of Network Security, 2010, 10 (1): 3238 3. He B, Luo L Y, Xiao D. A method for generating S-box based on iterating chaotic maps. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2010, 22 (1): 8993 (in Chinese). 4. Kocarev L, Tasev Z. Public-key encryption based on Chebyshev maps. Proceedings of the International Symposium on Circuits and Systems (ISCAS’03), Vol 3, May 2528, 2003, Bangkok, Thailand. Los Alamitos, CA, USA: IEEE Computer Society, 2003: 2831 5. Bergamo P, D’Arco P, Santis A D, et al. Security of public-key cryptosystems based on Chebyshev polynomials. IEEE Transactions on Circuits and Systems I: Regular Papers, 2005, 52(7): 13821393 6. Fee G J, Monagan M B. Cryptography using Chebyshev polynomials. Proceedings of the Maple Summer Workshop (MSW’04), Jul 1114, 2004, Burnaby, Canada. 2004: 15p 7. Kocarev L, Tasev Z, Amato P, et al. Encryption process employing chaotic maps and digital signature process. United States Patent 6892940. 2005 8. Kocarev L, Makraduli J, Amato P. Public-key encryption based on Chebyshev polynomials. Circuits, Systems, and Signal Processing, 2005, 24(5): 497571 9. Ning H Z, Liu Y, He D Q. Public key encryption algorithm based on Chebyshev polynomials over finite fields. Proceedings of the 8th International Conference on Signal Processing (ICSP’06): Vol 4, Nov 1620, 2006, Beijing, China. Piscataway, NJ, USA: IEEE, 2006: 4p 10. Elgamal T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory, 1985, 31(4): 469472 11. Lima J B, Campello de Souza R M, Panario D. Security of publickey cryptosystems based on Chebyshev polynomials over prime finite fields. Proceedings of the IEEE International Symposium on Information Theory (ISIT’08), Jul 611, 2008, Toronto, Canada. Piscataway, NJ, USA: IEEE, 2008: 18431847 12. Wang D H, Yang H Z, Yu F S, et al. A new key exchange scheme based on Chebyshev polynomials. Proceedings of the Congress on Image and Signal Processing (CISP’08), May 2730, 2008, Sanya, China. Piscataway, NJ, USA: IEEE, 2008: 124127 13. Wang D H, Hu Z G, Tong Z J, et al. An identity authentication system based on Chebyshev polynomials. Proceedings of the 1st International Conference on Information Science and Engineering (ICISE’09), Dec 2628, 2009, Nanjing, China. Piscataway, NJ, USA: IEEE, 2010: 16481650 14. Wang X Y, Zhao J F. An improved key agreement protocol based on chaos. Communications in Nonlinear Science and Numerical Simulation, 2010, 15(12): 40524057 15. Menezes A J, Van Oorschot P C, Vanstone S A. Handbook of applied cryptography. New York, NY, USA: CRC Press, 1997 16. Muller S. On the Computation of square roots in finite fields. Designs, Codes and Cryptography, 2004, 31 (3): 301312

(Editor: ZHANG Ying)