Computers & Securin/, 9 (1990) 257-262
Cryptanalysis of a Fast Cryptographic Checksum Algorithm Bart Preneel”, Antoon Bosselaers, Renk Govaerts and Joos Vandewalle Katirolieke Universiteit Lpuven. Laboratorium ESAT, Kardinaal Mercierlaan 94. B-3030 Heverlee, Be/@nr
A critical analysis of the modified cryptographic checksum algorithm published by F. Cohen and Y. J. Huang points out two major weaknesses in the scheme. The first weakness is a reduced dependency on the beginning of the plaintext. Secondly, the first bits of the key can be derived with an adaptive chosen text attack. With the aid of these bits, the plaintext can be manipulated with a negligible chance of detection. Keywords: Information protection, Integrity, Cryptographic checksum, Message authentication
Cryptography, code.
1. introduction
R
ecently emerging threats like computer viruses, Trojan horses and worms have focused attention on the protection of the integrity of data and programs stored in computers. Also, in communication networks, data integrity together with secrecy is an important issue. This holds especially for banking networks or in networks for exchange of strategic information. Three major methods can be distinguished
‘NFWO aspirant navorser, Foundation of Belgium.
0167-4048/90/$3.50
sponsored
by the National
for the
Science
protection of data integrity [7, 81. A first approach consists of compressing the plaintext under the control of a secret key. The compression function and often the result arc called a message authentication code (MAC). A second approach consists of compressing the information with a non-keyed hash function or manipulation detection code (MDC). In this case the integrity (not the secrecy) of the MDC must be protected. An alternative is encrypting the MDC with a secret key, resulting in a MAC scheme. Finally, a digital signature scheme can offer a very nice solution to the problem, especially when it is combined with an MDC to increase the efficiency. The three approaches result in a different key management and require a different level of protection of the checksum or signature. Together with the efficiency of implementations these factors will dcterminc the most economic solution. It is impossible to compare these schemes within the scope of this article. The algorithm proposed in ref. [2] and modified in ref. [6] is of the MAC type. The requirements that must bc satisfied by a MAC are [9]
0 1990, Elsevier Science Publishers Ltd.
257
B. Preneel et al. /Fast Cryptographic Checksum Algorithm
(1) The description of the MAC(. ,.) must be publicly known and the only secret information lies in the key (extension of Kerckhoff s principle).
is based on modular exponentiation (RSA, Ill]). The factorization of the modulus is discarded so that the built-in trapdoor cannot be opened.
(2) The data X can be of arbitrary MAC has a fixed length m.
The plaintcxt consists of N blocks B,, through B, _ , where I?,, is the file name or the first plaintext block The logical EXOR operation is denoted with 0.
(3) Given MAC(K,X)
X and K, the must be “easy”.
length
computation
and the
of
Algorithm I-A (4) Given
X, it is computationally infeasible to determine MAC(K,X) with a probability of success significantly higher than l/2”‘. Even when a large set of pairs {X,,MAC(K,X,)} arc known, it is computationally infeasible for an opponent to determint the key K or to compute MAC(K,X’) for any X'ZX, with a probability of success significantly higher than l/2”‘. (5) The MAC must bc collision free: this means that it is computationally infeasible to find two distinct mcssagcs which hash to the same result without knowing the secret key K. However, a good MAC is only a first step in obtaining integrity protection. A secure system should also provide a serial number, time information, a specification of the number of blocks and provide a padding proccdurc to deal with plaintexts of which the length is not an integer multiple of the blocklength. After a description of the MAC scheme, WC will point out two important wcakncsses in the scheme and it will be shown how they can bc cxploitcd to construct different plaintcxts with the same checksum. 2. The Scheme
of F. Cohen and Y. J. Huang
The design of a fast and secure MAC or MDC based on blockciphers can certainly not be considcrcd as a trivial task. One only has to look at the long list of proposals coming from rcputcd cryptographers that wcrc broken 17, 91. The scheme proposed in ref. [2] and modified in rcf [6]
258
Revised Checksum Algorithm
Select an RSA key K= (I&M) a user-specified seed S. Initialization
Set Q. = RSA(B,
Main loop
Fori=l toN-1: Qi = RSA[l + {(B, 63 S) mod (Q,- , -
Result
and
0 S, K)
l>L K]
The result of the checksum
is
QN- 1 Remark that the length of the result of the checksum m is equal to the number of bits in the binary representation of M. As indicated in ref. [2], there is no reason to keep K sccrct, so the actual key of the cryptographic checksum algorithm is the seed S. The use of the RSA algorithm implies that certain parts of the algorithm are cryptographically strong. However, the computation of modular exponcntiations is very slow, cvcn with the fastest dedicated hardware available [5] (17-20 kbit s- ’ for a modulus and an exponent of 5 12 bits). The performance can be improved by means of a preliminary compression of the plaintext. This can bc looked at as a tradeoff bctwccn two extremes: in the first case, thcrc is no compression which results in a secure but slow compression algorithm; on the other hand, the message could be compressed with an MDC and then the result could be encrypted with the RSA algorithm. In the last case, the RSA does not improve directly the cryptographical strength of the MDC. Our attack is worked out under the assumption that there is no preliminary compression such that the schcmc takes maximal profit of
Computers and Security, Vol. 9, No. 3
the strength of the RSA. Because the compression C(B) was not specified in rcfs. 12, 61 we are obliged to omit it. To take advantage of the properties of the RSA algorithm, WC propose to compress the message with an cfficicnt MDC schcmc and to sign the result with the secret key. This requires only one modular exponentiation. For the selection of an MDC, rcccnt evolutions in cryptanalysis must be taken into account (e.g. the scheme of Jueneman [7] is broken by Coppersmith), but new proposals stem to bc very promising [3,9]. A last remark concerns the length of the result. We believe that there is no reason to keep the full 5 12 bits of the result. Without decreasing the security level significantly, the result can be reduced to 128 bits by repeatedly EXORing lower and higher parts.
3. Weakness
of the Modulo
Reduction
The coupling of the blocks by a modulo reduction results in specific weaknesses like non-uniform distributions of the intermediate variables. When x and y arc independent random integers between 0 and M - 1, the probability that x mod y equals I’ is given by
P{(, mod v) = i) = $
whcrc lz] denotes the largest integer smaller than or equal to .z and .x mod 0 is defined as x^mod M = ,I’. For i = M - 1 this probability equals l/M’ instead of 1/M for a uniform distribution. For i = 0 the sum can be worked out to [4]
ln(M-1)+2y+O
where
y = 0.5772.. . is Euler’s constant.
Because S is a uniform random variable, the same holds for Bj 0 S. The good randomness properties of the RSA cause the result Q; of the RSA operation also to bc random. As a consequence
P(B,OS
cast,
B, is independent
of all previous
Under the assumption that the plaintext blocks are independent, uniformly distributed random variables, one can easily prove the following theorem. Theorem 1 If the first t bits of Q,_ h (2 G k G I) are e ual to 1, the probability that Q,- 1 is independent o P Q,_ k - and thus of the data blocks B, to B,_ b - is approximately equal to 1 - 1/2k+‘-‘. This opens the door to tamper with messages by changing individual blocks and combining blocks of different plaintexts in one new plaintext with a low probability of detection. There is especially a very small dependcncc of the checksum result on the first plaintext blocks. One can wonder how an attacker can observe the intermediate result Q,_k, but this is very easy when he can observe the checksum for a shorter version of the plaintext. However, the error probability of an attack could be significantly lowered when he would know S or at least the first bits of S. In the following paragraph WC will show how the first s bits of S can be derived by means of an adaptive chosen plaintext attack.
4. How to Derive the First s Bits of S WC assume that an attacker can select a large number of plaintcxts and obtain the corresponding checksum pairs without knowing the secret key. This is possible when he is able to enter data or programs in the directory of the attacked user and can observe the checksums or if he is able to use
259
B. Preneel et aLlFast Cryptographic Checksum Algorithm
the checksum program without having access to the sccrct key S. The term “adaptive chosen plaintext attack” implies that the sclcction of messages can depend on the outcome of previous observations. One can argue that an attacker who can obtain plaintext and corresponding checksum pairs, is more likely to use them to modify directly a number of plaintexts and the corresponding checksums. However, this may not always be the cast: suppose an attacker has access during a limited time and he wants to carry out manipulations on a later date. The idea behind our attack is the following: the in a very attacker looks for a plaintext rcsultin E e adds the large checksum QN_ ,. Subsequently, blocks B, and B,, , to the plaintext and observes the corresponding checksums QN and QN+ , Q,=RSA[l+{(B,OS)mod(Q,_,-l)},K] 0 S)mod(Qlv-, Q N+I =RSA[l+@%,+, With a probability Q,,,_ ,/M, f can be simplified to Q,=RSA[l
- l)}, K]
the equation
for QN
+(B,OS),K]
To compute the first bit of the key, the attacker selects blocks B, until Q,,,. has the following structure
Qdm)= 1 Q,(i)=O,
i=m-l,m-2
,...,
m-k+l.
Here the bits of a number A are denoted with [A(m), A(m1) . . . . A(l)] and A(m) is the most significant bit. This will require on the average 2” plaintext-checksum pairs. Then we choose B,,, equal to B, and compare QN and QN+ ,. If they are equal, we decide that S(m) = Bdm) and if they are different, we decide that s(m) = BN(m). The key observation of the attack is that if s(m) and B,(m) are equal, B,@ S will always be smaller
than QN. On the other hand, if S(m) and B,(m) are different, B,@ S will be greater than Qy with a probability close to 1 - l/2”. As a consequence it is easily proven that only in the last case (an event with probability l/2), this algorithm will result in a wrong decision with a probability 1/2L. In ref. [lo], a more detailed proof was given by the authors and the following more general results arc obtained: the first s bits of the key can be dctcrmined with a similar attack, that will require 0(2’+‘) plaintext-checksum pairs. The probability that all s bits arc correct is
( 1 1-L’
2k+’
A further extension is to make use of short plaintexts consisting of 1 and 2 blocks. The condition that QN-, must be large is no longer necessary, but the asymmetry between the initialization and the main loop (in the first case thcrc is no addition of 1) renders the attack slightly more complicated. Owing to the exponential increase of the number of plaintext-checksum pairs, it is computationally infeasible to compute all the key bits. This implies that adding blocks is not possible. Nevertheless, the most significant bits arc sufficient to carry out manipulations with a very small probability of detection. First, an attacker can determine the first bits of Bj@ S, and he will be able to tell whether BiO S < Qi_ , - 1 with an error probability of l/2’. Because he knows the first s bits of S, he can force the first s bits of Bj 0 S to zero, which implies that all previous blocks B,, through Bj_ , can be changed that the arbitrarily. In that case, the probability checksum will change is smaller than l/2’. 5. Summary Owing to the large number of modular exponcntiations, the modified version of the cryptographic checksum algorithm proposed by F. Cohen and Y. J. Huang is not very efficient. The result of the
Computers and Security, Vol. 9, No. 3
checksum is insensitive to changes in the initial part of the plaintext which allows several manipulations. Moreover, an attacker can compute the first bits of the key using an adaptive chosen text attack Knowledge of these bits reduces significantly the chances of detecting a fraud.
References [l] E. F. Brickell, A survey of hardware implementations of KSA, Advances in Cryptology, hoc. Crypt0 ‘89, Springer, Berlin. [2] F. Cohen, A cryptographic checksum for integrity protection, Comput. Secur., 6 (1987)505-510. [3] 1. B. Damgird, Design principles for hash functions, Advances in Cyptolog, Proc. Cqyto ‘89, Springer, Berlin. [4] G. H. Hardy and E. M. Wright, An introduction to the theory of numbers. 5th edition, Oxford University Press, Oxford, 1979. [5] F. Hoornaert, M. Decroos, J. Vandewalle and K. Govaerts, Fast RSA hardware: dream or reality?, Advances in Cyq~olOQ, Proc. Eurocrypt ‘88, Springer, Berlin, pp. 257-264. [6] Y. J. Huang and F. Cohen, Some weak points of one fast cryptographic checksum algorithm and its improvement, Compuf. Secur., 7 (1988) 503-505. [7] K. K. Jeuneman, A high speed manipulation detection code, Advances in Cryptology, Proc. Cypto ‘86, Springer, Berlin, pp. 327-347, [8] C. H. Meyer and M. Schilling, Secure program load with manipulation detection code, Proc. Securirom 1988, pp. 111-130.
PI B.
Preneel, K. Govaerts and J. Vandewalle, Cryptographically secure hash functions: an overview, Internal Rep., 1989 (ESAT Laboratories K. U. Leuven). [‘OlB. Preneel, A. Bosselaers, K. Govaerts and J. Vandewalle, A chosen text attack on the modified cryptographic checksum algorithm of Cohen and Huang. Advances in Cwrool09y &or. Gyplo ‘89, Springer, Berlin. 1’‘I K. L. Kivest, A. Shamir and L. Adleman, A method for obtaining digital signatures and public-key cryptosystems, Commun. ACM,2 (1978) 120-126.
Appendix: Nomenclature 0
x K
B;, Xi Qi
K M
s” ks N
C(B) RSA 14 Y
logical exclusive or complement of X cryptographic key plaintext blocks intermediate values of checksum RSA encryption exponent RSA modulus number of bits of M user specified seed parameters of our attack number of blocks of the plaintext B compression of plaintext B encryption wirh RSA public-key algorithm largest integer smaller than or equal to z Euler’s constant
Bart Preneel was born in Leuven, Belgium, on October 15, 1963. He obtained an electrical engineering d egree in applied science, from the Katholieke Leuven, Universiteit Belgium, in 1987.
Belgium, Research sciences.
sponsored (Belgium)
His research cations.
interests
From 1987 he has been a Research Assistant at the E.SA.T. Laboratory of the Katholieke Universiteit Leuven, by the National Fund for Scientific and is preparing a doctorate in applied
are cryptography
and digital
Antoon Bosselaers was born in Turnhout, Belgium, on August 7, 1963. He obtained an electrical engineering degree from the Katholieke Universiteit Leuven, Belgium, in 1987. From 1988 he has been a Research Assistant at the E.S.A.T. Laboratory of the Katholieke Universiteit Leuven, Belgium. His research tions.
interests
are cryptography
and digital communica-
communi-
261
B. Preneel et aLlFast Cryptographic
RenC Govaerts
was born
Checksum Algorithm
Joos Vandewalle was born in Kortrijk, on August 31, 1948. He Belgium, received an engineering degree and a doctorate in applied science, both from the Katholieke Universiteit Leuvcn, Belgium, in 197 1 and 1Y76 respectively.
in Leuven,
Belgium, on January 30, 1937. He received an engineering degree and a doctorate in applied science, both from the Katholieke Universiteit Leuven, Belgium in 1961 and 1971 respectively. In 1963 he became a Research Assistant and from 1964 until 1971 he was a Research Associate at the E.S.A.T. Laboratory of the Katholieke Universiteit Leuven, Belgium. In 1971 he joined the academic staff of the faculty of applied sciences and became Full Professor in 1977. His research interests are mainly thick film technology, digital video compressing techniques and communications securiry. He has authored these areas.
and
co-authored
more
than
100 papers
in
From 1972 to 1976 he was an Assistant at the E.SA.T. Laboratory of the Katholieke Universiteit Leuven. Belgium. From 1976 to 1978 he was a Research Associate and from July 1978 to July 1979, he was Visiting Assistant Professor both at the University of California, Berkeley. Since July 1979 he has been at the E.SA.T. Laboratory of the Katholieke Universiteit Leuven. Belgium where he is currently Full Professor. His research inrerests are mainly in mathematical system theory and its applications in circuit theory, control, signal processing, and cryptography. He has authored these areas.
262
and
co-authored
more
than
200 papers
in