Information Processing Letters 109 (2009) 838–841
Contents lists available at ScienceDirect
Information Processing Letters www.elsevier.com/locate/ipl
Optimal correlation attack on the multiplexer generator Jovan Dj. Golic´ a,∗ , Guglielmo Morgari b a b
Security Innovation, Telecom Italia, Via G. Reiss Romoli 274, 10148 Turin, Italy Telsy Elettronica e Telecomunicazioni, Corso Svizzera 185, 10149 Turin, Italy
a r t i c l e
i n f o
a b s t r a c t
Article history: Received 17 May 2007 Available online 9 April 2009 Communicated by Y. Desmedt
The security of the well-known multiplexer generator with respect to correlation attacks on the data shift register is investigated. Apart from the basic correlation attack exploiting the bitwise correlation between the output sequence and any data input sequence, two new correlation attacks are introduced. One is based on computing the a posteriori probabilities and is statistically optimal, whereas the other makes use of the accumulated bitwise correlation to all data input sequences. It is theoretically argued and experimentally confirmed that the optimal attack requires a significantly shorter output sequence to be successful than the basic attack. The experiments also show that the less complex accumulated correlation attack requires a somewhat longer output sequence than the optimal attack. © 2009 Elsevier B.V. All rights reserved.
Keywords: Cryptography Multiplexer generator Correlation attacks A posteriori probability
1. Introduction A class of binary pseudorandom sequences for cryptographic and spread spectrum applications called the multiplexed sequences was proposed and analyzed in [6] and widely popularized in [2]. Their use has also been recommended in an EBU standard for video encryption for payTV [10]. Multiplexed sequences are generated by a simple and fast scheme, known as the MUX generator. It consists of two linear feedback shift registers (LFSRs) and a multiplexer, whose address inputs are controlled by one of the LFSRs and whose data inputs are taken from the other. The two LFSRs are here called the address LFSR and the data LFSR, respectively. In [6], under certain conditions, the multiplexed sequences were shown to possess good standard cryptographic properties such as a long period, a high linear complexity, and a low out-of-phase autocorrelation for most values of the phase shifts on a period. However, over the years, a number of weaknesses have also been found out. Namely, the collision test [1] and the linear consistency test [11] are divide-and-conquer attacks
*
Corresponding author. ´ E-mail addresses:
[email protected] (J.Dj. Golic),
[email protected] (G. Morgari). 0020-0190/$ – see front matter doi:10.1016/j.ipl.2009.04.009
©
2009 Elsevier B.V. All rights reserved.
on the address LFSR. On the other hand, the basic correlation attack [8] and the fast correlation attacks [7] are divide-and-conquer attacks on the data LFSR which make use of the bitwise correlation [4] between the output sequence and any data input sequence. It is shown in [9] that the fast correlation attacks can be successful if the number of address bits is relatively small (e.g., 2 to 4). In [4], it is proved that the out-of-phase autocorrelation is necessarily high for relatively small values of the phase shifts and, in [5], this result is extended to higher-order statistical weaknesses as well. The resynchronization attack [3] can recover the initial states of both LFSRs, even if their lengths are large, by using a number of resynchronizations, but, on the condition that the resynchronization scheme is linear. Recall that the correlation attack is a known plaintext attack conducted under the assumption that the cryptanalyst knows the complete structure of the generator and that the secret key defines only the initial states of the two LFSRs. Let k and K = 2k be the numbers of address input bits and data input bits for the multiplexer, respectively. Then, at any time, the output bit is correlated to each data input bit with the correlation coefficient c = 1/ K . Here, the correlation coefficient between two random bits a and b is defined as Pr{a = b} − Pr{a = b}. In the basic correlation
J.Dj. Goli´c, G. Morgari / Information Processing Letters 109 (2009) 838–841
attack on the data LFSR, we select one of K data inputs and use the binary symmetric channel model to represent the bitwise correlation. The goal is to recover the selected data input sequence from a known segment of the output sequence. For each possible initial state of the data LFSR, a segment of the selected data input sequence is generated and then compared with the known output segment. In this model, it is statistically optimal to choose the initial state that minimizes the Hamming distance between the output segment and the corresponding data input segment. The main objective of this letter is to study the correlation attack on the data LFSR in the MUX generator in more detail. More precisely, we would like to make use of the bitwise correlation between the output sequence and all data input sequences. Note that these sequences are not independent, as they are phase shifts of each other. We would thus like to find out the statistically optimal correlation attack and analyze the output sequence length required for its success. Another aim is to compare this attack with the accumulated correlation attack in which the bitwise correlations to individual data input sequences are accumulated together by adding up the corresponding Hamming distances. The problems considered are of wider interest, as the MUX combining principle can also be applied within more complex generators. 2. MUX generator The MUX generator is depicted in Fig. 1. It consists of two regularly clocked LFSRs whose outputs are combined by a multiplexer Boolean function, MUX. LFSRa is the address LFSR of length ra and LFSRd is the data LFSR of length rd . With primitive feedback polynomials, they produce maximum-length sequences of periods P a = 2ra − 1 and P d = 2rd − 1, respectively. At any time t, the contents of a fixed set of k stages of LFSRa , are used to form a k-bit address, A (t ). If k < ra , then the range of addresses is {0, 1}k . Let γ be an invertible mapping from {0, 1}k into the set of indexes of rd stages of LFSRd . The MUX then selects a stage of LFSRd with the index γ ( A (t )), from a fixed set of K = 2k stages {γ ( A ): A ∈ {0, 1}k } providing data inputs to the MUX. The content of the selected stage is taken as the output bit z(t ) of the multiplexed sequence. 3. Basic correlation attack Multiplexed sequences possess a correlation weakness [4], namely, there is a bitwise statistical dependence between the output sequence z and individual phase-shifts of the LFSRd sequence providing K data inputs to the MUX. The correlation coefficient c is equal to 1/ K , independently of t, if the k-bit addresses are uniformly distributed. More precisely, this is because c = 1 if z(t ) is chosen from the same stage as the respective data input, which happens with probability 1/ K , whereas c = 0 otherwise. In the basic correlation attack, as explained in Section 1, this correlation is used for reconstructing the initial state of LFSRd . The basic correlation attack is equivalent to minimum distance decoding on the binary symmetric channel, since
839
Fig. 1. Multiplexer generator.
the segments, of any given length n, of the sequences produced from all the initial states of LFSRd represent a linear block code. The output sequence length required for a successful correlation attack is n = O (rd /c 2 ), where the involved multiplicative constant depends on the success probability and is small (e.g., smaller than 10). It is known that this expression can be obtained either from the channel capacity arguments or from the underlying binomial distributions. In particular, in the MUX generator case, the required length becomes n = O (rd K 2 ). The computational complexity is O (2rd n), as we exhaustively search through all the initial states of LFSRd and, for each initial state, compute the Hamming distance between a given output segment of length n and the corresponding data input segment produced from that initial state, and then choose the initial state giving rise to the minimum Hamming distance. The correlation attack is a divide-and-conquer attack on LFSRd and, once its initial state is recovered, the initial state of LFSRa can be reconstructed either by the exhaustive search or faster, by the information set decoding algorithm based on conditional correlations proposed in [9]. 4. Optimal correlation attack Our objective is to utilize the bitwise correlation between the MUX output sequence and all K data input sequences to the MUX. A straightforward way of doing this is simply to accumulate the bitwise correlations as if they were all independent. In other words, we can add up the Hamming distances between the same output segment and all K data input segments, produced from the same initial state of LFSRd , and then decide on the initial state yielding the minimum accumulated Hamming distance. This algorithm is here called the accumulated correlation attack. Instead of the Hamming distance, which is the number of coordinates in which two binary vectors disagree, we can equivalently consider the number of coordinates in which they agree, which can be called the Hamming similarity. The accumulated correlation attack then maximizes the accumulated Hamming similarity. Heuristically, it may appear that we thus effectively prolong the given output segment K times and, hence, reduce the required output
840
J.Dj. Goli´c, G. Morgari / Information Processing Letters 109 (2009) 838–841
sequence length K times to n = O (rd K ), while keeping the same computational complexity as in the basic correlation attack. However, this is not theoretically justified and needs to be tested experimentally, as each output bit is repeatedly used K times. We now show that it is possible to compute the a posteriori probability for the initial state of LFSRd given a segment of the MUX output sequence. More precisely, we can compute the probability that the initial state of LFSRd is equal to any assumed value, given an output segment. The algorithm that computes this a posteriori probability for any assumed initial state and then chooses the initial state with the maximum probability minimizes the probability of decision error and is hence statistically optimal. Such an attack is here called the optimal correlation attack. Assume the probabilistic model in which the unknown address input sequence to the MUX is a sequence of uniformly distributed and mutually independent random variables that is independent of the initial state of LFSRd , which is chosen uniformly at random. Let zn = ( zi )ni=1 denote a given binary MUX output segment and let X n = ( X i )ni=1 denote a segment of K -dimensional data input vectors produced from an assumed initial state S d of LFSRd , where both segments have length n. We are interested in computing the a posteriori probability Pr{ S d | zn }. If n rd , then X n is an invertible function of S d and, hence, Pr{ S d | zn } = Pr{ X n | zn }. Further, we have
Pr X n | zn = Pr zn | X n
Pr{ X n } Pr{ zn }
(1)
.
Since S d are equiprobable, we get Pr{ X n } = 2−rd , for all X n . Then, as Pr{ zn } does not depend on X n , maximizing the a posteriori probability Pr{ X n | zn } over X n is equivalent to maximizing the conditional probability Pr{ zn | X n } over X n . The problem thus reduces to computing Pr{ zn | X n }. In this case, X n is given and the conditional probability relates to the value of the output segment over random choices of the segment A n = ( A i )ni=1 of the address input sequence. In the assumed model, for any given X n , the address input vectors A i are uniformly distributed and mutually independent. Accordingly, we obtain
n
Pr z | X
n
=
n
Pr{ zi | X i },
(2)
i =1
where Pr{ zi | X i } is equal to the relative number of bits in X i that are equal to zi . If w H ( X i ) denotes the Hamming weight of X i , i.e., the number of 1’s in X i and p i = w H ( X i )/ K , then we obtain
Pr zn | X n =
n
p i i (1 − p i )1−zi z
i =1
=
n 1
Kn
1− z i
w H ( X i )zi K − w H ( X i )
.
(3)
i =1
Consequently, the optimal correlation attack maximizes (3). The output sequence length required for a successful correlation attack can be estimated as follows. If the initial state of LFSRd is guessed correctly, then, for each i
in the product in (3), the data input vector X i assumes X i and, hence, the given output bit zi the correct value was randomly chosen to be equal to 1 with probability p i . If this state is guessed incorrectly, then X i assumes a wrong, random value and, hence, zi was randomly chosen to be equal to 1 not with probability p i , but, rather, with X i . So, in the former case, the probability corresponding to product in (3) will, on average, contain more terms bigger than one half and, in the latter, the number of terms bigger than one half will on average be equal to those smaller than one half. Accordingly, the product in (3) is expected to be larger in the former case than in the latter. It is readily seen that the situation is similar to that for the statistically optimal decoding on a time-variant binary symmetric channel, where p i correspond to noise probabilities and zi to error bits (obtained by adding modulo two guessed codeword bits with the received noised codeword bits). The number of terms n needed to distinguish, with a significant probability, the correct guess from one wrong guess depends on the probability distribution of p i , which corresponds to the binomial distribution of w H ( X i ) when X i is chosen uniformly at random. More precisely, n can be estimated as n = O (1/c 2i ), where c i = 1 − 2p i is the correlation coefficient between zi and 0, conditioned on known p i when the guess is correct, and the involved multiplicative constant is small. By virtue of ( w H ( X i ) − 0.5K )2 = K /4, which holds for binomially distributed w H ( X i ), we then obtain that n = O ( K ). Therefore, the output sequence length required to distinguish the correct guess from all 2rd − 1 wrong guesses is then n = O (rd K ), which is much smaller than n = O (rd K 2 ) for the basic correlation attack. The computational complexity is O (2rd n) steps, but the step complexity is increased, as it corresponds to computing a product of two integers in (3). For comparison, the accumulated correlation attack maximizes the accumulated Hamming similarity, which can be put into the following form
Sim zn , X n =
n
1− z i
w H ( X i )zi K − w H ( X i )
.
(4)
i =1
So, the accumulated correlation attack is close to being optimal inasmuch as the sum of the terms in (4) is close to the product of the same terms in (3). In general, one may expect a roughly monotonic increasing relation between a sum and a product of the same nonnegative integers, but, certainly, this relation cannot be described as a functional dependence. This can be regarded as a theoretical justification for the accumulated correlation attack. Accordingly, the accumulated and optimal correlation attacks are not equivalent and it is interesting to compare them experimentally. 5. Experimental results Without loss of generality, to reduce the number of parameters, we conducted the experiments with K = rd , i.e., by taking the data inputs to the MUX from all the stages of LFSRd , where the address input vectors are pseudorandomly generated as uniformly distributed and mutually independent. According to the analysis from Section 4, the
J.Dj. Goli´c, G. Morgari / Information Processing Letters 109 (2009) 838–841
841
as a function of m, is shown in Fig. 2, for rd = 20. The probability that the correct guess has nonzero rank, over the chosen 100 initial states, is shown in Fig. 3. Similar figures were obtained for rd < 20. Each graph contains two curves, corresponding to the optimal and accumulated correlation attacks. Note that an attack can be regarded as successful whenever the rank of the correct guess is relatively small or even moderately large. It thus appears that any value 4 m 10 is sufficient for success. It follows that the success rate of both attacks rapidly increases with the constant m and that, for small m, the optimal attack is more successful, whereas, for large m, the two attacks have a similar success rate. This confirms the theoretical predictions from Section 4 and also shows that the less complex accumulated correlation attack is not far from the optimal one. 6. Conclusion
Fig. 2. Average rank versus normalized output sequence length m = n/rd2 , rd = 20.
Two new correlation attacks on the MUX generator are introduced, namely, the optimal correlation attack and its numerical approximation, the accumulated correlation attack. They both reduce the required output sequence length considerably, i.e., about K times, in comparison with the basic correlation attack. This length is shorter for the optimal attack, but the complexity is higher. For longer output sequences, the new attacks perform similarly, as they both identify the correct guess with a very high probability. References
Fig. 3. Nonzero rank probability versus normalized output sequence length m = n/rd2 , rd = 20.
output sequence length was chosen as n = mrd2 , 2 m 10. For a number, 100, of pseudorandomly chosen initial states of LFSRd , we generated the output segments of variable length n and, for each segment and each length, we then computed the conditional probability (3) and the accumulated Hamming similarity (4) for all the initial states of LFSRd and found the rank of the correct guess, where rank zero corresponds to the maximal probability or similarity. The average rank over the chosen 100 initial states,
[1] R. Anderson, Solving a class of stream ciphers, Cryptologia 14 (3) (1990) 285–288. [2] H. Beker, F. Piper, Cipher Systems: The Protection of Communications, Northwood Publications, London, 1982. [3] J. Daemen, R. Govaerts, J. Vandewalle, Cryptanalysis of MUX-LFSR based scramblers, in: Proc. 3rd Symposium on State and Progress of Research in Cryptography – SPRC ’93, Rome, Italy, 1993, pp. 55–61. ´ M. Salmasizadeh, E. Dawson, Autocorrelation weakness of [4] J.Dj. Golic, multiplexed sequences, in: Proc. International Symposium on Information Theory and its Applications – ISITA ’94, Sydney, Australia, vol. 2, 1994, pp. 983–987. ´ M. Salmasizadeh, E. Dawson, Statistical weakness of mul[5] J.Dj. Golic, tiplexed sequences, Finite Fields Appl. 8 (2002) 420–433. [6] S.M. Jennings, A special class of binary sequences, PhD thesis, University of London, 1980. [7] W. Meier, O. Staffelbach, Fast correlation attacks on certain stream ciphers, J. Cryptology 1 (3) (1989) 159–176. [8] T. Siegenthaler, Decrypting a class of stream ciphers using ciphertext only, IEEE Trans. Comput. 34 (1985) 81–85. ´ M. Salmasizadeh, E. Dawson, A fast correlation [9] L. Simpson, J.Dj. Golic, attack on multiplexer generators, Inform. Process. Lett. 70 (1999) 89– 93. [10] Specification of the systems of the MAC/packet family, EBU Technical Document 3258-E, Oct. 1986. [11] K.C. Zeng, C.H. Yang, T.R.N. Rao, On the linear consistency test (LCT) in cryptanalysis with applications, in: Advances in Cryptology – CRYPTO ’89, in: Lecture Notes in Computer Science, vol. 434, 1990, pp. 164–174.