Chapter 7 Difference Sets and Sequences
As seen in Chapter 6, the autocorrelation property of a binary periodic sequence is closely related to the difference property of its characteristic set with respect to the addition of ZN, where N is a period of the sequence. Generally speaking, the better the difference property of its characteristic set, the smaller max0r IAC,(w)I will be. In particular, for residue difference sets the autocorrelation functions of their characteristic sequences (briefly, DSC sequences) are 2-valued. For almost difference sets of ZN'S the autocorrelation functions of their characteristic sequences (briefly, ADSC sequences) are 3-valued. Furthermore, the characteristic sequences of difference sets and almost difference sets with parameters (N, k, A) having k - )~ ~ N / 4 have good autocorrelation property. The autocorrelation property of sequences is cryptographicaUy important for at least one reason: the control of the transformation density of some stream ciphers [122]. In addition, the autocorrelation property determines the two-digit pattern distributions of binary sequences. Due to the cryptographic significance of DSC sequences and ADSC sequences this chapter mainly introduces the differential analysis of those sequences and presents some results about their linear complexity. The NSG realization of sequences is also presented to show the significance of the differential analysis of sequences. 7.1
The
NSG
Realization
of Sequences
There are many ways to generate sequences, as shown by the many kinds of proposed generators. In spite of the flexibility of generating binary sequences, every binary sequence generator is equivalent to a natural sequence generator (NSG) described in Chapter 2. We say two generators are equiv185
Chapter 7. Difference Sets and Sequences
186
alent if, given any output sequence of one of the generators, the other generator can produce the same output sequence when the parameters of the generator are properly chosen. In this section we search for those NSGs which can produce some given sequences and for the equivalent NSGs of some known generators. To this end, we need the trace representation of sequences. It is well known that every periodic sequence in K - GF(q) has a trace representation described by the following two propositions [276, pp. 406 and 467].
Proposition
7.1.1 Let s ~176 be a periodic sequence in K - GF(q) whose characteristic polynomial f (x) of degree k is irreducible over K . Let a be a root of f (x) in the extension field F = GF(qk). Then there exists a uniquely determined 0 E F such that Sn = WrF/K(Oan), n ~_ O, where T r F / g ( x ) is the trace function.
The characteristic polynomial of a sequence refers to a zero polynomial of the sequence, which is a multiple of the monic minimal polynomial of the sequence. Proposition 7.1.1 gives a trace representation only for periodic sequences whose characteristic polynomials are irreducible over K. Generally we have the following conclusion [276, p. 467].
Proposition
7.1.2 Let s ~176 be a periodic sequence in K acteristic polynomial f ( x ) = f l ( x ) . . . ]r(x), where the reducible polynomials over K . For i - 1 , . . . , r, let ai its splitting field Fi over K . Then there exist uniquely Ot G F s , . . . , Or E F r such that sn = T r ~ / ~ ( O ~ a ~ ) + - - - +
= GF(q) with char.~(x) are distinct irbe a root of fi(x) in determined elements
TrF./K(O~ap), n > O.
Now we describe an NSG realization of periodic sequences in the finite field K = GF(q). Let s ~176 be the sequence described in Proposition 7.1.1; then one of its NSG realizations is depicted by Figure 7.1. For the sequence s ~176 of Proposition 7.1.2 we have an NSG realization in Figure 7.2. The NSG realization of the maximum-length sequences is easy given the above two propositions. If one has a characteristic polynomial of a sequence, it is possible to give an NSG realization of the sequence. However the computational complexity could be very large, depending on the sequence. Finding the minimal polynomial of a periodic sequence could be easy as we have the efficient
7.2. Differential Analysis of Sequences
, I
187
Ikey
N'cyclic c~ I i
I ,,
Figure 7.1" The NSG realization of some sequences.
Berlekamp-Massey algorithm. But factoring a polynomial and finding the parameters 8i and c~i of Proposition 7.1.2 could be hard. We also note that the NSG realization of a sequence is not unique. 7.2
Differential
Analysis
of Sequences
For any sequence generator (SG), suppose that its output sequence s ~176 over a finite group (G, +) has period N. Let
Cs(g)= {i" s i = g , 0 < i < N - 1 } ,
gEG
and fs be the characteristic function of the partition {C~(g) analysis of the difference parameters
9 g EG}. The
ds(i,j;w) -ICs(i)M (C s (j )- w)], (i,j;w) E G x G • ZN, is called the differential analysis of the sequence. The conservation laws between the difference parameters are given in Section 4.2.1. The differential analysis of sequences could be finer than the autocorrelation analysis. However, for binary sequences they are equivalent. It is clear that the differential analysis is in fact the two-character pattern distribution analysis, since the difference parameters ds(i,j;w) represent the number of appearances of one two-character pattern in a period of the sequence. Let ~ be a group character of (G, +). By definition the periodic autocorrelation function of a sequence s ~176 of period N over G is given by N-1
AC
(Z)
i=O
Chapter 7. Difference Sets and Sequences
188
1
key
N-cyclic counter
,,i
I ~
u
+
I~ V
Figure 7.2: The NSG realization of some general sequences.
- Ev6G I{0 < i < N -
lls~-
8~+, = v } l ~ ( v )
vGG uGG
=
~
~
d,(,,, u - v; ~)~(v).
v6G u 6 G
Thus, if the difference parameter ds(i,j; w) is a constant for all (i,j) 6 GxG, the autocorrelation value ACs(/) = 0 if I # 0. Generally, the flatter the difference parameters, the smaller the autocorrelation values IACs(/)I for I # 0. But the converse may not be true when IGI >_ 3. In summary, the differential analysis gives the autocorrelation analysis and two-character pattern analysis. Note that every periodic sequence has an NSG realization and many generators have an equivalent NSG. Thus, if an equivalent NSG of a keystream generator can be constructed, the differential analysis of the NSG is necessary due to the differential attack described in [122]. If we cannot ensure that an equivalent NSG of the keystream generator cannot be constructed, then we should carry out the differential analysis of the keystream. Otherwise, a bad difference property of the keystream sequence could lead to the determination of some parameters of the NSG with which the NSG could produce the same keystream sequence.
7.3. LinearComplexity of DSC (ADSC) Sequences
7.3
189
Linear Complexity of DSC (ADSC) Sequences
It is known that for any binary maximum-length sequence s ~176 of period 2 TM - 1, its characteristic set is a (2 m - 1, 2m-1,2 m-2) difference set (for example, see [404], p. 314). On the other hand, the m-sequences satisfy also Golomb's three postulates. But these sequences have only linear complexity m, which is very small compared with the period 2 m - 1. However, there are some DSC sequences with large linear complexity. In fact there do exist DSC sequences having maximum linear complexity, as described by the following proposition [122]. P r o p o s i t i o n 7.3.1 Let D be an (N, k, A)-difference set of ZN and s ~176 be its periodic characteristic sequence. Then 1. if k is even and A odd, then L(s ~176= N -
1;
2. if k is odd and A even, then L(s ~ ) = N ; 3. if k and A both are even, then .. ) .~' , . ~ - 1 ) . 1; L(s ~176- deg [ gcd(gcd(sN(z), ~r xN--1), (.-~ gcd(sN(z-1)x N, ~N--1)) J
4. if k and A both are odd, then
$cd(sN (:r.-1)zN, zN--..1)(x+l) ] L(s ~176= deg gcd(gcd(sN(x)' xN_l), gcd(sN(x_1)xN ' XN_I)) , where s g ( x ) = So + s i x + . . . + 8 N _ l x N - 1 .
P r o o f : It is well-known [138], [276, pp. 418-423], that the minimal polynomial of a sequence of period N over GF(q) can be expressed as x N- 1 Is(x) = g c d ( s N ( x ) , x N _ 1)"
Since the characteristic sequences are binary, our arithmetic is now on GF(2). Let D be the characteristic set of s ~176 Since D is a difference set
(modx N-l)
-
Exdi-dj i,j
=
(n mod 2) + (A mod 2 ) ( 1 + x + . . . + x N - l )
(mod x N -
1),
Chapter 7. Difference Sets and Sequences
190
where n = k - A. If k is even and A is odd, then n is odd, and sN(x)sN(x-1)xN
_= 1 + (1 + X + . . . + X N - l )
(mod x N - 1).
By the difference-set property k ( k - 1) = ( N - 1)A. Thus N must be odd. It follows further from the assumptions of the proposition that (x + 1) but not (x + 1) 2 divides s g ( x ) . Hence gcd(sN(x),x
N --
1) = x -
1, f s ( x ) = ( x N - 1 ) / ( x - 1).
Thus the linear complexity of the sequence is N - 1. This proves part one. If k is odd and A even, then s 2 v ( x ) s iv
(x -1)x 2v = 1
(mod x 2v - 1).
It follows that g c d ( s N ( x ) , x N - 1) = 1, and L(s ~176= N. This proves part two. If k and A both are even, then sN(x)sN(x-1)x
N -- 0
(mod x N - 1)
and therefore gcd(sN(x),x
N --
1)gcd(sN(x-1)XN,xN -- 1) - - 0
(mod x N - 1).
whence gcd(s N (x), x N - 1) is equal to (x N-
1) g c d ( g c d ( s N ,
x N-
gcd(sN(x
This proves part three. larly.
1),gcd(sN(x-I)xN,x N - 1)) -1)xu,x
u -
1)
The remaining part four can be proved simi[2
Set n - k - A . The linear complexity of the DSC sequences is optimal for those with parameter n odd. This also shows the cryptographic importance of the parameter n. For those DSC sequences with parameter n even, the linear complexity seems hard to control. As an example, we consider the binary maximum-length sequences. Their characteristic sets form (2 m 1, 2 m - l , 2 m-2) difference sets. For those difference sets we have n = k - A = 2 m-2 which is even. When n is even, the formulae for the linear complexity in Proposition 7.3.1 are not practical in general. But in some special cases they might be reduced into practical ones. Planar difference sets are those with parameters (N, k, A) having A = 1. If we can find planar difference sets with k even, then we get sequences with
7.3. Linear Complexity of DSC (ADSC) Sequences
191
maximum linear complexity. However, since k ~ v/-N, those sequences are fairly unbalanced. If the prime p ~ 2, the periodic characteristic sequences of those (p2j + p / + 1, p / + 1, 1) difference sets have linear complexity N - 1 and they are also fairly unbalanced. Another family of difference sets is the Singer difference sets with parameters qm+l _ 1
N:
q-1
'
k=
qm _ 1
q-l'
A-
qm--1 _ 1
q-1
'
which exist whenever q is a prime power and m >_ 2 [405], [15, pp.99-104], [404, pp.313-314]. Since k - A : qm-1
A - 1 + q + . . . + qm-2
the linear complexity of the periodic characteristic sequences of these difference sets is N - 1 if q is not a power of 2. However, unfortunatly we have N / k ~-. q. This kind of unbalance may restrict the cryptographic application of these sequences. A difference set which is composed of all the ruth powers modulo some prime N, or of the mth powers and zero, is called an ruth power residue difference set. Probably the cryptographically most important periodic characteristic sequences of difference sets are those of the quadratic residue difference sets. Let D be an (N, k, A) difference set of Z N (see Proposition 4.3.3). The polynomial H ( x ) = x d~ + x d~ + . . . + x d~
over the ring Z N is called the Hall polynomial of the difference set, the generating polynomial of the difference set or the difference set polynomial In 9 terms of this polynomial the difference set property is k
H(x)H(x-1)
- Z
xd'-d~ -- n + A(1 + x + ' - " + x N - l )
(mod x N -
1),
i, j
where n - k - A. Let s ~ be the periodic characteristic sequence of the (N, k, A) difference set D, then =
80 § 2 4 7 2 4 7
=
X dl §
d2 §
xN-1 §
where "+" denotes the modulo 2 addition. Thus, if we consider the Hallpolynomial over GF(2), then we have s N (x) = H ( x ) . It is by employing the
Chapter 7. Difference Sets and Sequences
192 formula 8N(x)sN(x-1) k
X di-dj ~ ~ -[- A(1 + x + . . - +
xN-l)
(mod x N - 1)
i,j that the above general conclusions about the linear complexity of DSC sequences have been proved. However, with almost difference sets we do not have such a nice fact to employ. So it seems not easy to control the linear complexity by controlling the parity of n. However, we can control the linear complexity of ADSC sequences by employing the results of Chapter 3. It should be mentioned here that there are ADSC sequences which have optimal linear complexity. Examples are the characteristic sequences of quadratic residues modulo primes of the form 4t + 1 (see Proposition 4.3.3). R e s e a r c h P r o b l e m 7.3.2 Analyze the linear complexity of the A D S C sequences.
7.4
Barker
Sequences
In some communication systems the value maxl
= = = = = = =
2 3 4 5 7 11 13
00 001 0001; 0010 00010 0001101 00011101101 0000011001010
together with the sequences which may be derived from them by the following transformations: s i' = (i + si) mod 2" 8i' = (i + 1 + si) rood 2; 8i' = (1 + si) rood 2.
7.4. Barker Sequences
193
It is known that a binary sequence of period N > 13 is a Barker sequence if and only if it is the characteristic sequence of a (4n 2, 2n 2 - n, n 2 - n) difference set of Z4n2 [15, p.97]. Thus, to construct Barker sequences, we have to find difference sets of this type, which are called Menon difference sets [7]. It was long known that if any further Barker sequences exist they must have n >_ 55, i.e., N - 4n 2 > 12,100 [15, p.97]. For the next twenty years little was achieved in the search for Menon difference sets of residue rings [7]. Then in 1992 Eliahou and Kervaire [144, p.363] raised the bound on n to n _> 689, so N _> 1,898,884. Barker sequences are cryptographically interesting from two points of view: On the one hand, a Barker sequence of period 4n 2 has maximum linear complexity 4n 2 if n is odd. This can be seen from Proposition 7.3.1 since k - )~ = n 2 is odd. On the other hand, if we use the characteristic function of the corresponding Menon difference set as the cryptographic function for the natural sequence generator and use further this generator as the keystream generator for the binary additive stream cipher, then the stream cipher has optimal local (encryption and decryption) transformation density (see Chapter 16). For our cryptographic applications, we need to consider at least two things: the search for Menon difference sets of Z4n2 with large n's; and the realization of the characteristic functions of them.
R e s e a r c h P r o b l e m 7.4.1 Find Menon difference sets of Z4n2 for large n if there are any. The Barker sequences are also closely related to the so-called circulant Hadamard matrices. A matrix is said to be a circulant if each successive row is derived from the previous row by shifting it cyclically one position to the right. An example is the following
H
_~
+1 -1 +1 +1
+1 +1 -1 +1
+1 +1 +1 -1
-1 +1 +1 +1
If a matrix has entries +1 and its rows are orthogonal, it is called a Hadamard matrix. The above H is a Hadamard matrix and is the only known circulant Hadamard matrix. It is not hard to see that there is a one-to-one correspondence between Barker sequences of even length N > 4 and circulant Hadamard matrices. Thus, if there exists any further circulant Hadamard matrix its order N >_ 1,898,884. Whether there are further circulant Hadamard matrices remains a well-known open problem.
R e s e a r c h P r o b l e m 7.4.2 Investigate whether Hadamard matrices of order N > 1,898,884.
there
are
circulant