3 July 1989
Information Processing Letters 32 (1989) 15-16 North-Holland
ON-LINE COMPUTATION
OF CONVOLUTIONS
Jill MATOUSEK Department
of Computer Science, Charles Uniuersity. Malostransk6
nrim. 25, II8 00 Praha I, Czechoslooakia
Communicated by W.M. Turski Received 28 December 1988 Revised 21 February 1989
Keywords:
Convolution, Fast Fourier Transform, counting combinatorial objects
Let u=(uO, a,, u*...) and b=(b,, b,, 6, ,...) be two sequences of numbers (or, more generally, of elements of a suitable commutative ring). Their convolution c = (c,,, c,, c2,. . .) is defined by ck = Using a Fast Fourier u,b, + u,b,_, + . -. +u,b,. Transform algorithm, we can evaluate cc,, c,, . . . , c,_, from uO, u ,,..., a,_, and b,, b ,,..., b,,_, by O(n log n) operations (see, e.g., [1,3]). However, sometimes an on-line computation of the convolution is needed; namely, we want to know each c,_, before a, and b,, are input. This problem arises e.g. when the coefficients of a generating function are computed from a functional equation for that function. Numerous examples of this situation are provided by the enumeration of combinatorial structures (see [2]). Here we present an algorithm satisfying the on-line restriction, which for every n computes the first n terms of the sequence c in time O(n log2n), using O(n) storage. Let us introduce some notation. When r Q t are integers and ~1= (u,, ui,. . .) is an infinite sequence or a finite sequence with at least t + 1 terms, we denote by u[r, t] the finite sequence j= u = (u,, ui, . . . , u,_,), where u, = u,+,, 0, l,..., t-r. If u=(t(,, ,..., u,_i) and u= u,_i) are finite sequences, we denote by (Q,..., u * u the sequence w=(wO ,..., w,+,_~), where w, is the ith term of the convolution of the sequences (u,, ,..., u,_,, 0, 0 ,...) and (u,, ,..., u,_,, 0, 0,. . .). We assume that u * u can be evaluated 0020-0190/89/$3.50
in time O((r + t)log(r + t)) (without loss of generality, we may restrict this assumption to the case r = t = 2q, 4 an integer). First we shall handle the case when the on-line restriction must be respected only for the sequence b, while the terms of the sequence a can be accessed freely. In fact, we shall see that during the computation which evaluates the terms c,_, (and also precomputes some data that co,..., will help to evaluate further terms of the sequence c), we access only the terms of a with index less than M, where M is the first power of 2 exceeding n. Every integer n can be uniquely represented in the form n = 2’1 + 2’1 + . . . + 2jp, where i, > i, > . . . >i. p, let B(n) denote the set (i,, . . . , ip}. Let us call the intervals (on integers) of the = [2k2q, (2k + 1)24 - l] (q, k integers) form 1q.k the canonical intervals (in analogy with the interval trees, segment trees etc. used e.g. in computational geometry, see [4]). Let B(n) = (i,, . . . , iP}, i, > i, > . . . > i,. The interval [0, n - l] is partitioned into canonical intervals as follows [o, n - l] = [o, 2’1 - l] u [2’1. 2’1 + 2’2 - l] u . . . u[2’1+
. . . +pp-l
, 2’1 + . . . + 2’p-’ + 2’p - l]
(the canonical decomposition of [0, n - 11). The main observation about the canonical decomposition we need is the following: the canoni-
0 1989, Elsevier Science Publishers B.V. (North-Holland)
15
Volume 32, Number 1
INFORMATION
cal interval Z4 Ir occurs in the canonical position of intervals [0, n - l] just for
PROCESSING
decom-
n E [(2k + 1)24, (2k + 2)2q - 11. In the computation of the convolution, we shall break the sum forming c,_, into parts corresponding to the intervals of the canonical decomposition of (0, n - 11: Let rq.k,s denote the sum of the terms ajb, with i, j > 0, i + j = s and i E I+. Then c,_, is the sum of tq.k.n_,, where the pair (q, k) runs over all the values such that Zq,k occurs in the canonical decomposition of [0, n - 11. The idea of the algorithm is that as soon as we know all the terms bj for i E Zq,kr we may precompute the expressions tq.k.s for all s for which they will be needed, i.e. for s E [(2k + 1)24 - 1, (2k + 2)24-
21.
These expressions are computed by a fast convolution algorithm, as the convolution of b[2k2q, (2k + 1)24 - l] with an appropriate portion of the sequence u. One easily verifies that the portion of the sequence a which must be taken into account is a[O, 2. 2’ - 21. Then it suffices to keep only the middle 2q terms of this convolution, which are just the expressions tq,k,l we need. Since the ranges in which we use two canonical intervals with the same q never overlap, for given q and for all k we may use a single array Tq with index range 0..2q - 1 for storing tq.k.s_ We shall present a formal description of the n th step of the algorithm, This step is executed for n=l,2 ,**-7. at the ith step the term c,_t is output. The execution of steps 1,. . . , IV assumes the accessibility of a,,..., aM, where M is the first power of 2 exceeding N. The following arrays are used: an array holding the elements b,, . . . , b,“_,, auxiliaj arrays Tq for all q > 0 with 2q < N ( Tq with index range 0..2q - l), and temporary arrays needed for the fast convolution algorithm. Step n: input
6, _ ,; q := max{ i : 2’ divides n } Tq := (a[O, 2 - 24 - 21 * b[n - 24, n - 1])[2q- 1, 2 - 2q - 21 output c,_i = C,, sc,,Tq[n mod 2q].
In an actual implementation, the arrays Tq can be combined into a single array R, with the corre16
LETTERS
3 July 1989
spondence Tq[i] = R[Zq + i]; the index range of R is 1.. M, where M has been defined above. About half of the additional storage can be saved if the partial convolutions are summed together as soon as they are formed; we omit the details here. When the algorithm is used for the computation of n terms of the sequence c, it computes one convolution of a 2q-term sequence with a (2 - 24 1)-term sequence for every canonical interval Zq,k contained in [0, n - 11. Since the total length of canonical intervals contained in [0, n - l] is 0( n X log n), these computations take time O(n log2n) as claimed (the other operations take less time). Now let us return to the original problem, when both the sequences a, b must be accessed on-line. In this case, we proceed in stages 0, 1, 2,.. .; at stage p we compute the terms czP, czP+,.. .., C2P’_,. At the beginning of stage p, we compute the convolution u = a[O, 2J’] * b[O, 2P]. This already yields the term cl,.. Then we use the above algorithm to compute the first 2p - 1 terms of the convolution u of a (known) sequence (a,, . . . , uzp_,) with an (on-line) sequence (bZPcl, b2P+2,...) and the same algorithm for computing the first 2P - 1 terms of the convolution w of a (known) sequence (b,, . . . , b,,_ ,) with an (on-line) sequence (azP+,, uzpc2,...) (the steps of these two computations are executed alternatively). Each term czP+i (i=l, 2,...,2P1) is then obtained as the sum of u2P+i and of the ith terms of u and w. The computation can be of course reorganized in many different ways, depending also on the fast convolution algorithm we have at our disposal. A challenging problem is whether one could compute the on-line convolution asymptotically faster.
References [1] A.V. Aho, J.E. Hopcroft and J.D. Pullman. The Design and Analysis of Computer Algorithms (Addison-Wesley, Reading. MA. 1983). [2] F. Harrary and E. Palmer, Graphical Enumeration (Academic Press, London, 1973). [3] H.J. Nussbaumer. Fast Fourier Transform and Convolution Algorithms (Springer, Berlin, 1982). [4] F.P. Preparata, I.M. Shamos, Computational Geometry An Introduction (Springer, Berlin, 1985).