V&me
4,
number 6
X?lrzrck1976
Appa~n~y 6 m~ti~~ic~tio~ are needed to compute the vector product as a set of b~~ear forms, but there are actua& several ways to do it in only 5, One such algorithm is implied by the ~deut~tjes XtY2 -qYt
=Cq -x?~~~~~~~
XlY3 I--X37x
=X163
+Y3
=cq
where
0
[ X2
3 X3
1
and
c 0
I”2
-3 y3
1
--x3Y2
-Y2)+(+ -x&3
+x&3
-Y~~~~~~ -“2k.rl +f3kz
+X3cY2
--Y2Mq
-Ylh
-x3by2.,
one &o&d observe that commutat~vjty need not be assumed, The number of add~,t~onsand subtractiormsirr the *brew” agues has h~cre~d from 3 to 11t and *h au1modem com~u~ng systems this till indeed not be desirable to L:t even the lightest gain in computing time (on the contralto There is however a different repin why we are ~resen~g it. Vk shall prove that tko matter what ~go~thm is uj;J3dto ~~r~pute vector-products, id;wiB always reqxk’e at feast 5 non-scalar mu~tip~ic~tio~siit any re~o~ab~l~ modes of cornpu~~~~~.* Results of this type are shown most of for non~omm~tat~ve variables, dnd become much harder in the ~omrn~~tat~~ecase, To indicate what differwe ~Siaatt beefy me e ~O~ucom~~utat~ve case c
March 1976
INFORMATION PROCESSING .LE‘lTERS
Vdume 4, number 6
It follows that com;uring _;;ztrrrproducts or Lieproducts of 2 X 2 matrices in 5 multiplications is op-
dmd from the algetxaic point of view in both the ’ commutative and c~~n-commutativecase.
matrices Dt , .... Di, a non-singular n X n matrix E = [qj] and an 12X r matrix F = cpii] such that n
r
has rank 1 for each i (1 Q i G p). In the study of optimai computations for bilinear problems overa field k it is customary to let “straightline programs” be the most general type of algorithm that one woukl have to consider. To ccmpute the c roduct ofx = (x1,x2,x3) mdy = 01 ,YZ,Y~), e ep in such a program is either an “input-step”, requ-sting a particular element of . i
k 3,-xa*x*,x3)ucylcY2,Y3},
or a ‘furrctionostep” computing the sum, difference, ?IIpt odb ; t of the results of two preceding steps (usuaHy written in explicit algebraic forms). A more gzzral exposition of this kind of algorithm ‘katfon and the adqtacy of the model in compMq4eory is given in ref. [I, ch. 121 and in ref. [2]. To see how the argument could proceed in the nonc0qmutaxive case we shall follow the type of approa& ;rOref. [3] first. The crass-product can be considered as the task to compute x%,y, XT&y, and xT&y where we use the matrices
and
r 0
0
B3=‘0 L0
0 -1
0
I 0I (&ich happens to be a basis for the 3 X 3 skew-symmetric matrices). Let t;s now develop a helpful result *&ati, of tidependent interest aiso. It is 1~41icrown that an arbitrary set of bilinear forms .xTB- y ..., x*&y can be evaluated in p multiplications 4 Ad only if there is a set of rank 1 matri.. 9 “‘9 DT from which each Bi can be obtained ugfr linear combination (see also refs. [8,9]>. ume that the Bi’s art: linear]y independent., ’ . 0065ca;l evaluate xTBIy, .... xTBg in aJ ltiplica~ions if MS$ou!y if there are rank I
Proof. l[fx*Bly, .... xTBny can be evaluated in n + r multiptications, then there must rank 1 matrices DI 9 ‘.‘PD?l+r such that for each i (1 < i Q n)
?l+r
where C = [Tii] has rank n. Assume (as we may) that the first n columns of G are linearly independent, and let the inverse of this submatrix be E = [oji]. Then it follows that for each i (1 < i i;Ep)
and the lemma holds for Dn+r, .... Dn+? and F = bill with pti = xF= 1CuikTkn+j* Conversely, if for each i the matrix
is of rank 1 then it follows from the invertibility of E that each f3i may be written as a linear combination of Cl 9.a.)Cn, D, , ...sDr, and xTBly , ... . xTi3,,y can be IG evaluated in n + r multiplications. The algorithm for computing vector-products in 5 multiplications essentially follows because we can decompose the matrices Bt , B2, and 83 (given above) as B, =I), tD2, B,, = D, t D4 t-D,,
for the rank 1 matrices
4
= -(D3 tD4),
Volume 4, number 6
0
and
Ds=l [ 1
IAFORMATION PROCESSING LETTERS
0
0
-1
1.1
--I
11
To show that one cannot improve this, lemma 2.1. may be applied in the following way. Suppose that xTBIY, xTB~, and x*I?g can be evaluated in 4 mu1tiplications. (It follows from more elementary arguments that 3 multiplications are not sufficient but *hat is easily included here by default). By the lemma there must be a single, rank 1 matrix D and an invertible 3 X 3 matrix E = [c@ such that for all i(1 GiG3) 3 c a+ js.1 11
+D
has rank 1. (‘We use here the fact that no combination of B$ atone can have rank i ,uld therefore we can assume thatF= I- 11). A combination MI +#& t 7B3 + D has rank 1 if and only if each 2 X 2 submatrix has determinant 0. ‘writing D=(u
v w)T (1 y z)
(for appropriate u, v, w, y, and z as we may) the resulting equations can be easily written down, and appear to have a unique solution in Q, 0, and 7 (which means that E has rank 1). This contradicts the invertibility of E.
3. me maia*rcsult I Although we shall follow the same pattern as before, the targument becomes harder in case we permit commutativity of variables largely because there is no simple characterization in terms of rank 1 matrices anymore. (Instead, one is forced to use symmetrized versions which are not as easy to handle see refs. [S,9 1). To prove that 5 multiplications are required for vector-prMucts in the commutative case too seems also intractable usingother elegant algebraic techniques which are presently available (e.g. ~IIrefs. [5,10,11 I), and a different development is needed. The main idea in the following proof is patterned after a construction ional details are needed d here. A related conto see exactly how it can
March 1976
structjon A0 ayp %.rJin ref. :4], but wiGi little further analogy. Theorem 3.1. Computing vector-products in 5 multiplications is optimal. Proof. We have seen that 5 multiplications suffice. Suppose there is a straight-line program that would do it in less than 5 and look at the Grst product that is formed. Each oplzrand ,must be linear in the &and the y’s, and may be assumed to be homogeneous (scalar multiplications can easily be accomddated for afterwards). Without loss of generality we can assume that in one of the operands an xi occurs (since the argument would be similar if we started with a/i), and it is also no restriction to assume that it isx3 (for symmetry-reasons). Then there must be a substitution,
with L@) linear homogeneous in the y’s, which will make the first product in the Fvogram always 0. Consistently substituting for and eliminating+ we therefore arrive at a straiglHne program (verify that it is) computing the “bilinear” forms
or, in a more clarifying notation (and using commutativity)
r -Y3
Y2
-Y1
-w2
Y3 -
+wr
0Yl
-I
x17
+fCy)*
PY2 c x2 1 J
where fcv) is a vector with purely homogeneous, at most second-order coordinates in the y’s, in at most 3 mu1tiplications. Suppose that the new task can be computed in only p multiplications e3 9 .... e, (1 G p G 3). The forms needed arc then obtained a. a linear combination of these prtiducts, together with some additions terms linear in titc x’s and the y’s:
151
March 1976
INFORMATIONPROCESSINGLETTERS
Qmat&, .Cmst hwe row-rank(andthuscolumn-rank) 3, sinceothetise we wuld obthmof the bilinearformsan the left to be 0 moduleexpressionsin separatedvmiables mp = 3 and E must be nonsingular.Denotingthe inverseof E by T = [tt,])
(- tit +#t&
I[~)h~neous
- &‘ty2
t fizv31
xl Llx
+@@I
asndhigherorder in the y’s, can be obtained from es in a purely “scalar” fashion using no 4
tlpl-i&mofq must be *
andnoruin (.,.) denote hcmogeneousand linear or non-linearexpressionsin the appropriate type of respectively.Sincethe expressionswe are computingconta#nno terms &at are purely secondorder in the ~‘8H EIOIB then secondorder in the 9s andfor Y’s,we conclude that q must be of the form (...)
M, $rr?f hr4J59~* (hne 6% the higher order terms inY’s from the b~inea~lart it fohowsthat
~4 ~~t~g
can be obtained using a singlemultiplication of the form (Ma @>I+ 0% 691 81which we may therefore apply W~ograd’s“column rank” critetion ([ 111,see ah 1 X 2 matrix of this bagel task must have column rank 1. Wec&n that the ~t~ces can have coiu~ rank I or@ if for at! i
ref. [ 1, thm 12.2)). Thus
fas rra& f the
$1 = at&j + fkQ*
First,if ti3
=0
then the matricesreduce to thf: form
-+,Yl - &2Y2 t tQY31 md itaorder for it to have rank 1 it is necessarythat either tSl = ati or ttl * ti2 a 0 (Md + is fnd On the otherhand,if tj3 ;f:0 then in the ~~~~ matrice!swe can et rank 1 o@ if either$2 EL 0 -arc~2 # 0 anti far some h 7c:0 we must have tt2 = --tit3 and atf3 = - ~1 + &a. Thus in b The consequenceof * is that the t~hunns of T are linearly dependent, contrary to the rsingular.IhL pi’oyesthe theorem. Btis interesting to contrast the construction with the argumentoutlined in 2. We di e had to andyse the multiplicationsexplicitly before we could conclude that the factors O(Gl-
@i2b2
Volume 4, number 6 COlTUtlUbtiW
izes tendmay
1NFORMATION PROCESSING LETTERS
March 1976
ell. The method cledy generalfor furtherapplications.
4. A
of two Purely wlcto
ernions
Sti +Xd +arjk d
uet
of two such quatemions is
that we consider algorithmsow a red field only, UK! one may well want to study what can happen if we change the field of scalars.(This was recently investi@ted and used more systematicatlyby Winograd[ 13I). & an interestingexample we note here that over the complex numbersone can mul’riplytwo vectorialquatemions in only 5 multiplicationcJ: I II III IV
= = = =
V
= &1-~3)6~
(Xt -x2 ihfWq +y2 9 iId& (Xl + X2 ildWY1 -y2 i/4$, (X2@ - X3 i) (Q/42 + y3 i), (x&9 +x 3 962i@-r~ i), l
l
l
l
l
l
l
+.v3).
Then one can write +(Xsvl - Xly3N +(Xly2 - XzylW d outer-productof x and [tl)showedthat3multions are minimal for innerproducts, and it that to compute the forms simultorrcuusly8 for the inner-, and 5 for the ed. Howell and Lafon [6, an ugument that at least 6 multiplicaand it is not hard to modi& their in the commutatbe c8se as well. The foefowblgstdght4ine in%tructions prove that the lower-bound can be attained, and that one may csmpta th8 inner- and oukx-product of x = (x, ,x*,x3) md y = (vl ,Ya,y3) in no more than 6 multiplications. Lkterminc
= (I- 11)/i& V2 -“zy1 x1y3 = (I + II - III *zV1 -‘ti3 - xZY2 = (I!! - Iv)li\/z,
IV - 2V)/2,
and *1y1 +X92 +v3
= (I + II + RI + IV)/2.
The lower-hound of 5 multiplications (over a ) can again be shown.
Acknowledgement We thank PeterVan Emde Boas and the referee for some helpful comments.
!‘f .A.V. Aho, I.E. Hopcroft and J.D. Ullman, 77zedesign
snd compute W3 - x99 = r/a (1 - Ii), 112(111- IV), XrVl - Xly3 xty2 -x&Q = V+lt2(I+iI+Iar+N),
and anafyds of computer algorithms, (Addison-Wesley, Read& Massachusetts, 1974). 121A. Borodin and I. Munro, The computationalcomplexity of ohvbraic and numeric pmblems, Theory of Comp. Serb.6 1 (Ar;. -rican Elsevier Publ. Company, New York, 1975). (3) R. Brockett and D. Dobkin, On the optimal tvaluation of a set of bilinear forms, Proc. 5th Ann. ACM Symp. on Theory of Computing, Austin, Texas (1973), pp. 88-95, VI H.F. De Groote, On the complexity of quaternion multiplication, Inform. Proc. Lett, 3 (1975) 177-179. ISI C.M. Fiduccia, On obtaining UpptX-boUndSon the complexity of matrix-multiplication, in: R.E. MiUer and J.W. Thatcher (ed.): Complexity of computer compu tatiws, (Pi~n~rn Press, New York, 1972).
153
O,numbcr6
INFO&fATIQN
PROCESSING LE‘ITERS b
hfarch 19761
proofs of lower bounds in complexity theory, Report and 3-C Lafon, The complexitjj of the quilter, IW41/75, CMathem. C&run;:, Amsterdam, 1975). uct, Techn. Rep. 7X245: Dept. of Computeer ** [ 111 S. Winograd, On the nu.mbec of’ multiplications needed ee, Cornell Univ., Ithaca, New York (1975). to compute certain function &Cornin. hi8 Appl. Maih ( 7f A.C. f&u&h, Leyu?er 011gene& algebra (Cheisea RlbL _ b 23 (1970) 165-179. 0 Cornpat~y, hfew Vork, 1963). [ 12) S. Winograd, On the muLtipbcati& of 2 X 2 matrices , ) J-C. Lafon, Optimum compu&Qn of p bilin9.a forms, * Lin. Alg. and its Apple.4 (1971) 381-388. f.ia.
[email protected] AppL 10 (1975) 225-240. (13) S. Winograd, The effect of the field of constants on the u, Vermeidung von Diionen, CreBe’s 3. Reine number of multiplications, 16th Ann. Symp. on Found. Malh 264 (1973) 184-202. of Computer SC., Berkeley, California (197S), pp. l-2 1101 3. Van Leeuwen and P. Van Emde Boas, Elementary