EJLSEVIER
Information Processing Letters
Information Processing Letters 56 (1995) 329-335
On the additive complexity of 2 x 2 matrix multiplication Nader H. Bshouty ’ Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
Received 3 February 1995; revised 22 August 1995 Communicated by D. Grles
Abstract Probert proved that 15 additive operations are necessary and sufficient to multiply two 2 x 2 matrices over the binary field by a bilinear algorithm using 7 non-scalar multiplications. We prove this result for an arbitrary field. Keywords: Algorithms; Computational complexity
1. Introduction and G = Let F be a field, x = (xi,. . . ,x~)~, y = (~1,. . . ,Y,,,)~ be column vectors of indeterminates , G,} be a set of n x m matrices with entries from F. A bilinear algorithm that computes the bilinear {Gl,... is a forms xTGy = {xTGt y, . . . , xTG,y} with multiplicative complexity ,u (or p nonscalar multiplications) triple of matrices (A, B, C) such that xTGy = AT(Bx*Cy),
(1)
( xTG,y i ) where A, B and C are t x ,u, ,u x n and I_Lx m matrices with entries from the field F and * is the componentwise product of vectors. In other words, we think of a computation of xTGy according to a bilinear algorithm as consisting of four disjoint stages. (i) Compute Bx = (ml,. . . , m,)T. (ii) Compute Cy = (mi,.. . , mL)T. (iii)
ComputeM=B~*Cy=(mtm’,,...,m,m~)~.
(iv)
Compute
Therefore
(xTGly, . . . , xTG,y)
the multiplicative
of additions
complexity
= ATM. of the bilinear
needed in steps (i) , (ii) and (iv).
’ Email:
[email protected]. Elsevier Science B.V. SSDIOO20-0190(95)00176-X
algorithm
is p and the additive
complexity
is the number
330
N.H. Bshouty/Information
Processing
Letters 56 (1995) 329-335
The minimal number of nonscalar multiplications p needed to compute .xTGy is denoted by p(G) or ,~(x~Gy) and the minimal number of additive operations needed to compute Bx is denoted by 6(B). Therefore the additive complexity of the bilinear algorithm in ( 1) is S( AT) + S(B) + S(C). We prove the following theorem. Theorem algorithm
1. 15 additive operations are necessary and su.cient that uses I nonscalar multiplications.
to multiply two 2 x 2 matrices by a bilinear
2. 2 x 2 Matrices Let F be a field and
Let
vet(X)
T.
= (~1,1,~2,1,~1.2,~2,2)
Then a bilinear
algorithm
that computes
XY with 7 nonscalar
multiplica-
tions is (A, B, C), where vec(XY)
=AT(Bvec(X)
*Cvec(Y)),
(2)
A, B and C are 7 x 4 matrices. It is known that if (A, B, C) is an algorithm (B,A,CW)
and
for vec( XY), then the dual algorithms
are [4]
(C,BWA),
(3)
(4)
Suppose we prove the lower bounds 6(A) 8(AT)=6(A)+7-4=S(A)+3>7,so &AT)
+ 6(B)
+ S(C)
> 4, 6(B)
2 4 and 6(C)
2 4. Then by the results in [5] we have
> 15.
(5)
This implies the following Lemma 2. Iffor every bilinear algorithm (A, B, C) for vec( XY) with multiplicative complexity 7 we have 6(A) > 4, S(B) > 4 and 6(C) 2 4, then the additive complexity of bilinear algorithms for vec(XY) with multiplicative complexity 7 is &AT)
+ S(B)
+6(C)
3 15.
We now give the algorithm of Winograd, multiplications and 15 additive operations:
which computes
the product of 2 x 2 matrices
with 7 nonscalar
N.H. Bshouty/Informtion
vec(XY)= AT(Bvec(X)
0110000 1 1 0 1100110 1101100
1
Processing
Letters 56 (1995) 329-335
*Cvec(Y))
0
0
1
-1 1 0 1 0 1 0
1 0 0 -1 1 -1 0
0 0 1 0 0 1 0
331
1i1’ 1
1 0 0 0 1 -1 1
l
x2,2 Xl,2 x2,1 x1,1
*
0
-1
1
-10 000 0100 010 -11
01
/
i-1
1
1
(
-1
(6)
Now M = Bvec( X) can be computed 11 =x I,1 -
x2,1;
12 = X2,2 -
ml = 12;
13 = Xl.2
y1,2;
1: = y1,1 -
rn{ = 1:;
-
m4 = 11;
in 4 additive operations 1:;
1; = y2,1
-
1:;
rnh = Y2,li
mi = yl,~;
by
12; 14 =x2,1 +x2,2;
m3 = X1,2;
m2 = x1.1;
M’ = Cvec (Y) can be computed 1: = Y2.2 -
11;
in 4 additive operations
m5=14;
me=lj;
m7EX2.2
by
1; = y1,2
-
rni = 1;;
y1,1;
rni = 1:;
rn&= y2,2;
rn; = 1;.
Define def
s~)=M*M’=
(Sl,S2,...,
(m~m{,...,m7m~).
Then A( M * M’) can be computed Zl,l
Z2,l
=
=
Xl,lYl,l
X2,lYl.l
+ X1,2Y2,1
in 7 additions
s8
= =
s9
=
s3 + s8
+ X2,2Y2,1
=
s7 + s9
by
s2 + s3 Sl fS2
21,2
=
Xl,lY1,2
+ Xl,2Y2,2
=
s5 + sfj + s8
z2,2
=
X2,lY1,2
+ X2,2Y2,2
=
35 + s9
This implies the following. Lemma 3. The algorithm complexity 15. Winograd We write (A,B,C)
(A, B, C) of Winograd for vec( XY) is of multiplicative
also proved [ 81 that the multiplicative
complexity
complexity
7 and additive
of vec( XY) cannot be less than 7.
E (A’,B’,C’),
equivalent bilinear algorithms if (A’, B’, C’) can be obtained from (A, B, C) by multiplying each of the ith rows of A, B and C by q, /Ii and yi, respectively, where CripiYi = 1, i = 1,. . . , ,u and by permuting the rows of A, B and C with the same permutation. Obviously, equivalent bilinear algorithms compute the same bilinear forms and have the same multiplicative complexity and additive complexity. To prove Theorem 1, we begin with the following lemma. Lemma 4. Zf (A, B, C) is a bilinear algorithm for vec(XY) of A, B and C are pairwise independent.
with multiplicative
complexity
7, then the rows
332
N.H. Bshouty/lnformation
Processing
Letters 56 (1995) 329-335
Proof. By (3) it is enough to prove that for every bilinear algorithm pairwise independent. Assume (without loss of generalization) that
ulvdx)
= ulxl.1
the algorithm, multiplication
+ @X2,1
+ u3Xl,2
Since, by [ 11, the multiplicative
multiplications.
the rows of B are
and Ut + 0. Substituting x1,1 = -(@x2,1 + ~3~1.2 that computes for u = -(u~/u~)x~,~ - ( u~/u~)_x~,~ -
+ u4X2,2
we obtain a new algorithm
with 5 nonscalar tion. 0
(A, B, C) for vec(XY)
complexity
+ ~4~2,2)/ul ( u~/~,)x~,~
in the
of (7) is 6, we have a contradic-
Let vec(XY)
= AT(Bvec(X)
*Cvec(Y)).
(8)
Since XY = G( G-‘XH) (H-‘Y( IT) -‘) JT for nonsingular 2 x 2 matrices G, H and J and since vec( RSL) = ( LT @ R) vec( S) , where @ is the Kroneker product of matrices, we have vec(XY) = (JasG)vec((G-‘XH)(H-‘Y(JT)-I)).
(9)
By (8) we have vec((G-lXH)(H-lY(JT)-l))
=AT(Bvec(G-‘XH)
*Cvec(H-‘Y(JT)-I))
=AT(B(HT@GG-l)vec(X)+C(J-l
@HH-‘)vec(Y))
and with (9) we obtain vec(XY)
(10)
=((J@G)AT)(B(HT~GG-l)vec(X)*C(J-l@HH-l)vec(Y)).
Since XY = (YTXT)T we have vec(XY)
= vec((YTXT)*)
where W is the permutation
= Wvec(YTXT) = (AW)T(CWvec(X)
* BWvec(Y)),
matrix in (4). This with (3) and (10) implies
Lemma 5. If (A, B, C) is a bilinear algorithm for vec( XY),
then (AW CW SW),
(B, A, CW) and
(A(JT@GT),B(HTc3G-1),C(J-1@H-1)) ure algorithms for vec( XY) with the same multiplicative and H are any nonsingular 2 x 2 matrices.
complexity,
where W is the matrix in (4) and J, G
333
N.H. Bshou@/lnformationProcessingLetters 56 (I995) 329-335 Let I(A,B,C)
ZZl(A,B,C)
= (A,B,C),
&,c,dA,B,C)
ZI,(A,B,C)
= (AWCW,BW), B(HT@G-‘),C(J-’
= (A(JT@GT),
= (B,A,CW),
@H-l)).
De Groote [3] proved that every bilinear algorithm (M’, N’, K’) for vec(XY) is equivalent to an algorithm that can be obtained from a sequence of the operations 171,Z72 and &,G,H on (A, B, C) that were defined in (6). To simplify the result of de Groote, we give the following. Lemma 6. Every bilinear algorithm such that (M,N,K) for some II E
(M’, N’, K’) for vec( XY) is equivalent to a bilinear algorithm
(M, N, K)
=nfb,c,~(A,B,C)
{ZZ,, lT2, L7lLT2, Li’7_L7,,I7lI&Z71},
where (A, B, C) is the algorithm of Winograd.
Proof. It can be easily verify that W2 = I and (J @ G) W = W( G 8 J), so fl: = 1, &,G,Hnl
n; = 1,
ftI,,G,,H,&,G~,H~ = &.l~,G,G~,H,H~, &,G,~n2
= nlAG,J,(,-1)T,
= ~~AH,(G-')T,J,
n,n2n1=17217,172.
Now the result follows from the result of de Groote [ 31.
0
Since {nl, 172, nlZ72, l72l7,, nl H2Z71) are operations that only permute the columns of A, B and C, we have that the bilinear algorithms (A, B, C) and Z7( A, B, C) have the same additive complexity. Therefore, to prove Theorem 1, by Lemma 2 and Lemma 6, it is sufficient to show that for every nonsingular matrices G, J and H we have S(A(GT ~3 JT)) 2 4,
S(B(HT@G-‘))
24,
6(C(J-’
@H-l))
b 4,
or 6( K(GT @ JT))
2 4
for every K E {A, B,C}. For a permutation q5 we define the matrix lF1*.,.,an) as follows: z~l,...,~,,)
[i,j]
=
I
ai o
ifj=#(i); otherwise.
Then it can be easily shown that B = l(-l,l,l,l,-l,l.-l)A(G1 (2,6)(4,5)
@ J1)
,
C = I(“‘,-““-“‘,-‘)A(G2
@ J2)
(11)
(2,7)(4,5)
where where Jl=G2=('
_l), J2=(1-‘).
Since the addition complexity does not change when we multiply the matrix from the left-hand side by Zp”..@“), we obtain that if for every nonsingular matrix G and J we have 6( A ( GT @ JT) ) > 4 then: for every nonsingular matrices G and J we have 4 < WU(GG:)‘@
(-@jT))
( l~l~l~l~-l~l~-l)A(G, @JJl)(GT@JT)) = 6(1(,,(‘,,5)
=8(B(GT@JT)).
334
N.H. Bshouty/Information Processing Letters 56 (1995) 329-335
Similar for C. We have proved the following. Lemma 7.
If for every pair of nonsingular
=A=
0 1 1 1 1000 0 1 0011 0 0 0100
1 1
1 1
0
1
1
0
2 x 2 matrices G and J and for the matrix
(12)
we have S( A( GT C$JT> ) 2 4 then every bilinear algorithm for vec( XY) with multiplicative additive complexity 15.
complexity
7 has
Denote by {ei, e2, e3, e4) the rows of the 4 x 4 identity matrices. We first prove the following lemma. Lemma 8.
The matrix H=A(GT@
JT) cannot have 4 rowsfrom
Proof. Since by Lemma 4 the rows of H are pairwise independent its rows the vectors culel, a2e2, &se3 and abed. This is equivalent Pluil,P2Vi*,p3uis,P4Viq in A such that
{cqel,cqe2,qe3,a4e4
1q E F}.
it is enough to prove that H cannot have in to saying that there exists no distinct rows
which implies that
If si,2 # 0 and s2,2 # 0, then since R is nonsingular
( ) s1,2R ~2,2R
has two nonzero
pair of dependent
rows.
Observing the two last columns of A, we conclude that this case cannot happen, so ~1.2 = 0 or s2,2 = 0. In the same way, we can prove that si,t = 0 or s2,i = 0. Now if si,2 = 0 then, since S is nonsingular, we have si,i # 0 and therefore sz,t = 0, which implies that
sl,lR
0
0 s2,2R ) ’
(13)
Observing the rows of A we must have {il, i2, i3, id} = {3,5,6,7} 0 not of the form in (13). Lemma 9. S(A(GT
For every pair of nonsingular @ JT))
> 4.
matrices G and J,
and in this case the rows 3, 5, 6 and 7 are
N.H. Bshouty/Information Processing Letters 56 (1995) 329-335
335
Proof. By Lemma 5, H = A( CT Q9JT) cannot contain more than 3 rows from {‘~lel, Ly2e2,a3e3, cyqe4) and by Lemma 4 the other 4 rows are distinct. Therefore each one of the other 4 rows requires at least one additive operation, which implies that 6(H) > 4. 0 References ] I ] E. Feig and S. Winograd, On the direct sum conjecture, Linear Algebra Appl. 63 (1984) 193-219. 121 H.F. de Groote, On varieties of optimal algorithms for the computation of bilinear mapping 1. The isotropy group of a bilinear mapping, Theoret. Comput. Sci. 7 (1978) I-24. 131 H.F. de Groote, On varieties of optimal algorithms for the computation of bilinear mapping 11. Optimal algorithms for 2 x 2-matrix multiplication, Theoret. Comput. Sci. 7 (1978) 127-148. [ 41 J. Hopcroft and J. Munsinski, Duality applied to the complexity of matrix multiplication, SIAM L Comput. 2 ( 1973) 159-173. [ 51 M.Kaminski, D.G. Kirkpatrick and N.H. Bshouty, Addition requirements for matrix and transpose matrix product, J. Algorithms 9 (1988) 354-364. (61 R.L. Probert, On the additive complexity of matrix multiplication, SIAM J. Comput. 5 (1976) 187-203. [7] V. Strassen, Gaussian elimination is not optimal, Numer. Math. 13 (1969) 454-456. 181 S. Winograd, On multiplication of 2 x 2 matrices, Linear Algebra Appl. 4 ( 1971) 381-388.