On the inequivalence of bilinear algorithms for 3×3 matrix multiplication

On the inequivalence of bilinear algorithms for 3×3 matrix multiplication

Information Processing Letters 113 (2013) 640–645 Contents lists available at SciVerse ScienceDirect Information Processing Letters www.elsevier.com...

187KB Sizes 8 Downloads 80 Views

Information Processing Letters 113 (2013) 640–645

Contents lists available at SciVerse ScienceDirect

Information Processing Letters www.elsevier.com/locate/ipl

On the inequivalence of bilinear algorithms for 3 × 3 matrix multiplication Jinsoo Oh a , Jin Kim b , Byung-Ro Moon b,∗ a b

Daum Communications, Jeju-do, 690-150, Republic of Korea School of Computer Science and Engineering, Seoul National University, Seoul, 151-744, Republic of Korea

a r t i c l e

i n f o

a b s t r a c t Since Laderman showed an algorithm for 3 × 3 matrix multiplication using 23 scalar multiplications, Johnson and McLoughlin used a numerical optimization and human controlled method to give two parameterized algorithms in which the coefficients are rational numbers. The algorithms are inequivalent to Laderman’s one with respect to the transformation introduced by de Groote. We present a simple and fast numerical heuristic for finding valid algorithms. Then we show that many of the obtained algorithms are inequivalent to the published ones. © 2013 Elsevier B.V. All rights reserved.

Article history: Received 9 October 2012 Received in revised form 14 May 2013 Accepted 21 May 2013 Available online 24 May 2013 Communicated by B. Doerr Keywords: Algorithms Matrix multiplication

1. Introduction 1.1. Strassen algorithm For two 2 × 2 matrices A and B, let C be the product A B. Strassen [1] showed that seven scalar multiplications are sufficient to get C . Let

Pt =



αi jt A i j

i, j



 βi jt B i j

(1)

i, j

and

C nm =



γmnt P t

(2)

t

where t = 0, 1, . . . , 6, and αi jt , βi jt ∈ {−1, 0, 1}. By finding such numbers αi jt , βi jt , γi jt , he proved that 7 scalar multiplications are enough to compute C . By applying the result recursively to the problem of multiplying two n × n matrices, Strassen derived an algorithm with O (nlog2 7 ) arithmetic operations which is better than

*

Corresponding author. E-mail addresses: [email protected] (J. Oh), [email protected] (J. Kim), [email protected] (B.-R. Moon). 0020-0190/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ipl.2013.05.011

O (n3 ), the complexity of the naive method. Later, Winograd [2] proved that 7 is the minimum number of multiplications required to multiply two 2 × 2 matrices in the context of bilinear combinations. One way to obtain fast algorithms for multiplying two n × n matrices is to look for other small cases such as the problem of multiplying a 2 × 2 matrix by a 2 × 3 matrix. For this case, it was proved that 11 multiplications are required. The upper bound was obtained by combining Strassen’s algorithm with a vector-matrix multiplication and the lower bound was obtained by Alexeyev [3]. For the case of multiplying a 2 × 3 matrix by a 3 × 3 matrix, the possible number of multiplications is either 14 or 15. The upper bound of 15 was derived by Hopcroft and Kerr [4] and the lower bound of 14 was proved by Brockett and Dobkin [5]. In this paper, we consider the problem of multiplicating two 3 × 3 matrices. For this case, Laderman [6] devised an algorithm that uses 23 multiplications instead of 27. Bläser [7] proved that a lower bound is 19. The time complexity of Laderman’s algorithm becomes O (nlog3 23 ). Although the complexity is worse than that of Strassen algorithm, the result has an important theoretical meaning which will be discussed in Section 1.2. Currently, the best complexity for the product of two n × n matrices is O (n2.3727 ) by Williams [8].

J. Oh et al. / Information Processing Letters 113 (2013) 640–645

641

Table 1 Rank distributions. # of rank 1

# of rank 2

# of rank 3

# of rank 1

# of rank 2

# of rank 3

44 46 47 47 48 48 48 49 49 50 50

24 22 21 22 18 20 21 19 20 15 18

1 1 1 0 3 1 0 1 0 4 1

50 51 51 52 52 53 53 54 54 55

19 17 18 16 17 15 16 14 15 14

0 1 0 1 0 1 0 1 0 0

Table 2 An [46, 22, 1]-algorithm.

αt

t

γt

βt 0 0 0

0 0 1

1 1 1 −1 −1 −1 1 1 1

t

αt −1

0

1

0 0 0 0 0 0 0 −1 −1

0 −1 0 1 0 0

0 0 0

−1 0

0 1 0 −1 0 −1

2

0 −1 −1 0 0 0 0 1 1

0 0 0 0 0 −1

0 0 0

0 1 0

0 0 0

0 0 0

14

0 0 0

0 0 0 1 −1

0 0 0

15

0 −1 0 −1 0 0

0 0 0

16

0 0 0 −1 0 −1

0 0 0

0 0 0 0 0 −1

−1

0 −1

4

0 0 0

0 1 0

0 0 0

1 0 1

0 −1 0 1 0 0

1 −1 1 0 0 0 −1 1 −1

0 0 0

0 0 0

0 1 1

−1

5

0 0 0

0 0 0 −1

0 0 0

1 7

8

9

10

11

1 1 0 0 −1 −1

0 0 0

0 0 0 0 −1 −1 0 0 0 0 0 −1

0 1 0

0 0 1

0 1 0 0 0 0 0 −1 −1 0 0 0 0 −1 −1

0 0 0

0 −1 0 0 0 0

0 −1 0 0 0 0

−1 0 −1 0

0 0 0

0 0 0 0 0 0

0

0 1 0 −1 0 0

0

1

−1

1 0 0 0 0 −1

0 1 0

0 0 0

0 1 0

−1

0 0 0

1 1 0

0 0 0

0 0 0 0 0 −1

1 0 −1

0 0 0

0 0 0

0

−1

0 0 0

0 0 0

1 0 −1

0 0 0

0 0 0

0 −1 1 1 0 0

0 0 0

0 0 0

0 −1 0 0 0 1

1 0 0

0 0 0

0 0 0

1 0 0

0 0 0

1 −1 0 0 0 0

1 0 0

0 0 0 −1 0 0

0 0 0

0

−1

−1 0 −1 −1

0 0 1

12

0 1 1

6

13

0 0 1

0 0 0

0 0 0 −1 −1

0 0 −1

0 0 0

1 0 −1

0 0 0 −1 0 0

0 0 0

17

0 1 1

0 0 0

0 0 1

0 0 0

18

0 0 −1

0 0 0

0 0 1

0

0 0 0

0 0 0

19

0 1 1

0 1 1

0 0 0

1 0 −1 1 −1

0 0 0

20

1 0 −1

0 0 0

0 0 1

1 1 −1 0 0 0 −1 0 −1

21

0 1 0 0 0 −1

0 0 0

22

0 0 0

−1 0

−1

0

0

−1 −1 −1

−1

0

1

0

0

0 0 1 −1 0 1

0 1 0

1 0 0

0 1 0

0 −1 0

0 0 1 −1 0 0

−1

0 1

0 1 0 0 0 −1

0 0 0

3

γt

βt

0 0 0

0 0 0 −1 0 −1

0 0 1 0 0 −1

0

−1 1 0

0 0 0 0 0 −1

0 0 0

0 −1 0 1 0 0

0 −1 0 0 0 0

−1 0 −1

0 −1 0 0 0 −1

0 1 0 −1 0 1

−1

0 0 0

1 0 −1 −1 0

0 0 0 1

0 0 −1

0 0 0

0 0 0

0 0 1 −1 0 1

0 0 1

0 0 0

0 0 0

0 0 0 0 −1 1 0 0 −1

0 0 0

0 0 1 0 0 −1 0 0 −1

1.2. Equivalence classes

 , respectively, are called equivalent when αijt , βijt , γmnt

We can think of αi jt , βi jt , γmnt as matrices. That is, αi jt can be considered as a 3 × 3 matrix where i ( j resp.) denotes row index (column index, resp.) for i , j = 0, 1, 2. Then, an algorithm for the multiplication of two 3 × 3 matrices using 23 scalar multiplications consists  be another algoof 23 · 3 matrices. Let αi jt , βijt , γmnt rithm. Two algorithms, consisting of αi jt , βi jt , γmnt and

one can be transformed to the other using a series of transformations by permutation, cyclic permutation, transposition, scalar multiplication, and matrix multiplication [9]. For the multiplication of 2 × 2 matrices, de Groote [10] proved that Strassen algorithm is unique under this transformation by showing that any algorithm is transformable to Strassen’s. In other words, every

642

J. Oh et al. / Information Processing Letters 113 (2013) 640–645

Table 3 An [50, 15, 4]-algorithm.

αt

t 0

0 0 −1 −1

0 0 0

1 −1 −1 0 0 0 0 0 0

0 0 0

−1 −1 −1 1 −1 1 −1 −1 −1

1

0 0 1

0 1 1

2

0 0 0

0 −1 0 0 0 0

3

γt

βt

−1 −1

0 0 1 0 −1 −1 −1 −1 0

0 0 0

0 0 0 0 0 −1

−1 −1 −1 0 0 0 −1 −1 −1

t

−0.5 0.5

0 0 0

15

0 0 0

0 −1 0 0 0 0

0 0 −1

1 0 1

−1

16

0 0 1

0 0 0 1 −1 −1 0 0 0

0 0 0

−0.5

7

0 0 0

8

0 0 0 −1 0 0

0 0 0

0 0 0

0 0 0

0 0 0

−1 −1

0 0 0

−1 −1 −1

9

0 0 0 −1 0 −1

0 0 0

0 0 0

−0.5 −0.5

0 0 0

−0.5 0 .5 0

0.5 0 −0.5

0 0 0

0 0 0

0.5 0 0.5

10

11

0 0 −1

0 0 0

0 0 1

0 0 1

0 0 1

0 0 1

0 0 0

1 −1 0

−0.5 −0.5 −0.5 0 0 −0.5 0 1

0 0 0 0 0 .5

0

12

13

−1

0 0 −1 −1 0 0

1 1 1 −1 −1 −1 −1 1

0 0 0

0 0 0

−1

−1

0 0 0

0 .5 0 0 .5

0 0 0 −1 0 0

0 0 0 −1 0 −1

0 0 0

0 0 0

0 0 0

0 1 0

0 0 0

14

0 0 0

1 1 1 −1 −1 −1

0 0 0

0 0 −1

0 −1 0 0 0 0

0 1 0

0 0

0 0 0

6

0 1

−1

1 0 1

0 1 0 −1 0 0

0 0 −1 −1

0 0 1 −1 0 0

0.5 0.5 1

0 1 0

−1 −1 −1

0 0 0

0.5 −0.5 0 −0.5 0 .5 0

5

4

0 0 1

0

0 0 0 0 0 −1

0 0 0

γt

βt

0 0 1

0 0 0

0 0 0

0 0 0 0 0 −1

αt

0 0 0

1 0

−1

0 0 0

17

1 0 −1 0 0 0 1 1 0

1.3. An invariance property of de Groote’s transformation De Groote’s transformation has an invariance property for ranks of matrices, which is useful when proving inequivalence between two algorithms. Simply speaking, if the distribution of ranks for an algorithm is different from that for another algorithm, then they are inequivalent. (We just implicitly mention the whole distribution of ranks of 69 matrices. De Groote’s transformation has a stronger invariance property than we mentioned here.) The Laderman’s algorithm consists of 51 matrices of rank 1, 12 matrices of rank 2, and 6 matrices of rank

−0.5 −0.5 0 .5 0 0 − 0.5

0 −1 0 0 0 0

−1

1 1 −1 −1 1 −1 1 1 1

0

1 0 0

1

0 0 0

0 0 0

0 0 0

0 0 0

0 0. 5 −0.5

0 0 0

0 0 0

−0.5

0 0 1

0 0 0

0 .5 −0.5

1 1 −1 0 0

1 1 0

0.5 −0.5 0

0 0 0

−1 −1 −1

0. 5 0 .5 0

0 0 0

0 0 0

−0.5 −0.5

0 0 0

0 0 0

− 0.5 0 .5

0 0 0

0

0 0 .5

1 0

0 1 0 −1 0 0

18

0 0 0

1 0 0

0 0 0

0 0 0

0 1 1

−1

19

0 0 0

20

0 1 0

0 0 0 −1 0 0

0 0 1

0 0 0 0 1 −1

0

0

0.5 0 −0.5

21

0 1 0

0 0 0

0 0 0

1 −1 1

1 −1 1 1 1 −1

0 0 0

0 0. 5 − 0.5

0 0 0

22

0 −1 0 1 0 −1 0 −1 −1

−1

1 −1 −1 1 1 0 0 0

0 0 0

0 0 0

0 .5 0 −0.5

0

algorithms of 7 scalar multiplications are equivalent to Strassen’s. For the 3 × 3 matrix multiplication, Johnson and McLoughlin [11] presented two parameterized algorithms and proved that they are inequivalent to Laderman’s. However, their result does not give an answer to a critical question: How many inequivalent algorithms exist. In this paper, we show that there exist many algorithms inequivalent to Laderman’s and Johnson–McLoughlin’s algorithms.

0 0 0

0 1 0

0 0

0 0

0 0

0

0

0

−1 −1 −1 0

0

0

1 1 1 −1 −1 −1 1 1

0

0 0 0

0

−0.5 −1 −0.5 0

3. For convenience, we simply write the distribution as [51 : 12 : 6]. In case of Johnson and McLoughlin’s first parameterized algorithm, the distribution can be either [47 : 21 : 1] or [48 : 20 : 1] depending on the given parameters. The distribution of their second algorithm can be [48 : 21 : 0], [49 : 20 : 0] or [50 : 19 : 0]. From the fact that the distributions of ranks for the two Johnson and McLoughlin’s algorithms are different to the Laderman’s, the algorithms are inequivalent to the Laderman’s. (Johnson and McLoughlin’s first algorithm is also inequivalent to their second algorithm.) Johnson and McLoughlin used this invariance property to prove their result. 1.4. Difficulty of finding algorithms To prove inequivalence, we need to find two candidate algorithms. However it is difficult to find even one candidate algorithm. To find one algorithm, Laderman used a deeper analysis for the problem. Johnson and McLoughlin used the help of computer and a fairly delicate procedure considering de Groote’s transformation so that they could get two parameterized algorithms.

J. Oh et al. / Information Processing Letters 113 (2013) 640–645

643

Table 4 An [48, 18, 3]-algorithm.

αt

t

γt

βt

t

0

0 0 0

0 −1 0 0 0 0

0 0 0

0 0 0 0 0 −1

1 0 1

0 −1 −1

1

0 0 1

0 0 0 0 0 −1

1 0 0 1 0 0 0 −1 −1

0 0 0

0 0 0

0 0 0 0 1 −1

0 1 0

0 0 0

0 0 0

−1

2

3

0 0 1

0 1 0 0 0 −1

1 1 −1 1 −1 1 1 −1 −1

0 .5 − 0.5 0

0 0 0

4

0 1 0

0 −1 0 −1 0 0

−1 −1 −1 −1 −1 −1 −1 −1 1

0. 5 0 0.5 −0.5 0 −0.5

5

0 0 −1 0 −1 0 −1 1 1

−1 −1 −1

1 −1 1 −1 1 −1

0 0 0

0 0 0

6

0 0 0 −1 −1 1

−1 1 −1 −1 −1 −1 −1 1 −1

0 0 0

−0.5 0 .5

0 0 0

0 1 0

0 0 0

0 0 0

12

1 1 0

1 1 0

1 1 0

0.5 0.5 0

0 0 0

0 0 0

1 0 0

1 0 0

0 0 0

0 0 0

0 0 0

1 1 0

−0.5

13

0 0 0

0 0 0

0 14

−1

0 1 0

1 1 −1 1 1 1 1 1 −1

−0.5 −0.5

0

0 0 0

0

0 .5 0 .5 0

0 0 0

0 −0.5 0.5

15

0 0 1

0 0 0

0 0 0

−1 −1 −1

1 1 1

1 1 1

0 0 0

0 0 0

0 0.5 0 .5

0 0 0

16

0 0 0 1 1 −1

0 0 0

−1 −1 −1

1 −1 1 1 1 −1

0 0 0

− 0.5 0 .5

0 0.5 −0.5

0 0 0

1 −1 0 0 0 0

1 0 0

1 0 0

−1

17

1 −1 0 0 0 0 0 0 −1

0 0 1

0 0 0

0 0 0 0 −1 −1 0 0 0

0 0 0

1

0

18

−1

−1

0

0

19

0 0 0

0 1 0

0 0 0

0 0 0

0 0 0

0 −1 1

−1

20

0 −1 1 0 0 0 1 0 −1

−1 −1

0.5 0.5 0 0 0

0

7

0 0 1

0 0 0

0 0 1

1 1 1

0 0 0

0 0 0

0 0 0

0 0 0

0.5 0.5 0

8

0 0 1

0 0 0 0 0 −1

0 0 −1

0 0 1

0 0 1

0.5 − 0.5 0

0 0 0

0.5 −0.5 0

0 1 0

0 1 0

0 1 0

0 0 0

1 0 0

0 0 0

21

−1 −1 −1 −1 −1 −1 −1 −1 −1

0 0 0

0

−0.5 −0.5

0 0 0

22

9

0

1 1 0

0 0 0

0 0 0

0 0 0

1 1 0

−1 −1

1 0 0

0 1 0

10

11

0

1 −1 1 1 0 0

1 1 0

0. 5 − 1 0.5 0 0 0

γt

βt 0 −1 0 −1 0 0

0 0 0

−1 −1

αt 1 1 0

0 −1 1

0 0 0

0 0.5

0 1 0

0

1 −1 1 −1 0 0

0 0 0

0 0 0

0 −1 0 1 0 0

−0.5

0 0

1 0 0

0 1 0

0 0 0

0 0 0

1 1 1

1 −1 1 1 1 −1

0 0 0

−1

0

0 0.5 −0.5 0 0 0

0 0

0 1

0.5 −0.5 0

0 0 0

0 0 0

0 0 0

0 0 0

0 .5 0 − 0.5

0 0 0

0 0.5

0 0 0

Since our first step is to find many candidate algorithms, the published methods mentioned above are not satisfactory in that they take too much time. We added a simple rounding method to Brent’s numerical method so that we could get a simple and fast heuristic to find many algorithms in a short time. Since we could find algorithms with 23 multiplications, we also tried to find an algorithm with 22 multiplications. Unfortunately, given αi jt ’s, βi jt ’s, γmnt ’s, at most 8 out of 9 C nm ’s could be obtained. 1.5. Outline We first present the basic numerical method commonly used for solving (1) and (2). Then we show our heuristic which adds a simple rounding method to the basic numerical method. Finally, we give just three algorithms that are inequivalent to the published ones; due to space limit, we could not list all the inequivalent algorithms that we found. Thus, before showing the three algorithms, we give a summary of the rank distributions of them.

2. Heuristic 2.1. Basic numerical method Brent [12] introduced a basic numerical method. When we plugging (1) into Eq. (2), we get the following 22 

αi jt βklt γmnt = δni δ jk δlm

(3)

t =0

where i , j , k, l, m, n = 1, 2, 3. In [12], a function s(α , β, γ ) was defined as follows:

s(α , β, γ ) =

 

i , j ,k,l,m,n

2

αi jt βklt γmnt − δni δ jk δlm . (4)

t

The function s(α , β, γ ) is decomposed into three parts.

s(α , β, γ ) =

  i , j ,k,l,m,n

−2

2

αi jt βklt γmnt

t

 

δni δ jk δlm

i , j ,k,l,m,n

 t



αi jt βklt γmnt

644

J. Oh et al. / Information Processing Letters 113 (2013) 640–645



+

(δni δ jk δlm )2 .

(5)

i , j ,k,l,m,n

} Round all

Each part is computed as follows:

 

=

 t ,u

αi jt αi ju

i, j

 

δni δ jk δlm

t 2

(δni δ jk δlm ) =

i , j ,k,l,m,n



βklt βklu

k,l



i , j ,k,l,m,n



γi jt ’s;

2

αi jt βklt γmnt

t

i , j ,k,l,m,n

for i , j , t, update γi jt = γi ∗ j ∗ t ∗ such that s(γi ∗ j ∗ t ∗ ) is minimum; if (k % RT = RT − 1) round all αi jt ’s and βi jt ’s;





αi jt βklt γmnt = 

Three inner for-loops are just Brent’s method and “if” statements are for rounding variables.



γmnt γmnu

,

3. Inequivalent algorithms

m,n



αi jt β jlt γlit ,

i , j ,l,t

(δi j δkl δmn )2 = 27.

i , j ,k,l,m,n

The strength of Brent’s derivation comes from the fact that if we fix all the variables except one then s(α , β, γ ) becomes a quadratic function. (Suppose that all variables, αi jt , βi jt , γi jt , except αi∗ j∗ t ∗ are fixed. Then the first part of s(α , β, γ ) becomes a quadratic function of αi ∗ j ∗ t ∗ and the second part becomes a linear function of αi ∗ j ∗ t ∗ . Thus s(α , β, γ ) becomes a quadratic function of αi ∗ j ∗ t ∗ .) Since the function s(α , β, γ ) is quadratic, it is definitely easy to find the minimum value, αi ∗ j ∗ t ∗ ∈ [−1, 1], at which s(α , β, γ ) is minimal. This simple procedure is applied to each variable one by one until we can find a valid algorithm, in which s(α , β, γ ) = 0 and ∀i , j , t , αi jt , βi jt , γi jt ∈ {−1, 0, 1}. 2.2. Heuristic As Johnson and McLoughlin devised a delicate procedure to find algorithms, it seemed difficult to make a simple and fast method. Through many trial and errors, we devised a rounding method. In case of αi jt and βi jt , we round the number to the nearest integer among −1, 0 and 1. We round γi jt to the nearest among −1, −0.5, 0, 0.5 and 1. By applying the simple rounding method to Brent’s method appropriately, we could get many algorithms in a short time. Mostly, 18 ∼ 31 algorithms in 10,000 tries were obtained in 70,000 seconds by our heuristic. We experimented on Intel(R) Core(TM)2 Duo CPU E6750 2.66GHz. The following is the pseudo code of our heuristic.1 // N : number of iterations // RT : rounding time // Θ : predefined threshold Assign random values to αi jt , βi jt , γi jt in the range [−1, 1]; for k = 0 to N − 1 { if (k % RT = RT − 1) then round all γi jt ’s; for i , j , t, update αi jt = αi ∗ j ∗ t ∗ such that s(αi ∗ j ∗ t ∗ ) is minimum; for i , j , t, update βi jt = βi ∗ j ∗ t ∗ such that s(βi ∗ j ∗ t ∗ ) is minimum; if (k % RT = RT − 1) { round all αi jt ’s and βi jt ’s; for i , j , k, l, m, n, do the following:  if | t αi jt βklt γmnt − δni δ jk δlm |  Θ , update γmnt = γm∗ n∗ t ∗ such that s(γm∗ n∗ t ∗ ) is minimum; }

1 Full source code can be found here: http://soar.snu.ac.kr/etc/ online_material_ipl/.

In this section, we first show some statistics of rank distributions of the algorithms that we found. Then we show some algorithms inequivalent to the published algorithms. We cannot list all the inequivalent algorithms due to space limit.2 Instead, we select three of them that are interesting in comparison with the published algorithms. The first is an integral algorithm where all variables are integral. The others are non-integral. The reason we present two non-integral algorithms is as follows. While the published algorithms have 0, 1 or 6 as the number of matrices of rank 3, our algorithms have 4 and 3 as the number of matrices of rank 3. 3.1. Rank distributions of our algorithms Recall that an algorithm with a rank distribution of

[ p , q, r ] has p matrices of rank 1, q matrices of rank 2, and r matrices of rank 3. In Table 1, we list the rank distributions of the algorithms that we found. We can see that there are various rank distributions, each corresponding to an inequivalent algorithm. From these, we chose three algorithms of rank distributions, [46, 22, 1], [50, 15, 4] and [48, 18, 3]. (See Tables 2–4.) 4. Conclusion We devised a simple and fast heuristic for finding algorithms for 3 × 3 matrix multiplication with 23 scalar multiplications. Our heuristic is simple and fairly faster than the published method. We then proved that our algorithms are inequivalent to the published ones. References [1] V. Strassen, Gaussian elimination is not optimal, Numer. Math. 14 (3) (1969) 354–356. [2] S. Winograd, On multiplication of 2 × 2 matrices, Linear Algebra Appl. 4 (4) (1971) 381–388. [3] V.B. Alekseyev, On the complexity of some algorithms of matrix multiplication, J. Algorithms 6 (1) (1985) 71–85. [4] J.E. Hopcroft, L.R. Kerr, On minimizing the number of multiplications necessary for matrix multiplication, SIAM J. Appl. Math. 20 (1) (1971) 30–36. [5] R.W. Brockett, D. Dobkin, On the number of multiplications required for matrix multiplication, SIAM J. Comput. 5 (4) (1976) 624–628. [6] J.D. Laderman, A noncommutative algorithm for multiplying 3 × 3 matrices using 23 multiplications, Bull. Amer. Math. Soc. 82 (1) (1976) 126–128. [7] M. Bläser, On the complexity of the multiplication of matrices of small formats, J. Complexity 19 (1) (2003) 43–60.

2 All the obtained algorithms are listed here: http://soar.snu.ac.kr/etc/ online_material_ipl/.

J. Oh et al. / Information Processing Letters 113 (2013) 640–645

[8] V.V. Williams, Multiplying matrices faster than Coppersmith– Winograd, in: STOC, 2012, pp. 887–898. [9] H.F. de Groote, On varieties of optimal algorithms for the computation of bilinear mappings I. The isotropy group of a bilinear mapping, Theoret. Comput. Sci. 7 (1) (1978) 1–24. [10] H.F. de Groote, On varieties of optimal algorithms for the computation of bilinear mappings II. Optimal algorithms for

645

2 × 2-matrix multiplication, Theoret. Comput. Sci. 7 (2) (1978) 127–148. [11] R.W. Johnson, A.M. McLoughlin, Noncommutative bilinear algorithms for 3 × 3 matrix multiplication, SIAM J. Comput. 15 (2) (1986) 595–603. [12] R.P. Brent, Algorithms for matrix multiplication, Tech. Rep. TR-CS70-157, DCS, Stanford, 1970.