Sunday, 5 October 2014

Today I Learned: the Strassen Algorithm

There's a reason why most of us are taught addition before multiplication - it's just so much easier (*). And so, it was pretty big deal when people invented Logarithms, because the fact that \begin{align}\log(xy)=\log(x)+\log(y)\end{align} meant that (provided that you could convert from a number \(x\) into its logarithm \(\log(x)\)) you were able to covert a multiplication problem into an addition one.

Now, I must admit that I was super-surprised to find out that the first (proper) logarithm ever developed was (more-or-less) done in base \(1-10^{-7}\). It...just seemed sooooooo arbitrary! Though, I think that there's probably a good astronomy-based explanation for why this base was chosen (**)... in fact, I might've even heard this explained on one of the recent episodes of the BBC In Our Time podcast? (hint: you should totes go check it out =P)

Analogously, matrix multiplication is also harder than matrix addition. And the fact that even for \(2\times2\) matrices, matrix multiplication \begin{align} \left[\begin{array}{c}{c} a_{11} & a_{12} \\ a_{21} & a_{22}\end{array}\right] \left[\begin{array}{c}{c} b_{11} & b_{12} \\ b_{21} & b_{22}\end{array}\right] = \left[\begin{array}{c}{c} a_{11}\times b_{11}+a_{12}\times b_{21} & a_{11}\times b_{12}+a_{12}\times b_{22} \\ a_{21}\times b_{11}+a_{22}\times b_{21} & a_{21}\times b_{12}+a_{22}\times b_{22}\end{array}\right] \end{align} actually requires 8 little multiplications just adds insult to injury...ammirite?

Weeeeell, fear no more! Because we can totally reduce \(2\times2\) matrix multiplication down to just 7 multiplications (***)! And here's how:

Start by computing these guys:
  1. \(m_1:=(a_{11}+a_{22})\times (b_{11}+b_{22})\),
  2. \(m_2:=(a_{21}+a_{22})\times b_{11}\),
  3. \(m_3:=a_{11}\times (b_{12}- b_{22})\),
  4. \(m_4:=a_{22}\times (b_{21}-b_{11})\),
  5. \(m_5:=(a_{11}+a_{12})\times b_{22}\),
  6. \(m_6:=(a_{21}-a_{11})\times (b_{11}+b_{12})\),
  7. \(m_7:=(a_{12}-a_{22})\times (b_{21}+b_{22})\).
This takes up all 7 of the multiplications that we're allowing ourselves. To finish off the matrix multiplication, the four entries of our multiplied matrix are given by:
  1. \(m_1+m_4-m_5+m7\),
  2. \(m_3+m_5\),
  3. \(m_2+m_4\),
  4. \(m_1-m_2+m_3+m_6\).
Et voila! THIS is called the Strassen Algorithm.

Now, this might seem like kind of a stupid thing to do, but I think that it was supposedly kinda useful in the early days of computing, i.e.: when computers had trouble handling multiplication. Just...umm...goes to show, right? much harder than addition. =P

 *: unless if you're dealing with \(0\) which case, it's pretty easy either way
**: coupled with the fact that \((1-10^{-7})^{10^7}\approx\frac{1}{e}\)
***: but with a few more additions, which is fine, because additions are totally easier than multiplications

No comments:

Post a Comment