multiplying multiple matrices

Started by
8 comments, last by phresnel 15 years, 1 month ago
given the following, how can do i multiply three matrices together to achieve M? i know how to multiply two matrices, but a third - can it be done in a single calculation and if so, what indices do i use for the extra matrix?

//multiplying two matrices
void Matrix_Mult(int a1[][3], int a2[][3], int a3[][3])
{
   int i = 0;
   int j = 0;
   int k = 0;
       for(i = 0; i < 3; i++) 
           for( j = 0; j < 3; j++)
               for( k = 0; k < 3; k++) 
                   a3[j] +=  a1[k] * a2[k][j];
}

//multiplying three matrices?
void Matrix_Mult_3(int a1[][3], int a2[][3], int a3[][3], int a4[][3])
{
   int i = 0;
   int j = 0;
   int k = 0;
       for(i = 0; i < 3; i++) 
           for( j = 0; j < 3; j++)
               for( k = 0; k < 3; k++) 
                   a4[j] +=  a1[k] * a2[k][j] * a3[k][j]; //a3 should use which indices?
}

Advertisement
If you can correctly multiply two matrices, multiply the first two then, then multiply the result by the third and dont worry about writing a call for every possible combination.
this would be the only other combination i'd need, but if that's THE way it's usually done (breaking it into two multiplication steps), that's how i'll do it.

but was also just kind of interested in if it's possible, and if so how exactly?
1) Just about every graphics matrix API i've seen multiplies matricies out longhand instead of using loops. It is significantly faster without all the branches.

2) That said, you can just multiply out the matrix longhand, and come up with the formula for each cell the same way you would for #1.
ok, i'll do that. thanks!
If i understand correctly the compiler is just as happy to optimize the simple loops for you. So feel free to write them as loops.

And it's correct that it's mostly done in two multiplication stages. But if you've got a lot of matrices to multiply, it may be worth experimenting with another process of crunching all those numbers, maybe 3 matrices at a time.
Quote:Original post by KulSeran
1) ... multiplies matricies out longhand instead of using loops. It is significantly faster without all the branches

Did you do a performance analysis with a current compiler to back up this claim? I would trust the compiler to do a way better job of optimizing that I could and with respect to rather simple loops it is recommended to write them as loops since compilers are smart enough to produce fast code from those loops. Unrolling the loops takes information from the compiler that might help it to optimize.

But I have to admit that is only hearsaying cause I can not back up my claim with sources. :)
------------------------------------I always enjoy being rated up by you ...
I believe modern C and C family languages will unroll loops where they believe it will be helpful. I also believe that compiler optimisations aren't really for this board :p

Re the OP: Multiplying matrices together is an associative operation, i.e.

ABC = (AB)C = A(BC)

So work from either end of the chain and calculate the multiples as you go.
Waterwalker,

While normally I'd agree.
1) Maybe not the best to unroll for the matrix multiply, but other matrix functions tend to have a lot of repeated values and calculations that you can manually pull out.

2) When you have access to a vector unit (SSE, Cell SPU, PSP/PS2 vector co-processor), you have to hand roll it, since most compilers don't auto-vectorize. And the hand-rolling can be very worth it.

Bob Janova,

You do have a point. Optimization wasn't so much the issue. I was just pointing out in my origional post that the OP could just multiply it out by hand if he was really curious as to what the formula would look like.
Quote:Original post by Waterwalker
Quote:Original post by KulSeran
1) ... multiplies matricies out longhand instead of using loops. It is significantly faster without all the branches

Did you do a performance analysis with a current compiler to back up this claim? I would trust the compiler to do a way better job of optimizing that I could and with respect to rather simple loops it is recommended to write them as loops since compilers are smart enough to produce fast code from those loops. Unrolling the loops takes information from the compiler that might help it to optimize.


Also, KulSeran, even if it won't be rolled out it could yield better performance as such a tight loop fits far better into the execution pipeline, plus branch history tables will do a good job, plus there is nothing indeterministic w.r.t. branch prediction (code path is always the same).

Additionally, compilers (e.g. GCC) will find it easier to autovectorize such tight loops. For a glimpse, see this post.

edit:
Quote:
While normally I'd agree.
1) Maybe not the best to unroll for the matrix multiply, but other matrix functions tend to have a lot of repeated values and calculations that you can manually pull out.

The compiler will pull it out for you (it will even yield a fsincos fpu instruction when you have a sin() and a cos() on the same value; at least GCC will do that. For a proof, see my humble 4k flame fractal render and the corresponding dissassembly [ugly C-source code here] / edit: and that demo is already 2 years old)

Quote:
2) When you have access to a vector unit (SSE, Cell SPU, PSP/PS2 vector co-processor), you have to hand roll it, since most compilers don't auto-vectorize. And the hand-rolling can be very worth it.

See tbp's post on ompf (again: this).

This topic is closed to new replies.

Advertisement