Jump to content
  • Advertisement
Sign in to follow this  
  • entries
    11
  • comments
    53
  • views
    8576

4 x 4 ASM Matrix Multiplication

Sign in to follow this  
Caitlin

330 views

So I spent a fair amount of time yesterday and this morning trying to get around the fact that it appeared that I needed to do a lot of data shuffling to multiply two 4x4 matricies using SIMD instructions. I didn't want to accept that because shuffling data around is a waste of time to me unless it absolutely HAS to be done, so I looked up some things on Intel's web site and I was right unfortunately. I have to do a ridiculous amount of data shuffling, OMG its horrible, so ugly it makes me want to [bawling]. Oh well it HAS to be done so I'll have to live with it.

Here is the unedited Intel code:

multiply_4x4_matrix: ;multiplies two 4 x 4 matricies

mov edx, dword ptr [esp+4] ; src1
mov eax, dword ptr [esp+0Ch] ; dst
mov ecx, dword ptr [esp+8] ; src2
movss xmm0, dword ptr [edx]
movaps xmm1, xmmword ptr [ecx]
shufps xmm0, xmm0, 0
movss xmm2, dword ptr [edx+4]
mulps xmm0, xmm1
shufps xmm2, xmm2, 0
movaps xmm3, xmmword ptr [ecx+10h]
movss xmm7, dword ptr [edx+8]
mulps xmm2, xmm3
shufps xmm7, xmm7, 0
addps xmm0, xmm2
movaps xmm4, xmmword ptr [ecx+20h]
movss xmm2, dword ptr [edx+0Ch]
mulps xmm7, xmm4
shufps xmm2, xmm2, 0
addps xmm0, xmm7
movaps xmm5, xmmword ptr [ecx+30h]
movss xmm6, dword ptr [edx+10h]
mulps xmm2, xmm5
movss xmm7, dword ptr [edx+14h]
shufps xmm6, xmm6, 0
addps xmm0, xmm2
shufps xmm7, xmm7, 0
movlps qword ptr [eax], xmm0
movhps qword ptr [eax+8], xmm0
mulps xmm7, xmm3
movss xmm0, dword ptr [edx+18h]
mulps xmm6, xmm1
shufps xmm0, xmm0, 0
addps xmm6, xmm7
mulps xmm0, xmm4
movss xmm2, dword ptr [edx+24h]
addps xmm6, xmm0
movss xmm0, dword ptr [edx+1Ch]
movss xmm7, dword ptr [edx+20h]
shufps xmm0, xmm0, 0
shufps xmm7, xmm7, 0
mulps xmm0, xmm5
mulps xmm7, xmm1
addps xmm6, xmm0
shufps xmm2, xmm2, 0
movlps qword ptr [eax+10h], xmm6
movhps qword ptr [eax+18h], xmm6
mulps xmm2, xmm3
movss xmm6, dword ptr [edx+28h]
addps xmm7, xmm2
shufps xmm6, xmm6, 0
movss xmm2, dword ptr [edx+2Ch]
mulps xmm6, xmm4
shufps xmm2, xmm2, 0
addps xmm7, xmm6
mulps xmm2, xmm5
movss xmm0, dword ptr [edx+34h]
addps xmm7, xmm2
shufps xmm0, xmm0, 0
movlps qword ptr [eax+20h], xmm7
movss xmm2, dword ptr [edx+30h]
movhps qword ptr [eax+28h], xmm7
mulps xmm0, xmm3
shufps xmm2, xmm2, 0
movss xmm6, dword ptr [edx+38h]
mulps xmm2, xmm1
shufps xmm6, xmm6, 0
addps xmm2, xmm0
mulps xmm6, xmm4
movss xmm7, dword ptr [edx+3Ch]
shufps xmm7, xmm7, 0
addps xmm2, xmm6
mulps xmm7, xmm5
addps xmm2, xmm7
movaps xmmword ptr [eax+30h], xmm2
Sign in to follow this  


5 Comments


Recommended Comments

Do you really need to use SIMD instructions? Might look better if you didn't have to use them.

Share this comment


Link to comment
SIMD makes things quicker. Using C with Gaussian elimination doing the multiplication takes 1074 cycles, C with Cramer's rule takes 846 cycles, and C with Cramer's rule using SIMD only takes 210 cycles. (Performance numbers taken from the Intel document where I got the code).

Share this comment


Link to comment
Maybe, but none of those numbers sound particularly large. Remember:
Premature optimization is the root of all evil [smile].

Share this comment


Link to comment
Still, you should probably profile before optimizing. That way, you don't waste time optimizing something which really didn't need to be optimized. You can always optimize it later if it's speed is a problem.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!