Sign in to follow this  

useing SSE instructions

This topic is 4711 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to optimize my program with SSE instructions. I've downloaded a couple test programs that use the SSE instructions just fine. However my program causes an illegal instruction fault every time it tries to execute an SSE instruction. If anyone has any experience with this or knows any good sources of info... I'm useing Visual Studio 6 (I know it's kinda old) on Windows XP.

Share this post


Link to post
Share on other sites
Quote:
Original post by avianRR
However my program causes an illegal instruction fault every time it tries to execute an SSE instruction.
If anyone has any experience with this or knows any good sources of info...


Three reasons:
1. You processor does not support SSE at all. Check it with some scanning programs, eg. SiSoft Sandra can tell you that.

2. You are using SSE in a wrong way, eg. "aligned" instruction with non-aligned data. I assume you are talking about some sample programs, so it is probably not the case.

3. VC++ needs to have specific compile options set to use SSE. Check Configuration-Manager.

Give us more info. What instruction is causing the trouble?

Cheers.
/def

Share this post


Link to post
Share on other sites
Aligning your vectors on 16-byte boundaries is probably your problem... is it crashing on a movaps call?

I've written a fair amount about this in other threads. Data alignment in really tricky -- just getting the compiler and memory allocators to figure out that they need to align to 16 is a huge pain. Another annoying problem is that the memory allocator does align on 8, so a lot of times you'll think you've fixed it, but it's only working as a matter of chance.

Share this post


Link to post
Share on other sites
Quote:
Original post by deffer
Three reasons:
1. You processor does not support SSE at all. Check it with some scanning programs, eg. SiSoft Sandra can tell you that.

If you read my original post corectly you'll notice I said I used other programs that worked just fine. For example...
http://www.psychology.nottingham.ac.uk/staff/cr1/simd.html
Quote:
Original post by deffer
2. You are using SSE in a wrong way, eg. "aligned" instruction with non-aligned data. I assume you are talking about some sample programs, so it is probably not the case.

3. VC++ needs to have specific compile options set to use SSE. Check Configuration-Manager.

Give us more info. What instruction is causing the trouble?

I get the error on all of these instructions.

movupd xmm0,xmm1
movupd xmm0,[ebx]
movupd xmm1,[eax]
subpd xmm0,[eax]
movapd [ecx],xmm0

While this code in the exact same program runs just fine.

__m128 a, b, c;
a = _mm_set_ps(4, 3, 2, 1);
b = _mm_set_ps(4, 3, 2, 1);
c = _mm_set_ps(0, 0, 0, 0);
c = _mm_mul_ps(a, b);

So the compiler must not be the problem...
Am i useing the instructions right???

After a quick review of what I'm doing...

ok, I got this working (sorta)

--- from the disassembler window
221: movaps xmm0,xmmword ptr [a]
004138D1 0F 28 45 F0 movaps xmm0,xmmword ptr [ebp-10h]
222: movapd xmm0,xmmword ptr [a]
004138D5 66 0F 28 45 F0 movapd xmm0,xmmword ptr [ebp-10h]

line 221 works 222 causes an error on both my Pentium III and my Athalon XP systems.

Anyone got a clue?

Share this post


Link to post
Share on other sites
You are using movapd. You need your data to be aligned on 16 byte bounderies. Use movupd or use __declspec(align(16)) to align your vectors. Oh ok, this is not the problem.

Is a an array of floats or doubles?

[Edited by - vNistelrooy on January 18, 2005 4:38:44 AM]

Share this post


Link to post
Share on other sites
Oh man, double percision packed vectors are SSE2 and neither of your CPUs (P3 and AthlonXP) support SSE2, either use floats and movaps or don't use SSE if you need doubles.

[Edited by - vNistelrooy on January 18, 2005 4:47:26 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by vNistelrooy
Oh man, double percision packed vectors are SSE2 and neither of your CPUs (P3 and AthlonXP) support SSE2, either use floats and movaps or don't use SSE if you need doubles.


That was it. The reference I was useing for the instructions just says SSE. It doesent mention SSE2 so I didn't know the exact instructions I was useing were SSE2.
Now the code works, I just need to figure out how to force the struct to be created on the right boundry.

thanks,
rob

Share this post


Link to post
Share on other sites

This topic is 4711 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this