Jump to content
  • Advertisement
Sign in to follow this  
ttigue

SSE2 instruction causes access violation.

This topic is 4666 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey, I'm trying to experiment with SSE2 instructions. I am programming in Visual C++ 2003 (and have also tried this in Visual C++ 2005 Express). I am coding on a Pentium IV computer. I use the 'cpuid' instruction to check that SSE2 instructions are supported. However, my program throws an access violation whenever it hits an SSE2 instruction: Unhandled exception at 0x00411983 in QuickEnum.exe: 0xC0000005: Access violation reading location 0xffffffff. ----------------------------- unsigned char data[16]; _asm { movdqa xmm0, [data] pslldq xmm0, 1 movdqa [data], xmm0 } ----------------------------- (Sorry, I couldn't figure out how to embed code) I just tried this on a Release target and the code worked properly. Is there a reason that Visual Studio does not allow SSE2 instructions in Debug Mode? Thanks, Tyler

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by gumpy macdrunken
is the data aligned on a 16-byte boundary?

Yup, you have to manually align the variable on the stack to a 16-byte boundary. Or alternatively use movdqu to access it (with a performance penalty).

Thus something like this should work (feel free to wrap it up in a template/macro):
unsigned char data_buf[16+15];
unsigned char *data = (unsigned char *) ((uintptr_t) &data_buf[15] & ~15);
edit: Perhaps this would work:
template <typename T, size_t alignment = 16>
class aligned {
enum { MASK = alignment - 1 };
char _data[sizeof(T) + MASK];

T *instance() {
return (T *) ((uintptr_t) &_data[MASK] & ~MASK);
}

public:
operator T *() {
return instance();
}

T *operator ->() {
return instance();
}
};

Share this post


Link to post
Share on other sites
Quote:
Original post by doynax
Quote:
Original post by gumpy macdrunken
is the data aligned on a 16-byte boundary?

Yup, you have to manually align the variable on the stack to a 16-byte boundary. Or alternatively use movdqu to access it (with a performance penalty).

Thus something like this should work (feel free to wrap it up in a template/macro):
unsigned char data_buf[16+15];
unsigned char *data = (unsigned char *) ((uintptr_t) &data_buf[15] & ~15);
edit: Perhaps this would work:
template <typename T, size_t alignment = 16>
class aligned {
enum { MASK = alignment - 1 };
char _data[sizeof(T) + MASK];

T *instance() const {
const char *null = NULL;
return (T *) ((uintptr_t) &_data[MASK] & ~MASK);
}

public:
operator T *() const {
return instance();
}

T *operator ->() const {
return instance();
}
};


with microsoft compilers you can just use this.

Share this post


Link to post
Share on other sites
I aligned the data to 16 bytes and it worked properly. Does the Release Mode automatically align the data members? Maybe that is why I was not getting an error in Release Mode.

Thanks for your help.

Tyler

Share this post


Link to post
Share on other sites
Quote:
Original post by ttigue
I aligned the data to 16 bytes and it worked properly. Does the Release Mode automatically align the data members? Maybe that is why I was not getting an error in Release Mode.
No, you were probably just lucky.
The stack is normally 4 byte aligned so you've got a 25% chance of success provided that you only call the function from one path in call hierarcy.

Beware that SSE-2's 128-bit MMX is, in my experience, generally not worth the effort compared to plain old 64-bit MMX. At least I couldn't manage to get more than a 10-20% speedup at best (but none at all most of the time).
Quote:
Original post by gumpy macdrunken
with microsoft compilers you can just use this.
Unfortunately GCC's alignment directive doesn't handle the stack at all so you still have to use hacks like this when writing portable code.

Share this post


Link to post
Share on other sites
Quote:
Original post by doynax
Unfortunately GCC's alignment directive doesn't handle the stack at all so you still have to use hacks like this when writing portable code.

i wasn't aware of that. #pragma pack is supported by both compilers. could this be used instead?

Share this post


Link to post
Share on other sites
Quote:
Original post by gumpy macdrunken
Quote:
Original post by doynax
Unfortunately GCC's alignment directive doesn't handle the stack at all so you still have to use hacks like this when writing portable code.

i wasn't aware of that. #pragma pack is supported by both compilers. could this be used instead?
I don't see how.. That directive only controls padding between structure members which is already properly handled by the compiler for 128-bit datatypes.
The real problem is that GCC doesn't align the instances themselves when you allocate them on the stack.

To do this properly a compiler would have add extra code to adjust the function's stack frame for such functions, which takes about two instructions and forces the function to use a stack frame register.
lea ebp,[esp-frame-16]
and ebp,-16
sub esp,frame+16

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!