Jump to content
  • Advertisement
Sign in to follow this  
fathom88

More Newbie SSE Questions (Sorry)

This topic is 4643 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I was playing around with my code. Part of it uses an algorithm which uses a large array table. At any rate, I found dramatic difference in performance between a debug version and a release version. I was surprised to say the least because in the past it never seemed to make much of a difference. However, I did not notice a difference between my SSE verison and pure version with VC++. Later, I noticed a difference when I increased the load (my algorithm can be ajusted). I had to triple my load before I saw a difference. Here is my question. How can these two be the same. I cut and pasted from an example. int i; float* pSource1 = pArray1; float* pSource2 = pArray2; float* pDest = pResult; for ( i = 0; i < nSize; i++ ) { //do some calcs pSource1++; pSource2++; pDest++; } //// SSE version int nLoop = nSize/ 4; __m128 m1, m2, m3, m4; __m128* pSrc1 = (__m128*) pArray1; __m128* pSrc2 = (__m128*) pArray2; __m128* pDest = (__m128*) pResult; for ( int i = 0; i < nLoop; i++ ) { //do some calcs pSrc1++; pSrc2++; pDest++; } Aren't you missing some values when you do N/4 instead of N? Does incrementing the pointer ++ mean you go to the next element in the array or the 4th. Sorry if it's stupid question. I'm a total newbie. Thanks for any replies.

Share this post


Link to post
Share on other sites
Advertisement
The SSE-version processes 4 items in each iteration of the loop, so you only need n/4 iterations. If n%4!=0 you have a problem.

++pointer advances the pointer to the first position after the thing currently pointed at. As __m128 contains 4 items ++pointer advances the pointer by 4 items.

Share this post


Link to post
Share on other sites
In other words, ++pointer advances the pointer sizeof(*pointer) bytes. Now, if we have an array of floats, ++pointer simply advances it to point to the next float. But here:

__m128* pSrc1 = (__m128*) pArray1;
__m128* pSrc2 = (__m128*) pArray2;
__m128* pDest = (__m128*) pResult;

we're effectively telling the compiler to think that the array is not made of floats, but of __m128's. This works because a __m128 internally consists of four floats - thus, sizeof(__m128) == 4 * sizeof(float) and incrementing the pointer once jumps four floats (but only one __m128) forward.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!