Jump to content
  • Advertisement
Sign in to follow this  
v.mommersteeg

Help with SIMD

This topic is 2379 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to learn SIMD, and I was trying to optimize this function below, but I can't get it quite to work. I think maybe I'm overlooking the problem.

This is the original function without any SIMD which I'm trying to convert.



void MetalBall::Tick()
{
if ((m_X -= .2f) < -50)
{
m_X = SCRWIDTH * 4;
}
m_Sprite->Draw( (int)m_X, (int)m_Y, m_Surface );
Pixel* src1 = m_Sprite->GetBuffer() - 1;
for ( int y = 0; y < 50; y++ )
{
int ty = (int)m_Y + y;
if((ty < 0) || (ty >= SCRHEIGHT))
{
src1 += 50;
continue;
}
float dy = a[y];
int sy = (int)(((m_Y + 25) + (int)(sinxy[y] * dy) + SCRHEIGHT)) % SCRHEIGHT;

for ( int x = 0; x < 50; x++ )
{
src1++;
int tx = (int)m_X + x;
if ((tx < 0) || (tx >= SCRWIDTH))
{
continue;
}
if (!(*src1 & 0xffffff))
{
continue;
}
float dx = a[x];
float l = presqrt[y][x];
int sx = (int)(((m_X + 25) + (int)(sinxy[y] * dx) + SCRWIDTH)) % SCRWIDTH;

Pixel* src2 = m_Surface->GetBuffer() + sx + sy * m_Surface->GetPitch();
Pixel* dst = m_Surface->GetBuffer() + tx + ty * m_Surface->GetPitch();
*dst = AddBlend( *src1, *src2 & 0xffff00 );
}
}
}



This is the attempt in SIMD I tried to do.


void MetalBall::Tick()
{
if ((m_X -= .2f) < -50)
{
m_X = SCRWIDTH * 4;
}

m_Sprite->Draw( (int)m_X, (int)m_Y, m_Surface );
Pixel* src1 = m_Sprite->GetBuffer() - 1;
static union{float my[4]; __m128 my4;};
static union{float dy[4]; __m128 dy4;};
static union{int ty[4]; __m128i ty4;};
static union{int tx[4]; __m128i tx4;};
static union{float mx[4]; __m128 mx4;};
static union{float dx[4]; __m128 dx4;};
mx4 = _mm_add_ps(_mm_set_ps1(m_X), _mm_set_ps1(25));
my4 = _mm_add_ps(_mm_set_ps1(m_Y), _mm_set_ps1(25));

for ( int y = 0; y < 50/4; y+=4)
{
ty4 = _mm_add_epi32(_mm_set1_epi32(m_Y), _mm_set1_epi32(y));

for(int k=0; k<4; k++)
{
if((ty[k] < 0) || (ty[k] >= SCRHEIGHT))
{
src1 += 50;
continue;
}

dy4 = _mm_set_ps1(a[y]);

int sy = (int)((my[k] + (int)(sinxy[y] * dy[k]) + SCRHEIGHT)) %SCRHEIGHT;

for ( int x = 0; x < 50/4; x+=4 )
{
src1++;

tx4 = _mm_add_epi32(_mm_set1_epi32(m_X), _mm_set1_epi32(x));

if ((tx[k] < 0) || (tx[k] >= SCRWIDTH) )
{
continue;
}
if (!(*src1 & 0xffffff))
{
continue;
}

dx4 = _mm_set_ps1(a[x]);

float l = presqrt[y][x];

int sx = (int)((mx[k] + (int)(sinxy[x] * dx[k]) + SCRWIDTH)) %SCRWIDTH;

Pixel* src2 = m_Surface->GetBuffer() + sx + sy * m_Surface->GetPitch();
Pixel* dst = m_Surface->GetBuffer() + tx[k] + ty[k] * m_Surface->GetPitch();

*dst = AddBlend( *src1, *src2 & 0xffff00 );
}
}
}
}

If someone would be kind enough to help me out, I would be very grateful.

Share this post


Link to post
Share on other sites
Advertisement
Friendly tip: put your code between [code ][/code ] tags (without the spaces). Before you do that though, copy your code to a plain text editor (like Notepad) so that formatting information is removed (also convert tabs to spaces, as the tabs won't show up on GameDev), then copy it from your plain text editor into your post, and surround it with the code tags. The easier it is for someone to read the code, the easier it is for them to help you.

If you don't copy it into a plain text editor first, you can end up with something like this:
[spoiler]It's in code tags, but it's not very pretty. Notice the URL had a [url ] tag placed on it, and the tabs in my original code don't show up, so my code isn't indented.
// Nevermind the potential incorrectness of the code...
class Sphere
{
public:
float radius;
Vector3f position;

bool intersect(const Ray& ray)
{
// from http://wiki.cgsociet...re_Intersection
const Vector3f t = ray.start - position;
float a = ray.direction.squaredLength();
float b = 2.0f * ray.direction.dot(t);
float c = t.squaredLength() - radius * radius;

float inside = b * b - 4.0f * a * c;
if (inside < 0)
{
return false;
}

// This increases floating point preceision
float q = (b < 0) ? (-b + std::sqrt(inside)) / 2.0f : (-b - std::sqrt(inside)) / 2.0f;

float t0 = q / a; // first intersection
float t1 = c / q; // second intersection

return true;
}
};
[/spoiler]

[edit]

Ah, you added code tags while I was writing this. Now for the formatting smile.png [edit edit]: And sorry I'm no help with the SIMD.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!