• 12
• 14
• 13
• 10
• 11

# Help with SIMD

This topic is 2205 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I'm trying to learn SIMD, and I was trying to optimize this function below, but I can't get it quite to work. I think maybe I'm overlooking the problem.

This is the original function without any SIMD which I'm trying to convert.

 void MetalBall::Tick() { if ((m_X -= .2f) < -50) { m_X = SCRWIDTH * 4; } m_Sprite->Draw( (int)m_X, (int)m_Y, m_Surface ); Pixel* src1 = m_Sprite->GetBuffer() - 1; for ( int y = 0; y < 50; y++ ) { int ty = (int)m_Y + y; if((ty < 0) || (ty >= SCRHEIGHT)) { src1 += 50; continue; } float dy = a[y]; int sy = (int)(((m_Y + 25) + (int)(sinxy[y] * dy) + SCRHEIGHT)) % SCRHEIGHT; for ( int x = 0; x < 50; x++ ) { src1++; int tx = (int)m_X + x; if ((tx < 0) || (tx >= SCRWIDTH)) { continue; } if (!(*src1 & 0xffffff)) { continue; } float dx = a[x]; float l = presqrt[y][x]; int sx = (int)(((m_X + 25) + (int)(sinxy[y] * dx) + SCRWIDTH)) % SCRWIDTH; Pixel* src2 = m_Surface->GetBuffer() + sx + sy * m_Surface->GetPitch(); Pixel* dst = m_Surface->GetBuffer() + tx + ty * m_Surface->GetPitch(); *dst = AddBlend( *src1, *src2 & 0xffff00 ); } } } 

This is the attempt in SIMD I tried to do.

 void MetalBall::Tick() { if ((m_X -= .2f) < -50) { m_X = SCRWIDTH * 4; } m_Sprite->Draw( (int)m_X, (int)m_Y, m_Surface ); Pixel* src1 = m_Sprite->GetBuffer() - 1; static union{float my[4]; __m128 my4;}; static union{float dy[4]; __m128 dy4;}; static union{int ty[4]; __m128i ty4;}; static union{int tx[4]; __m128i tx4;}; static union{float mx[4]; __m128 mx4;}; static union{float dx[4]; __m128 dx4;}; mx4 = _mm_add_ps(_mm_set_ps1(m_X), _mm_set_ps1(25)); my4 = _mm_add_ps(_mm_set_ps1(m_Y), _mm_set_ps1(25)); for ( int y = 0; y < 50/4; y+=4) { ty4 = _mm_add_epi32(_mm_set1_epi32(m_Y), _mm_set1_epi32(y)); for(int k=0; k<4; k++) { if((ty[k] < 0) || (ty[k] >= SCRHEIGHT)) { src1 += 50; continue; } dy4 = _mm_set_ps1(a[y]); int sy = (int)((my[k] + (int)(sinxy[y] * dy[k]) + SCRHEIGHT)) %SCRHEIGHT; for ( int x = 0; x < 50/4; x+=4 ) { src1++; tx4 = _mm_add_epi32(_mm_set1_epi32(m_X), _mm_set1_epi32(x)); if ((tx[k] < 0) || (tx[k] >= SCRWIDTH) ) { continue; } if (!(*src1 & 0xffffff)) { continue; } dx4 = _mm_set_ps1(a[x]); float l = presqrt[y][x]; int sx = (int)((mx[k] + (int)(sinxy[x] * dx[k]) + SCRWIDTH)) %SCRWIDTH; Pixel* src2 = m_Surface->GetBuffer() + sx + sy * m_Surface->GetPitch(); Pixel* dst = m_Surface->GetBuffer() + tx[k] + ty[k] * m_Surface->GetPitch(); *dst = AddBlend( *src1, *src2 & 0xffff00 ); } } } } 
If someone would be kind enough to help me out, I would be very grateful.

##### Share on other sites
Friendly tip: put your code between [code ][/code ] tags (without the spaces). Before you do that though, copy your code to a plain text editor (like Notepad) so that formatting information is removed (also convert tabs to spaces, as the tabs won't show up on GameDev), then copy it from your plain text editor into your post, and surround it with the code tags. The easier it is for someone to read the code, the easier it is for them to help you.

If you don't copy it into a plain text editor first, you can end up with something like this:
[spoiler]It's in code tags, but it's not very pretty. Notice the URL had a [url ] tag placed on it, and the tabs in my original code don't show up, so my code isn't indented.
// Nevermind the potential incorrectness of the code... class Sphere { public: float radius; Vector3f position; bool intersect(const Ray& ray) { // from http://wiki.cgsociet...re_Intersection const Vector3f t = ray.start - position; float a = ray.direction.squaredLength(); float b = 2.0f * ray.direction.dot(t); float c = t.squaredLength() - radius * radius; float inside = b * b - 4.0f * a * c; if (inside < 0) { return false; } // This increases floating point preceision float q = (b < 0) ? (-b + std::sqrt(inside)) / 2.0f : (-b - std::sqrt(inside)) / 2.0f; float t0 = q / a; // first intersection float t1 = c / q; // second intersection return true; } };[/spoiler]

Ah, you added code tags while I was writing this. Now for the formatting [edit edit]: And sorry I'm no help with the SIMD.