SSE confusion !

Started by
10 comments, last by sepul 18 years, 1 month ago
Quote:Original post by bpoint
SSE does not work well horizontally. For calculating the dot product, you have to add X, Y, and Z across a register, which can only be done by shuffling. If you have SSE3, there is a single opcode which does this for you, though. (can't remember it off the top of my head...)


HADDPS.

Advertisement
you mean by using SSE3 instructions, we can optimize a single dot to something like this ?
(assuming Vectors 4th value is zero, due to my lack of skill in simd programming)

	inline float operator*( const vect& v ) const	{		float r;		_asm	{			mov esi, this			mov edi, v			movaps xmm0, [esi]			mulps xmm0, [edi]			// xmm0 = (x*v.x, y*v.y, z*v.z, 0)			haddps xmm0, xmm0			// xmm0 = (x*v.x + y*v.y, z*v.z, x*v.x + y*v.y, z*v.z)                        haddps xmm0, xmm0			// xmm0 = (x*v.x + y*v.y + z*v.z, ...)			movss r, xmm0		}		return r;	}


I don't have any SSE3 processor, so I can't test it, but do you think this peace of code can gain better performance than the normal dot code ?

dark-hammer engine - http://www.hmrengine.com

This topic is closed to new replies.

Advertisement