Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


SSE vector normalization


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
14 replies to this topic

#1 masterbubu   Members   -  Reputation: 157

Like
0Likes
Like

Posted 16 July 2012 - 12:11 AM

Hi,

this is the function I use for normalize vec3

inline const CVector3SSE& CVector3SSE::Normalize()
{
  m_fValsSSE = _mm_mul_ps(m_fValsSSE, _mm_rsqrt_ps(_mm_dp_ps(m_fValsSSE, m_fValsSSE, 0x7F)));
  return *this;
}

the problem is that the sqrt can be zero.

I want to first check if all the coords of the vector are none zero, and only then calculate the sqrt.

if ( *this != ZERO_VEC)
normalize

I have thought about sum all the components in the register and check the res... but I'm not sure how to do that Posted Image

Any solution will be welcome.

tnx

Edited by masterbubu, 16 July 2012 - 12:18 AM.


Sponsor:

#2 Martins Mozeiko   Crossbones+   -  Reputation: 1422

Like
0Likes
Like

Posted 16 July 2012 - 12:24 AM

Why do you want to check for zero vector here? Isn't that a bug in your algorithm if you want to normalize zero vector? Because If you want to normalize vector, then it means you want to get vector with same direction, but with length equal to 1. There is no way you can do that with zero vector - it has no direction.

Edited by Martins Mozeiko, 16 July 2012 - 12:24 AM.


#3 Hodgman   Moderators   -  Reputation: 31214

Like
0Likes
Like

Posted 16 July 2012 - 12:56 AM

What do you want the result to be when someone tries to normalize a zero-length vector?

#4 masterbubu   Members   -  Reputation: 157

Like
0Likes
Like

Posted 16 July 2012 - 05:32 AM

HI,

Martins Mozeiko:
I do some mathematics operations on the vector, and then I'm normalizing it.
It is possible that this vector turns to be Zero vector. I just want to check it b4 I normalize it.


Hodgman:
if the vector is the Zero vector, than don't normalize it. I just need away to determine if it is a Zero vector ( fast ).

#5 Hodgman   Moderators   -  Reputation: 31214

Like
0Likes
Like

Posted 16 July 2012 - 05:43 AM

N.B. I'm not very experienced with SSE, so this might not be the best solution.

You could use _mm_cmpneq_ps to compare the vector against (0,0,0,0), which sets each component of the result to either 0xffffffff or 0x0.
You could then use _mm_movemask_ps to OR those 4 result values into a single integer. This integer will be non-zero if any of the original inputs were non-zero, and will be zero if all of the original inputs were zero.

#6 masterbubu   Members   -  Reputation: 157

Like
0Likes
Like

Posted 21 July 2012 - 09:26 AM

Tnx, I'll try that.

All the vector normalization examples I saw online, did not take care for the Zero vector case.

Why is that?

Anyone knows the effect on the performances?

#7 Cornstalks   Crossbones+   -  Reputation: 6991

Like
0Likes
Like

Posted 21 July 2012 - 09:33 AM

All the vector normalization examples I saw online, did not take care for the Zero vector case.

Why is that?

Most likely laziness or ignorance.

I don't know about performance here. I don't know anything about SSE stuff.
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#8 Bacterius   Crossbones+   -  Reputation: 9098

Like
0Likes
Like

Posted 21 July 2012 - 06:46 PM

I don't know if it's laziness/ignorance. A zero vector means your code logic has failed somewhere (or your program inputs are invalid), and is not be a normal occurrence that should be checked against beyond an assert in debug mode. I would just let it be, it will cause an exception in due time. Doing a conditional check for each normalization defeats the performance advantages of using SIMD instructions, imho.

Edited by Bacterius, 21 July 2012 - 06:48 PM.

The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#9 Cornstalks   Crossbones+   -  Reputation: 6991

Like
0Likes
Like

Posted 21 July 2012 - 08:35 PM

I don't know if it's laziness/ignorance. A zero vector means your code logic has failed somewhere (or your program inputs are invalid), and is not be a normal occurrence that should be checked against beyond an assert in debug mode.

That's probably what I'd do, is use a debug assert. Anything less than that and I'd most likely (personally) call it laziness or ignorance. Anything more than that and I'd most likely call it pedantic (which may or may not be what your project needs).

Edited by Cornstalks, 21 July 2012 - 09:31 PM.

[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#10 RobTheBloke   Crossbones+   -  Reputation: 2349

Like
1Likes
Like

Posted 26 July 2012 - 04:44 AM


inline const CVector3SSE& CVector3SSE::Normalize()

{

  static const __m128 almostZero = _mm_set1_ps(1e-5f);

  __m128 dp = _mm_dp_ps(m_fValsSSE, m_fValsSSE, 0x7F);

  const __m128 cmp = _mm_gt_ps(dp, almostZero);

  dp = _mm_rsqrt_ps(dp);

  m_fValsSSE = _mm_mul_ps(m_fValsSSE, _mm_and_ps(dp, cmp));

  return *this;

}



#11 DerekEhrman   Members   -  Reputation: 127

Like
0Likes
Like

Posted 27 July 2012 - 09:01 AM

There is actually a very valid reason to simply crash on a 0 vector in your normalization function and I promise it has nothing to do with "ignorance or laziness". The entire point of using SSE is that it is high performance code. Normalization of a 0 vector, as previously stated is technically an invalid operation.

Adding vector validation (i.e. checking for a 0 vector) in your normalization code will introduce unnecessary run-time overhead (in the form of potential branch mis-predictions and LHS) to a performance-sensitive area of your code. As the operation in question is technically invalid, those concerned with performance will opt to have the function crash rather than introduce the overhead. If this code crashes, the real bug lies elsewhere (the attempt to normalize a 0 vector ... why is your vector 0? Why are you trying to normalize it if it is? These are the bugs you should be concerned with).

If there is a case in your code where you *may* be normalizing a 0 vector (direction derived via velocity when player is standing still perhaps?), then you should validate the vector *before* the attempt to normalize. The reason for this is that these cases are likely few and far between, and introducing the overhead that I explained above to *every* instance of a call to normalize is unfairly penalizing everyone who calls the function, whether they have a chance to pass a 0 vector or not.

#12 Cornstalks   Crossbones+   -  Reputation: 6991

Like
0Likes
Like

Posted 27 July 2012 - 09:05 AM

*snip*

Sounds to me like a perfect time to use assert, like was suggested...
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#13 DerekEhrman   Members   -  Reputation: 127

Like
0Likes
Like

Posted 27 July 2012 - 09:20 AM

Sounds to me like a perfect time to use assert, like was suggested...


Agreed, I had meant to reiterate that a debug-only assertion was a valid solution! :)

#14 Cornstalks   Crossbones+   -  Reputation: 6991

Like
0Likes
Like

Posted 27 July 2012 - 09:21 AM


Sounds to me like a perfect time to use assert, like was suggested...


Agreed, I had meant to reiterate that a debug-only assertion was a valid solution! Posted Image

Ah, I see, I thought you were saying an assert was a bad idea. Looks like we're on the same page :)
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

#15 Dave Eberly   Members   -  Reputation: 1161

Like
2Likes
Like

Posted 31 July 2012 - 11:44 PM

inline const CVector3SSE& CVector3SSE::Normalize()
{
  static const __m128 almostZero = _mm_set1_ps(1e-5f);
  __m128 dp = _mm_dp_ps(m_fValsSSE, m_fValsSSE, 0x7F);
  const __m128 cmp = _mm_gt_ps(dp, almostZero);
  dp = _mm_rsqrt_ps(dp);
  m_fValsSSE = _mm_mul_ps(m_fValsSSE, _mm_and_ps(dp, cmp));
  return *this;
}


Although yours is the standard way folks do the normalization, for large components the dot product overflows. If you need something that is robust for all finite floating-point inputs,
inline __m128 MaximumAbsoluteComponent (__m128 const v)
{
	__m128 SIGN = _mm_set1_ps(0x80000000u);
	__m128 vAbs = _mm_andnot_ps(SIGN, v);
	__m128 max0 = _mm_shuffle_ps(vAbs, vAbs, _MM_SHUFFLE(0,0,0,0));
	__m128 max1 = _mm_shuffle_ps(vAbs, vAbs, _MM_SHUFFLE(1,1,1,1));
	__m128 max2 = _mm_shuffle_ps(vAbs, vAbs, _MM_SHUFFLE(2,2,2,2));
	__m128 max3 = _mm_shuffle_ps(vAbs, vAbs, _MM_SHUFFLE(3,3,3,3));
	max0 = _mm_max_ps(max0, max1);
	max2 = _mm_max_ps(max2, max3);
	max0 = _mm_max_ps(max0, max2);
	return max0;
}

inline __m128 Normalize (__m128 const v)
{
	// Compute the maximum absolute value component.
	__m128 maxComponent = MaximumAbsoluteComponent(v);

	// Divide by the maximum absolute component.  This is potentially a divide by zero.
	__m128 normalized = _mm_div_ps(v, maxComponent);

	// Set to zero when the original length is zero.
	__m128 zero = _mm_setzero_ps();
	__m128 mask = _mm_cmpneq_ps(zero, maxComponent);
	normalized = _mm_and_ps(mask, normalized);

	// (sqrLength, sqrLength, sqrLength, sqrLength)
	__m128 sqrLength = _mm_dp_ps(normalized, normalized, 0x7F);

	// (length, length, length, length)
	__m128 length = _mm_sqrt_ps(sqrLength);

	// Divide by the length to normalize.  This is potentially a divide by zero.
	normalized = _mm_div_ps(normalized, length);

	// Set to zero when the original length is zero or infinity.  In the latter case, this is considered to be an unexpected condition.
	normalized = _mm_and_ps(mask, normalized);
	return normalized;
}





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS