Fast sqrt

Started by
12 comments, last by grhodes_at_work 19 years, 5 months ago
There's a really cool paper about this fast sqrt trick, where everything is derived. It's from the thread posted here by Nice Coder.

invsqrt
Advertisement
Isn't there a SSE sqrt instruction? Anyone know how that compares?
I'm going to close the thread, since the topic has been discussed in immense detail throughout the archives.

In addition to the other thread created within the past few days on the same subject, I found many, many, MANY discussions on fast sqt dating back to 2001. I don't see much new in this thread. If anyone has a compelling argument why the thread should remain open, please send me a private message and make your case strongly. The topic of fast sqrt has been covered to death and I will double check any argument to reopen the thread against the archives to see if the argument holds water.
Graham Rhodes Moderator, Math & Physics forum @ gamedev.net
OK,

I reopened the thread just to post this example from superpig. It seems like a good contribution that may be useful. The thread is closing back immediately.

Quote:From superpig via private message
SSE has both SQRT and RSQRT instructions, but you don't really get much benefit unless you're doing four of them at once. Say you want to get the lengths of four vectors, stored as a structure of arrays:

__declspec(align(16)) struct blockOfFourVectors{float xValues[4];float yValues[4];float zValues[4];float lengths[4];}blockOfFourVectors data;__asm{ movaps xmm0, [data + 0x00] ; load x components movaps xmm1, [data + 0x10] ; load y components movaps xmm2, [data + 0x20] ; load z components  ; square each component mulaps xmm0, xmm0 mulaps xmm1, xmm1 mulaps xmm2, xmm2 ; sum them into xmm0 addps xmm0, xmm1 addps xmm0, xmm2  ; sqrt to get length sqrtps xmm0, xmm0 ; save out movaps [data + 0x30], xmm0}


That would calc all four lengths into data.lengths. For just a single vector it's not really worthwhile (and unless you store that vector in a SoA, would require a load of shuffling).


Graham Rhodes Moderator, Math & Physics forum @ gamedev.net

This topic is closed to new replies.

Advertisement