Subscribe to GameDev.net Direct to receive the latest updates and exclusive content.
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.
Posted 25 April 2014 - 03:59 AM
Crealysm game & engine development: http://www.crealysm.com
Looking for a passionate, disciplined and structured producer? PM me
Posted 25 April 2014 - 04:13 AM
POPULAR
Well, distance comparisons don't "require" square roots as the square root part of the Euclidean metric does not change the order of comparisons (since the square root function is strictly increasing). Similarly, for light attenuation, you typically don't need the distance but the distance squared, so what is the point of calculating the square root just to square the result immediately after? Unless you need the actual distance/radius at some point, I don't see what you gain by doing the computation.
So, where do the "3 to 6 multiplications and value assignments" come in when doing distance comparisons or distance squared computations? To me it just seems like taking the square root is straight up a waste of energy here. If you have a specific situation where avoiding doing a square root requires some extra work, please mention it, because the examples you give don't really seem relevant to your question.
“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”
Posted 25 April 2014 - 04:16 AM
POPULAR
Point in sphere and distance checks by themselves can be done by comparing the squared distance to the squared radius, thereby avoiding sqrt.
When you actually need sqrt... it's not very evil on newer desktop processors, but at the same time the other instructions have also gotten faster, so they can still be relatively faster.
There are also special instructions on many newer processors for calculating them. One reference I found put sqrt for a single float in SSE at 19 clockcycles, while an instruction for 1 / sqrt which is only an approximation with some number of bits accuracy only takes 3 cycles so if that would work then it would probably be the fastest way.
Posted 25 April 2014 - 05:13 AM
Its not that evil but the fastest code is the code that you never have to run. So if you don't need to do a sqrt then don't do it.
Posted 25 April 2014 - 05:27 AM
Normalization involves a sqrt, so since normalization is used so much, you can safely assume that GPUs are optimized for it.
Of course, using the built-in normalize instruction rather than writing your own normalization would be advised to take advantage of where it may be implemented directly in the hardware.
As the others said, for distance comparisons/etc, where you can get away without doing a sqrt then do so.
It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.
Posted 25 April 2014 - 06:16 AM
POPULAR
Posted 25 April 2014 - 11:41 AM
GPUs have a sqrt instruction which is a single instruction (edit: it's not just 1 cycle I think), so taking the x1*x1 + y1*y1 ...> x2*x2 + y2*y2 ... comparison can actually end up being slower than just doing sqrt(vecn(...),vecn(...0).
CPUs have a similar instruction as well, but it's (afaik) SIMD only. However, it wouldn't surprise me if compilers implemented std::sqrt simply by calling that simd sqrt.
edit: yes, seems I was wrong. Thanks for correcting me.
Edited by agleed, 26 April 2014 - 06:35 AM.
Posted 25 April 2014 - 12:02 PM
Thanks, this gives a good view of what (not to) do.
I'll keep in mind that every sqrt (or anything else ) that isn't really necessary, shouldn't be done at all.
Two examples:
1. My CoordToCoord distance function:
float CoordToCoordDist(const D3DXVECTOR3 pv1, const D3DXVECTOR3 pv2) { return sqrt(pow(pv1.x - pv2.x, 2) + pow(pv1.y - pv2.y, 2) + pow(pv1.z - pv2.z, 2)); }
How would you do that without the sqrt?
2. Point in sphere.
I currently save the radius of my bounding spheres as 'normal' radius. I take the CoordToCoord distance from world center of the sphere to the point I'm checking. This distance I compare to the radius. That would basically be solved if the above CoordToCoord distance function returns the squared distance. In that I could initially take the squared radius of the sphere and save that (and when updating also keep the squared radius).
Note: in my shaders I don't use any sqrt at the moment, I'll ook into how I do my attenuation at the moment.
Of course there are some normalizations in my VS/PS, which I think are needed (and cannot be done without a square root).
Crealysm game & engine development: http://www.crealysm.com
Looking for a passionate, disciplined and structured producer? PM me
Posted 25 April 2014 - 01:07 PM
GPUs have a sqrt instruction which is a single instruction (edit: it's not just 1 cycle I think), so taking the x1*x1 + y1*y1 ...> x2*x2 + y2*y2 ... comparison can actually end up being slower than just doing sqrt(vecn(...),vecn(...0).
CPUs have a similar instruction as well, but it's (afaik) SIMD only. However, it wouldn't surprise me if compilers implemented std::sqrt simply by calling that simd sqrt.
Thanks, this gives a good view of what (not to) do.
I'll keep in mind that every sqrt (or anything else ) that isn't really necessary, shouldn't be done at all.
Two examples:
1. My CoordToCoord distance function:float CoordToCoordDist(const D3DXVECTOR3 pv1, const D3DXVECTOR3 pv2) { return sqrt(pow(pv1.x - pv2.x, 2) + pow(pv1.y - pv2.y, 2) + pow(pv1.z - pv2.z, 2)); }How would you do that without the sqrt?
2. Point in sphere.
I currently save the radius of my bounding spheres as 'normal' radius. I take the CoordToCoord distance from world center of the sphere to the point I'm checking. This distance I compare to the radius. That would basically be solved if the above CoordToCoord distance function returns the squared distance. In that I could initially take the squared radius of the sphere and save that (and when updating also keep the squared radius).
Note: in my shaders I don't use any sqrt at the moment, I'll ook into how I do my attenuation at the moment.
Of course there are some normalizations in my VS/PS, which I think are needed (and cannot be done without a square root).
Edited by Ohforf sake, 25 April 2014 - 01:13 PM.
Posted 25 April 2014 - 01:16 PM
GPUs have a sqrt instruction which is a single instruction (edit: it's not just 1 cycle I think), so taking the x1*x1 + y1*y1 ...> x2*x2 + y2*y2 ... comparison can actually end up being slower than just doing sqrt(vecn(...),vecn(...0).
...good job they've also got a dot product instruction then...!
(dot (v1, v1) > dot (v2, v2))
Edited by mhagain, 25 April 2014 - 01:17 PM.
It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.
Posted 25 April 2014 - 03:44 PM
@Ohforf sake: thanks, I made a 2nd CoordToCoord distance function, now for quared distance:
float CoordToCoordDistSqr(const D3DXVECTOR3 &pv1, const D3DXVECTOR3 &pv2) { return D3DXVec3LengthSq(&D3DXVECTOR3(pv1-pv2)); }
To be sure I use any possible optimizations, I also changed the non-squared version:
float CoordToCoordDist(const D3DXVECTOR3 &pv1, const D3DXVECTOR3 &pv2) { return D3DXVec3Length(&D3DXVECTOR3(pv1 - pv2)); }
Now that's done, I'll go through my code where I call the CoordToCoord distance function and see what I compare the result too. For example radius of a sphere I can do "radius*radius" like you said. The same probably goes for checking distance between mesh/ renderable center's and point lights, versus point light radius (radius would be radius*radius then).
Thanks for the help.
Crealysm game & engine development: http://www.crealysm.com
Looking for a passionate, disciplined and structured producer? PM me
Posted 25 April 2014 - 07:36 PM
OMG! You have three pow calls and you worry about a sqrt?! A pow is significantly more expensive.1. My CoordToCoord distance function:
float CoordToCoordDist(const D3DXVECTOR3 pv1, const D3DXVECTOR3 pv2) { return sqrt(pow(pv1.x - pv2.x, 2) + pow(pv1.y - pv2.y, 2) + pow(pv1.z - pv2.z, 2)); }
float tmp = pv1.x - pv2.x; tmp = tmp * 2; //or tmp = tmp + tmp;Whether "tmp * 2" is better than "tmp + tmp" depends on the architecture you're running. On one hand, you've got addition vs multiplication, and often addition has lower latency than multiplication. On the other hand, the multiplication is against a constant value, and some archs may have special optimizations for that (i.e. custom opcodes, better pipelining). However both of them will be a zillion times better than a pow( tmp, 2 ).
Edited by Matias Goldberg, 25 April 2014 - 07:36 PM.
Posted 26 April 2014 - 03:00 AM
OMG! You have three pow calls and you worry about a sqrt?! A pow is significantly more expensive.
Just do
float tmp = pv1.x - pv2.x;
tmp = tmp * 2; //or tmp = tmp + tmp;Whether "tmp * 2" is better than "tmp + tmp" depends on the architecture you're running.
I'm afraid this is not correct, shouldn't it be tmp * tmp?
(tmp * 2 would only work if it were always 2 )
I now have 2 coord to coord distance functions, one squared and one non-squared.
Next step is going through my codebase and see where I can use the squared one and multiply the other variable by itself (that or saving the original value squared, the 2nd doesn't sound that good because I would then have to keep track of this always and rename all member vars).
Crealysm game & engine development: http://www.crealysm.com
Looking for a passionate, disciplined and structured producer? PM me
Posted 26 April 2014 - 08:33 AM
OMG! You have three pow calls and you worry about a sqrt?! A pow is significantly more expensive.
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.
GameDev.net™, the GameDev.net logo, and GDNet™ are trademarks of GameDev.net, LLC.