Jump to content
  • Advertisement
Sign in to follow this  
Koobazaur

Multiplication vs Division?

This topic is 2806 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

As I mentioned in another thread Arx Fatalis Source Code has been released. Poking through it I noticed this little magnificent tidbit:


#define DIV2 0.5f
#define DIV3 0.3333333333333333333333f
#define DIV4 0.25f
#define DIV5 0.2f
#define DIV6 0.1666666666666666666666f
#define DIV7 0.1428571428571428571428f
#define DIV8 0.125f
#define DIV9 0.1111111111111111111111f
#define DIV10 0.1f
#define DIV11 0.0909090909090909090909f
#define DIV12 0.0833333333333333333333f
#define DIV13 0.0769230769230769230769f
#define DIV14 0.0714285714285714285714f
#define DIV15 0.0666666666666666666667f
#define DIV16 0.0625f
#define DIV17 0.0588235294117647058823f
#define DIV18 0.0555555555555555555555f
...


Resulting in code such as:

float myFloat = someVal * DIV2;

instead of

float myFloat = someVal / 2.0f;

I figured "damn that's clever yet simple way of optimizing code!" But, of course, I decided to benchmark this out of curiosity, and multiplication has proven to be some 5 times faster than division WHOA......... in Debug. When I switched to release, the results were pretty much identical.

Out of curiosity, any thoughts why this "optimization" might have been used if not producing any real benefit while complicating the code? Just speeding up Debugging? Or maybe some older processors do handle mul faster than div?

Share this post


Link to post
Share on other sites
Advertisement
A modern optimizing compiler should by itself do the faster thing which will usually be a multiplication instead of a division. As you noticed, Release builds (which generally are configured to do such optimizations) have identical speeds while debug builds (where such optimizations are typically disabled) are have different speeds. Since Arx Fatalis is quite old it is quite likely the they were using a compiler which was unable to do such optimizations by itself.

Share this post


Link to post
Share on other sites

A modern optimizing compiler should by itself do the faster thing which will usually be a multiplication instead of a division.

Some useful reading on floating point optimization here. Floating point optimization may not be as trivial as you may think, so don't count on this happening, simply because transforming the division into its reciprocal multiplication may not produce the same result. If it does, the behavior of the program has been altered, and the compiler has not done its job properly (in a strict sense). The article concerns VS 2005, but as far as I have seen, it still applies in VS 2010, and the default settings won't do the division to multiplication transformation with the default project settings.

Share this post


Link to post
Share on other sites
I wouldn't bother. It is a micro-optimisation, which can be applied where needed if profiling indicates a problem. In other cases, it just reduces the readability of the code. At the very least don't use the pre-processor for this, use the "const" keyword if you must. That said, I do generally multiply by 0.5f rather than divide by 2, but that is partly to make integer division less likely and I don't feel it impacts the code readability too much.

As Brother Bob mentioned, there are compiler settings which you can play with to make it more likely that this optimisation will be used. But heed the warning over the correctness issue.

Share this post


Link to post
Share on other sites
Apart from all the issues already mentioned, it's also problematic that the constants are hard coded as single precision floats. What if, for some reason, you need to calculate something in double precision and use one of those macros, forgetting this little detail? Depending on the ordering of the expression, the compiler might cast the doubles to float, make the calculation, and cast back to double to store it, throwing away the benefit of using doubles in the first place.

Share this post


Link to post
Share on other sites
any thoughts why this "optimization" might have been used[/quote]
There was a time when division was absurdly slow. There was also time when multiplication was so slow that lookup tables were used instead.

It may be worth it, it might not be. This type of optimizations are moving targets. Sometimes they work, sometimes they don't.

But at least you benchmarked, which is more than 99% of people do and answered the question. This is also the correct approach.

-------
Going into realm of mostly speculation here:

The running time of the above operations is probably dominated by load/store and other memory accesses. Most operations today are IO bound. Either by DRAM or other CPU caches. Load store means that variable is first loaded from memory to register, manipulated in register, then written back. This alone can introduce a ton of side-effects which slow things down.

Ideally, one would convert the above into SIMD, then apply single multiplication/division to thousands of elements in a row.

Also, if optimizing for throughput, one always wants to maximize the reads:writes ratio. Try to keep values that change in registers for as long as possible and minimize writes. Reading is mostly fine, especially if streamed.


This works for bulk operations, but when dealing with very fine-grained and conditional logic, things simply aren't cost-effective to optimize this way.

Share this post


Link to post
Share on other sites
There is something that a lot of people seem to not understand. "Do not optimize until you have found a performance issue." In most cases the parts of the code that people think have will have a performance issue in is not the area that is really the problem. Next, it is a waste of time and effort until you find there is a real problem. Finally most of the optimization that people do doesn't really help much.

theTroll

Share this post


Link to post
Share on other sites

In most cases the parts of the code that people think have will have a performance issue in is not the area that is really the problem.

Gather 'round children, and I shall read to ye from the good book...Code Complete 2.

And brother McConnell sayeth in regards to performance "...programmers are notoriously poor at guessing which code really causes problems."

Amen.

Share this post


Link to post
Share on other sites
Instead of thisdouble x = y / 7.0;
you can write thisdouble x = y * (1.0 / 7);and you can be fairly sure that the correct constant double value will be calculated for you at compile time, probably even in debug configuration.
The #defines are a waste of time.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!