Declaring temporary variable to save 1 multiply?

Started by
7 comments, last by Chris_F 9 years, 8 months ago

Really simply question, really. Is it worth declaring a temporary (float) variable to reduce an operation from 2 to 1 multiplies?

For example,

x = a * b / c;

y = a * b / d;

or..

float ab = a * b;

x = ab / c;

y = ab / d;

I'm not very familiar with assembly language, or how everything breaks down when its compiled, so that's why I'm asking. I'm not trying to optimize (really), I'm actually hoping to do the opposite (avoid the temporaries). If the function wasn't part of my math library, I probably wouldn't even bother worrying about it. The method I'm writing generates the (matrix-like) direction vectors of a quaternion. In the method, there are 2 of 9 unique multiplies, which means I could either multiply 18 times, or declare 9 variables and multiply 9 times.

I originally wrote the function by declaring the temporary variables, but it would look a lot nicer to simply do the math twice, as long as the difference is negligible.

Thanks a bunch for any advice

Advertisement

The compiler will do that automatically when it makes sense, if optimizations are turned on. If the temporary doesn't make sense it will probably remove it and turn the second example into the first example.

In general however I would probably recommend using the temporary for expensive operations.. but I'm not sure a multiply applies.. if it was sqrt or sin/cos I would use the temporary.

EDIT: For float operations there is one more thing to consider.. the order of operations can slightly alter the results (as floats are approximations). For example (a * b) / c is not necessarily equal to a * (b / c). Therefore using a temporary can often be a good idea, as the compiler can be prevented from performing optimizations it could normally do (though this can depend on compiler settings).

The mechanism, that Erik referred to, is called Common Subexpression Elimination and pretty much every compiler supports it to some degree.

If you don't care about the exact order of the operations, you can turn on unsafe math optimizations (-ffast-math for the gcc compiler), which will allow the compiler to optimize more aggressively, possibly pulling out subexpressions even if this changes the order of additions within a sum, or the order of multiplications within a product.

Generally, if you have a large expression in which different parts of it meaningfully represent different values which make up the equation, then it makes sense to do this.

For example, it would be fruitless to do this when you're just doing some arbitrary operations. But if you have a formula which uses variables in it which you have to calculate, then it makes sense to abstract away these variables. In the distance formula:

d = sqrt((x2-x1)^2 + (y2-y1)^2)

(x2-x1) and (y2-y1) represent meaningful values: the x and y distance. Thus it would make sense to abstract the formula away to:

d = sqrt(xdistance^2 + ydistance^2)

Where:

xdistance = (x2 - x1)

and

ydistance = (y2-y1)

I'm a game programmer and computer science ninja !

Here's my 2D RPG-Ish Platformer Programmed in Python + Pygame, with a Custom Level Editor and Rendering System!

Here's my Custom IDE / Debugger Programmed in Pure Python and Designed from the Ground Up for Programming Education!

Want to ask about Python, Flask, wxPython, Pygame, C++, HTML5, CSS3, Javascript, jQuery, C++, Vimscript, SFML 1.6 / 2.0, or anything else? Recruiting for a game development team and need a passionate programmer? Just want to talk about programming? Email me here:

hobohm.business@gmail.com

or Personal-Message me on here !

Make your code readable and maintainable first. Which may mean giving a meaningful name and temporary variable to a common sub-expression because it makes sense, or it may not.

Only if your profiler actually tells you that this particular multiply is killing your performance should you actually change it.

Programmers (even ones who have been doing it for years) are horrible at knowing what is slowing down their code. Use and love your profiler smile.png

Thanks to everyone for the advice. I agree with pretty much everything said.

Only if your profiler actually tells you that this particular multiply is killing your performance should you actually change it.

Programmers (even ones who have been doing it for years) are horrible at knowing what is slowing down their code. Use and love your profiler smile.png

I'm always afraid to rely so heavily on profiling. A game (or any complex program) seems like a huge mess of fluctuating circumstances, where any small dynamic change can have a noticeable effect on performance. Is there a way to find the bottlenecks when the necks can morph and move around as the game state changes? In addition, newer programmers may write a lot of sluggish routines, which when used together, wouldn't create a bottleneck at all, right? Or at least not one that stands out much.

I don't think these two things are much of an issue for experienced programmers that know the impact that each code fragment they write will have on the CPU/GPU, but its especially challenging for those who are mostly in the dark about such things.

Thanks to everyone for the advice. I agree with pretty much everything said.

Only if your profiler actually tells you that this particular multiply is killing your performance should you actually change it.

Programmers (even ones who have been doing it for years) are horrible at knowing what is slowing down their code. Use and love your profiler smile.png

I'm always afraid to rely so heavily on profiling. A game (or any complex program) seems like a huge mess of fluctuating circumstances, where any small dynamic change can have a noticeable effect on performance. Is there a way to find the bottlenecks when the necks can morph and move around as the game state changes? In addition, newer programmers may write a lot of sluggish routines, which when used together, wouldn't create a bottleneck at all, right? Or at least not one that stands out much.

I don't think these two things are much of an issue for experienced programmers that know the impact that each code fragment they write will have on the CPU/GPU, but its especially challenging for those who are mostly in the dark about such things.

Yes. You find these bottlenecks through Profilers.

I'm a game programmer and computer science ninja !

Here's my 2D RPG-Ish Platformer Programmed in Python + Pygame, with a Custom Level Editor and Rendering System!

Here's my Custom IDE / Debugger Programmed in Pure Python and Designed from the Ground Up for Programming Education!

Want to ask about Python, Flask, wxPython, Pygame, C++, HTML5, CSS3, Javascript, jQuery, C++, Vimscript, SFML 1.6 / 2.0, or anything else? Recruiting for a game development team and need a passionate programmer? Just want to talk about programming? Email me here:

hobohm.business@gmail.com

or Personal-Message me on here !

I'm always afraid to rely so heavily on profiling. A game (or any complex program) seems like a huge mess of fluctuating circumstances, where any small dynamic change can have a noticeable effect on performance. Is there a way to find the bottlenecks when the necks can morph and move around as the game state changes? In addition, newer programmers may write a lot of sluggish routines, which when used together, wouldn't create a bottleneck at all, right? Or at least not one that stands out much.


Modern processors are fast at processing. In the x86 family, each processor currently has a maximum rate of 4 CPU instructions per cycle. Internally within the OOO core some modern processors can perform upwards of 20 micro-operations per cycle. You get roughly two billion cycles per second, per processor.

All total it is something around thirty billion to sixty billion per second, depending on the exact usage. Individual CPU instructions are plentiful and cheap. The slightly bigger costs are when you need to make a round trip out to memory which takes a small number of nanoseconds.

A single instruction is almost no time. A single memory lookup is almost no time. The cost comes when you run massive loops, running through thousands of items thousands of times. Suddenly the "almost no time" gets multiplied by very large numbers and becomes "a noticeable time".



It is generally best to just write code. Don't worry too much about performance. Solve the problem using any algorithm you can think of. Then, only if performance becomes an issue, you can use a profiler to find the issues.

Over the years, most of the big problems I've seen are not the use of an errant multiplication. The problem is when programmers accidentally embed nested loop inside a nested loop inside a nested loop, with the inner-most loop requiring a lot of memory loads. Then it shows up as a blip on the profiler as something that is slightly slow.

The nearly half-century old saying that "premature optimization is the root of all evil" still applies today. That does not mean we should make intentionally bad implementations, but instead that for nearly all source code (with ad-hoc numbers suggesting between 97% to 99% of the time) performance doesn't matter. For the small <3% of the time that performance actually does matter, there are tools to help you identify exactly what needs to be changed. Usually those few changes are easy for an experienced developer to identify and correct.

Don't worry about it, just write whatever works.

Really simply question, really. Is it worth declaring a temporary (float) variable to reduce an operation from 2 to 1 multiplies?

For example,

x = a * b / c;

y = a * b / d;

or..

float ab = a * b;

x = ab / c;

y = ab / d;

I'm not very familiar with assembly language, or how everything breaks down when its compiled, so that's why I'm asking. I'm not trying to optimize (really), I'm actually hoping to do the opposite (avoid the temporaries). If the function wasn't part of my math library, I probably wouldn't even bother worrying about it. The method I'm writing generates the (matrix-like) direction vectors of a quaternion. In the method, there are 2 of 9 unique multiplies, which means I could either multiply 18 times, or declare 9 variables and multiply 9 times.

I originally wrote the function by declaring the temporary variables, but it would look a lot nicer to simply do the math twice, as long as the difference is negligible.

Thanks a bunch for any advice

Why ask us when you can ask your compiler? http://goo.gl/PcN9vz

You can see that both functions produce the same assembly.

This topic is closed to new replies.

Advertisement