Sign in to follow this  
Dive_With_Me

Optimization, C++

Recommended Posts

I got two things I cant really find any good info on the internet on when it comes to optimization. Both are regarding assignment. First: Is it more efficent to write : myvalue = myvalue2 = SomeValue; then myvalue = SomeValue; myvalue2 = SomeValue; ? The second thing i wonder about is: is a simple addition/other simple mathematic procedure faster then finding and assigning a value from a dynamic array (one allocated with new)? E.g. is it more efficent to write (when I write "Index" I mean a variabel and not a constant) : Myvalue = SomeDynamicArray[Index]; then MyValue = SomeValue + SomeOtherValue; ?

Share this post


Link to post
Share on other sites
The first question irrelevant because they either a) apply to basic types, at which point they result in identical code at the machine level or b) refer to more complex types, at which point the actual implementation of that type is extremely important when determining which is faster. Use the second form, because it's more readable and contains less side-effects in the same statement.

As for the second question, an array access at a non-constant index always requires an addition.

In general, to determine the fastest way to do something, try both ways, profile and choose the fastest one.

Share this post


Link to post
Share on other sites
Your asking question about micro-optimizations. In the majority of cases (excluding what has been touched on about complex objects with overridden functionality), the compiler will generate identical code.

In the situations where that might not be the case, there is no answer without much more contextual information.

Writing code that is micro-optimized in this fashion, where you are attempting to preempt the compiler's very, very good optimization abilities, is the hallmark of a very, very bad programmer. Don't go down that road.

Your job is to write code that works, code that is understandable and robust. It is not to worry about wringing every last bit of speed (or trying to, at least) out of every line of code in your program. That's what profilers are for; they tell you what is slow, then you fix it.

Share this post


Link to post
Share on other sites
Thanks for the input.

I did mean that all the variables in the example are basic types.

Okay, in example 1 I was expecting the same results.

What do you mean with that "an array access at a non-constant index always requires an addition"?

Yes, this is micro-optimization.
But micro-optimization can be good when you are dealing with a small piece of code that is looped continously in a program - i.e. this is small but critical code that needs to be fast.

You all mention profilers, Im not very familiar with them. Im using VC++ 2005, express edition, any tips on where I can learn more about profiling would be great...


Share this post


Link to post
Share on other sites
Profilers can include such things as Intel's VTune, or AMD's CodeAnalyst.

Here's a link off AMD's site with a few nice utilities. Strange they don't like Intels stuff:P

As stated above, micromanagement of this type is usually irrelevant today.

In the end, you're looking at

MOV myvalue, SomeValue
MOV myvalue2, SomeValue

The only time I can ever remember the first being important was back when diskspace was extremely limited (talking about having only kilobytes to work with), and people micromanaging like that to get every byte to add more code. Then again, most of the C compilers I remember at the time had an 8 byte limit for variable names.

The general rule of micromanaging:

1) Profile your code and see where your bottleneck is
2) Rewrite the slow algorithm's to be more efficient
3) If you get to the point where you need to micromanage small tidbits such as doing bitshifts and other things like that, go to step 2

Not to state that it can't be useful at times, but if you don't understand what the outcome of your example will be, most likely, you won't really know how to micromanage and get every free cycle. If things are too slow, try posting some code, and I'm sure people here will enlighten you on ways to make it faster.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dive_With_Me
Yes, this is micro-optimization.
But micro-optimization can be good when you are dealing with a small piece of code that is looped continously in a program - i.e. this is small but critical code that needs to be fast.

Usually the optimization can be done in the algorithm level, not code level. If you can see if there's a better and faster algorithm to replace the algorithm you are using, then it's a far better optimization than micro-optimization.

Quote:

You all mention profilers, Im not very familiar with them. Im using VC++ 2005, express edition, any tips on where I can learn more about profiling would be great...

Profilers is just a fancy word to describe "how fast is this code running?" You basically start a timer before you execute a piece of code, and stop it after it's done and record the time.


For your first question. I had learnt that it's an ugly construct that can generate bugs in other languages/compilers. For example:

a = b = foo(); // foo() does something

should be translated as:

a = foo();
b = a;

but some compilers can translate that as:

a = foo();
b = foo();

which is not what you want. A buggy compiler, but they are out there.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dive_With_Me
What do you mean with that "an array access at a non-constant index always requires an addition"?


a[i] is defined to be exactly the same as *(a + i), thus array access requires an addition. This means you won't gain anything by replacing addition with array access. Where you might see some gain is replacing trig functions or exponentials or combinations of them with array access.

This also means that a[i] is exactly the same as i[a], but other than IOCCC entries, I'm hard pressed to think of a good use for this (and even IOCCC entries try to find better obfuscation techniques).

Quote:

Yes, this is micro-optimization.
But micro-optimization can be good when you are dealing with a small piece of code that is looped continously in a program - i.e. this is small but critical code that needs to be fast.


Do you know that the current method isn't fast enough?

Assuming the current method isn't fast enough, here's some advice: The fastest operation is to not do anything. You don't really get faster than a no-op. You think I'm kidding? Consider what others have said about finding better algorithms. If you replace your O(n2) algorithm with an O(n) algorithm you've essentially replaced (n2-n) operations with no-op's.

Share this post


Link to post
Share on other sites
You should learn to do this kind of stuff yourself, learn a little asm and look at your compiler's assembly output when microoptimizing. When doing optimization at a higher level, or sometimes even for microoptimization, use a profiler.

Quote:
Original post by alnite
Quote:

You all mention profilers, Im not very familiar with them. Im using VC++ 2005, express edition, any tips on where I can learn more about profiling would be great...

Profilers is just a fancy word to describe "how fast is this code running?" You basically start a timer before you execute a piece of code, and stop it after it's done and record the time.

No, a profiler is a performance analyser. They can give you lots of interesting data, for instance how many misaligned data accesses do you make? What about cache misses? What function does on average take the most time? What function is called the most? Which assembly instruction is the most used instruction? What is the ratio of SSE and x87 instructions? etc.

Quote:
For your first question. I had learnt that it's an ugly construct that can generate bugs in other languages/compilers. For example:

a = b = foo(); // foo() does something

should be translated as:

a = foo();
b = a;

No, that doesn't have the same behavior in all cases, if you had to translate it then the following would be more appropriate:
b=foo();
a=b;

If you need to see why, consider this:

#include <iostream>
#include <ostream>
struct C
{
C& operator=(int)
{
std::cout << "C="<<std::endl;
return *this;
}
};
struct D
{
D& operator=(const C&)
{
std::cout << "D="<<std::endl;
return *this;
}
};

int main()
{
D d;
C c;
d=c=5;

return 0;
}


Quote:
but some compilers can translate that as:

a = foo();
b = foo();

which is not what you want. A buggy compiler, but they are out there.

Could you give an example? If a compiler was that bad, then I'd also be afraid that it'd evalute 2+2 to 5.

Share this post


Link to post
Share on other sites
Quote:
Original post by CTar
Quote:
For your first question. I had learnt that it's an ugly construct that can generate bugs in other languages/compilers. For example:

a = b = foo(); // foo() does something

should be translated as:

a = foo();
b = a;

No, that doesn't have the same behavior in all cases, if you had to translate it then the following would be more appropriate:
b=foo();
a=b;

If you need to see why, consider this:

#include <iostream>
#include <ostream>
struct C
{
C& operator=(int)
{
std::cout << "C="<<std::endl;
return *this;
}
};
struct D
{
D& operator=(const C&)
{
std::cout << "D="<<std::endl;
return *this;
}
};

int main()
{
D d;
C c;
d=c=5;

return 0;
}


Quote:
but some compilers can translate that as:

a = foo();
b = foo();

which is not what you want. A buggy compiler, but they are out there.

Could you give an example? If a compiler was that bad, then I'd also be afraid that it'd evalute 2+2 to 5.

Heh. it was a typo. I did mean:

b = foo();
a = b;

I don't remember anymore, but it was a commercially available Java optimizer program. I don't understand why it'd do that, but it was generally considered a huge warning in my company to do something like that.

Share this post


Link to post
Share on other sites
Thanks, Im getting the profiler and then I'll try to figure out how it works.
Reading all your posts I guess the only right way is to profile whatever alternative codepieces you have
and then make the conclusion on which one to use.

Thanks to Way Walker for clearing out the array-thing, but I think I got missunderstod from the begining.
My question is really this:

int Somevalue,
Someothervalue;

Is this piece of code more efficent:

for(Somevalue=inital_value; Somevalue < max; somevalue += something)
{
for(someothervalue=inital_value2; someothervalue < max2; someothervalue += somethingelse)
{
foo[Index] = Somevalue;
foo[Index + 1] = foo[Index] + someothervalue;
}
}

then this :

for(Somevalue=inital_value; Somevalue < max; somevalue += something)
{
for(someothervalue=inital_value2; someothervalue < max2; someothervalue += somethingelse)
{
foo[Index] = Somevalue;
foo[Index + 1] = Somevalue + somethingelse;
}
}

?
The code I have is not this simple and it would take quite some time to explain conceptually what it does, but I think
this conveys the issue, if not I'll try to clearify.
In other words, is it more efficent to assign from a dynamic array where a value is stored or is it more efficent to
calculate the same value that you have in the array when the value you are calculating does not require a lot of
math ops?
An other way to put it migth be (not sure if it is correct to put it like this) "what is the overhead for getting a value
from a dynamic array"?

I know the speed of the algorithm is sufficent for my solution, so now I also want to make the code as efficent as possible -
not because it's to slow but rather because I like the challenge of optimization and I want to learn how to write as efficent
code as possible.

Share this post


Link to post
Share on other sites
The fastest, above, would be:

const int x = (max - initial_value) % something + initial_value;
const int y = x + (max2 - initial_value2) % somethingelse + initial_value2;
foo[Index] = x;
foo[Index + 1] = y;


Since the loops are obviously not required.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dive_With_Me
Thanks, Im getting the profiler and then I'll try to figure out how it works.
Reading all your posts I guess the only right way is to profile whatever alternative codepieces you have
and then make the conclusion on which one to use.

Thanks to Way Walker for clearing out the array-thing, but I think I got missunderstod from the begining.
My question is really this:

int Somevalue,
Someothervalue;

Is this piece of code more efficent:

for(Somevalue=inital_value; Somevalue < max; somevalue += something)
{
for(someothervalue=inital_value2; someothervalue < max2; someothervalue += somethingelse)
{
foo[Index] = Somevalue;
foo[Index + 1] = foo[Index] + someothervalue;
}
}

then this :

for(Somevalue=inital_value; Somevalue < max; somevalue += something)
{
for(someothervalue=inital_value2; someothervalue < max2; someothervalue += somethingelse)
{
foo[Index] = Somevalue;
foo[Index + 1] = Somevalue + somethingelse;
}
}

?
The code I have is not this simple and it would take quite some time to explain conceptually what it does, but I think
this conveys the issue, if not I'll try to clearify.
In other words, is it more efficent to assign from a dynamic array where a value is stored or is it more efficent to
calculate the same value that you have in the array when the value you are calculating does not require a lot of
math ops?
An other way to put it migth be (not sure if it is correct to put it like this) "what is the overhead for getting a value
from a dynamic array"?

I know the speed of the algorithm is sufficent for my solution, so now I also want to make the code as efficent as possible -
not because it's to slow but rather because I like the challenge of optimization and I want to learn how to write as efficent
code as possible.

In your example Somevalue is a value, not an expression, therefore it's not evaluted like you seem to believe. If we have this code:
int a = 5*100;
int b = a;
int c = a;

Then what happens (no optimizer, x86 like architecture) is that we put 5 in a register, multiply that register by 100. We then save the value of the register to some memory location. We then move the contents of the memory location to a new memory location (b), and the when do the same again but to a new memory location (c). So 5*100 is only evaluted once.

Accesing a specific element of an array is almost always (I can't imagine a system where this isn't true) slower than accesing a specific memory location.

Access single variable a:
1. Load contents at &a

Access element b of array a:
1. Load contents at &b
2. Multiply by size of one element of a (this is a compile time constant, and will be hardcoded in the assembly, therefore no need to load it)
3. Add with &a
4. Load the contents of the value we now have

If what you mean is whether
1) a[i] = f()
b = f()
or
2) a[i] = f()
b = a[i]
is the fastest, then, that depends on how much work f needs to do. If f needs to do alot, number two is the fastest, but the optimizer should be able to figure this out itself, so do whatever is the most readable.

Share this post


Link to post
Share on other sites
Thanks again for the input.

First of all:
Loops are necessary in my case.
When I wrote Index I ment it to be an integer that I would increase depending on some conditions in my loops - not a literal that does not change in the loop.
Sorry for that.

Yes, it depends on how much work f() needs to do, that is my question - when should I choose to calculate the value and when should I use one that is stored in a dynamic array.
My guess is that the profiler will have the best answer to this question so I'll run it by him =)

Share this post


Link to post
Share on other sites
Quote:
Original post by Dive_With_Me
What do you mean with that "an array access at a non-constant index always requires an addition"?
What he means is that this:
val = a[x];
is exactly the same as this:
val = *(a + x);
or for that matter, this:
val = *(x + a);
and strangely enough, even this:
val = x[a];
Quote:
Yes, this is micro-optimization.
But micro-optimization can be good when you are dealing with a small piece of code that is looped continously in a program - i.e. this is small but critical code that needs to be fast.
Sure, but the only way to know which loops are actually taking a decent amount of time, and are worth optimising is by profiling.
Even better, by performing higher level optimisations first, you might be able to do away with the loop entirely, or something like that.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this