Sign in to follow this  

operator efficiency

This topic is 4811 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

hi all, I have a question about operator overloading and the implications of using auxiliary operators over member functions. For the sake of simplicity and argument I have a class, say CPosition, with x, y and z doubles. to add two positions together I do:
...
CPos a(...), b(...), c(...);
...

CPos &CPos::Add(const CPos& l, const CPos& r) {
    x = l.x + r.x;
    y = l.y + r.y;
    z = l.z + r.z;	
    return *this;
}



which enables me to call the very (what I think is) efficient:
//assign a + b to C
c.Add(a, b);

now if I make an auxiliary operator for this, I do something along the lines of:
const CPos operator+ (const CPos &l, const CPos &r) {
	return CPos().Add(l, r);
}



//so the user has the option of doing:

a = b + c;  //slow
a = b + 1;  //slow
a = 1 + b;  //slow works because of NON-member operator
a.(b,c);    //fastest



I admit my knowledge of this area is shaky, so please point out any stupid mistakes i have made, but my query is basically along the lines of: can I make an operator that works just as fast as the member function? I know it is slower because it returns by value rather than reference. Can operators return by reference? Thanks.

Share this post


Link to post
Share on other sites
Hi,

You shouldn't be worried too much about this level of
optimization too much.

but, to answer your question, you'd probably want to keep the
number of redundant function calls to a minimum.

Here is how i would re-implement your system:

class CPos {
public:
float x, y, z;
// add appropriate constructor code here.
const CPos operator +(const CPos& rhs ) {
CPos r;
r.x = x + rhs.x;
r.y = y + rhs.y;
r.z = z + rhs.z;
return r; //de-reference 'this' pointer and return;
}
};



do you understand whats going on above ?

EDIT: dont write code when you have the flu.

dont be affraid to use the features of C++, when used
appropriatly and sensibly, performace (in comparison with other
OO languages) is not even an issue.

Also remember, that the clearer and cleaner your code is at
showing the compiler its intentions, the better job the compiler
can do at optimizing it (in most cases, far better than a human
ever could)

Cheers,
Danu

PS. me thinks we need a small FAQ section on optimization or something.

[Edited by - silvermace on October 10, 2004 12:44:37 AM]

Share this post


Link to post
Share on other sites
Hi, thanks for your reply. I just want to get the very basic stuff optimized like crazy :)

I understand that what you've written is going to be as fast as it gets (being an inlined member operator returning an address) - is that right?

I think part of my worry is balancing maintainability with efficiency... as in your case I believe my second example of what the user can do:


a = 1 + b;

// will not work (as it will try)

a.operator=(1.operator+(b));

or something to this effect



Well thanks for clearing this up for me :) Being a graphics programming student we really are drilled into our heads how important efficiency at every step of the way is!

Share this post


Link to post
Share on other sites
Quote:
Original post by silvermace
Hi,

You shouldn't be worried too much about this level of
optimization too much.

but, to answer your question, you'd probably want to keep the
number of redundant function calls to a minimum.

Here is how i would re-implement your system:
*** Source Snippet Removed ***
do you understand whats going on above ?

This is a bad idea. You have redefined the fundamental behavior of operator+ to be mutating. Instead of using operator+, you should be using operator +=.
Quote:

dont be affraid to use the features of C++, when used
appropriatly and sensibly, performace (in comparison with other
OO languages) is not even an issue.

Yes, and with that in mind, always make sure that your operators behave as one would expect them to. operator+ should be non-mutating, for instance.

Share this post


Link to post
Share on other sites
I think silvermace is correct in proposing this function, because the '+' operator is returning the resulting value, very much like '+='.
The difference is that '+' returns the result as an expression which may or may not be used, whereas '+=' stores the result into the assigned variable.

Edit: This is wrong, don't believe it! See next post
const CPos &operator+(const CPos &rvalue) {
x += rvalue.x;
y += rvalue.y;
z += rvalue.z;
return *this;
}

This enables you to execute "b + c", which will be executed like "b.operator+(c)"

In order to execute "a = b + c", you additionally need the assignment operator defined, like so:

const CPos &operator=(const CPos &rvalue) {
x = rvalue.x;
y = rvalue.y;
z = rvalue.z;
return *this;
}

"a = b + c" will then be executed like "a.operator=(b.operator+(c))"


You could also implement the '+=' operator if you'd like to use it. Other than that, I would strongly recommend not to implement any operators which do not behave exactly as they are supposed to behave with primitive types, because nobody other than you (and at a later time maybe not even you) will expect them behave in a "wrong" way.
If you need to have other behaviour, best stick with normal named functions like "CPos::assignSum(const CPos& value1, const CPos& value2)"


About your optimization concerns: I second the opinion of silvermace.
This is all just about one extra member-function call, which is really very little in execution time. I don't know how serious performance is to you, but I can hardly imagine that the difference is going to matter.

See, if speed really is that important, it might be better not to use objects at all - and rather do it all with primitive integers in pure C style. The benefit of an object model like you are using is readability - readabilty traded against a tiny little bit of speed. So does it really make sense then to give away readability only to spare a single function call, just to regain a hardly measurable fraction of that speed?

AV

[Edited by - beefsteak on October 9, 2004 8:12:43 PM]

Share this post


Link to post
Share on other sites
I gave it a second though, and I think Washu is right about the '+' operator, and I was wrong. :-)
I mustn't change the values of the object instance itself, because after "b + c", b must still have the same value inside.

Share this post


Link to post
Share on other sites
Quote:
Original post by beefsteak
I gave it a second though, and I think Washu is right about the '+' operator, and I was wrong. :-)
I mustn't change the values of the object instance itself, because after "b + c", b must still have the same value inside.

I take back all of the nasty things I said about your name just a second ago...well, ok, not all of them.

It is important to make sure that code is always consistent. When you come back to a code base even a few months later, then something stupid like making operator+ mutating would be completely forgotten. So you would make obvious mistakes, and bugs would run rampant.

Share this post


Link to post
Share on other sites
Quote:
Original post by Striken
I just want to get the very basic stuff optimized like crazy :)
Bad idea. You should want to get the basic stuff as correct as possible. No matter how much you work on your Add method, it is likely not going to be the bottleneck in your application.

Never optimize without profiling first. At this stage, focus on writing syntactically and semantically correct code. Once that builds and performs correctly, then you want to profile your application to determine what portions spend the most time executing. You'll want to examine those portions to determine if you can make any simple changes to improve efficiency, followed by algorithm changes, finally culminating in code-intensive but probably platform-specific optimizations.

Share this post


Link to post
Share on other sites
Washu has it right. Non-mutable operators should retain that status. Take the folowing example with operator + acting as silvermace described:

Cpos a(1,1,1);
Cpos b(2,2,2);

Cpos c = a + b;

now, c = (3,3,3) and b = (2,2,2) but guess what: a also = (3,3,3), which is incorrect because a new value was never assigned to a.


@beefsteak: Glad you've come to your senses ;-)

Share this post


Link to post
Share on other sites
Heh, I'm glad I just recognize my error before getting the head chopped off. :-)

So, now I tried to write up a '+' function that may actually work, but don't slap me if I'm wrong again. I just gave it a shot. :-)

The following function needs both an assignment operator ('=') and an empty constructor for CPos defined in order to work properly


const CPos operator+(const CPos &rvalue) const {
CPos result;
result.x = x + rvalue.x;
result.y = y + rvalue.y;
result.z = z + rvalue.z;
return result;
}


AV

Share this post


Link to post
Share on other sites
Agreed. Your choice of operators most likely will not affect performance AT ALL in the long run, since it probably won't be the bottleneck. Spend your time optimizing stuff that matters; and if you don't know which stuff matters, spend your time figuring out which stuff matters.

Share this post


Link to post
Share on other sites
Quote:
Original post by Washu
Quote:
Original post by silvermace
Hi,

You shouldn't be worried too much about this level of
optimization too much.

but, to answer your question, you'd probably want to keep the
number of redundant function calls to a minimum.

Here is how i would re-implement your system:
*** Source Snippet Removed ***
do you understand whats going on above ?

This is a bad idea. You have redefined the fundamental behavior of operator+ to be mutating. Instead of using operator+, you should be using operator +=.
Quote:

dont be affraid to use the features of C++, when used
appropriatly and sensibly, performace (in comparison with other
OO languages) is not even an issue.

Yes, and with that in mind, always make sure that your operators behave as one would expect them to. operator+ should be non-mutating, for instance.


That was obviously a mistake, you're a mod, why didnt you fix it?!

Share this post


Link to post
Share on other sites
I've checked. If you turn optimization, inline your operator+ and operator=, on MSVC6.0 you'll probably have same assembly ouput in both cases. There's no call to = and no call to +, it just adds vector components one by one and stores 'em in destination.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dmytry
I've checked. If you turn optimization, inline your operator+ and operator=...
inline is only a suggestion. Stop using it, because the compiler does a better of determine what needs to be inlined than you will in 99% of cases - and it can still ignore you!

If you're absolutely sure you want your function inlined, you can use compiler-specific extensions like __forceinline, though I wouldn't recommended it lightly, for the same reasons.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
Quote:
Original post by Dmytry
I've checked. If you turn optimization, inline your operator+ and operator=...
inline is only a suggestion. Stop using it, because the compiler does a better of determine what needs to be inlined than you will in 99% of cases - and it can still ignore you!

Actually it might still be a good idea to use it. I though the same thing but a friends project (some heavy compression stuff) got some rather large speed ups though careful use of 'inline' (With the VS6 compiler).

Admittedly its probably highly compiler dependant, but my point is that its not a redundant keyword just yet.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
Quote:
Original post by Dmytry
I've checked. If you turn optimization, inline your operator+ and operator=...
inline is only a suggestion. Stop using it, because the compiler does a better of determine what needs to be inlined than you will in 99% of cases - and it can still ignore you!

If you're absolutely sure you want your function inlined, you can use compiler-specific extensions like __forceinline, though I wouldn't recommended it lightly, for the same reasons.

No, "for same reasons" you should not use __forceinline, and use inline instead. Why? Because, as you said, inline can be ignored if there's no need to inline. That is, inline is a hint, and __forceinline is not, so __forceinline can really hurt performance alot.

And, i don't want to blow size of my programs with inlining everything. Many compilers either inline too much or don't inline at all.
There's many common myths about compilers,2 most common is:
1: are so stupid , and can not optimize out a=b+c;
2: are so smart that can optimize everything without any hints, and that optimization hints from human is very bad for performance. Inline, for more-or-less smart compilers, it's only "hint" that can be ignored. And for stupid compilers... stupid compilers have 3 or less modes:
1:Not inline at all, ignore inline keyword
2:Inline only functions with inline keyword.
3:Inline everything they can.
and without using inline keyword you have only modes 1 and 3, and 3 works like you put __forceinline keyword anywhere you can.

Inline is needed mainly not because of call overhead, but because with inlining compiler can optimize more things, IMHO. And i know that vector a+b is nearly always better to be inlined, and there's no point in inlining matrix inverse.

Share this post


Link to post
Share on other sites
To get optimal performance out of such operations you have to vectorize them. Look at a library such as Blitz++ or the C++ port of uBLAS (part of boost); they use expression templates to build loop-amoritized and vectorized calculations. This is difficult for an expert to design and code correctly. To maximize performance on modern hardware, you also must arrange your data appropriately. For an pentium II+, to take advantage of SIMDs (MMX and/or SSE); for instance it's common to have an array of structurs for points, but the CPU wants a structure of arrays (e.g. an array for x, another for y, another z).

Quote:
Original post by Oluseyi
Quote:
Original post by Dmytry
I've checked. If you turn optimization, inline your operator+ and operator=...
inline is only a suggestion. Stop using it, because the compiler does a better of determine what needs to be inlined than you will in 99% of cases - and it can still ignore you!

If you're absolutely sure you want your function inlined, you can use compiler-specific extensions like __forceinline, though I wouldn't recommended it lightly, for the same reasons.


inline has semantic meaning (no external linkage) beyond the optimization suggestion, you can and most likely should use it for operators such as those dicussed here (you must use it if you define the operators in a header outside of a class declaration (member function defintions inside a class declartion are automatically inline), otherwise the code is malformed). It's probably best left to compiler to decide whether or not to actually inline the object code (e.g. don't use __forceinline unless you really mean it).

To regurgitate, the C/C++ keyword inline is not about optimization, it's about correct code. The optimization hint is a compiler implementation after-thought.

And since it's not very clear from the above, if you put the operator defintions in a .cpp file (external linkage), it is very difficult for compilers to optimize them (inlining or any other optimizations). (MSVC 7.x is the only compiler that I know of that can even attempt to). So operators generally belong in headers, and should be inline code.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dmytry
No, "for same reasons" you should not use __forceinline, and use inline instead. Why? Because, as you said, inline can be ignored if there's no need to inline. That is, inline is a hint, and __forceinline is not, so __forceinline can really hurt performance alot.

Sometimes, the profiler shows the optimizing compiler's wrong, and then you have to use __forceinline. The most recent example I've seen of this was some method in wxWindows' wxString.

Share this post


Link to post
Share on other sites

This topic is 4811 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this