Return Value Optimization

Started by
16 comments, last by outRider 15 years, 9 months ago
I'm writing a small physics engine. So I've written my vector, matrix, quaternion , ... classes. I have, of course, overloaded the most common operators: wherever possible, I use the return value optimization since my compiler support it. But I'm still wondering if using overloaded operators like "+=", "*=" gives better performance than the "+", "*" operators, even if the return value optimization is supported by my compiler. Thank you for your replies... [Edited by - johnstanp on July 5, 2008 5:26:27 AM]
Advertisement
Hey, I dont know about your question, but I was wondering;
What is return value optimization?
Well, given that they do different things, how and why do you want to directly compare performance?
It's easy -- try it and profile it. My guess is it won't be any different.
Quote:Original post by Oxyd
It's easy -- try it and profile it. My guess is it won't be any different.


Yes I could do it that way...The problem is I don't really trust the accuracy of my profiler( gprof ) since the results vary from tests to tests...I use it simply to have a broad picture.

I just wanted to know if theoretically the performances are equal or if there's a small difference. Anyway, I will do the profiling.


Quote:Original post by h3ro
Hey, I dont know about your question, but I was wondering;
What is return value optimization?


Here is an answer:

Quote:
Return Value
Methods that must return an object usually have to create an object to return. Since constructing this object takes time, we want to avoid it if possible. There are several ways to accomplish this.

* Instead of returning an object, add another parameter to the method which allows the programmer to pass in the object in which the programmer wants the result stored. This way the method won't have to create an extra object. It will simply use the parameter passed to the method. This technique is called Return Value Optimization (RVO).
* Whether or not RVO will result in an actual optimization is up to the compiler. Different compilers handle this differently. One way to help the compiler is to use a computational constructor. A computational constructor can be used in place of a method that returns an object. The computational constructor takes the same parameters as the method to be optimized, but instead of returning an object based on the parameters, it initializes itself based on the values of the parameters.


A simple example:

Vector3<T> Vector3<T>::operator+( const Vector3<T>& v )const
{
return Vector3<T>( x + v.x , y + v.y , z + v.z );
}

to enforce it for a compiler supporting it.
The optimization will be applied when you write:

v = v1 + v2;

v, v1 and v2 being vectors, of course.

The book "Efficient C++" explains it in its fourth chapter.
Quote:Original post by rip-off
Well, given that they do different things, how and why do you want to directly compare performance?


To see if writing code that way:
v1 = v2 + ( v3 - v4 ) * a

gives the same performance as:

v1 = v3;
v1 -= v4;
v1 *= a;
v1 += v2;

vi being vectors and "a", a scalar.

But, I'll profile the two methods.


Quote:Original post by johnstanp
To see if writing code that way:
v1 = v2 + ( v3 - v4 ) * a

gives the same performance as:

v1 = v3;
v1 -= v4;
v1 *= a;
v1 += v2;

vi being vectors and "a", a scalar.
I'm hesitant to state for certain one way or another about this, since I don't know exactly what your compiler does [of course] or how well its various features are implemented, or even what compiler it is, but any compiler with a reasonable implementation of return value optimization will easily handle the above example.

Its a pretty simple optimization to make, so the likelihood that the quality of the implementation would result in a successful optimization is pretty high. Not to say that it will yield the exact result in your expanded out version, but that it would implement something equivalent or really close to it.

In short, don't worry about it.
Quote:I'm hesitant to state for certain one way or another about this, since I don't know exactly what your compiler does [of course] or how well its various features are implemented, or even what compiler it is, but any compiler with a reasonable implementation of return value optimization will easily handle the above example.


In fact with my compiler( GNU G++ ), the expression v1 = v2 + v3 is computed faster than v1 = v2; v1 += v3. It is computed 1.5 times faster.
I am quite surprised by the difference of speed, but pleased.
Quote:Original post by Oxyd
It's easy -- try it and profile it. My guess is it won't be any different.


You were right: it was just a matter of profiling...

This topic is closed to new replies.

Advertisement