Sign in to follow this  
Trenki

C++ STL containers as return parameter

Recommended Posts

Hi! Consider the following code:
std::string function1()
{
    // ... do something ...
    return a_string;
}

std::vector<somestruct> function2()
{
    // ... some other code ...
    return some_large_vector;
}

How does C++ handle this? Will it generate copies when returning them from the function or is there some internal reference counting to the actual data (probaly with copy-on-write etc.) which makes copying cheap? I would hope that it will be cheap even if vector is say 1M or more in size. Is there anything in the standard that defines if it will be cheap or not? Is it compiler dependant? I'm specifically interested in how GCC >= 4 and MSVC >= 2005 handle this. Thanks.

Share this post


Link to post
Share on other sites
It's probably specified in the ABI, not the standard. For x86 IIRC if the return type is too large to fit in eax or ST0 it's returned through the stack; i.e.

foo bar()
{
foo g;
...
return g;
}
...
foo f;
f = bar();


silently becomes

void bar(*foo)
{
...
}
...
foo f;
bar(&f);

Share this post


Link to post
Share on other sites
I believe in such cases, the copy constructor for the return class would be called.

Somebody that had read through a lot of the STL source code once told me that the string class performs reference counting and fast copying by just copying the pointer, and only reallocates and does a full copy when you change a copy. (Basically, a copy-on-write mechanism.) But I don't know if all implementations of STL do this.

Share this post


Link to post
Share on other sites
Quote:
Original post by Trenki
Hi!

Consider the following code:

*** Source Snippet Removed ***
How does C++ handle this? Will it generate copies when returning them from the function or is there some internal reference counting to the actual data (probaly with copy-on-write etc.) which makes copying cheap? I would hope that it will be cheap even if vector is say 1M or more in size.


vectors will be copied, or at the very least behvae as if a copy was made. strings can be implemented with cow/reference-counting, but this is becoming less popular as it can degrade performance in multithreaded environments.


Quote:
Is there anything in the standard that defines if it will be cheap or not? Is it compiler dependant? I'm specifically interested in how GCC >= 4 and MSVC >= 2005 handle this.


The standard specifies that they should be copied. However, special provisions are made in some cases so that if the compiler can prove that a copy is not needed, it can be elided:

12.8.15 (any typos are mine):

"
When certain criteria are met, an implementation is allowed to omit the copy construction of a class object, even if the copy constructor and/or destructor for the object have side effects. In such cases, the implementation treats the source and target of the omitted copy operation as simply two different ways of referring to the same object, and the destruction of that object occurs at the latter of the times when the two objects would have been destroyed without the optimization. This elision of copy operations is permitted in the following circumstances (which may be combined to eliminate multiple copies):

- in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object with the same cv-unqualified type as the function return type, the copy operation can be omitted by constructing the automatic object directly in to the function's return value.

- when a temporary class object that has not been bound to a reference would be copied to a class object wit the same cv-unqualified type, the copy operation can be omitted by constructing the temporary object directly in to the target of the omitted copy.

[Example:

class Thing {
public:
Thing();
~Thing();
Thing(const Thing&);
};

Thing f() {
Thing t;
return t;
}

Thing t2 = f();



Here the criteria for elision can be combined to eliminate two calls to the copy constructor of class Thing: the copying of the local automatic object t into the temporary object for the return value of the function f() and the copying of that temporary object into object t2. Effectively, the construction of the local object t can be viewed as directly initializing the global object t2, and that object's destruction will occur at program exit -- end example]
"

As for which compilers implement this when, I honestly don't know. I tend not to worry about returning strings (it's never been an issue under profiling) and tend to "return" other kinds of container into a reference passed to the function.

With the next revision of the standard, we will hopefully see support for "rvalue move semantics", which will make things much cheaper under certain circumstances (more so than reference counting, even). More info.

Share this post


Link to post
Share on other sites
Without RVO or NRVO it will copy the entire vector and its contents, as Vectors are not COW. That much is certainly compiler dependent.

You can either rely on whether your compiler does this or not, or find another way to get the data back to the caller. Usually you would do this via a reference parameter to the function used as an out parameter. Fustrating I know, but it is all we can do for now.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this