C++ small dynamic array/vector returns

Started by
20 comments, last by Aressera 7 years, 1 month ago

Up until now when I needed to return a variable number of items (ie. not know at compile time) from a function I would either:

1) suck it up and use a vector: std::vector<int> Func();

2) pass a vector as an argument: void Func(std::vector<int>& v) { v.clear(); /* fill vector with return values*/ }

3) try to work around it: delgate/function pointer, change the algorithm, etc...

4) store vector in the object (if Func() is part of an object) and return a reference: const std::vector<int>& Class::Func();

Really I want to do number 1 but the global new/delete can be on the slow side. 2-4 are just work arounds that I really don't want to use if I don't have to. So to that end, how do you guys handle variable length arrays/returns? Do you just use new/delete and don't worry about performance (even in tight/critical loops)? Do you avoid allocations altogether in critical code (and if so, what methods are your 'go to' methods)? Do you use a global memory pool/custom allocator?

I remember a while ago (in a thread about C and VLA's if I remember correctly) Hodgman talking about how he uses a stack allocator of some sort (the way I remember him describing it was an allocator that worked like a stack and ran in parallel to the stack, but wasn't actually on the stack).

Advertisement

I'm assuming in case 2) you are passing in a static, or at least scope-level vector that is not constructed for each invokation, but less often? Otherwise 1) and 2) usually don't perform differently due to RVO/copy ellision, but if you can reuse the vector, 2) is obviously faster.

Using a stack allocator is really the way to go, I think. A stack allocator just allocates some larger amount of memory from the heap, and allows further allocations to happen from this preallocated area. Every allocation just increments the stack-pointer, and possibly decrements it after use -> so thats why its called a stack-allocator, because it functions like a stack, but uses custom heap-allocated memory instead of the actual c-stack. That way you can even implement an allocator that doesn't even decrement its stack pointer, but just throws away the entire memory at some point (ie. once per update-tick).

This gives you both the flexibility and the speed, though you obviously have to make the allocator accessible to where you want to use it. If you don't want to rely on globals (which is probably for the better of it), you'd have to pass them around eigther as class members, or to each function that uses it.

Ya for #2 I re-use the vector, as you said there'd be little reason otherwise.

For a stack allocator, how do you manage objects returned from a function? If the stack allocator is decremented when the object is destroyed or at the end of the function, either way you've lost the data you want to return. Course that wouldn't be the case for the 'never delete/clean up at the end' stack allocator, which is something I had never thought of before...

What do you find yourself personally using? What do you prefer?

For a stack allocator, how do you manage objects returned from a function? If the stack allocator is decremented when the object is destroyed or at the end of the function, either way you've lost the data you want to return. Course that wouldn't be the case for the 'never delete/clean up at the end' stack allocator, which is something I had never thought of before...

Shouldn't be a problem eigther way - std::vector<> is a wrapper around a dynamically allocated array, and you're constructing that array from the allocator, not the vector itself. So when returning, it should perform at least move-construction, so if the temporary object is destroyed from the function, it pointers to "nullptr" as its memory array, so it doesn't remove the memory from the allocator. Only if the returned object is destroyed without being further moved, it will decrement the stack pointer.

What do you find yourself personally using? What do you prefer?

Unfortunately I can't give you much pointers on this end, I havn't come around to using a good memory allocation scheme myself. I generally just stay away from dynamic allocations inside tight loops, and for occasional events, I'll just swallow the cost - it works out so far, but I'd really advice anyone to spend a few hours upfront to employ good memory allocation, hopefully someone else can give you more insights on that end (personally it is just too much work at this point and I'm too lazy to do to refactor everything right now :/).

There really isn't as large of a cost as you seem to think. You write "suck it up and return a vector", but this is actually one of the fastest solutions: std::vector<foo> bar = GetFooVector(); The returned object results in creating a single collection and the returned result is used directly, no copies are required.

If you assign to an existing object there is a cost to get rid of what used to be in there, but again the cost is minimal: bar = GetFooVector(); The assignment operator needs to wipe out whatever used to be in the old object, and there is a cost to creating the additional objects, but it is typically a bearable cost, particularly with your vector if integer values.

If someone is using it with an existing collection: bar.swap( GetFooVector() ); could be similar cost if the function's return result is compatible.

If you are not completely replacing the collection with another collection there needs to be some copying that takes place from one buffer to another buffer. For that, there is the option to use the insert() function family or use std::copy() with a back inserter. C++ gives plenty of options.

Really I want to do number 1 but the global new/delete can be on the slow side.

Then do it and don't use global new/delete. The std::vector template takes more than one parameter, although it is optional. All the container classes have a second optional template value for the memory allocator to use. By default it is the global allocator but you are under no obligation to use it. If you want to use a different allocator then do so. Code like std::vector<foo, myallocator> works just fine.

2-4 are just work arounds that I really don't want to use if I don't have to.

Not really, they do something different.

#2 is a great solution if your intention is to reuse an object. Often when you want to reuse something you don't clear it, allowing the calling code to accumulate a large number of items. Maybe they want to accumulate all the items that satisfy multiple searches.

#3 is sometimes necessary if a standard collection type doesn't quite satisfy your needs. For example, in older C++ there was no equivalent of a hash map or hash table in the standard, and many programs either used a hash map library or wrote their own. There is no shame in that, just be aware when you are doing it that somebody out there in the world has probably already implemented the solution for you and you don't need to spend your time writing and debugging your own, effectively re-inventing the wheel.

#4 is perfectly acceptable to if that is really what you need to do. Sometimes you'll have a collection internally that you want to return but you don't want other people manipulating your class's internals; returning a const reference to your internals allows them to see the item and use the contents, but not actually manipulate your collection.

So to that end, how do you guys handle variable length arrays/returns? Do you just use new/delete and don't worry about performance (even in tight/critical loops)?

Normally with option 1 as you described. Create a container in the function and return it, trusting the class user will make smart choices either to use RVO or to use move operators with the result.

Also, be sure to reserve enough space in your function so there is only a single allocation cost. It is frustrating to profile code and discover that someone is doing hundreds of push_back() calls where each one causes the container to expand. The reserve() function is your friend, and you should use it before doing any batch work.

Shouldn't be a problem eigther way - std::vector<> is a wrapper around a dynamically allocated array, and you're constructing that array from the allocator, not the vector itself. So when returning, it should perform at least move-construction, so if the temporary object is destroyed from the function, it pointers to "nullptr" as its memory array, so it doesn't remove the memory from the allocator. Only if the returned object is destroyed without being further moved, it will decrement the stack pointer.

What if we had this:


typedef std::vector<int, StackAllocator> stack_vector;

stack_vector Func() {
 stack_vector v1, v2;
 v1.push_back(1);
 v2.push_back(2);
 return v2;
 }

stack_vector v3 = Func();

The call to Func() uses RVO and v3 ends up with v2's pointer. This would be problematic for a stack based allocator since v1's data would be deallocated before v2's/v3's data. With a stack allocator isn't it required that you have to deallocate in reverse order that you allocated?

No, it is not about reverse order. If you use a stack-based allocator it is only valid within the scope where the stack lives. Once the code leaves the function that uses the stack everything that lived inside that scope is dead and invalid.

If you are using a stack-based allocator the buffer is only available within the scope of that object. C++ assumes you as the programmer have a solid understanding of object lifetimes. You need to know when objects come into existence, when they are valid, and when they are dead.

Several of your posts seem to be focused on a fear of time spent with new and delete. While there is certainly a cost to allocating and releasing memory, it is not a terrible cost. If you need to allocate memory then do so.

No, it is not about reverse order. If you use a stack-based allocator it is only valid within the scope where the stack lives. Once the code leaves the function that uses the stack everything that lived inside that scope is dead and invalid.

If you are using a stack-based allocator the buffer is only available within the scope of that object. C++ assumes you as the programmer have a solid understanding of object lifetimes. You need to know when objects come into existence, when they are valid, and when they are dead.

Several of your posts seem to be focused on a fear of time spent with new and delete. While there is certainly a cost to allocating and releasing memory, it is not a terrible cost. If you need to allocate memory then do so.

The whole point of this post was to see what techniques people were using when returning values from a function, when new/delete were considered too costly.

Juliean suggested a stack allocator but I can't see how it can pass values across a function call, that said I can think of 2-3 different ways a stack allocator could work, I can't see any version of them being useful here; but perhaps I'm wrong, I remember Hodgman talking about the virtues of a stack allocator a while back.

For example, a stack based allocator could be simply alloca (or similar) wrapped in a allocator, and all data is lost at end of function, so there's no way it can be used for returning values. A stack allocator could also be a FILO queue, where each allocation increments a pointer, and each deallocation decrements the pointer. As long as you maintain the FILO ordering, you don't need to worry about (and can cross) function scopes. This is what I imagined Juliean to be inferring to, but as I mentioned to him, this method is not foolproof. The 3rd way I could see implementing a stack allocator would be to never have the stack pointer decremented automatically, simply allocate to the top of the stack, then when you are certain all the data from the allocator is no longer needed, you wipe the entire stack at once. I don't know if I'd call this a true 'stack' (since you're not really unwinding the stack, simply wiping it at pre-determined locations), and this would be highly restrictive (too much to use as a general case I would argue) but it technically would work.

From what I've gathered from your responses you like to use new/delete and don't worry about the cost.

As above, RVO is your friend here :D
However, this is one particular use-case where scope-stack allocation comes into it's own.

Juliean suggested a stack allocator but I can't see how it can pass values across a function call, that said I can think of 2-3 different ways a stack allocator could work, I can't see any version of them being useful here; but perhaps I'm wrong, I remember Hodgman talking about the virtues of a stack allocator a while back.

Scope-stack allocation lets you use the same scope/stack concept which is built into the language for local variables, but instantiate as many other stacks and scope-chains as you want, with the ability to define your own scoping rules besides typical block-level scoping.

So normally you can't do the following kind of thing, because the language's built in scopes and stack are tied to the call graph:


int* MakeData( int* out_numData )
{
  int data[42];//allocate it on the call-stack at function-scope
  *out_numData = 42;
  return data;//return the stack allocation
}//oh shit, this is the end of function-scope, the allocation is deleted now!!!
void Test()
{
  int numData;
  int* data = MakeData(&numData);
  printf( ...something something data... );//crash or worse...
}

But when using scope-stack allocation, you can write this kind of code in a perfectly sane and working way:


int* MakeData( Scope& a, int* out_numData )
{
  int* data = new(a) int[42];//allocate it within the passed-in scope.
  //n.b. this does NOT have the typical high cost of a call to `new` - it's a simple stack allocation, closer in cost to declaring a local variable!
  *out_numData = 42;
  return data;
}
void Test()
{
  Scope temp;//make a custom scope as a local variable, which ties it to the current function-scope.
  int numData;
  int* data = MakeData(temp, &numData);//pass 'temp' as the scope withing which to allocate the data
  printf( ...something something data... );
}//our custom scope 'temp' goes out of scope here, so the data is deallocated. No leaks, no mess, no costly heap allocation.

However, I'm cheating a little bit here to make the point :D Yes, the scope needs to get memory from somewhere, so in that example it would `new` a buffer internally.
In practice, when using this across an entire project you've always got large existing stacks though, so you'd just pass one into `Test`:


void Test(Scope& a)
{
  Scope temp(a);//make a new child scope. This just marks the existing internal stack allocator so it can unwind to the right point at the end of the function.

Or you can grab some memory from an alloca type mechanism:


void Test()
{
  LocalScope<1024> temp;//Instead of using `new` to make the internal buffer, start off by grabbing 1KiB off the call-stack, and only use `new` once it overflows

I see, that's quite clever. Thank-you for the explanation Hodgman. Is this your preferred method of returning dynamic data from functions?

This topic is closed to new replies.

Advertisement