Passing a vector... by reference?

Started by
19 comments, last by Qw3r7yU10p! 18 years, 10 months ago
Quote:Original post by Will F
Quote:Original post by mgarriss

amen! i can't believe the over use of the heap i've seen on these forums. take advantage of C++'s scope-based deterministic destruction. if you must allocate memory using new or because you're using a lib that hands you object that way i would suggest using the boost.smart_ptr lib.


My understanding of the standard library (correct me if i'm wrong - this is from Effective STL item 3), is that when you push_back an object onto a vector, you're not actually putting the object in the vector, rather you're putting a copy of the object into it. And when you get an object back from the container, its a copy of the object. And once inside a container, it's not uncommon for it to be copied further. Meaning you might need a copy constructor and copy assignment operator. Using pointers to objects in containers makes the copying cheap and correct (at the expense of having to use the heap).

Memory deallocation is a pain though, it's not that difficult to .clear() a vector forgetting to delete the objects first.


You're indeed correct about that, however as you point out the dangers of losing the pointers is also quite high, as such storing bald pointer in a container is very often the worse idea you can have, instead as mgarriss says you should be storing a smart pointer (be it boost::shared_ptr or another) which acts like a pointer and is small so the copy operation wont cost you as much as possibly recopying a whole object.

However, as snk_kid points out, quite often you can just store the object its self on their if the copying is light enought (on average you wont copy the objects that often anyways)

edit: ok, as per snk_kid's reponce below you arent totally correct about it, tbh i didnt read your post properly, heh [grin]
Advertisement
Quote:Original post by Will F
is that when you push_back an object onto a vector, you're not actually putting the object in the vector, rather you're putting a copy of the object into it.


Yes, however with std::vector (as is the case with all standard library containers) allocation/deallocation and construction/destruction are separated for efficency reason. typically std::vector allocates uninitialized memory then later on will copy construct elements in-place when needed.

Quote:Original post by Will F
And when you get an object back from the container, its a copy of the object.


Absolutely incorrect, all standard library containers be it through an accessor method or deferencing an iterator returns a reference or constant reference (depending on context) to the element.

Quote:Original post by Will F
And once inside a container, it's not uncommon for it to be copied further.


Standard library containers don't, if you use it naively then its your own fault.

Quote:Original post by Will F
Meaning you might need a copy constructor and copy assignment operator.


You don't, all user-defined types have copy constructors & assignment operators implicitly defined, generated by the compiler that by default does a member-wise (or bit-wise if its a POD-type) copy/assign, if that behaviour is not appropriate for you then either disable them by declaring them private and store pointers or explicitly define a more appropriate one.

Quote:Original post by Will F
Using pointers to objects in containers makes the copying cheap and correct (at the expense of having to use the heap).


Erm exactly how does it make any more correct? and with the default allocator type your using the heap regardless.

If you know what your doing you can make it less expensive, for example using std::vector's reserve method which allocates a chunk of uninitialized memory, then use push_back to just initialize each element which is basically does an in-place copy construction on the uninitialized memory. No overhead of allocating & constructing at the same time.

If you've have actually profiled your code to find that copy construction is costing you then you would consider storing pointers. There are other cases when you have no choice but to store pointers, polymorphic types, types which are non-copyable (like the C++ I/O streams where copying has no meaning), etc.

The reasons why to prefer storing the actual type over pointers:

1. standard library containers is designed to efficently store the actual type.

2. if you store pointers in std::vector you've lost contiguousness of elements (which is important in some contexts) also increasing the chance of memory fragmentation thus hindering preformance.
Quote:Original post by Will F
My understanding of the standard library (correct me if i'm wrong)


Well I did ask to be corrected if I was wrong... I'll admit, my comfort level with the standard library is not where i'd like it to be. All my comments come straight from Effective STL by Scott Meyers. So either I misread Item 3 in the book, or is the book just flat out wrong? It was published in 2001, so I wouldn't think it was comepletely out of date.


Quote:Absolutely incorrect, all standard library containers be it through an accessor method or deferencing an iterator returns a reference or constant reference (depending on context) to the element.


I was more thinking of getting a copy of an object out of a container so you could do something with it - like move it to another container. I suppose that in reality you are going to be doing this so rarely as to make little preformance difference.


Quote:You don't, all user-defined types have copy constructors & assignment operators implicitly defined, generated by the compiler that by default does a member-wise (or bit-wise if its a POD-type) copy/assign, if that behaviour is not appropriate for you then either disable them by declaring them private and store pointers or explicitly define a more appropriate one.


That's what I meant (should have been more clear), I didn't mean to imply that the compiler doesn't implicitly define them - what I meant to say was that you might need to explicitly define one.


Quote:Standard library containers don't, if you use it naively then its your own fault.


Straight from the book:
"Once an object is in a container, it's not uncommon for it to be copied further. If you insert into or erase something from a vector, string, or deque, existing container elements are typically moved (copied) around. If you use any of the sorting algorithms; next_permutation or previous_permutation; remove, unique, or its ilk; reverse or rotate, etc., objects will be moved (copied) around. Yes, copying objects is the STL way"

So, is this incorrect?
I suppose using reserve() on vectors cuts down on worring about insert operations?

Quote:Erm exactly how does it make any more correct? and with the default allocator type your using the heap regardless.

You're argument is with Scott Meyers, not me. His words "An easy way to make copying efficient, correct, and immune to the slicing problem is to create a container of pointers instead of a container of objects."

I'm really confused all of a sudden and am beginning to feel maybe I should just write my own dynamic array class rather than ever using the standard library's vector.


Quote:If you've have actually profiled your code to find that copy construction is costing you then you would consider storing pointers. There are other cases when you have no choice but to store pointers, polymorphic types, types which are non-copyable (like the C++ I/O streams where copying has no meaning), etc.


Good point, I didn't mean to imply that creating something like a vector<int*> was a good idea. But a vector<Sprite*> is probably going to be faster then a vector<Sprite>. Plus the advatages of vector<Sprite*> being usable for polymorphism. Then again, I suppose you're right that it's best to profile the code to make sure it actually makes a noticable difference before you start getting pointer crazy.

Regardless, the point you made in an earlier post about if this is someone's first experience with vectors they should not be using pointers in vectors is probably the most important thing.

Anyway's correct me if i'm wrong. I'm working through the book, and maybe i'm misrepresenting what it's saying. Regardless I don't want to spread misinformation.

[Edited by - Will F on June 10, 2005 7:46:59 PM]
Quote:Original post by Will F
Straight from the book:
"Once an object is in a container, it's not uncommon for it to be copied further. If you insert into or erase something from a vector, string, or deque, existing container elements are typically moved (copied) around. If you use any of the sorting algorithms; next_permutation or previous_permutation; remove, unique, or its ilk; reverse or rotate, etc., objects will be moved (copied) around. Yes, copying objects is the STL way"

So, is this incorrect?
I suppose using reserve() on vectors cuts down on worring about insert operations?


You should have been more specific before i thought you where talking to something else, anyways if your going to do insertion/removal in arbitrary positions, frequently in the middle then your using the wrong container for the job.

Quote:Original post by Will F
You're argument is with Scott Meyers, not me. His words "An easy way to make copying efficient, correct, and immune to the slicing problem is to create a container of pointers instead of a container of objects."


Again you should have been more specific before and i think your taking his words out of context, do you know what he is referring to when he mentions "slicing problem" if you don't he is referring to polymorphic types i.e storing base classes, thats why you should store pointers in this case.

Quote:Original post by Will F
I'm really confused all of a sudden and am beginning to feel maybe I should just write my own dynamic array class rather than ever using the standard library's vector.


Its highly unlikely your going to out do a modern C++ compiler's imp of std::vector.

Not only is it very efficient, it does other things you wont have thought about:

1. Provides a level of exception safety, using RAII idioms, you can read about it here: Standard-Library Exception Safety

2. Seperates allocaiton/deallocation and construction/destruction for efficiency, to do this with C-style dynamic arrays in C++ is low-level advance C++ code, prone to error, its not every-day typical coding (placement new, explicitly invoking destructors, etc). This handled automatically for you.

3. Guaranteed contiguousness of elements (as of C++ 2003 TC1 standard).

4. Parameterized by allocator type allowing you to use custom memory models/schemes with it.

5. std::vector typically implementated with some form of exponential growth strategy instead of naively growing by one element at a time. This has been known to give good overall preformance.

6. Typically implements optimizations you most likely never heard of before, like EMO for example.

Rule of thumb, generally prefer std::vector over C-style dynamic arrays.

Quote:Original post by Will F
Anyway's correct me if i'm wrong. I'm working through the book, and maybe i'm misrepresenting what it's saying.


I think some of it you have, i suggest getting some other books too, particularly stuff from Herb Sutter, C++ In-depth series are the ones to look out for. Also you might want to go through some of Guru of the Week articles.

[Edited by - snk_kid on June 10, 2005 8:51:15 PM]
Quote:Original post by snk_kid
Quote:Original post by Will F
I suppose using reserve() on vectors cuts down on worring about insert operations?


You should have been more specific before i thought you where talking to something else, anyways if your going to do insertion/removal in arbitrary positions, frequently in the middle then your using the wrong container for the job.


Yeah, bad example (I suppose I should have used list there instead of vector). What I should have said there was use reserve() if you know how many elements the vector will contain - before you start using push_back()


Quote:Again you should have been more specific before and i think your taking his words out of context, do you know what he is referring to when he mentions "slicing problem" if you don't he is referring to polymorphic types i.e storing base classes, thats why you should store pointers in this case.


I think i'm doing a bad job of explaining myself here.
An example of slicing would be having a class called wizard, which inherits from Sprite, and then trying to add a wizard to a container of Sprite rather than Sprite*. Correct?
Meyers still does write that "copying pointers is fast, it always does exactly what you expect (it copies the bits making up the pointer)."
I was under the impression that as long as you're aware of the memory issues (making sure to explicitly call delete on the pointer, because clear(), pop_back(), etc won't do it for you) it is generally not a horrible idea to make containers hold pointers to objects.
So - if i'm not using polymorphism - the best way to know whether to use objects or pointers in my code is to profile the code, and only use pointers if there's a significant gain?

Generally when I use containers holding pointers I put it as a private member of a class (for example, I have one called CSpriteManager), and make have the manager take care of the memory management issues (via calls to the manager like addSprite and RemoveSprite) to ensure that no code anywhere else in the program does anything dumb with the list. Is this a good way of doing things - or should I be looking into another way to do it?

Quote:
Quote:Original post by Will F
I'm really confused all of a sudden and am beginning to feel maybe I should just write my own dynamic array class rather than ever using the standard library's vector.


Its highly unlikely your going to out do a modern C++ compiler's imp of std::vector.


I realize that, my point is that where I previously thought I had a basic handle on containers, but you've almost scared me into not wanting to use them. I'd almost rather have a container class that I wrote that I know exactly what's going on, at the cost of efficiency (and probably some nasty bugs).

Quote:Rule of thumb, generally prefer std::vector over C-style dynamic arrays.

I've always believed that.


Quote:I think some of it you have, i suggest getting some other books too, particularly stuff from Herb Sutter, C++ In-depth series are the ones to look out for. Also you might want to go through some of Guru of the Week articles.


Thanks for the suggestions. Now I just need to figure out whether the problem is with my understanding of containers or my ability to communicate effectively (probably a bit of both).

I suppose my next step is to look into shared pointers to make memory management in this situation easier?. I haven't really looked much at boost yet, and it should probably be put high on my todo list.
general rule of thumb:

prefer std::deque over std::vector -
Quote:So, for the three reasons cited earlier: Consider preferring deque by default in your programs, especially when the contained type is a class or struct and not a builtin type, unless the actual performance difference in your situation is known to be important or you specifically require the contained elements to be contiguous in memory (which typically means that you intend to pass the contents to a function that expects an array).
Quote:Original post by Will F
An example of slicing would be having a class called wizard, which inherits from Sprite, and then trying to add a wizard to a container of Sprite rather than Sprite*. Correct?


The problem of object slicing has nothing to do with the standard library containers themselfs, it can happen with or with-out, with C-style arrays, with objects statically or dynamically allocated.

Object slicing occurs when you have a base class which is instacable (its non-abstract class) and you copy (either via assignment or copy constructor) an instance of derived class to a instance of the base class, only the base members of the derived class are copied.

For example let A be a base class (super type) and let B be a derived class of A (B is a sub type of A):

1. with statically allocated objects:
B b;//....A a = b; // copy constructor, object slicedA c;c = a; // assignement, object sliced



2. with dynamically allocated objects:

A* a = new B;//....A* b = new A(*a); // copy constructor, object slicedA* c = new A;*c = *a; // assignement, object sliced


A typical method of correcting this is to use virtual constructor idiom, via a virtual clone method.

I forget to mention something else, when not using pointers/references dynamic type dispatch does not work which is obvious the reason why.

Quote:Original post by Will F
Meyers still does write that "copying pointers is fast, it always does exactly what you expect (it copies the bits making up the pointer)."


Yes but thats a shallow copy, you may not want that it depends on the context, it also at the cost of another level of indirection if its worth that cost depends on the context.

Quote:Original post by Will F
I was under the impression that as long as you're aware of the memory issues (making sure to explicitly call delete on the pointer, because clear(), pop_back(), etc won't do it for you)


Indeed, standard library containers handle there own memory management via the allocator types, any memory management you do yourself is your responsibility.

Quote:Original post by Will F
it is generally not a horrible idea to make containers hold pointers to objects.


Again it all depends on the context.

Quote:Original post by Will F
So - if i'm not using polymorphism - the best way to know whether to use objects or pointers in my code is to profile the code, and only use pointers if there's a significant gain?


Sort of yes it does depends on the context, sometimes you can just tell by looking at the code that copying/assignment is going to cost you big time but most of the time objects are often small so these operations are relatively inexpensive, especially if you can take advantage of compiler generated version of these (the default copy constructor and assignment operator implicitly defined).

Generally its dynamic allocation & de-allocation of these that can cost you more as memory allocation is typically optimal for medium-large allocations not for small objects but you do need to profile that to know for sure however. Good thing about the standard library containers i can provide a custom allocator type with-out resorting to redefining global operators new/delete just to use different/custom memory scheme/model, i can just do:

typedef std::list<foo, boost::fast_pool_allocator<foo> > fast_foo_list;


i'm done.

like i was saying earlier in some other cases you have no choice but to use pointers, there may be times when you have more than one container where one holds the actual instances while the others store pointers/iterators to these.

Quote:Original post by Will F
Generally when I use containers holding pointers I put it as a private member of a class (for example, I have one called CSpriteManager), and make have the manager take care of the memory management issues (via calls to the manager like addSprite and RemoveSprite) to ensure that no code anywhere else in the program does anything dumb with the list. Is this a good way of doing things - or should I be looking into another way to do it?


Sounds reasonable but so does storing the actual type as its sounds like CSpriteManager is the sole owner. Its hard to say for sure with-out further details, knowing what the requirements are, seeing whats going on in code, etc. some question you can ask yourself to find out are:

Object-Ownership, sounds trivial but its not especially when it comes to pointers. is sharing of instances significant? who owns what?, when should delete be invoked if ever etc.

Is random accessibility significant? if so are elements stored contiguously in memory significant? if not then std::deque might be more appropriate its also random accessible, its a better alternative for a random accessible container of (smart) pointers than std::vector, petewood just beat me to it [grin]:

Quote:Original post by petewood
general rule of thumb:


prefer std::deque over std::vector -
Quote:So, for the three reasons cited earlier: Consider preferring deque by default in your programs, especially when the contained type is a class or struct and not a builtin type, unless the actual performance difference in your situation is known to be important or you specifically require the contained elements to be contiguous in memory (which typically means that you intend to pass the contents to a function that expects an array).



At the moment i've gone brain dead [lol], if i can think of any more i'll let you know.

Quote:Original post by Will F
I realize that, my point is that where I previously thought I had a basic handle on containers, but you've almost scared me into not wanting to use them. I'd almost rather have a container class that I wrote that I know exactly what's going on, at the cost of efficiency (and probably some nasty bugs).


Woah sorry heh, all i can say to that is generally you don't need to be concerned with the internals (although nothing prevents you looking into your compilers implementation to see whats going on, i do this alot and have learn't quite abit from it).

The standard states the concepts & requirements of containers & algorithms (among other things) such as time/space complexities which a C++ compiler must adhere to to be a c++ compiler. Thats all you need to do is find a good reference such as here look at what operations (and structures) are available and there complexities then see any of them match your requirements.

If your compiler's imp does not conform to the requirements of the standard then you would bring it to the attention of the compiler vendor to sort it out ASP if they don't i would take my business else as its there job and there responsibility to do so.

Quote:Original post by Will F
I suppose my next step is to look into shared pointers to make memory management in this situation easier?.


reference counted smart pointers mainly make the problem of "object-ownership" easier but as a consequence take care of deleting, lets you do RAII. It makes your life easier why not [grin].

Quote:Original post by Will F
I haven't really looked much at boost yet, and it should probably be put high on my todo list.


Oh yeah well worth investigating, some components of boost are currently being reviewed to be included into the standard some compilers already implement some of it e.g. std::tr1::shared_ptr.

[Edited by - snk_kid on June 11, 2005 6:44:25 AM]
You really have 3 options:

vector<T>. Difficult to leak memory with this. Can be slow if copy constructor/ opeator = are expensive for T.

vector<T*> Easy to leak memory & due to cache misses, may be slower than
vector<T> for some uses.

vector< boost::shared_ptr<T> > Always going to be slower than vector<T*>, but much easier to avoid memory leaks.
Quote:Original post by Nitage
You really have 3 options:

vector<T>. Difficult to leak memory with this. Can be slow if copy constructor/ opeator = are expensive for T.

vector<T*> Easy to leak memory & due to cache misses, may be slower than
vector<T> for some uses.

vector< boost::shared_ptr<T> > Always going to be slower than vector<T*>, but much easier to avoid memory leaks.

which of these you choose is dependent on how your vector is being used. i've used them in two common ways...

for example, what if Tiles are just ever added? let's say no part of the program should be removing Tiles or getting/making copies. the vector is just passed around by const ref so other can search for tiles, use ones at certain indexes, etc. the users of vector are just doing things like:
void foo ( const vector<Tile> & vec ) { vec[12].some_tile_fn(); }

we are accessing the Tiles by ref so no copy is done. when the vector is no longer needed it goes out or scope and it's destructor is called. this use of a vector<Tile> is just about as efficent as vector<Tile*> IMO. the only reason i'm not sure is that i don't know if this is optimized by the compiler:
vec.push_back( Tile("grass.bmp") );

is this really a construction of a temp and then a copy? or can the compiler do something cool here? i'm thinking not because the construction and copy could have side effects. in this case adding tiles becomes a bit more costly for vector<Tile> then vector<Tile*> unless Tile has a really smart copy constructor. it boils down to whatever the diff is between a Tile copy and a Tile* copy.

now if one part of the program is adding Tiles and another is removing them and then using them i would go with the boost::shared_ptr idea.

_ugly
Thanks for all the replies. I'm at the point in C++ where i'm comfortable with the language, but perhaps a bit overconfident in what I think I know (which is a really dangerous place to be). And anything I can use to learn more is always appreciated (I did state in my first post, - correct me if i'm wrong - with the hope I would get replies like these, correcting me where i'm in error).

I kind of miss C where after a while I pretty much knew the language inside and out, whereas with C++ I think that 5 years from now i'll still be learning things about the standard language. Oh well, no turning back now.

Also thanks for the link to the Guru of the Week, that's a goldmine of great info.

This topic is closed to new replies.

Advertisement