Archived

This topic is now archived and is closed to further replies.

gimp

[STL] Quickly filling a vector with data...

Recommended Posts

I''m loading data from a file. I have a pointer and a size for a bunch of CVector3f classes. I have 30,000 of these and I think that push_back(CVector(~~~)) would be a little bit inefficient. (Sub question : Correct me if I''m wrong but doesn''t push_back of an object just take a copy of the object?) Is there a way of just ''shoving'' the data in to the vector? (It''s possible I should be using boost::array for the container but I''ve never looked in to those yet, if it supports this kind of operation I''m willing to read up on it) Any idea''s on how to do this? And if so is it safe? I thought about getting a a reference to the first object, resize(x)''ing the vector and just memcpy''ing it in, but that sounds evil doesn''t it. I believe that the STL spec says that a vector''s memory must be contiguous. Thoughts? Chris Brodie http:\\fourth.flipcode.com

Share this post


Link to post
Share on other sites
quote:
Original post by gimp
I''m loading data from a file. I have a pointer and a size for a bunch of CVector3f classes. I have 30,000 of these and I think that push_back(CVector(~~~)) would be a little bit inefficient.

It can be made more efficient if you know how many elements you are planning to push_back. Then you can use reserve(n) to reserve storage for n elements in one hit - that will at least save a few reallocs.
quote:

(Sub question : Correct me if I''m wrong but doesn''t push_back of an object just take a copy of the object?)

Yes.
quote:

Is there a way of just ''shoving'' the data in to the vector?

vector actually uses placement new behind the scenes to separate object construction from memory allocation. If you have called reserve(), then memory allocation has already been done, and only in-place construction needs to be carried out.

The other thing you need to consider is the expected usage pattern of your data structure, and whether a vector does actually suit that pattern.

Share this post


Link to post
Share on other sites
quote:
Original post by gimp
I''m loading data from a file. I have a pointer and a size for a bunch of CVector3f classes. I have 30,000 of these and I think that push_back(CVector(~~~)) would be a little bit inefficient.

(Sub question : Correct me if I''m wrong but doesn''t push_back of an object just take a copy of the object?)

Is there a way of just ''shoving'' the data in to the vector?

Well, no. Remember that a vector is like an array and is almost certainly a single block of memory, say from memory location zero to 100 for example. Then if you have some other variable, at memory location 300 for example, there''s no way you can get that onto the end of the array without copying it. You can''t move memory.

But if the copying is prohibitive, then as Ziphnor said, storing pointers is probably the best idea. I tend to store pointers rather than objects in my vectors and lists, and it works fine as long as you have a good memory management strategy.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost | Asking Questions | Organising code files | My stuff ]

Share this post


Link to post
Share on other sites
quote:
Original post by gimp
I thought about getting a a reference to the first object, resize(x)''ing the vector and just memcpy''ing it in


That would work - it''s best to do one resize(30000) at the begininng and then memcpy''ing everything in.
Using reserve & push_back may invoke unneccassary copy ctor''s, and will copy it element by element. A couple of big memcpy''s will be the fastest.

Share this post


Link to post
Share on other sites
If you want insertion and random-access that are almost as good as list and vector (but not better) respectively, then give deque a try.

As for avoiding copy, have them to contain pointers instead of actual objects as many posts say. Just remember to release them.

Share this post


Link to post
Share on other sites
quote:
Original post by Magmai Kai Holmlor
That would work - it''s best to do one resize(30000) at the begininng and then memcpy''ing everything in.

That would still cause 30,000 object initialisations, and memcpy is only safe when it is certain to have the same effect as the assignment operator of the object - which you can generally guarantee for POD types, but is a flimsy assumption for UDTs.
quote:

Using reserve & push_back may invoke unneccassary copy ctor''s, and will copy it element by element.

These objects have to be constructed into being at some point. IOW, if he''s going to store objects by-value in the vector, then he''s going to have to construct them and pay for a copy. With a pre-emptive reserve(), the copy will not incur the expense of a memory allocation or, even worse, a reallocation.
quote:
A couple of big memcpy''s will be the fastest.

But not necessarily guaranteed to work. I would say it is more important for gimp to write code that surely does work in the first instance, and then determine whether it is fast enough for his needs. If this is a one-off hit at application start-up, then it''s not going to degrade overall performance.

Share this post


Link to post
Share on other sites