# boost::ptr_vector performace problems :

This topic is 3667 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi folks ,i'm doing some tests, with boost::ptr_vector, i have a class who load objets in a simple format, i load this format in a simple raw pointer like this Vertex p = new Vertex[countPoints]; and in the render step i take the information using the [] like this p[face.a] but when i change it to boost::ptr_vector a lost a LOT of performance , with row pointer i get 466 FPS, with boost::ptr_vector 210 FPS, i know i will have a performance cost for use boost::ptr_vector, but this is a LOT of lost in performance ... it is true or i have a bad concept to try to save the vertex in a boost::ptr_vector for remplace the raw pointer to a boost::ptr_vector i save each vertex with boost::ptr_vector::push_back an get the vertex with the [] operator if you have any tip to help me in that performace problem i will apreciete thanks in advance

##### Share on other sites
Is there any reason you're using ptr_vector instead of a regular std::vector?

##### Share on other sites
From what you posted, it looks like you probably want std::vector, not boost::ptr_vector.

Can you post some of your code so that we can see how exactly you're using boost::ptr_vector?

##### Share on other sites
Well, first of all, don't measure performance in FPS. It's a stupid metric. Your scary sounding 466 FPS to 210 FPS drop, for instance, is really just about two and a half milliseconds per frame. The reciprocal nature of FPS makes it lousy for comparisons like these.

I really can't see, though, why anybody would ever want to use a ptr_vector for storing mesh vertices. The only way you should ever store vertices is in a packed array (or equivalent contiguous container, like std::vector) which can be directly given to the card as a vertex array or VBO. What functionality are you trying to gain with ptr_vector?

##### Share on other sites
thanks to all for your fast reply! i just want to use ptr_vector to store pointer to Vertex .... but reading your post mybe i'm wrong and the std::vector is the correct way to to do that :)

##### Share on other sites
Quote:
 but reading your post mybe i'm wrong and the std::vector is the correct way to to do that :)

Yes, std::vector is the correct way. Unless you're using immediate mode in OpenGL, where you can individually submit each vertex, you neeed your vertices in a contiguous block of storage to submit them to the card. std::vector provides that. As a result, it is also more efficient when traversing linearly across the entire vector, as I suspect you are doing (in conjunction with GL's immediate-mode functions). The reason is because it is more cache-friendly to have all the vertex data nearby; memory is all fetched from generally the same location, which is likely already cached from a previous fetch to that area, and thus you make less cache misses.

Each element in a ptr_vector is a pointer. While the pointers will be contiguous in memory, the data they point to -- the vertices, would most likely not be. In fact they may be scattered all over the heap. Traversing a ptr_vector linearly like this is likely to induce more cache misses; on a large enough data set, this will degrade performance to some extent.

##### Share on other sites
jpetrie, thanks for the info, i'm using OpenGL if i use imediate mode what do you recomend?

##### Share on other sites
Quote:
 Original post by juglarjpetrie, thanks for the info, i'm using OpenGL if i use imediate mode what do you recomend?
std::vector is probably still the best option (plus, you'll probably want to switch to vertex arrays at some point anyway for performance reasons).

##### Share on other sites
Quote:
 Original post by juglarjpetrie, thanks for the info, i'm using OpenGL if i use imediate mode what do you recomend?

Not using immediate mode.

##### Share on other sites
Quote:
 Original post by SneftelThe only way you should ever store vertices is in a packed array (or equivalent contiguous container, like std::vector) which can be directly given to the card as a vertex array or VBO. What functionality are you trying to gain with ptr_vector?

If you are loading from a file format like obj that stores "vertex data" like position, normal, and textureCoords separately then it is necessary to store them in an intermediate format as independent vector/arrays...in addition it seems easier to do calculations such as dynamic changes to the mesh or calculation of vertex tangents.

If you use D3DPOOL_MANAGED then there is no need to maintain a contiguous container in GPU form because you'll never need to pass them manually to the graphics card more than once, so there would be no point to maintaining an array in that form anyway.

Storing the data in a custom structure like "Vector3" makes it a lot easier to do arithmetic, which would be the only purpose of saving that data that I can think of. But if you don't use a vector of pointers, then you have to invoke a lot of extra constructors/assignment-ops, which is a lot less efficient...and you also cannot conveniently use the constructor when adding new objects like you can with pointers. So, it seems to me that a vector of pointers is the way to go.

##### Share on other sites
I think you're confusing packed arrays and interleaved arrays.

##### Share on other sites
I don't think so...what gives you that impression?

##### Share on other sites
Quote:
 If you are loading from a file format like obj that stores "vertex data" like position, normal, and textureCoords separately then it is necessary to store them in an intermediate format as independent vector/arrays...in addition it seems easier to do calculations such as dynamic changes to the mesh or calculation of vertex tangents.

This gets into specific requirements, so things start to get up in the air. However, in general, data that is intended to be used for rendering should be stored in a format that is efficient for rendering, which is generally a format you should maintain as close to the native format of the render subsystem as possible, to minimize swizzling stuff around prior to render. While loading geometry from a file format, it may be neccessary to store data in intermediate forms, but that doesn't mean that should be the final form of the data if efficient rendering is the goal.

If the goal of the application is to manipulate mesh data, then yes, certain other representations may be more useful.

Quote:
 Storing the data in a custom structure like "Vector3" makes it a lot easier to do arithmetic, which would be the only purpose of saving that data that I can think of. But if you don't use a vector of pointers, then you have to invoke a lot of extra constructors/assignment-ops, which is a lot less efficient...and you also cannot conveniently use the constructor when adding new objects like you can with pointers. So, it seems to me that a vector of pointers is the way to go.

You can store a contiguous array of "Vector3" objects and have all the benefits of having a nice class for your vectors, as long as the class is suitably designed and implemented such that it's representation in memory is correct. That is, something like
struct Vector3{  float x,y,z;};

with no virtuals and appropriate compiler-specific options applied to ensure tightly packed, unpadded organization, et cetera. This is quite common. There is no reason, in general, why a collection of such objects needs to be stored as a vector of pointer-to-Vector3. It's not like these things are expensive to copy, and even then, you're faced with the problem of "which hurts more, lots of copies of a couple bytes to probably-in-cache memory, or lots of copies of a slightly smaller number of bytes but frequently references to out-of-cache memory." Most modeller tools, that would be doing the frequent manipulation of mesh data the way you describe, use higher-order representations of the data anyway. Only primitive vertex-only-based modellers do extensive per-vertex manipulation in random access. Most games, when they process their vertices, do it in a highly sequential fashion that would cause the pointer-to-Vector3 approach to cache miss all over.

EDIT: Also, what do you mean by "can't use the constructor" when adding new elements? Of course you can (push_back(Vector3(x,y,z))).

Quote:
 If you use D3DPOOL_MANAGED then there is no need to maintain a contiguous container in GPU form because you'll never need to pass them manually to the graphics card more than once, so there would be no point to maintaining an array in that form anyway.

There is, in fact, no point to maintaining the storage at all in many cases. But you still need a suitable representation to provide to the API that one time.

##### Share on other sites
Quote:
 Original post by jpetrieYou can store a contiguous array of "Vector3" objects and have all the benefits of having a nice class for your vectors, as long as the class is suitably designed and implemented such that it's representation in memory is correct. That is, something likewith no virtuals and appropriate compiler-specific options applied to ensure tightly packed, unpadded organization, et cetera. This is quite common.....EDIT: Also, what do you mean by "can't use the constructor" when adding new elements? Of course you can (push_back(Vector3(x,y,z))).

This is something that I have debated mentally for a while, so I'm glad to hear your input.

Thanks for your input. If there is a better solution than what I'm doing, I'm all ears.

It is my understanding that in order to get those benefits, it must be a POD type, meaning that it has no non-trivial constructor. If it doesn't have this, then you can't do "push_back(Vector3(x,y,z))"

Secondly, if you do "push_back(Vector3(x,y,z))", then you first call Vector3 constructor when creating it and then call Vector3 copy constructor when assigning it. In previous time trials I have done, I found that calling the Vector3 constructor occupied a very large percentage of my processing time...far far more expensive than any of the internal vector arithmetic, so I have attempted to minimize the necessary constructor calls.

I started out by using POD types for Vector3, but it is such an inconvenience to use without a normal constructor when you want to insert into a vector like this.

Oops, I have to go...no time to finish these thoughts.

##### Share on other sites
Quote:
 It is my understanding that in order to get those benefits, it must be a POD type,

The "benefits" (having the type compact) need to be manually ensured regardless of whether or not the type is POD. A POD type may be aligned or padded by the compiler the same way a non-POD type may be. You don't have to worry about adding a virtual table pointer for a POD type, but you do need to worry about the other stuff. In this case there is really little reason to prefer POD to non-POD, as non-POD gives you the constructor you want.

Quote:
 Secondly, if you do "push_back(Vector3(x,y,z))", then you first call Vector3 constructor when creating it and then call Vector3 copy constructor when assigning it.

Correct. You will construct the object, then the container will copy it. This happens regardless of whether or not the type is POD or non-POD. The other comparison you've been making is pointer-to-type versus non-pointer-to-type, so let's examine that case. It would look like vec.push_back(new Vector3(x,y,z)). Here we have an allocation, which involves a walk through the heap to find an appropriate block, an invocation of the appropriate constructor, and we also copy the resulting pointer within the container. The allocation is expensive, but may be avoided if we're using a fancy pooling allocation scheme somewhere, or whatever. We still have the same overhead from the construction, so that washes. Then we have the copy of the pointer.

In effect, the non-pointer option involves a copy of sizeof(float) * 3 bytes (probably 12) and the pointer option involves a copy of sizeof(Vector3*) bytes (probably 4) and a potentially expensive allocation. You're concerned about 8 bytes?

Quote:
 In previous time trials I have done, I found that calling the Vector3 constructor occupied a very large percentage of my processing time

Without the actual benchmark this doesn't mean much, but I would even contend that it's information that washes. Constructing the vector must be done, period, in both the non-pointer and pointer approaches. Given the same implementation of Vector3 for both tests, if the test is structured correctly, then you should spend the same amount of time constructing the Vector3. Copying the Vector3 will be more efficient on the order of (typically) eight bytes per copy. I'm skeptical that the performance impact of eight extra bytes on a collection of vertices is worthwhile, especially if you are going to be doing extensive linearized processing of the vertices.

And again, the bulk of the processing done on buffers of vertices is linear traversal. It is rare that you are actually moving vertices around within the vertex buffer itself with high frequency. That kind of operation tends to be present in simplistic point-and-poly based modelling tools, and even then you'd probably be better off moving the indices in an index buffer around, rather than the vertices themselves, for most operations. Even then there is little use for moving the vertices around beyond adding and removing them, except for mesh optimization, which is almost always going to be an offline process where speed is irrelevant and code clarity is more important. This is why modelling tools typically use different internal representations to actually work on, and if that's what you're doing, this discussion doesn't quite pertain.

Quote:
 I started out by using POD types for Vector3, but it is such an inconvenience to use without a normal constructor when you want to insert into a vector like this

Now I'm confused again. If the type is POD, it can't have a constructor. But earlier you said "and you also cannot conveniently use the constructor when adding new objects like you can with pointers" which leads me to believe you've got a disconnect somewhere. Whether or not you're involving pointers is orthogonal to being a POD type. If the type is a POD, is has no constructors you can use, even if you allocate it via new.

##### Share on other sites
Quote:
 This is why modelling tools typically use different internal representations to actually work on, and if that's what you're doing, this discussion doesn't quite pertain.

In fact this is what I am doing, storing the data in a format that is easiest for dynamic changes to the mesh -- such as optimizations, deformations, etc.

You say that in this case the discussion does not pertain, but if not doing this, then it seems there is no reason to maintain the vertex data in a std::vector at all...since it would be stored in a vertex buffer or display list that is managed by the graphics API already.

I have just done some timing tests, and you are absolutely right. In fact, it turns out that populating a std::vector of Vector3 is twice as fast as with Vector3*, and additionally, accessing the Vector3 data is twice as fast when not using pointers as well...not to mention that there is no need to delete that memory or deal with additional pointer dereferencing.

##### Share on other sites
Quote:
 You say that in this case the discussion does not pertain, but if not doing this, then it seems there is no reason to maintain the vertex data in a std::vector at all...since it would be stored in a vertex buffer or display list that is managed by the graphics API already.

The concept of a "managed pool" in the same sense that D3D9 has does not exist in all graphics APIs, and furthermore is not always the most optimal storage location for vertex data. In which case, the buffer may need to be kept around.

##### Share on other sites

This topic is 3667 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.