Strange slowness, std::vector, MSVC++9

Started by
39 comments, last by Emmanuel Deloget 15 years, 8 months ago
Quote:Original post by Shannon Barber
You can forcibly .resize the vector then bit-blast the data in.


Yeah, you can do that. But if that's your approach to C++, just go back to C. Please.

My free book on Direct3D: "The Direct3D Graphics Pipeline"
My blog on programming, vintage computing, music, politics, etc.: Legalize Adulthood!

Advertisement
Quote:Original post by Glen Corby
Quote:
Crippling cost? Now you're reading into the original post. As I read it, they wondered why the one was much slower than the other. No mention of "crippling costs".


I definitely do consider a real time application that at best will run at 15 - 40 fps in release mode if doing nothing else but gathering a render list to be crippled.


Well, that's a judgment for the original poster to make, not me or you. You're reading a lot of assumptions into his post that he didn't state. They may be true and the poster may be speaking from the same perspective as you. Or they may not.

I gave up trying to mind read people's specific situations in their posts long ago.

My free book on Direct3D: "The Direct3D Graphics Pipeline"
My blog on programming, vintage computing, music, politics, etc.: Legalize Adulthood!

Guys and girls,

I'm aware that there is a considerable overhead for my copy constructor. My only wonderment was why on earth the supposedly faster insert() is always 3 times slower than iterating and using push_back(). I'm aware that my gather time is too slow, and that this is caused by the amount of data I was passing around.

I'm reading your comments and taking them as advice on how to remove the other overheads. My initial question has been answered now: insert needs to allow for non-default constructors.

Basically, there will be a small number of render list nodes, less than 100. the mesh I am testing it with has rather alot because it is loading the entire level and sending all nodes to be rendered on every frame. However, in the finished product this display list will be changed only when the player moves into a new portalised chunk, so the 1ms gather time is optimised enough. For dynamic objects, I will store the meshes on the stack and the render node will consist of just a view matrix and a reference to the appropriate object.

[Edited by - speciesUnknown on August 10, 2008 10:02:17 PM]
Don't thank me, thank the moon's gravitation pull! Post in My Journal and help me to not procrastinate!
Quote:Original post by speciesUnknown
I'm reading your comments and taking them as advice on how to remove the other overheads.


IMO, you're already doing the most important thing: optimizing from measurements. The rest of us can only guess what you're doing, but you've got the data in front of you to guide you on your quest. Just remember that the order of magnitude improvements in performance generally come from choosing better algorithms or data structures (which is where using std::string as a value type or a reference type comes in).

My free book on Direct3D: "The Direct3D Graphics Pipeline"
My blog on programming, vintage computing, music, politics, etc.: Legalize Adulthood!

I'm using std::string as a key type in various places, as well as to store names which are only changed once at load time. Is there a better alternative? I have written a C library wrapper around char * for a bullshit exam where we had to use C strings, so perhaps I could use that instead, and overload the < operator?
Don't thank me, thank the moon's gravitation pull! Post in My Journal and help me to not procrastinate!
Isn't using list a better alternatives? Why bother with the vector? Afterall, IMO, you would be travelling from start to end of the list... a RenderList is a (not the OOP is_a) list right?
"after many years of singularity, i'm still searching on the event horizon"
Ive been informed that a list may suffer from cache thrashing when I iterate through it, but ill try it anyway.
Don't thank me, thank the moon's gravitation pull! Post in My Journal and help me to not procrastinate!
I don't think you need to worry about cache misses, cache friendly optimizations can definitely wait for the end of the project if you're running slow and you've run out of general algorithmic optimization wins.

I haven't done any work with hardware profilers, so take this with a pinch of salt, but I think using a list for a 'display list' isn't going to cause many cache misses, as your list additions aren't going to be interspersed with many other allocations, and as such the allocated list nodes will be fairly contiguous in memory. Though I am used to consoles were the game is the only thing accessing it's reserved block of ram, I have no idea how other processes interact with an applications ram allocations on a PC.

Now changing topic.

Using a std::string for a key value really depends on the usage. I'd say that if you're using std::string as a key in a std::map, it can be plenty fast enough for a main loop, but as legalize advises, let your profiling be the guide on this one.

Using std::string's for resource naming and resource lookups and things like that is perfectly fine however, I generally never use raw char*.
if the vector is not being appended to, then avoid the vector::insert( back, begin, end) and use instead vector::assign( begin, end). in either of these (insert/assign) operations i would avoid the call to reserve, since youve provided enough information upfront so that the container can determine using std::distance whether it might need to reallocate.

its a bit weird though,

what is the final capacity() that you get at the end of each method using insert() vs push_back() ? could it be that somehow there is an internal resizing going on (calling reserve upfront should avoid this - that's the point) - so that there is extra default/copy constructing going?t. calling capacity() would be an easy way to determine if the allocation(s) were behaving differntly.

if its 227 elements (very small) taking 20/60ms then 95% of the time must be being spent doing object copy constructing, which perhaps is done differently in push_back vs insert. [edit] maybe put a static int count = 0; ++count in the copy constructor and assignment operator to check if each approach is really doing the same number of copies

[edit] 2.

I was curious about this so tested by implementing a static count implmented in the default and copy constructors of an object an using a vector with n = 1000 elts.

$all these operations behave identically
v.resize( n),
v.assign( b,e),
insert( v.b, x.b, x.e)
copy( x.b, x.e, back_inserter v)

with or without a reserve( n) in all cases they do n = std::distance = 1000 default or copy constructs in all cases. ie they are algorithmically optimal.


$push_back() where n = 1000 with a pre reserve( 1000) does 2000 default or copy construts.

$push_back() where n = 1000 without a reserve( 1000) does 3023 copy of default constructs due to incremental reallocation.

the assignment operator is never invoked in any of these operations - although required to be a conforming containtainer candidate. am using gnu libstdc++.

so anything that avoids push_back ought to be faster !! weird




[Edited by - chairthrower on August 11, 2008 5:15:21 AM]
Quote:Original post by speciesUnknown
I'm using std::string as a key type in various places, as well as to store names which are only changed once at load time. Is there a better alternative?


If the names are only changed once at load time, then why not keep all the names in a std::vector<std::string> and store in your data structures indices into this array of names? Then as you copy your data structures around, you're just copying POD ints and not classes with c'tors and d'tors.

My free book on Direct3D: "The Direct3D Graphics Pipeline"
My blog on programming, vintage computing, music, politics, etc.: Legalize Adulthood!

This topic is closed to new replies.

Advertisement