std::vector operator [] is evil?

Started by
24 comments, last by werekarg 18 years, 9 months ago
Quote:Original post by JY
When you use the [] operator the element in the vector gets copied. So if you have a vector of vectors, the inner vector will get copied when you use the first [] operator. This would account for the performance hit, and certainly did with my map of maps.

Like I say I could be wrong but I seem to remember reading somewhere that this is the case.


Yes you are completely wrong, both overloaded subscript operators for vector & (unordered_)map return constant and non-constant references depending on the context. If you don't declare your variables as of reference type then a copy will be made but this is exactly the same case for C-style arrays [headshake].

Quote:Original post by Arkion
I'm using Visual Studio.NET 2002 and Dinkumware STL implementation bundled with it.



I also suggest you update your compiler as VC++ 7.0 is almost as bad as VC++ 6.0 which is significant, if you want an MS compiler get VC++ 7.1 > *.
Advertisement
I had tried profiling the Release build back when the STL vector usage was still unoptimized (63 FPS in Debug), and the Release build resulted in framerate of 75 FPS.

I just tried the Release build again as many of you suggested, now with the optimized code, and it resulted in framerate of around 200 FPS! I can tell you, only optimizations I have done were that of STL vector usage. Summary of the results:

Before: Debug 63 FPS/Release 75 FPS
After: Debug 133 FPS/Release 200 FPS

This makes me draw a conclusion that Debug build can be indeed used for preliminary profiling. Because if FPS goes up in Debug mode, it will also go up in Release mode (though not necessarily in same proportion, as seen in the results above).

Just in case you are interested, here's a concrete example of optimized performance-critical code that bumped the FPS from 122 to 133 FPS in Debug mode. BSP tree for the test map has 172 I-nodes and 173 leaves (well balanced), making this code called 345 times per frame.

tree->m_nodes is a std::vector<BSPNode>.

Before:
void Renderer::RecursiveRender(BSPTree *tree, int node){	if(tree->m_nodes[node].m_isLeaf)	{		std::vector<Surface_t>::iterator iter;				for(iter = tree->m_nodes[node].m_surfaces.begin(); iter != tree->m_nodes[node].m_surfaces.end(); iter++)		{			glBindTexture(GL_TEXTURE_2D, iter->mat->diffMap);			glDrawArrays(GL_TRIANGLES, iter->firstVert, iter->count);		}		return;	}	RecursiveRender(tree, tree->m_nodes[node].m_iFrontChild);	RecursiveRender(tree, tree->m_nodes[node].m_iBackChild);}


After:
void Renderer::RecursiveRender(BSPTree *tree, int node){	BSPNode *pNode = &tree->m_nodes[node]; // Store pointer to the node and use it - much faster	if(pNode->m_isLeaf)	{		std::vector<Surface_t>::iterator iter;				for(iter = pNode->m_surfaces.begin(); iter != pNode->m_surfaces.end(); iter++)		{			glBindTexture(GL_TEXTURE_2D, iter->mat->diffMap);			glDrawArrays(GL_TRIANGLES, iter->firstVert, iter->count);		}		return;	}	RecursiveRender(tree, pNode->m_iFrontChild);	RecursiveRender(tree, pNode->m_iBackChild);}
if you missed it the first time i'll say it again.

Quote:Original post by snk_kid
Quote:Original post by Arkion
I'm using Visual Studio.NET 2002 and Dinkumware STL implementation bundled with it.



I also suggest you update your compiler as VC++ 7.0 is almost as bad as VC++ 6.0 which is significant, if you want an MS compiler get VC++ 7.1 > *.


Quote:Original post by Arkion
Just in case you are interested, here's a concrete example of optimized performance-critical code that bumped the FPS from 122 to 133 FPS in Debug mode


One thing I noticed while looking at your code is that you post-increment the iterator in your for loop(ex. iter++ ). You should get in the habit of pre-incrementing any non-trivial data tpyes(ex. ++iter). This could speed thing up more.

-John
- John
Quote:Original post by Teknofreek
Quote:Original post by Arkion
Just in case you are interested, here's a concrete example of optimized performance-critical code that bumped the FPS from 122 to 133 FPS in Debug mode


One thing I noticed while looking at your code is that you post-increment the iterator in your for loop(ex. iter++ ). You should get in the habit of pre-incrementing any non-trivial data tpyes(ex. ++iter). This could speed thing up more.


If we are going doing that road then i'll add to that [grin], constant-correctness (constant members functions, use a constant iterator etc as your only rendering) & use generic algorithms over explicit loop code.
Quote:Original post by Arkion
After:
for(iter = pNode->m_surfaces.begin(); iter != pNode->m_surfaces.end(); iter++)

You might want to cache the result of .end() to a const_iterator and compare with it, rather than evaluate the size of your vector couple hundred times ^^;
For what it's worth, I was a big fan of using []'s just because I was so familiar with the syntax thanks to c++'s for loops and it was a lot less to type than iterators. But then I was chastised for doing so because it was a lot slower. So now I always use the iterative method. I don't know that I've noticed any difference but I think there are a number of other things that hurt my performance even more.

But using the []'s in your performance critical loops sounds like a very bad thing and I'm not surprised in the least. I imagine that operator was being called extremely often each frame.

I attended a lecture at a past GDC about optomizing code and the speaker spent at least half of the lecture talking about how bad STL is for performance and how something simple like what you did (switching []'s to iterators) will make a huge improvement.
Quote:Original post by helix
But using the []'s in your performance critical loops sounds like a very bad thing


It's not. It is my sad duty to inform you that you should use release mode. Your compiler, if released within the past few years, now includes what's known as an "optimizer" which will eliminate the overhead for you, eliminating countless excuses to get down and jiggy with pointers and manual allocations. It's like depriving a pig of mud I tell you!!!

Quote:and I'm not surprised in the least. I imagine that operator was being called extremely often each frame.


In debug mode, yes, so you could in theory put a breakpoint in std::vector::operator[]. However, in release mode, "calling" is a misdominer - the assembly will not resemble anything like "calling", rather, it will produce the same assembly as something like this:

int * pie = new int[4238];
int i = pie[34];

The second line which, needless to say, involves absolutely no functions. This is due to optimizations such as automatic inlining.

Quote:I attended a lecture at a past GDC about optomizing code and the speaker spent at least half of the lecture talking about how bad STL is for performance and how something simple like what you did (switching []'s to iterators) will make a huge improvement.


How many years ago?

Recent compilers will optimize both out the same in many/most circumstances. There's still reasons to prefer iterators when it makes sense, but speed in most cases isn't one if them.

Advice gets old quick in this industry. 10 years ago in C you could write <<2 instead of *4 and see a preformance boost in your innermost loops. Today? Your compiler automatically preforms this and many many other optimizations as needed, automatically (if allowed by setting release mode/-O3/your compiler's equivilant of "optimize the heck outta my code" is), free of charge.
just wanna point out that this
glBindTexture(GL_TEXTURE_2D, iter->mat->diffMap);
or even (less so) this
glDrawArrays(GL_TRIANGLES, iter->firstVert, iter->count);

are gonna take (roughly 20billion times, slight exageration) as long as how youre doing the loop
ie
youre barking up the wrong tree
Quote:Original post by snk_kid
If you don't declare your variables as of reference type then a copy will be made...


But this isn't true for vectors?

"Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan

This topic is closed to new replies.

Advertisement