std::vector slow

Started by
26 comments, last by Extrarius 16 years, 5 months ago
I have to calculate some geometry dynamically (about 6000 vertices and 18000 indices) every frame. As I calculated the geometry, I was using push_back to store the computed vertices and indices. Note: I called vector::reserve at initialization time with a large enough max, so no memory allocation should be needed at runtime. I narrowed it down that filling the vectors was the slow part. I replaced them with C-arrays, and got about a 20 ms performance increase per frame. I went from 20 ms per frame to 2 ms per frame. Any ideas why it was slow? The code looked like this:

WORD baseIndex = (WORD)m_vertices.size();

	for(int i = 0; i < numVertices; ++i)
		m_vertices.push_back(v);

	m_vertices.push_back(centroid);

	for(int i = 0; i < numVertices; ++i)
	{
		m_indices.push_back( baseIndex+i ); 
		m_indices.push_back( baseIndex + ((i+1)%numVertices) );
		m_indices.push_back( baseIndex + numVertices ); // centroid
	}
-----Quat
Advertisement
Was this in Debug or Release mode?

If in Debug mode then std::vector::push_back might be performing some error checking (memory verification stuff) that in large unoptimized quantities would be much slower than just an array index (which assembles to an offset even when Debugging I think?). This is the first thing that comes to mind, but if this is a Release build push_back probably would optimize to the array code anyway (besides the function call). Check the assembly output if possible or step through the assembly in your debugger. Thats as much as I can think of, as long as you're positive the vector isn't being cleared/reallocated every frame.
Its probably having to reallocate and copy memory around, try calling .reserve with the amount of pushes you are expecting before you start pushing them:

WORD baseIndex = (WORD)m_vertices.size();        m_vertices.reserve(numVertices+1);	for(int i = 0; i < numVertices; ++i)		m_vertices.push_back(v);	m_vertices.push_back(centroid);        m_indices.reserve(numVertices*3);	for(int i = 0; i < numVertices; ++i)	{		m_indices.push_back( baseIndex+i ); 		m_indices.push_back( baseIndex + ((i+1)%numVertices) );		m_indices.push_back( baseIndex + numVertices ); // centroid	}


Edit: Also, don't clear and fill the vector each frame, modify its contents instead.
If you really need to "clear" the vector, use resize(0) instead, since clear() might (depending on your compiler) actually also free the memory.
If you know how big the array needs to be why not just use an array?
// Bad:	for(int i = 0; i < numVertices; ++i)		m_vertices.push_back(v);// Good:	m_vertices.insert(m_vertices.end(), v, v+numVertices);
If you're using VS2k5, it performs quite a bit of error-checking by default, even in release builds. You have to make a couple of #define's to disable it.
Quote:
If you're using VS2k5, it performs quite a bit of error-checking by default, even in release builds. You have to make a couple of #define's to disable it.


I am using VS2005. Do you know what the defines are?
-----Quat
#define _SCL_SECURE 0


Also, Kwizatz is showing off the perfect way to get your .reserve()s wrong, you want:
m_vertices.reserve( baseIndex + numVertices + 1 );m_indices.reserve( m_indicies.size() + numVertices*3 );


VS2k5's implementation of .reserve, at least, also fails to grow exponentially if .reserve(n), n > .capacity() -- this could actually end up slowing things down as a result.
Thanks for the replies so far.

One thing I recently tried was to stop using push_back. Instead, I just used vector::operator[], and kept my own size counter. This is now comparable to the array performance.

I noticed push_back has a conditional statement (in case it needs to resize). Do you think that is the extra overhead? I read hardware does not like branching statements.
-----Quat

This topic is closed to new replies.

Advertisement