Sign in to follow this  

ARB_vertex_buffer_object: No speed increase

This topic is 4856 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello together, I'm using VBOs now to render a lot of triangles (65536) using GL_TRIANGLE_STRIP. However, I don't get a speed increase in contrast to using VAR. This is how I set up my VBO:
void CVertexBuffer::Init( CV3d* _pV, unsigned int _numV,
						 CV3d* _pN, unsigned int _numN, 
						 CColor* _pC, unsigned int _numC,
						 CV3d* _pT1, unsigned int _numT1,
						 CV3d* _pT2, unsigned int _numT2,
						 CV3d* _pT3, unsigned int _numT3,
						 CV3d* _pT4, unsigned int _numT4,
						 GLenum _mode, unsigned int* _pIndices, unsigned int _numIndices, bool _bUseHardwareBuffer )
{
	if( !_numV  || !_pV )
		return;

	// pack every kind of buffer into one large buffer

	// calculate the offset values

	m_offsets[O_V] = 0;
	m_offsets[O_N] = m_offsets[O_V] + sizeof(CV3d) * _numV;
	m_offsets[O_C] = m_offsets[O_N] + sizeof(CV3d) * _numN;
	m_offsets[O_T1] = m_offsets[O_C] + sizeof(CColor) * _numC;
	m_offsets[O_T2] = m_offsets[O_T1] + sizeof(CV3d) * _numT1;
	m_offsets[O_T3] = m_offsets[O_T2] + sizeof(CV3d) * _numT2;
	m_offsets[O_T4] = m_offsets[O_T3] + sizeof(CV3d) * _numT3;
	m_offsets[O_END] = m_offsets[O_T4] + sizeof(CV3d) * _numT4;

	// fill the buffer
	m_pBuffer = new unsigned char[ m_offsets[O_END] ];
	if( _numV )
		memcpy( &m_pBuffer[m_offsets[O_V]], _pV, m_offsets[O_V+1]-m_offsets[O_V] );
	if( _numN )
		memcpy( &m_pBuffer[m_offsets[O_N]], _pN, m_offsets[O_N+1]-m_offsets[O_N] );
	if( _numC )
		memcpy( &m_pBuffer[m_offsets[O_C]], _pC, m_offsets[O_C+1]-m_offsets[O_C] );
	if( _numT1 )
		memcpy( &m_pBuffer[m_offsets[O_T1]], _pT1, m_offsets[O_T1+1]-m_offsets[O_T1] );
	if( _numT2 )
		memcpy( &m_pBuffer[m_offsets[O_T2]], _pT2, m_offsets[O_T2+1]-m_offsets[O_T2] );
	if( _numT3 )
		memcpy( &m_pBuffer[m_offsets[O_T3]], _pT3, m_offsets[O_T3+1]-m_offsets[O_T3] );
	if( _numT4 )
		memcpy( &m_pBuffer[m_offsets[O_T4]], _pT4, m_offsets[O_T4+1]-m_offsets[O_T4] );

	if( _pIndices && _numIndices )
	{
		m_pIndices = new unsigned int[_numIndices];
		memcpy( m_pIndices, _pIndices, _numIndices * sizeof(unsigned int) );
		m_numIndices = _numIndices;
	}
	else
		m_numIndices = _numV;

	m_mode = _mode;

	m_bUseHardwareBuffer = _bUseHardwareBuffer;

	// generate the buffer in video memory if possible

	if( GLEW_ARB_vertex_buffer_object && m_bUseHardwareBuffer )
	{
		glGenBuffersARB( 1, &m_bufferHandle );
		glBindBufferARB( GL_ARRAY_BUFFER_ARB, m_bufferHandle );
		glBufferDataARB( GL_ARRAY_BUFFER_ARB, m_offsets[O_END], m_pBuffer, GL_STATIC_DRAW_ARB );
		
		if( _pIndices && _numIndices )
		{
			glGenBuffersARB( 1, &m_indexBufferHandle );
			glBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, m_indexBufferHandle );
			glBufferDataARB( GL_ELEMENT_ARRAY_BUFFER_ARB, m_numIndices * sizeof(unsigned int), m_pIndices, GL_STATIC_DRAW_ARB );
		}
		glBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );
		glBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, 0 );
	}
	else
	{
	}
}

The fillrate is not the limit because no difference between: - white color, no lights, no textures (still passing the texture coords though) - point light, bump mapping, multitexturing, using vertex- and pixelshaders Any hints? Lyve

Share this post


Link to post
Share on other sites
Sorry I was wrong, I've rendered 131072 triangles, not 65536.

I've tested it with a lot more now: A grid of 1024x1024 = 2097152 triangles, rendered as strips with one degenerated face per row.

With VBO: 1,72fps
With VAR: 2,02fps

System: Geforce FX 5700 / P4 3GHz HT

Regards,

Lyve

Share this post


Link to post
Share on other sites
just an idea, not to be taken seriously.

since your getting that low of frame rates y dont you try and decrease the number of triangles and find the FPS then. it could be fill rate

Share this post


Link to post
Share on other sites
How does VBO preform compared to plain VA? Could you try testing with 'unsigned short' index buffer format. Could you also post how you render with VBO.

I've seen big differences between VAR and VBO in early drivers but now it performs about the same.

adam17: He stated that fillrate is not an isue.

Share this post


Link to post
Share on other sites
@DarkWing

Sorry that I misleaded you, I didn't ment Vertex Array Range, I ment plain Vertex ARrays, no extensions.

@adam My fillrate for 131xxx triangles is about 30-40. If I increase the number of triangles, the triangles are resized accordingly so that the same area gets filled.

___________

I tried it now on another test system: AMD XP 2400+ / GFFX 5900. The VBO is making a huge difference here. For 131k triangles I get about 12fps with VA, about 60 with VBOs. Remember: I had 40fps with and without using VBOs on a P4 3GHz / GFFX 5700. How come that there is such a huge difference on a AMD system, but not on a P4 one? AGP Bus differences?

Regards,

Lyve

Share this post


Link to post
Share on other sites
I rather think the difference between systems is due to he graphics card, not the processor. However, it's true that driver optimization may happen on specific processors, due to MMX, SSE or 3DNow! optimizations.

As for the difference between VA and VBO in general, it's not guaranteed that VBO's will always be faster than VA's. If you make wrong use of VBO they can slow down things significantly. "Wrong" use especially includes data types. For that, DarkWing's unsigned short advice is very important for instance. The GeForceFX series can execute arrays bigger than 64k but when you'll run your program on GeForce1-4 series it will either kill your framerate or don't even work at all. Try splitting your geometry in 64k chunks if you can. It saves significant memory resources on index arrays and ensures compatibility over all hardware.

Share this post


Link to post
Share on other sites
"@adam My fillrate for 131xxx triangles is about 30-40. If I increase the number of triangles, the triangles are resized accordingly so that the same area gets filled."

oh ok that makes since now

Share this post


Link to post
Share on other sites
My guess is that you are rendering a to big of a patch..
In my terrain demo i used regular vertex arrays and got about 100 fps..

Switched to VBO and didnt see any markant change..
Until i split it up into feeding VBO batches of 2048 triangles..
So my 131k terrain i split up into 64 pieces and used VBO once for each piece..
KABOOM..
Fps was up at and over 200 fps..
More than 100% speed increase..

Might be worth a try??

Share this post


Link to post
Share on other sites

This topic is 4856 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this