VBO (Vertex Buffer Object) Performance Issue

Started by
23 comments, last by Gumpngreen 20 years ago
Currently, rendering via compiled vertex arrays is about 3x times faster than using VBO's. I have not noticed any graphical glitches using VBO's, though stability is also influenced as it also randomly crashes with VBO's enabled. If anyone could offer some hints on why the performance is so low I would greatly appreciate it. The PC specs: Athlon XP 3200+ 1024MB RAM Radeon 9800 Pro DDR2 256mb Catalyst 4.3 Here is the initialization code:

r_numVertexBufferObjects = VBO_ENDMARKER + glConfiguration.maxTextureUnits - 1;
		if( r_numVertexBufferObjects > MAX_VERTEX_BUFFER_OBJECTS )
			r_numVertexBufferObjects = MAX_VERTEX_BUFFER_OBJECTS;

		glGenBuffersARB( r_numVertexBufferObjects, r_vertexBufferObjects );

		for( i = 0; i < r_numVertexBufferObjects; i++ ) {
			if( i == VBO_INDEXES )
				glBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, r_vertexBufferObjects[i] );
			else
				glBindBufferARB( GL_ARRAY_BUFFER_ARB, r_vertexBufferObjects[i] );

			if( i == VBO_VERTS ) {
				glVertexPointer( 3, GL_FLOAT, 0, 0 );
			} else if( i == VBO_NORMALS ) {
				glBufferDataARB( GL_ARRAY_BUFFER_ARB, MAX_ARRAY_VERTS * sizeof( vec3_t ), NULL, GL_STREAM_DRAW_ARB );
				glNormalPointer( GL_FLOAT, 12, 0 );
			} else if( i == VBO_COLORS ) {
				glBufferDataARB( GL_ARRAY_BUFFER_ARB, MAX_ARRAY_VERTS * sizeof( byte_vec4_t ), NULL, GL_STREAM_DRAW_ARB );
				glColorPointer( 4, GL_UNSIGNED_BYTE, 0, 0 );
			} else if( i == VBO_INDEXES ){
				glBufferDataARB( GL_ELEMENT_ARRAY_BUFFER_ARB, MAX_ARRAY_INDEXES * sizeof( int ), NULL, GL_STREAM_DRAW_ARB );
			} else {
				glBufferDataARB( GL_ARRAY_BUFFER_ARB, MAX_ARRAY_VERTS * sizeof( vec2_t ), NULL, GL_STREAM_DRAW_ARB );

				GL_SelectTexture( i - VBO_TC0 );
				glTexCoordPointer( 2, GL_FLOAT, 0, 0 );
				if( i > VBO_TC0 )
					GL_SelectTexture( 0 );
			}
		}

		glBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );
Here is the rendering code:

if( r_enableNormals ) {
			glEnableClientState( GL_NORMAL_ARRAY );
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, r_vertexBufferObjects[VBO_NORMALS] );
			glBufferDataARB( GL_ARRAY_BUFFER_ARB, numVerts * sizeof( vec3_t ), normalsArray, GL_STREAM_DRAW_ARB );
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );
		}

		GL_Bind( 0, r_texPointers[0] );
		glEnableClientState( GL_TEXTURE_COORD_ARRAY );

		if( numColors > 1 ) {
			glEnableClientState( GL_COLOR_ARRAY );
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, r_vertexBufferObjects[VBO_COLORS] );
			glBufferDataARB( GL_ARRAY_BUFFER_ARB, numVerts * sizeof( byte_vec4_t ), colorArray, GL_STREAM_DRAW_ARB );
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );
		} else if( numColors == 1 ) {
			glColor4ubv( colorArray[0] );
		}

		for( i = 1; i < r_numAccumPasses; i++ ) {
			GL_Bind( i, r_texPointers[i] );
			glEnable( GL_TEXTURE_2D );
			glEnableClientState( GL_TEXTURE_COORD_ARRAY );
		}

		glBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, r_vertexBufferObjects[VBO_INDEXES] );
		if( glConfiguration.drawRangeElements )
			glDrawRangeElementsEXT( GL_TRIANGLES, 0, numVerts, numIndexes, GL_UNSIGNED_INT, 0 );
		else
			glDrawElements( GL_TRIANGLES, numIndexes, GL_UNSIGNED_INT,	0 );
		glBindBufferARB( GL_ELEMENT_ARRAY_BUFFER_ARB, 0 );
[edited by - Gumpngreen on April 13, 2004 3:23:19 PM]
Advertisement
The driver might need some vbo optimization. I''ve read lots of folks complain about the speed. From what I gather, one should create lots of small buffers in gl and the opposite in d3d. This is because d3d switches between kernel and user space modes frequently which kills speed, so less switching by having deeper vbs is the way to go. Since gl doesn''t take a hit as much, you can switch more often and keep the verts in the internal caches, or something like that. I think the vbo speed should be improved in 56.x drivers from what I read in release 55 of nv driver docs.
For what its worth I have experienced similiar issues with VBO''s and eventually decided to avoid them.
Forget the last reply. The bug is quite simple. You are NOT using VBO. Plus, you are sending data to graphic card(glBufferDataARB) each frame. Look at the end of extension specification for some examples of use (or some tutorials). In general it goes something like this.

init:
-create buffer for each obect (glGenBuffersARB). If you have static data use GL_STATIC_DRAW_ARB
-fill buffer with vertex data glBufferDataARB (just once!)
(same for indices)

render:
-bind VBO
-set vertex pointers using offset
-render
(optional : bind VBO 0 to disable VBO)

You should never let your fears become the boundaries of your dreams.
You should never let your fears become the boundaries of your dreams.
As I said, for what its worth.

DarkWing, do you have any experience with the effects of VBO''s with fairly substantial datasets? I was experimenting with a dataset that was ~27 megs large. It ended up going a bit ( not significantly ) slower than using exclusively verex arrays.
quote:Original post by haro
DarkWing, do you have any experience with the effects of VBO''s with fairly substantial datasets? I was experimenting with a dataset that was ~27 megs large. It ended up going a bit ( not significantly ) slower than using exclusively verex arrays.

I''ve played with VBO size when it was released, becouse VBO bind was quite expensive. I remember there was a limit to the size of single buffer, but I don''t remember exact size. After you reached the limit VBO droped back to AGP memory(=SLOW). But the good thing is that cost of binding a VBO is getting very small so that having lots of small buffers is not a problem anymore. Now I use mostly small buffers (<1mb). one thing to remember is that VBO will not speed up your rendering (much) if you are not transfer bound.

You should never let your fears become the boundaries of your dreams.
You should never let your fears become the boundaries of your dreams.
27 Megs is not a "fairly substantial dataset", it's a small dataset. At work i have datasets that are in the Gigs range.. but that's off-topic.

Tom Nuydens from delphi3d.net has a demo with occlusing culling in a medium dataset, around 200 Mb of data, all put in static VBOs. It runs fine on a Radeon 9700+ (more than 50 MTris/sec), maybe you should check it out if that's interesting you.

Y.


[edited by - Ysaneya on April 14, 2004 3:10:45 PM]
EDIT: Flame - cut.


[edited by - haro on April 14, 2004 6:44:18 PM]
Ysaneya: The point was how much data you can put in one VBO before suffering from trashing/swapping. As far as I can remember (it was a long time ago) Tom's demo uses a bunch of "small" buffers.

haro: No need to start a childish flame war here... Size of dataset is relative. 1m vertices may be alot for Quake3 level but is(will be) very little for Unreal3.

This topic is getting way off-topic. Just try to help OP (Gumpngreen) before flaming on...

You should never let your fears become the boundaries of your dreams.

[edited by - _DarkWIng_ on April 14, 2004 5:28:59 PM]
You should never let your fears become the boundaries of your dreams.
clarification: as an author of this code, I can tell you that:
1) static VBO''s are not used at all (they don''t work very well in q3a environment)
2) calling glBufferDataARB each time you want to draw something is required (infact, glBufferDataARB is a lot cheaper than glVertexPointer)

This topic is closed to new replies.

Advertisement