Jump to content
  • Advertisement


This topic is now archived and is closed to further replies.


Unusual VBO slowdown

This topic is 5475 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I implemented vertex arrays and VBO in my planet generator experiment... To my great surprise, VBO is performing much slower than regular vertex array, and even slower than immediate mode. The polygons are divided into 20 batches (there are 20 unique textures), so the batch size in the following data varies from 1024 to 65536 triangles.
(Mtris/second with immediate mode/regular vertex array/VBO)
20480 tris:   2.56 / 4.09 / 1.86
81920 tris:   2.64 / 13.7 / 1.95
327680 tris:  2.69 / 2.73 / 2.05
1310720 tris: 2.65 / 2.76 / 0.1 (out of VRAM?)

Setup: P4/2.4Ghz, Radeon 9700 pro 128M
Things to note are that the vertex array performance has a sharp peak at 4096 tris/batch, but VBO is consistently slow. Can you see anything obviously wrong in the code below? (this is called once per frame for each quadrant [batch], the arrays are written to only once in an init function)
ifdef USE_VAR

		if (gfx.vbosup)
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, orb->rants[r].nvar);
			glVertexPointer(3, GL_INT, 0, NULL);
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, orb->rants[r].nnar);
			glNormalPointer(GL_FLOAT, 0, NULL);
			glBindBufferARB( GL_ARRAY_BUFFER_ARB, orb->rants[r].ntar);
			glTexCoordPointer(2, GL_FLOAT, 0, NULL);
			glVertexPointer(3, GL_INT, 0, orb->rants[r].var);
			glNormalPointer(GL_FLOAT, 0, orb->rants[r].nar);
			glTexCoordPointer(2, GL_FLOAT, 0, orb->rants[r].tar);

		glDrawElements(GL_TRIANGLES, orb->rants[r].ntris*3, GL_UNSIGNED_INT, orb->rants[r].iar); // array of vertex/normal/texcoord indices

		tris = orb->rants[r].tris;
		verts = orb->rants[r].verts;
		for (tri = 0; tri < orb->rants[r].ntris; tri++)
			//glNormal3f(tris[tri].norm[0], tris[tri].norm[1], tris[tri].norm[2]);
			glNormal3f(orb->rants[r].nar[tris[tri].v[0]*3+0], orb->rants[r].nar[tris[tri].v[0]*3+1], orb->rants[r].nar[tris[tri].v[0]*3+2]);
			glTexCoord2f(orb->rants[r].tar[tris[tri].v[0]*2+0], orb->rants[r].tar[tris[tri].v[0]*2+1]);
			glVertex3i(orb->rants[r].var[tris[tri].v[0]*3+0], orb->rants[r].var[tris[tri].v[0]*3+1], orb->rants[r].var[tris[tri].v[0]*3+2]);

			glNormal3f(orb->rants[r].nar[tris[tri].v[1]*3+0], orb->rants[r].nar[tris[tri].v[1]*3+1], orb->rants[r].nar[tris[tri].v[1]*3+2]);
			glTexCoord2f(orb->rants[r].tar[tris[tri].v[1]*2+0], orb->rants[r].tar[tris[tri].v[1]*2+1]);
			glVertex3i(orb->rants[r].var[tris[tri].v[1]*3+0], orb->rants[r].var[tris[tri].v[1]*3+1], orb->rants[r].var[tris[tri].v[1]*3+2]);

			glNormal3f(orb->rants[r].nar[tris[tri].v[2]*3+0], orb->rants[r].nar[tris[tri].v[2]*3+1], orb->rants[r].nar[tris[tri].v[2]*3+2]);
			glTexCoord2f(orb->rants[r].tar[tris[tri].v[2]*2+0], orb->rants[r].tar[tris[tri].v[2]*2+1]);
			glVertex3i(orb->rants[r].var[tris[tri].v[2]*3+0], orb->rants[r].var[tris[tri].v[2]*3+1], orb->rants[r].var[tris[tri].v[2]*3+2]);

Share this post

Link to post
Share on other sites
it seems like you need to access three different vertex buffers for one vertex. i dont know if thats regular behaviour, but when i tried to place my texcoords somewhere else (ie the moment i was accessing more than one vb at once) the performance degraded horribly. in other words: dont. allocate one big buffer and either use offsets for the different kinds of data or (probably better) store them interleaved.

looking something like either this:
glBindBufferARB( GL_ARRAY_BUFFER_ARB, orb->rants[r].nvar);
glVertexPointer(3, GL_INT, 0, 0);
glNormalPointer(GL_FLOAT, 0, (char*)NormOffset);
glTexCoordPointer(2, GL_FLOAT, 0, (char*)TexOffset);

or: (with struct Vertex as {int x,y,z; float nx,ny,nz; float u,v;}
glVertexPointer(3, GL_INT, sizeof(Vertex), 0);
glNormalPointer(GL_FLOAT, sizeof(Vertex), (char*)(3*sizeof(int)));
glTexCoordPointer(2, GL_FLOAT, sizeof(Vertex), (char*)(3* (sizeof(int)+sizeof(float)) ));

the idea behind method 2 is that the closer you store your data the less it needs to wildly jump all over memory to collect it, though i wouldnt expect it to make much difference. just try keeping your whole stuff for one draw-call in one single vb.

[edited by - Trienco on November 16, 2003 7:39:52 AM]

Share this post

Link to post
Share on other sites
You should also use GL_ELEMENT_ARRAY_BUFFER_ARB to store indices in video mem. I know ATi prefers this over system-stored indices. Overall those numbers (tris/sec) are very low for that kind of card. You should be getting at least 5x more. The "out of VRAM" thing is probably becouse there is a 32mb limit on VBO size (not writen anywhere but both nVidia and ATi fail to allocate this size in VRAM)

You should never let your fears become the boundaries of your dreams.

Share this post

Link to post
Share on other sites
glVertexPointer(3, GL_INT , sizeof(Vertex), 0);
int aren''t optimised on most drivers.

Blocks of vertex array data may be stored in buffer objects with the
same format and layout options supported for client-side vertex
arrays. However, it is expected that GL implementations will (at
minimum) be optimized for data with all components represented as
floats, as well as for color data with components represented as
either floats or unsigned bytes.


Jester, studient programmer
The Jester Home in French

Share this post

Link to post
Share on other sites
jesterlecodeur is correct. Here''s a table from ATI''s OpenGL SDK (http://www.ati.com/developer/sdk/radeonsdk/Gl_sdk.zip):

Type			Native	Alignment	Components	Range
GLdouble No
GLfloat Yes 32-bit 1,2,3,4 +/- MAX_FLOAT
GLuint No
GLint No
GLushort Yes 32-bit 2,4 [0,65536]
GLshort Yes 32-bit 2,4 [-32768,32767]
GLushort (normalized) Yes 32-bit 2,4 [0,1]
GLshort (normalized) Yes 32-bit 2,4 [-1,1]
GLubyte Yes 32-bit 4 [0,255]
GLbyte Yes 32-bit 4 [-128,127]
GLubyte (normalized) Yes 32-bit 4 [0,1]
GLbyte (normalized Yes 32-bit 4 [-1,1]

Share this post

Link to post
Share on other sites
Thanks for good suggestions... In particular the lack of int would explain a lot. I''ll try all of these and see how it turns out. I haven''t used VBO before so this is all new to me

Share this post

Link to post
Share on other sites
Ah, the smell of progress

tris Mtris/s VAR / VBO1 / VBO2
20480 4.09 / 4.09 / 4.09
81920 13.7 / 16.4 / 16.4
327680 2.73 / 27.3 / 41.0
1310720 2.76 / 21.5 / 42.3

Replacing ints with floats alone increased the triangle rates dramatically (VBO1). Interleaving the vertex/normal/texcoord data had negligible effect (<1ms/frame). Adding a hardware buffer for indices caused another performance jump at the high end of poly counts (VBO2), although I may not end up using it if/when I implement some kind of a LOD scheme.

Also it turns out that I''m not out of VRAM after all.. I''m using ~21M for the vertex arrays at the highest detail level. Still, this means I''ll have to cut the detail if I ever want to display more than one planet.

In case you''re wondering what the thing looks like, here''s a picture.

Thanks for your help!

Share this post

Link to post
Share on other sites
Well, I don''t think you''re going to have multiple planets close enough together for such detail to need to be shown simultaneously on all. . .

Share this post

Link to post
Share on other sites
so youre saying you dont have any slowdowns when using multiple vbs for position/normals etc.? hm, time to either get an ati or hope newer drivers work better, because the current setup is horribly chaotic *g*

Share this post

Link to post
Share on other sites
Yes, it''s interesting because what you said made a lot of sense. I''m keeping them all in a single array now anyway since it''s easier to manage (and other hardware might not be as forgiving).

I did find that the indices themselves want to be as sequential as possible rather than jumping around within the array(s). And I guess it''s easier for caching when subsequent triangles re-use vertices too. So ordering the triangles like it was a triangle strip seems to be the fastest to render.

Share this post

Link to post
Share on other sites

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!