Sign in to follow this  
Int19

Extremely bad VBO performance (nVidia)

Recommended Posts

Hi, I recently discovered a very serious performance drop on scenes with a relatively large amount of small VBOs. I know all about "Batch, batch, batch...", and the amount of drawcalls is not the problem. The weird part about the problem is that by looking away (or moving camera) so that most of the scene is culled (hence only a few VBOs being activated) i can get the fps up from 40 to over 400, nothing strange about that. But once I look back (displaying the whole scene again) the problem is gone and i have a stable 400fps. Sometimes I will have todo it a couple of times before it "takes", but I can allways do it. I have tried alot of different ideas (nVidia perfkit/GLexpert), and removed more or less all of our CPU code to get to the root of the problem but with no success. It somehow seems like nVidia is doing something behind the scenes that i cant figure out. It might be some kind of memory managment (defrag?) being performed on GPU memory area of the smaller VBOs that is failing to resolve while they are being rendered (which would explain why it all of a sudden works after having looked away). Im running out of ideas, anyone had a similar experience? Greetz David And here is a short version of what happens in our engine, in reality the stuff is split between shaders and the "VertexBroker" much like how Yann described his system on these boards a few years back. // Allocate and upload VBO // * Happens at "VertexBroker" init // * Happens when slots are split/merged to create slots of other size glGenBuffersARB( 1, &pnewSlot->uBufferID); glBindBufferARB( GL_ARRAY_BUFFER_ARB, uBufferID); glBufferDataARB( GL_ARRAY_BUFFER_ARB, uSize, NULL, GL_STATIC_DRAW_ARB ); glBufferSubDataARB( GL_ARRAY_BUFFER_ARB, cOffset, cBytes, p ); // Enable client states (once for each shader switch) glEnableClientState( GL_VERTEX_ARRAY ); glEnableClientState( GL_NORMAL_ARRAY ); glClientActiveTextureARB(GL_TEXTURE0_ARB); glEnableClientState( GL_TEXTURE_COORD_ARRAY ); // Activate VBO and set pointers (once for each geometry) glClientActiveTextureARB(GL_TEXTURE0_ARB); glTexCoordPointer(uFloatSize,GL_FLOAT,stride,(char*)(offset)); glNormalPointer(GL_FLOAT,stride,(char*)(offset)); glVertexPointer(uFloatSize,GL_FLOAT,stride,(char*)(offset));

Share this post


Link to post
Share on other sites
Quick followup :

Just created a new testscene with about 100 VBOs (sizes ranging from 4k to 1024k with most of the VBOs being 4k or 8k), and in this scene it is impossible to get the fps up unless you get every single geometry culled so that no vbo:s are used. After that you are free to turn around and go about at 200fps with 200k tris and shadowmapping enabled.

Share this post


Link to post
Share on other sites
I suppose the amount of individual batches is your problem. I've also have experienced severe framerate drops if the amount of batches per scene exceeds ~300 or so (GF6600GT, a bit old piece of hw, though it can still render stuff pretty fast if it's in VRAM).

From experience I can tell that if the GPU gets choked by amount of batches, the framerate drop is not by any means small or smooth. Regarding to batch count it seems to be a some kind of thin red line, which once is crossed the framerates simply STALL. I experienced ~50-70% drops from solid 60fps (vsynced) by adding only dozens of batches visible per frame to the scene which already had couple of hundred batches visible per frame.
Most of my geometry is in VBOs as seems to be the case with your project too.

It seems that you're having same kind of problems, especially because if most of your geometry gets culled you skyrocket the FPS counts.

My solution was just to combine the geometry to reduce batch count.
Polycounts or fillrate seems not to be the issue in this case, it's simply the amount of drawcalls which is the problem.
It might be related to the VBOs somehow, I think a VBO drawcall might stall the GPU for a short time so they should be kept at minimum.

BTW the project I'm currently working is in the "Image of the Day" section titled "Sylph Wind".

Share this post


Link to post
Share on other sites
Thanks for your reply.

I can not say for sure that the amount of drawcalls has nothing todo with it, but im fairly certain it is not the key. As I said in the original post the framerate will go up when looking away, but the it will STAY at that high fps when you look back at the scene.

In the latest testcase there is around 100 drawcalls per frame (150 or so with CSM enabled). This is on a GF 7950GT (although i dont think the number of drawcalls supported per frame is much different then on a 6600).

I can most definately get the fps up by batching everything (which i have done to try and find at which level the fps drops), but it seems i have to go down to about 20-40 or so drawcalls to avoid the drop, that cant be right.

Thanks again for your reply, will have a look at your project :)

Greetz David

Share this post


Link to post
Share on other sites
I just found the solution on the opengl.org boards. Its apparently an nVidia issue with the newer drivers running on multicpu systems :

http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi?ubb=get_topic;f=3;t=015274;p=0

I was able to fix the problem by forcing single CPU while creating my opengl context. Sadly I will have to add some code to check for nVidia driver version once they release a new driver to tackle this issue. But for now this will do :

// Set single CPU usage
::SetProcessAffinityMask(::GetCurrentProcess(), 0x1);

// Create opengl context
InitOpenGL()

// Restore system settings
DWORD procMask;
DWORD sysMask;
::GetProcessAffinityMask(::GetCurrentProcess(), &procMask, &sysMask);
::SetProcessAffinityMask(::GetCurrentProcess(), sysMask);

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this