Sign in to follow this  

glGetQueryObjectuivARB - extremely slow

This topic is 2634 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Guys, yesterday i finally managed to implement Occlusion Culling in my engine, but the results are faaar away from what i have expected.

Here is what i do ( as per the specs )


DRAW()
{

//Z-fill pass for Occlusion
for(int X=0;X<32;X++)
for(int Z=0;Z<32;Z++)
{
if(cube[x][z]->infrustum)
{
cube[x][z]->drawTerrain();
}
}

//Occlusion query pass
glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);
glDepthMask(GL_FALSE);
for(int X=0;X<32;X++)
for(int Z=0;Z<32;Z++)
{
if(cube[x][z]->infrustum)
{
glBeginQueryARB( GL_SAMPLES_PASSED_ARB, cube[x][z]->OCQuery);
cube[x][z]->drawTerrain();
glEndQueryARB( GL_SAMPLES_PASSED_ARB );
glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);
}
}
glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
glDepthMask(GL_TRUE);

//Final pass, shaders lightning and so on
...
...
for(int X=0;X<32;X++)
for(int Z=0;Z<32;Z++)
{
if(cube[X][Z]->OCcount>0)
cube[X][Z]->drawTerrain()
}



}




Now, the thing is that

glGetQueryObjectuivARB

is killing my framerate from ~100FPS to 45, when removed i get ~120FPS.

I can run Occlusion step1, Occlusion step2, Final render - without glGetQueryObjectuivARB and still i have ~120FPS, but when i use
glGetQueryObjectuivARB on each cube my framerate dies..


Why does that happen?
Where is the 'trick' to utilise the power of occlusion culling, i seem to fall into a black hole with this, why use that if it becomes a pain when used..


Just to clarify - terrain is rendered with VBO and has a single call per cell (cube).



Share this post


Link to post
Share on other sites
I'm gonna be honest I know next to nothing about the ARB_occlusion_query extension. But I did look over your code and read a little snippet from the documentation.

Quote:

This extension solves both of those problems. It returns as its result the number of samples that pass the depth and stencil tests, and it encapsulates occlusion queries in "query objects" that allow applications to issue many queries before asking for the result of any one. As a result, they can overlap the time it takes for the occlusion query results to be returned with other, more useful work, such as rendering other parts of the scene or performing other computations on the CPU.


This doesn't really look like what you have


foreach(0..32){
foreach(0..32){
glBeginQueryARB( GL_SAMPLES_PASSED_ARB, cube[x][z]->OCQuery);
cube[x][z]->drawTerrain();
glEndQueryARB( GL_SAMPLES_PASSED_ARB );
glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);
}
}




Any time you have an opengl program that involves a giant loop of set/get/set/get/set/get/... your performance is going to be screwed. Can you try something like this instead? It seems to be more inline with what the extension documentation recommends.


foreach(0..32){
foreach(0..32){
glBeginQueryARB( GL_SAMPLES_PASSED_ARB, cube[x][z]->OCQuery);
cube[x][z]->drawTerrain();
glEndQueryARB( GL_SAMPLES_PASSED_ARB );
}
}
foreach(0..32){
foreach(0..32){
glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);
}
}




I don't know if that will help, but it might be a possibility. Or better yet stick some other work in between the two loop sets to reduce the chance of your code stalling waiting for the draw commands to finish.

Share this post


Link to post
Share on other sites
karwosts - nice to hear you again ;-)

I'll have a look and see if that is the issue,

the problem might be that i am using :


GLuint OCQuery;
GLuint OCcount;



per terrain cell, instead of a one big array that i can query later on as per specs:


do {
glGetQueryObjectivARB(queries[ioccl], GL_QUERY_RESULT_AVAILABLE_ARB, &available);
} while (!available);



lemme have a look to see if that is the issue,
of course replies are more than welcome meantime,
if anyone knows anything in this subject.

Share this post


Link to post
Share on other sites
Haha I never thought anybody would recognize me :P

Yeah I definitely think you'd want to have a separate occlusion query for each cube, or at least try to round-robin with multiple queries if there's some performance hit to having so many query objects. Using only one is going to cause horrible bottlenecking.

Also I don't think I understand what this is supposed to be:


do {
glGetQueryObjectivARB(queries[ioccl], GL_QUERY_RESULT_AVAILABLE_ARB, &available);
} while (!available);




If you just call glGet on the query object, does it block until the result is available? I could see using this loop if you were doing something useful with your time in the meantime, but it seems to be pointless. You're asking the gpu "are you done yet" every millisecond, where you could just call glGet once saying "tell me when you've got the result".

Share this post


Link to post
Share on other sites
karwosts,

have a look at the examples here:

http://www.opengl.org/registry/specs/ARB/occlusion_query.txt


// Do other work until "most" of the queries are back, to avoid
// wasting time spinning
i = N*3/4; // instead of N-1, to prevent the GPU from going idle
do {
DoSomeStuff();
glGetQueryObjectivARB(queries[i],
GL_QUERY_RESULT_AVAILABLE_ARB,
&available);
} while (!available);



It seems like i should use one array that i ask if filled, instead of cycling in a "for" and trying to 'read' the results.

The bottleneck in my code might be the:


for..
for..
glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);



and i think i should use


glGetQueryObjectivARB(queries[i],
GL_QUERY_RESULT_AVAILABLE_ARB,
&available);



and when available=true, go on and read the results...

kinda crazy, but i'm in the middle of implementing this..

Share this post


Link to post
Share on other sites
Well go ahead and keep playing with it and see what gets you the best result. I agree that you want to use an array, but if you're going to poll if the object is done, then you at least have to make use of that time somehow. It does no good to just spin in a loop querying over and over when calling a read will wait for the result to be ready.

Also I'd suggest that you don't want any glGet command inside your drawing loop. If you don't need the result at that time, then I think calling glGet will screw up your command buffering (best to just dispatch all the draw commands in one swoop instead of switching back and forth between draw calls and state queries.)

Share this post


Link to post
Share on other sites

This topic is 2634 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this