glGetQueryObjectuivARB - extremely slow

Started by
6 comments, last by razor1911 13 years, 6 months ago
Guys, yesterday i finally managed to implement Occlusion Culling in my engine, but the results are faaar away from what i have expected.

Here is what i do ( as per the specs )

DRAW(){//Z-fill pass for Occlusionfor(int X=0;X<32;X++)for(int Z=0;Z<32;Z++){if(cube[x][z]->infrustum){cube[x][z]->drawTerrain();}}//Occlusion query passglColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);glDepthMask(GL_FALSE);for(int X=0;X<32;X++)for(int Z=0;Z<32;Z++){if(cube[x][z]->infrustum){glBeginQueryARB( GL_SAMPLES_PASSED_ARB, cube[x][z]->OCQuery);cube[x][z]->drawTerrain();glEndQueryARB( GL_SAMPLES_PASSED_ARB );		glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);}}glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);glDepthMask(GL_TRUE);//Final pass, shaders lightning and so on......for(int X=0;X<32;X++)for(int Z=0;Z<32;Z++){if(cube[X][Z]->OCcount>0)cube[X][Z]->drawTerrain()}}


Now, the thing is that

glGetQueryObjectuivARB

is killing my framerate from ~100FPS to 45, when removed i get ~120FPS.

I can run Occlusion step1, Occlusion step2, Final render - without glGetQueryObjectuivARB and still i have ~120FPS, but when i use
glGetQueryObjectuivARB on each cube my framerate dies..


Why does that happen?
Where is the 'trick' to utilise the power of occlusion culling, i seem to fall into a black hole with this, why use that if it becomes a pain when used..


Just to clarify - terrain is rendered with VBO and has a single call per cell (cube).



Advertisement
I'm gonna be honest I know next to nothing about the ARB_occlusion_query extension. But I did look over your code and read a little snippet from the documentation.

Quote:
This extension solves both of those problems. It returns as its result the number of samples that pass the depth and stencil tests, and it encapsulates occlusion queries in "query objects" that allow applications to issue many queries before asking for the result of any one. As a result, they can overlap the time it takes for the occlusion query results to be returned with other, more useful work, such as rendering other parts of the scene or performing other computations on the CPU.


This doesn't really look like what you have

foreach(0..32){  foreach(0..32){    glBeginQueryARB( GL_SAMPLES_PASSED_ARB, cube[x][z]->OCQuery);    cube[x][z]->drawTerrain();    glEndQueryARB( GL_SAMPLES_PASSED_ARB );		    glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);  }}


Any time you have an opengl program that involves a giant loop of set/get/set/get/set/get/... your performance is going to be screwed. Can you try something like this instead? It seems to be more inline with what the extension documentation recommends.

foreach(0..32){  foreach(0..32){    glBeginQueryARB( GL_SAMPLES_PASSED_ARB, cube[x][z]->OCQuery);    cube[x][z]->drawTerrain();    glEndQueryARB( GL_SAMPLES_PASSED_ARB );		  }}foreach(0..32){  foreach(0..32){    glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);  }}


I don't know if that will help, but it might be a possibility. Or better yet stick some other work in between the two loop sets to reduce the chance of your code stalling waiting for the draw commands to finish.
[size=2]My Projects:
[size=2]Portfolio Map for Android - Free Visual Portfolio Tracker
[size=2]Electron Flux for Android - Free Puzzle/Logic Game
karwosts - nice to hear you again ;-)

I'll have a look and see if that is the issue,

the problem might be that i am using :

	GLuint OCQuery;	GLuint OCcount;


per terrain cell, instead of a one big array that i can query later on as per specs:

 do {      glGetQueryObjectivARB(queries[ioccl], GL_QUERY_RESULT_AVAILABLE_ARB, &available);  } while (!available);


lemme have a look to see if that is the issue,
of course replies are more than welcome meantime,
if anyone knows anything in this subject.
Haha I never thought anybody would recognize me :P

Yeah I definitely think you'd want to have a separate occlusion query for each cube, or at least try to round-robin with multiple queries if there's some performance hit to having so many query objects. Using only one is going to cause horrible bottlenecking.

Also I don't think I understand what this is supposed to be:

 do {      glGetQueryObjectivARB(queries[ioccl], GL_QUERY_RESULT_AVAILABLE_ARB, &available);  } while (!available);


If you just call glGet on the query object, does it block until the result is available? I could see using this loop if you were doing something useful with your time in the meantime, but it seems to be pointless. You're asking the gpu "are you done yet" every millisecond, where you could just call glGet once saying "tell me when you've got the result".
[size=2]My Projects:
[size=2]Portfolio Map for Android - Free Visual Portfolio Tracker
[size=2]Electron Flux for Android - Free Puzzle/Logic Game
karwosts,

have a look at the examples here:

http://www.opengl.org/registry/specs/ARB/occlusion_query.txt

// Do other work until "most" of the queries are back, to avoid        // wasting time spinning        i = N*3/4; // instead of N-1, to prevent the GPU from going idle        do {            DoSomeStuff();            glGetQueryObjectivARB(queries,                                  GL_QUERY_RESULT_AVAILABLE_ARB,                                  &available);        } while (!available);


It seems like i should use one array that i ask if filled, instead of cycling in a "for" and trying to 'read' the results.

The bottleneck in my code might be the:

for..for..glGetQueryObjectuivARB( cube[x][z]->OCQuery, GL_QUERY_RESULT_ARB, &cube[x][z]->OCcount);


and i think i should use

 glGetQueryObjectivARB(queries,                                  GL_QUERY_RESULT_AVAILABLE_ARB,                                  &available);


and when available=true, go on and read the results...

kinda crazy, but i'm in the middle of implementing this..

Well go ahead and keep playing with it and see what gets you the best result. I agree that you want to use an array, but if you're going to poll if the object is done, then you at least have to make use of that time somehow. It does no good to just spin in a loop querying over and over when calling a read will wait for the result to be ready.

Also I'd suggest that you don't want any glGet command inside your drawing loop. If you don't need the result at that time, then I think calling glGet will screw up your command buffering (best to just dispatch all the draw commands in one swoop instead of switching back and forth between draw calls and state queries.)
[size=2]My Projects:
[size=2]Portfolio Map for Android - Free Visual Portfolio Tracker
[size=2]Electron Flux for Android - Free Puzzle/Logic Game
Uhh, unfortunately it did not help.
Still when OCulling turned on, the framerate drops like mad.

hmm



Again,

glGetQueryObjectuivARB


this is the biggest pain in the @55..
Removed that 'while' and seems great now!

This topic is closed to new replies.

Advertisement