glDrawRangeElements and performance

Started by
4 comments, last by Ysaneya 16 years, 9 months ago
Hello! I'm having performance problems with glDrawRangeElements and VBOs, I've used a sampler to see where most of the CPU time is spent and about 60% is spent in glDrawRangeElements. Shouldn't glDrawRangeElements get executed on the GPU? And 2500 calls per frame shouldn't be too much, right? And another question, texture wrap and filter modes, are they per texture (as in you only have to set them once), or global?
Advertisement
Quote:Original post by patrrr
Hello! I'm having performance problems with glDrawRangeElements and VBOs, I've used a sampler to see where most of the CPU time is spent and about 60% is spent in glDrawRangeElements.
This means little. If the rest of the code is about the same complexity, don't be surprised if DrawRangeElements takes a great deal of time in relative terms. This seems a lot but it really depends on the complexity of the whole system.
Quote:Original post by patrrr
Shouldn't glDrawRangeElements get executed on the GPU?
Not necessarly. A thing that comes to mind is that there's a maximum range size but I don't recall how to query this.
Quote:Original post by patrrr
And 2500 calls per frame shouldn't be too much, right?
Depends again on the complexity of the system but I believe it's a count that starts to need some attention. If everything is ready, then it's ok. If this is the starting point, then maybe it's worth a closer look.
Quote:Original post by patrrr
texture wrap and filter modes, are they per texture (as in you only have to set them once), or global?
They're set using TexParameteri so they're per texture.

Previously "Krohm"

Thanks for your answer!

Yes, I wouldn't be worried if it didn't take 100% CPU just to render a simple scene at 60 FPS, and that's the problem. I should be able to get 700-1000 fps out of this scene.

Is it glGet with GL_MAX_ELEMENTS_INDICES you're refering to? No index buffer is bigger than that, they are pretty small.
I currently have one index buffer per glDrawRangeElement call, that means I have ~2500 buffers. Could this be the problem? Wouldn't glBindBuffer have a larger impact if that was the case?
Would I get a big performance increase by having one index buffer for all the static data, then offseting glDrawRangeElements into that buffer?


Quote:Original post by patrrr
I should be able to get 700-1000 fps out of this scene.
Not really. There's an optimal "command queue length" encoded in the driver which will slow down for too simple applications.
Quote:Original post by patrrr
Is it glGet with GL_MAX_ELEMENTS_INDICES you're refering to?
It seems so.
Quote:Original post by patrrr
I currently have one index buffer per glDrawRangeElement call, that means I have ~2500 buffers. Could this be the problem?
It's extremely likely. Don't even think at having 1 buffer for each batch, unless this batch counts several tens of thousands triangles or there's a good reason to (dynamic content is a good reason).
Quote:Original post by patrrr
Wouldn't glBindBuffer have a larger impact if that was the case?
No, because there's no need for BindBuffer to really perform some operation (besides setting an internal "note") until some command needs to verify the setting is consistent. Draw calls require a consistent state and the driver will kick in only at this time.
Calling BindBuffer once is likely to have similar overhead than calling in ten times in a row and then having the same drawcall.
Quote:Original post by patrrr
Would I get a big performance increase by having one index buffer for all the static data, then offseting glDrawRangeElements into that buffer?

Please avoid those extremes ;-) consider having a maximum size (a few megs is good for static content). If you can work with unique indices that would be better - theorically, some hardware may not support index offsets but I'm not sure those devices are still around. As for setting the offset, it's possible it would have similar negative performance impact but I admit I'm not sure there and someone will likely have more accurate knowledge.

Previously "Krohm"

Quote:
No, because there's no need for BindBuffer to really perform some operation (besides setting an internal "note") until some command needs to verify the setting is consistent. Draw calls require a consistent state and the driver will kick in only at this time.
Calling BindBuffer once is likely to have similar overhead than calling in ten times in a row and then having the same drawcall.

Maybe not. It's best to make as less GL calls as poassible and glBindBuffer followed by the gl****Pointer calls can be expensive.
Other problems might be not supported vertex formats.
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
Quote:Original post by patrrr
Thanks for your answer! Yes, I wouldn't be worried if it didn't take 100% CPU just to render a simple scene at 60 FPS, and that's the problem. I should be able to get 700-1000 fps out of this scene.


60 FPS ? Are you sure you have vsync disabled ?

Y.

This topic is closed to new replies.

Advertisement