Vertex Array Object + Direct State Access

Graphics and GPU Programming Programming OpenGL

Started by Chris_F December 03, 2013 04:15 AM

19 comments, last by Kaptein 10 years, 4 months ago

173

December 14, 2013 02:49 AM

There is no viable alternative to VAOs though, which is why we are all so confused.

Cross-platform, no. But on NVidia, bindless can easily exceed the performance you get from VAOs, and the reason is intuitive.

You ideally just "enable" your 5 attribs, and then proceed to render 300 VBOs. You can't. It sucks.

Instead I have to create 300 VAOs, with 5 "enabled" attribs each. Then render 300 VAOs.

No you don't. Having a bazillion little VAOs floating around with all the cache misses that go with them isn't necessarily the best approach. Best case, use bindless (does an end-run around the cache issues, but is NV-only) or a streaming VBO approach with reuse (which keeps the bind count down, and works cross-platform).

If you have a number of separate, preloaded VBOs, you absolutely can't/refuse to make them large enough to keep you from being CPU bound for some reason, and you can't/won't use bindless, then fall back on (in the order of the performance I've observed on NVidia) 1) VBOs with VAOs, one per batch state combination, 2) client arrays, or 3) VBOs without VAOs or with one VAO for all.

In case it's not obvious, I do what gives me the best performance. I'm not a core purist.

Its really unfortunate AMD hasn't stepped up and supported bindless in their OpenGL drivers, at least for launching batches (vertex attribute and index list specification).

Kaptein

2,226

December 14, 2013 01:20 PM

I unfortunately don't have that luxury. The world is fully 3D and i employ a large number of optimizations..

If there was any way at all to get even 20 ns i would rewrite everything in a heartbeat. The only caveat is that it has to work on any opengl 3.x implementation.

Also 300 VBOs was just a number. It's more like 300 to 5000, depending on config.ini settings.

Ideally I could draw 5000 per frame at 60 fps. But that wouldn't be right, now would it.

It's usually around 2000, going down to 1000 after occlusion culling. (A frame behind)

The terrain is dynamic and is produced on the fly. When the player moves, so the world "revolves." Trying to combine VBOs is definitely possible. I just wonder if it won't severely affect the GPU in the end. Combining all terrain data of the same (sectoral) x,z (internally known as columns) is something I'm already doing. The world just isn't a plane.

Vertex Array Object + Direct State Access

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Vertex Array Object + Direct State Access

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines