glBindTexture - what is the technical reason that it is slow?

Started by
5 comments, last by vlj 9 years, 1 month ago

I've been looking all over the place for an answer to this question and I'm only getting bits and pieces of the answer. I learned the hard way a long time ago that calling glBindTexture is relatively slow; I found out from a friend of mine that it was because I was binding the texture every time I drew a triangle lol! The simple fact is that calling glBindTexture comes at a price, and minimizing calls is always the best choice!

Anyway, there's a reason it's slower than most other function calls that change the opengl states, and I'm looking for the technical answer.

So far, with my limited understanding of hardware in general, the main reason I think it's slow is because every time you call glBindTexture, the hardware copies the texture from Texture RAM into the Texture Cache because the Cache is faster to read from. Although the copy is fast in it's own right, copying larger textures should take longer than smaller textures, and if done enough times no matter the size of the texture, will become a performance bottleneck.

Yes/No? Is there more to it than that?

Advertisement
There's no copying going on, generally, unless your GPU is very memory constrained.

It's mostly a driver issue and more to do with flushing states out to the execution cores, how shaders are scheduled, how the resource binding tables are configured, flipping read/write states on textures automagically for you, swapping out sampler states (since those are embedded into the texture in older OpenGL versions), and so on.

In other words, it can be super super fast. The driver - and the specific call you're referring to, glBindTexture - is slow. Switch to bindless approaches and you'll see it get much faster.

Sean Middleditch – Game Systems Engineer – Join my team!

I'm a little confused. Is glBindTexture still slow, but not for the reason I thought, and it's been replaced by a different function: a bindless function?

Never mind, it just sunk in. It doesn't seem like that would be the reasons for being so slow, but I don't really understand the hardware.

What's the bind-less function thing you're talking about?

Never mind, it just sunk in. It doesn't seem like that would be the reasons for being so slow, but I don't really understand the hardware.

What's the bind-less function thing you're talking about?


Modern hardware doesn't bind texture slots. It has a huge giant array of resources. Some hardware uses separate arrays for different kinds of resources while some hardware mixes a certain subset of resources into one array.

Bindless is a feature of very recent OpenGL and Direct3D versions. All that "bindless" means in this case is that you don't have to call the various 'glBind*' functions to swap resources; instead, your shader indexes into these arrays to get at the resources. It's slightly more complicated than that, but not too much.

You may have to use texture arrays to simulate bindless textures for compatibility reasons. If you just google "opengl bindless textures" you'll find plenty of materials on the subject. Google "AZDO OpenGL" (AZDO is an acronym that derives from the name of a presentation from a few years back titled "Approaching Zero Driver Overhead") you'll find materials on both bindless resources and some other modern GPU programming primitives.

Sean Middleditch – Game Systems Engineer – Join my team!

Ha, so much has changed since I last wrote an OpenGL program! Thanks for the clarification and insight into the latest generation of graphics acceleration.

You can have a look at what a typical texture bind involve here :

http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/texobj.c

As you can see looking at _mesa_test_texobj_completeness it's far from being trivial.

There are also state revalidation involved when a call to glDraw* is issued : texture type must match the one expected by the shader, the driver must add a barrier or a cache flush if the texture was previously used as a result of a draw call or a dma transfer, for instance.

This topic is closed to new replies.

Advertisement