Jump to content

  • Log In with Google      Sign In   
  • Create Account

FREE SOFTWARE GIVEAWAY

We have 4 x Pro Licences (valued at $59 each) for 2d modular animation software Spriter to give away in this Thursday's GDNet Direct email newsletter.


Read more in this forum topic or make sure you're signed up (from the right-hand sidebar on the homepage) and read Thursday's newsletter to get in the running!


#ActualChris_F

Posted 20 March 2014 - 01:20 PM

 

 

A texture array is a native resource type designed to let you access pixels in a big 3d volume of mip-mapped 2d slices.

 

An array of samplers is syntactic sugar over declaring a large number of individual samplers.

It's the same as having sampler0, sampler1, sampler2, etc... If you're using them in a loop that the compiler can unroll, then they're fine. If you're randomly indexing into the array, then I'd expect waterfalling to occur (i.e. if(i==0) use sampler0 elif(i==1) use sampler1, etc...)

 

I'm sure on old hardware that is the case. Old hardware doesn't even support bindless textures, so that is not really an issue anyway.

 

A quick experiment that renders 1024 random sprites per frame from an array of 32 images shows a performance difference of about 0.05ms when comparing a TEXTURE_2D_ARRAY to an array of bindless handles stored in a SSBO.

 

Edit: I just changed the SSBO to a UBO and now there is no performance difference at all.

 

That's really cool. What kind of GPU are you using?

I'd be interested to see what your shader code looks like... You've got an array of samplers inside a UBO, and then you just index that array when fetching texture samples?

 

I wonder if these performance characteristics are reliable for all vendors that support the necessary extensions... Which vendors do support bindless; is it just nVidia?

 

 

basically

layout(binding = 0, std430) readonly buffer texture_buffer
{
    sampler2D textures[];
};
 
in vec2 tex_coord;
flat in int id;
out vec4 frag_color;

void main()
{
    frag_color = texture(textures[id], tex_coord);
}
GLuint textures[32] = { 0 };
GLuint64 texture_handles[32] = { 0 };
glGenTextures(32, textures);

for (int i = 0; i < 32; ++i) {
    GLuint texture = textures[i];
    glTextureStorage2DEXT(texture, GL_TEXTURE_2D, 8, GL_SRGB8_ALPHA8, 128, 128);
    glTextureSubImage2DEXT(texture, GL_TEXTURE_2D, 0, 0, 0, 128, 128, GL_RGBA, GL_UNSIGNED_BYTE, image_buffer + i * 65536);
    glGenerateTextureMipmapEXT(texture, GL_TEXTURE_2D);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
    GLuint64 handle = glGetTextureHandleARB(texture);
    glMakeTextureHandleResidentARB(handle);
    texture_handles[i] = handle;
}
 
glNamedBufferStorageEXT(buffer, sizeof(texture_handles), texture_handles, 0);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, buffer);

Tested this on a GTX 760 which is based on the GK104 version of Kepler. To my knowledge Kepler is the only thing which currently supports bindless textures, but AMD and Intel could have it soon enough. It probably will never be available on older GPUs, as it is really more of an OpenGL 5 hardware feature.


#4Chris_F

Posted 20 March 2014 - 01:20 PM

 

 

A texture array is a native resource type designed to let you access pixels in a big 3d volume of mip-mapped 2d slices.

 

An array of samplers is syntactic sugar over declaring a large number of individual samplers.

It's the same as having sampler0, sampler1, sampler2, etc... If you're using them in a loop that the compiler can unroll, then they're fine. If you're randomly indexing into the array, then I'd expect waterfalling to occur (i.e. if(i==0) use sampler0 elif(i==1) use sampler1, etc...)

 

I'm sure on old hardware that is the case. Old hardware doesn't even support bindless textures, so that is not really an issue anyway.

 

A quick experiment that renders 1024 random sprites per frame from an array of 32 images shows a performance difference of about 0.05ms when comparing a TEXTURE_2D_ARRAY to an array of bindless handles stored in a SSBO.

 

Edit: I just changed the SSBO to a UBO and now there is no performance difference at all.

 

That's really cool. What kind of GPU are you using?

I'd be interested to see what your shader code looks like... You've got an array of samplers inside a UBO, and then you just index that array when fetching texture samples?

 

I wonder if these performance characteristics are reliable for all vendors that support the necessary extensions... Which vendors do support bindless; is it just nVidia?

 

 

basically

layout(binding = 0, std430) readonly buffer texture_buffer
{
    sampler2D textures[];
};
 
in vec2 tex_coord;
flat in int id;
out vec4 frag_color;

void main()
{
    frag_color = texture(textures[id], tex_coord);
}
GLuint textures[32] = { 0 };
GLuint64 texture_handles[32] = { 0 };
glGenTextures(32, textures);

for (int i = 0; i < 32; ++i) {
    GLuint texture = textures[i];
    glTextureStorage2DEXT(texture, GL_TEXTURE_2D, 8, GL_SRGB8_ALPHA8, 128, 128);
    glTextureSubImage2DEXT(texture, GL_TEXTURE_2D, 0, 0, 0, 128, 128, GL_RGBA, GL_UNSIGNED_BYTE, image_buffer + i * 65536);
    glGenerateTextureMipmapEXT(texture, GL_TEXTURE_2D);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
    GLuint64 handle = glGetTextureHandleARB(texture);
    glMakeTextureHandleResidentARB(handle);
    texture_handles[i] = handle;
}
 
glNamedBufferStorageEXT(buffer, sizeof(texture_handles), texture_handles, 0);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, buffer);

Tested this on a GTX 760 which is based on the GK104 version of Kepler. To my knowledge Kepler is the only thing which currently supports bindless textures, but AMD and Intel could have it soon enough. It probably will never be available on older GPUs, as it is really more of an OpenGL 5 hardware feature.


#3Chris_F

Posted 20 March 2014 - 01:19 PM

 

 

A texture array is a native resource type designed to let you access pixels in a big 3d volume of mip-mapped 2d slices.

 

An array of samplers is syntactic sugar over declaring a large number of individual samplers.

It's the same as having sampler0, sampler1, sampler2, etc... If you're using them in a loop that the compiler can unroll, then they're fine. If you're randomly indexing into the array, then I'd expect waterfalling to occur (i.e. if(i==0) use sampler0 elif(i==1) use sampler1, etc...)

 

I'm sure on old hardware that is the case. Old hardware doesn't even support bindless textures, so that is not really an issue anyway.

 

A quick experiment that renders 1024 random sprites per frame from an array of 32 images shows a performance difference of about 0.05ms when comparing a TEXTURE_2D_ARRAY to an array of bindless handles stored in a SSBO.

 

Edit: I just changed the SSBO to a UBO and now there is no performance difference at all.

 

That's really cool. What kind of GPU are you using?

I'd be interested to see what your shader code looks like... You've got an array of samplers inside a UBO, and then you just index that array when fetching texture samples?

 

I wonder if these performance characteristics are reliable for all vendors that support the necessary extensions... Which vendors do support bindless; is it just nVidia?

 

 

basically

layout(binding = 0, std430) readonly buffer texture_buffer
{
    sampler2D textures[];
};
 
in vec2 tex_coord;
flat in int id;
out vec4 frag_color;

void main()
{
    frag_color = texture(textures[id], tex_coord);
}
GLuint textures[32] = { 0 };
GLuint64 texture_handles[32] = { 0 };
glGenTextures(32, textures);

for (int i = 0; i < 32; ++i) {
    GLuint texture = textures[i];
    glTextureStorage2DEXT(texture, GL_TEXTURE_2D, 8, GL_SRGB8_ALPHA8, 128, 128);
    glTextureSubImage2DEXT(texture, GL_TEXTURE_2D, 0, 0, 0, 128, 128, GL_RGBA, GL_UNSIGNED_BYTE, image_buffer + i * 65536);
    glGenerateTextureMipmapEXT(texture, GL_TEXTURE_2D);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
    GLuint64 handle = glGetTextureHandleARB(texture);
    texture_handles[i] = handle;
    glMakeTextureHandleResidentARB(handle);
}
 
glNamedBufferStorageEXT(buffer, sizeof(texture_handles), texture_handles, 0);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, buffer);

Tested this on a GTX 760 which is based on the GK104 version of Kepler. To my knowledge Kepler is the only thing which currently supports bindless textures, but AMD and Intel could have it soon enough. It probably will never be available on older GPUs, as it is really more of an OpenGL 5 hardware feature.


#2Chris_F

Posted 20 March 2014 - 01:18 PM

 

 

A texture array is a native resource type designed to let you access pixels in a big 3d volume of mip-mapped 2d slices.

 

An array of samplers is syntactic sugar over declaring a large number of individual samplers.

It's the same as having sampler0, sampler1, sampler2, etc... If you're using them in a loop that the compiler can unroll, then they're fine. If you're randomly indexing into the array, then I'd expect waterfalling to occur (i.e. if(i==0) use sampler0 elif(i==1) use sampler1, etc...)

 

I'm sure on old hardware that is the case. Old hardware doesn't even support bindless textures, so that is not really an issue anyway.

 

A quick experiment that renders 1024 random sprites per frame from an array of 32 images shows a performance difference of about 0.05ms when comparing a TEXTURE_2D_ARRAY to an array of bindless handles stored in a SSBO.

 

Edit: I just changed the SSBO to a UBO and now there is no performance difference at all.

 

That's really cool. What kind of GPU are you using?

I'd be interested to see what your shader code looks like... You've got an array of samplers inside a UBO, and then you just index that array when fetching texture samples?

 

I wonder if these performance characteristics are reliable for all vendors that support the necessary extensions... Which vendors do support bindless; is it just nVidia?

 

 

basically

layout(binding = 0, std430) readonly buffer texture_buffer
{
    sampler2D textures[];
};
 
in vec2 tex_coord;
flat in int id;
out vec4 frag_color;

void main()
{
    frag_color = texture(textures[id], tex_coord);
}

GLuint textures[32] = { 0 };
GLuint64 texture_handles[32] = { 0 };
glGenTextures(32, textures);

for (int i = 0; i < 32; ++i) {
    GLuint texture = textures[i];
    glTextureStorage2DEXT(texture, GL_TEXTURE_2D, 8, GL_SRGB8_ALPHA8, 128, 128);
    glTextureSubImage2DEXT(texture, GL_TEXTURE_2D, 0, 0, 0, 128, 128, GL_RGBA, GL_UNSIGNED_BYTE, image_buffer + i * 65536);
    glGenerateTextureMipmapEXT(textures[i], GL_TEXTURE_2D);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTextureParameteriEXT(texture, GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
    GLuint64 handle = glGetTextureHandleARB(texture);
    texture_handles[i] = handle;
    glMakeTextureHandleResidentARB(handle);
}
 
glNamedBufferStorageEXT(buffer, sizeof(texture_handles), texture_handles, 0);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, buffer);

 

Tested this on a GTX 760 which is based on the GK104 version of Kepler. To my knowledge Kepler is the only thing which currently supports bindless textures, but AMD and Intel could have it soon enough. It probably will never be available on older GPUs, as it is really more of an OpenGL 5 hardware feature.


#1Chris_F

Posted 20 March 2014 - 01:14 PM

 

 

A texture array is a native resource type designed to let you access pixels in a big 3d volume of mip-mapped 2d slices.

 

An array of samplers is syntactic sugar over declaring a large number of individual samplers.

It's the same as having sampler0, sampler1, sampler2, etc... If you're using them in a loop that the compiler can unroll, then they're fine. If you're randomly indexing into the array, then I'd expect waterfalling to occur (i.e. if(i==0) use sampler0 elif(i==1) use sampler1, etc...)

 

I'm sure on old hardware that is the case. Old hardware doesn't even support bindless textures, so that is not really an issue anyway.

 

A quick experiment that renders 1024 random sprites per frame from an array of 32 images shows a performance difference of about 0.05ms when comparing a TEXTURE_2D_ARRAY to an array of bindless handles stored in a SSBO.

 

Edit: I just changed the SSBO to a UBO and now there is no performance difference at all.

 

That's really cool. What kind of GPU are you using?

I'd be interested to see what your shader code looks like... You've got an array of samplers inside a UBO, and then you just index that array when fetching texture samples?

 

I wonder if these performance characteristics are reliable for all vendors that support the necessary extensions... Which vendors do support bindless; is it just nVidia?

 

 

basically

 

layout(binding = 0, std430) readonly buffer texture_buffer
{
    sampler2D textures[];
};
 
in vec2 tex_coord;
flat in int id;
out vec4 frag_color;

void main()
{
    frag_color = texture(textures[id], tex_coord);
}

 

Tested this on a GTX 760 which is based on the GK104 version of Kepler. To my knowledge Kepler is the only thing which currently supports bindless textures, but AMD and Intel could have it soon enough. It probably will never be available on older GPUs, as it is really more of an OpenGL 5 hardware feature.


PARTNERS