• Advertisement
Sign in to follow this  

OpenGL Texture compression DDS / S3TC

This topic is 2201 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey,

About texture compression... Never used it so far, but I just made an (OpenGL) program that loads a DDS file. So far so good, but before really implementing it in my engine, I need some good reasons. Well, what I can think of:
- No time wasted generating mip-maps
- Smaller disk size
- Smaller > Faster loading from disk
- Extra features such as cubeMap / 3D textures
- Allows to use bigger resolutions

Then on the other hand...
- Lower quality
- Slower rendering?? Or is it actually faster (less bandwidth)??

The lower quality probably depends on the compression settings I guess, and images that really need detail still can use uncompressed formats. But, how bad is the loss really? I can't really see a difference usually. But I didn't have a lot of examples.

About the performance. I've been tought, 100 years ago, that decompressing takes time. Not sure how the video-card does things, but does the decompression hit the performance? The only thing I read is that is can actually boost the performance because less bandwidth is used. In my case I have quite a lot surface textures (512 x 512, 1024 x 1024, resolutions like that).


A last question. How to calculate the video memory usage of a compressed texture? Is it equal to the amount of bytes pixeldata you read? Or does OpenGL / GPU convert the data?

Share this post


Link to post
Share on other sites
Advertisement
[quote]Lower quality[/quote]

Not necessarily, while using same space as standard texture, S3TC (DXT1) compressed one can have 8-times the size of uncompressed one actually, while still keeping very good quality - so you can actually have a lot more micro details on texture, even with a little less colors (still not too much visible difference between s3tc compressed one and uncompressed - in most cases...)

[quote]Slower rendering?? Or is it actually faster (less bandwidth)??[/quote]

Actually it is faster, less bandwidth, and also u have less memory storage.

[quote]About the performance. I've been tought, 100 years ago, that decompressing takes time. Not sure how the video-card does things, but does the decompression hit the performance?[/quote]

Actually S3TC compressions are hardware implemented on GPU, so it won't take any time more than using standard uncompressed texture.

[quote]How to calculate the video memory usage of a compressed texture?[/quote]

Memory usage = Number of pixels * Bits per pixel / Compression Ratio

[quote]Or does OpenGL / GPU convert the data?[/quote]

The good thing on S3TC compressions is that GPU still keeps them compressed in VRAM - e.g. they're actually a lot smaller than standard textures.

Note that S3TC compressions are a must for Megatextures.

Btw. I think I'm still having some S3TC compression code lying around if you would like to try encoding, but still I don't know if its patented and for how long (I wrote it just out of curiosity and I even read patent, my code is different than what they describe :P)


Share this post


Link to post
Share on other sites
In general compression is going to be a win. It can significantly ease the burden on your memory usage, and it helps with bandwidth as well. The GPU can decode them on the fly (since S3TC is designed to be very simple to decode a 4x4 block) so you don't need to worry about any added cost for the decode. You do have to be careful though with quality, for certain cases it's very easy to end up with a really crappy-looking texture once you've compressed them.

Here are some general tips. I'm going to use the D3D10/D3D11 terminology for the different compression formats, since I don't know the corresponding OpenGL names (sorry!):

1. Use a good compressor! This is key. Some are much better than others. For offline compression, I recommend using the ATI_Compress library. It's dead simple, it's multithreaded, and has great quality. We used to Nvidia texture tools at work since it's open source, but the quality was worse and it was significantly slower.

2. Have the option to opt out of compression for certain textures. You're bound to find a few cases where the hit in quality just isn't worth it...for instance anything with a really smooth gradient usually ends up being paletized pretty badly.

3. Use the right format! BC1 has the lowest memory footprint for an RGB texture, but also doesn't have an alpha channel. BC2 and BC3 do have alpha channels, with different means of encoding them (most people just stick to BC3). For monochrome textures, you'll want BC4. It has 1 hi-quality channel (basically the alpha channel from BC3). For normal maps, you'll want to use BC5 which has 2 hi-quality channels (store XY and reconstruct Z in the shader). If BC4 and BC5 aren't available on your target spec, then you can use BC3 for normal maps and put the X in the alpha channel and Y in the green channel and then reconstruct Z in the shader (those are the two channels with the most precision).

4. For color maps you can get a bit better quality by determining the min and max values in the texture, and then rescaling that range to [0, 1]. Then in the shader you use the min and max to scale it back to the normal.

5. BC6 and BC7 are really awesome (HDR, and hi-quality LDR respectively), but only available on DX11-class hardware. There's also not really any tools support for it yet. The D3DX library can encode to it, but it's super slow. There's also a sample in the SDK that does the encoding on the GPU using a compute shader, but it's pretty bare bones and doesn't support cube maps or mipmaps.

Share this post


Link to post
Share on other sites
Hey Vilem! It's been a while. Sorry for forgetting to mail you a year ago. It suddenly got so busy with people offering help on T22 :)

Thanks for the info MJP & Vilem. These arguments are good enough to put DDS support in the engine. I thought decoding would give a small hit, but having the hardware doing it for "free" is awesome.

Images with compressing quality issues can still keep using uncompressed formats. About that... asides from smooth gradients are there particular cases that have quality problems?
- Textures using the alpha channel for transparency (foliages, metal fences, ...)
- NormalMaps with small details
- Images with a lot of small details, but not varying colors (a sand texture for example)
Most of the images we're using are indoor material textures such as a concrete wall or wood floor btw.

Not sure where BC1..7 stands for. While coding a bit I came across DXT1, DXT3 and DXT5. Are those the same?

Share this post


Link to post
Share on other sites
[quote]Sorry for forgetting to mail you a year ago. It suddenly got so busy with people offering help on T22[/quote]

No problem, I watch your T22 blog from time to time and your project looks very very promising!

[quote]Not sure where BC1..7 stands for. While coding a bit I came across DXT1, DXT3 and DXT5. Are those the same?[/quote]

AFAIR BC* is naming system of Direct3D, while DXT* are names used in OpenGL:

DXT1 is actually BC1
DXT3 is actually BC3
DXT5 is actually BC5

Actually I think there are also DXT2 and DXT4 (but these aren't used commonly and I'm also not sure if GL implements them), BC6 and BC7 are new things, I think that BPTC is whats standing in OpenGL instead of BC7 in Direct3D.

Share this post


Link to post
Share on other sites
Yeah DX9 used to call them DXT* and the old OpenGL extensions also called them DXT, but in DX10 onwards they started calling them BC* (where BC stands for "block compression"). I have no idea if the names have changed with the newer (3.x - 4.x) version of OpenGL, which is why I just used the DX10 names. For quick reference:

BC1 - 5:6:5 RGB + 1bit alpha. DXT1 in DX9/GL extensions. 1:6 compression ratio
BC2 - 5:6:5 RGB + 4bit explicit alpha (better for non-coherent alpha values). DXT3 in DX9/GL extensions. 1:4 compression
BC3 - 5:6:5 RGB + 8bit interpreted alpha (better for coherent alpha values). DXT5 in DX9/GL extensions. 1:4 compression
BC4 - 8bit interpreted R, similar to the alpha from BC3. Was previously known as ATI1N in DX9, I think it was called LATC1 in GL extensions. 1:2 compression
BC5 - 8bit interpreted RG, basically just double BC4 channels. Was previously known as ATI2N in DX9, LATC2 in GL extensions. Referred to as "3Dc" in ATI marketing. 1:2 compression
BC6H - 16:16:16 floating point RGB. No idea what this is called in OpenGL.1:6 compression ratio
BC7 - 4-7bit RGB + 0-8bit alpha. Actually a combination of different encoding modes, where the best mode is chosen for each 4x4 block to best represent the data. 1:4 compression ratio

Share this post


Link to post
Share on other sites
BC4 = RED_RGTC1 in OpenGL
BC5 = RG_RGTC2 in OpenGL
from http://www.opengl.org/registry/specs/ARB/texture_compression_rgtc.txt (in Core from 3.0)

BC6H = BPTC_UNORM / SRGB_ALPHA_BPTC_UNORM
BC7 = BPTC_SIGNED_FLOAT / BPTC_UNSIGNED_FLOAT
from http://www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt (in Core from 4.2)

Both extensions especially mentions compatibility with DirectX.

Share this post


Link to post
Share on other sites
[quote name='Martins Mozeiko' timestamp='1316122987' post='4862230']
BC6H = BPTC_UNORM / SRGB_ALPHA_BPTC_UNORM
BC7 = BPTC_SIGNED_FLOAT / BPTC_UNSIGNED_FLOAT
from [url="http://www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt"]http://www.opengl.or...ession_bptc.txt[/url] (in Core from 4.2)
[/quote]

I think these are backwards

Share this post


Link to post
Share on other sites
[quote name='MJP' timestamp='1316128777' post='4862260']
[quote name='Martins Mozeiko' timestamp='1316122987' post='4862230']
BC6H = BPTC_UNORM / SRGB_ALPHA_BPTC_UNORM
BC7 = BPTC_SIGNED_FLOAT / BPTC_UNSIGNED_FLOAT
from [url="http://www.opengl.org/registry/specs/ARB/texture_compression_bptc.txt"]http://www.opengl.or...ession_bptc.txt[/url] (in Core from 4.2)
[/quote]

I think these are backwards
[/quote]
Oops! Yes, you are right. BC7 = BPTC_xyz_FLOAT, BC6H = xyz_BPTC_UNORM.

Share this post


Link to post
Share on other sites
One more question. I managed to create a DDS file now with the ATI_Compress tool. Works like a charm, but... does anyone know how to make multiple Mip-Map levels? The DDS writer in ATI_Compress_Helper only makes 1 level by default. Now I could try to make my own writer, but maybe it can be done with some easy adjustments.

Thanks
Rick


Share this post


Link to post
Share on other sites
Compress won't generate mips, you have to generate them first and then compress each mip level individually. In our texture pipeline we generate the mips ourselves, but you could use another library to do it if you want.

Share this post


Link to post
Share on other sites
Definitely faster rendering; I've benchmarked it up to 20% faster. Quality can drop off quite badly depending on the texture you're using though; for something like lower resolution 2D GUI textures it can be unacceptable.

Make sure that you load the DDS natively and that you're not going through any software decompression/recompression stages too.

Share this post


Link to post
Share on other sites
[quote name='MJP' timestamp='1316110690' post='4862152']
BC6 and BC7 are really awesome (HDR, and hi-quality LDR respectively), but only available on DX11-class hardware. There's also not really any tools support for it yet. The D3DX library can encode to it, but it's super slow. There's also a sample in the SDK that does the encoding on the GPU using a compute shader, but it's pretty bare bones and doesn't support cube maps or mipmaps
[/quote]

MJP, this post wasn't all that long ago, but wondering whether you've since come across a decent compression tool that supports BC6/7.

Share this post


Link to post
Share on other sites
[quote name='360GAMZ' timestamp='1326100237' post='4900890']
[quote name='MJP' timestamp='1316110690' post='4862152']
BC6 and BC7 are really awesome (HDR, and hi-quality LDR respectively), but only available on DX11-class hardware. There's also not really any tools support for it yet. The D3DX library can encode to it, but it's super slow. There's also a sample in the SDK that does the encoding on the GPU using a compute shader, but it's pretty bare bones and doesn't support cube maps or mipmaps
[/quote]

MJP, this post wasn't all that long ago, but wondering whether you've since come across a decent compression tool that supports BC6/7.
[/quote]

Unfortunately not. I've been meaning to look into it more, but haven't gotten around to it yet. I do have a hacked-up version of the DX GPU compressor that will do BC6H for cube maps and mip levels that's sitting on my PC at home, but that's about it.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
  • Advertisement
  • Popular Now

  • Advertisement
  • Similar Content

    • By Balma Alparisi
      i got error 1282 in my code.
      sf::ContextSettings settings; settings.majorVersion = 4; settings.minorVersion = 5; settings.attributeFlags = settings.Core; sf::Window window; window.create(sf::VideoMode(1600, 900), "Texture Unit Rectangle", sf::Style::Close, settings); window.setActive(true); window.setVerticalSyncEnabled(true); glewInit(); GLuint shaderProgram = createShaderProgram("FX/Rectangle.vss", "FX/Rectangle.fss"); float vertex[] = { -0.5f,0.5f,0.0f, 0.0f,0.0f, -0.5f,-0.5f,0.0f, 0.0f,1.0f, 0.5f,0.5f,0.0f, 1.0f,0.0f, 0.5,-0.5f,0.0f, 1.0f,1.0f, }; GLuint indices[] = { 0,1,2, 1,2,3, }; GLuint vao; glGenVertexArrays(1, &vao); glBindVertexArray(vao); GLuint vbo; glGenBuffers(1, &vbo); glBindBuffer(GL_ARRAY_BUFFER, vbo); glBufferData(GL_ARRAY_BUFFER, sizeof(vertex), vertex, GL_STATIC_DRAW); GLuint ebo; glGenBuffers(1, &ebo); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo); glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(indices), indices,GL_STATIC_DRAW); glVertexAttribPointer(0, 3, GL_FLOAT, false, sizeof(float) * 5, (void*)0); glEnableVertexAttribArray(0); glVertexAttribPointer(1, 2, GL_FLOAT, false, sizeof(float) * 5, (void*)(sizeof(float) * 3)); glEnableVertexAttribArray(1); GLuint texture[2]; glGenTextures(2, texture); glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, texture[0]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); sf::Image* imageOne = new sf::Image; bool isImageOneLoaded = imageOne->loadFromFile("Texture/container.jpg"); if (isImageOneLoaded) { glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, imageOne->getSize().x, imageOne->getSize().y, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageOne->getPixelsPtr()); glGenerateMipmap(GL_TEXTURE_2D); } delete imageOne; glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, texture[1]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); sf::Image* imageTwo = new sf::Image; bool isImageTwoLoaded = imageTwo->loadFromFile("Texture/awesomeface.png"); if (isImageTwoLoaded) { glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, imageTwo->getSize().x, imageTwo->getSize().y, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageTwo->getPixelsPtr()); glGenerateMipmap(GL_TEXTURE_2D); } delete imageTwo; glUniform1i(glGetUniformLocation(shaderProgram, "inTextureOne"), 0); glUniform1i(glGetUniformLocation(shaderProgram, "inTextureTwo"), 1); GLenum error = glGetError(); std::cout << error << std::endl; sf::Event event; bool isRunning = true; while (isRunning) { while (window.pollEvent(event)) { if (event.type == event.Closed) { isRunning = false; } } glClear(GL_COLOR_BUFFER_BIT); if (isImageOneLoaded && isImageTwoLoaded) { glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, texture[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, texture[1]); glUseProgram(shaderProgram); } glBindVertexArray(vao); glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, nullptr); glBindVertexArray(0); window.display(); } glDeleteVertexArrays(1, &vao); glDeleteBuffers(1, &vbo); glDeleteBuffers(1, &ebo); glDeleteProgram(shaderProgram); glDeleteTextures(2,texture); return 0; } and this is the vertex shader
      #version 450 core layout(location=0) in vec3 inPos; layout(location=1) in vec2 inTexCoord; out vec2 TexCoord; void main() { gl_Position=vec4(inPos,1.0); TexCoord=inTexCoord; } and the fragment shader
      #version 450 core in vec2 TexCoord; uniform sampler2D inTextureOne; uniform sampler2D inTextureTwo; out vec4 FragmentColor; void main() { FragmentColor=mix(texture(inTextureOne,TexCoord),texture(inTextureTwo,TexCoord),0.2); } I was expecting awesomeface.png on top of container.jpg

    • By khawk
      We've just released all of the source code for the NeHe OpenGL lessons on our Github page at https://github.com/gamedev-net/nehe-opengl. code - 43 total platforms, configurations, and languages are included.
      Now operated by GameDev.net, NeHe is located at http://nehe.gamedev.net where it has been a valuable resource for developers wanting to learn OpenGL and graphics programming.

      View full story
    • By TheChubu
      The Khronos™ Group, an open consortium of leading hardware and software companies, announces from the SIGGRAPH 2017 Conference the immediate public availability of the OpenGL® 4.6 specification. OpenGL 4.6 integrates the functionality of numerous ARB and EXT extensions created by Khronos members AMD, Intel, and NVIDIA into core, including the capability to ingest SPIR-V™ shaders.
      SPIR-V is a Khronos-defined standard intermediate language for parallel compute and graphics, which enables content creators to simplify their shader authoring and management pipelines while providing significant source shading language flexibility. OpenGL 4.6 adds support for ingesting SPIR-V shaders to the core specification, guaranteeing that SPIR-V shaders will be widely supported by OpenGL implementations.
      OpenGL 4.6 adds the functionality of these ARB extensions to OpenGL’s core specification:
      GL_ARB_gl_spirv and GL_ARB_spirv_extensions to standardize SPIR-V support for OpenGL GL_ARB_indirect_parameters and GL_ARB_shader_draw_parameters for reducing the CPU overhead associated with rendering batches of geometry GL_ARB_pipeline_statistics_query and GL_ARB_transform_feedback_overflow_querystandardize OpenGL support for features available in Direct3D GL_ARB_texture_filter_anisotropic (based on GL_EXT_texture_filter_anisotropic) brings previously IP encumbered functionality into OpenGL to improve the visual quality of textured scenes GL_ARB_polygon_offset_clamp (based on GL_EXT_polygon_offset_clamp) suppresses a common visual artifact known as a “light leak” associated with rendering shadows GL_ARB_shader_atomic_counter_ops and GL_ARB_shader_group_vote add shader intrinsics supported by all desktop vendors to improve functionality and performance GL_KHR_no_error reduces driver overhead by allowing the application to indicate that it expects error-free operation so errors need not be generated In addition to the above features being added to OpenGL 4.6, the following are being released as extensions:
      GL_KHR_parallel_shader_compile allows applications to launch multiple shader compile threads to improve shader compile throughput WGL_ARB_create_context_no_error and GXL_ARB_create_context_no_error allow no error contexts to be created with WGL or GLX that support the GL_KHR_no_error extension “I’m proud to announce OpenGL 4.6 as the most feature-rich version of OpenGL yet. We've brought together the most popular, widely-supported extensions into a new core specification to give OpenGL developers and end users an improved baseline feature set. This includes resolving previous intellectual property roadblocks to bringing anisotropic texture filtering and polygon offset clamping into the core specification to enable widespread implementation and usage,” said Piers Daniell, chair of the OpenGL Working Group at Khronos. “The OpenGL working group will continue to respond to market needs and work with GPU vendors to ensure OpenGL remains a viable and evolving graphics API for all its customers and users across many vital industries.“
      The OpenGL 4.6 specification can be found at https://khronos.org/registry/OpenGL/index_gl.php. The GLSL to SPIR-V compiler glslang has been updated with GLSL 4.60 support, and can be found at https://github.com/KhronosGroup/glslang.
      Sophisticated graphics applications will also benefit from a set of newly released extensions for both OpenGL and OpenGL ES to enable interoperability with Vulkan and Direct3D. These extensions are named:
      GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_memory_object_win32 GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_semaphore_win32 GL_EXT_win32_keyed_mutex They can be found at: https://khronos.org/registry/OpenGL/index_gl.php
      Industry Support for OpenGL 4.6
      “With OpenGL 4.6 our customers have an improved set of core features available on our full range of OpenGL 4.x capable GPUs. These features provide improved rendering quality, performance and functionality. As the graphics industry’s most popular API, we fully support OpenGL and will continue to work closely with the Khronos Group on the development of new OpenGL specifications and extensions for our customers. NVIDIA has released beta OpenGL 4.6 drivers today at https://developer.nvidia.com/opengl-driver so developers can use these new features right away,” said Bob Pette, vice president, Professional Graphics at NVIDIA.
      "OpenGL 4.6 will be the first OpenGL release where conformant open source implementations based on the Mesa project will be deliverable in a reasonable timeframe after release. The open sourcing of the OpenGL conformance test suite and ongoing work between Khronos and X.org will also allow for non-vendor led open source implementations to achieve conformance in the near future," said David Airlie, senior principal engineer at Red Hat, and developer on Mesa/X.org projects.

      View full story
    • By _OskaR
      Hi,
      I have an OpenGL application but without possibility to wite own shaders.
      I need to perform small VS modification - is possible to do it in an alternative way? Do we have apps or driver modifictions which will catch the shader sent to GPU and override it?
    • By xhcao
      Does sync be needed to read texture content after access texture image in compute shader?
      My simple code is as below,
      glUseProgram(program.get());
      glBindImageTexture(0, texture[0], 0, GL_FALSE, 3, GL_READ_ONLY, GL_R32UI);
      glBindImageTexture(1, texture[1], 0, GL_FALSE, 4, GL_WRITE_ONLY, GL_R32UI);
      glDispatchCompute(1, 1, 1);
      // Does sync be needed here?
      glUseProgram(0);
      glBindFramebuffer(GL_READ_FRAMEBUFFER, framebuffer);
      glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
                                     GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, texture[1], 0);
      glReadPixels(0, 0, kWidth, kHeight, GL_RED_INTEGER, GL_UNSIGNED_INT, outputValues);
       
      Compute shader is very simple, imageLoad content from texture[0], and imageStore content to texture[1]. Does need to sync after dispatchCompute?
  • Advertisement