3 replies to this topic

Posted 21 January 2013 - 03:06 PM

Is the way the 5.6.5. colors expanded into 8 bits or normalised floats documented/standardized anywhere. Specifically for DX9 and DX11.

So this could be done a few different ways. For Red and Blue the binary values can be divided by 31.0f to give a color in the range 0.0 to 1.0.

Alternatively the color could be expanded into 8 bits and then divided by 255. A common way of expanding bits to large word sizes is to replicate high order bits to the low order bits. So to expand 5 bits in the work ABCDE to 8 bits is ABCDEABC. Is this what occurs for DXTn?

So this could be done a few different ways. For Red and Blue the binary values can be divided by 31.0f to give a color in the range 0.0 to 1.0.

Alternatively the color could be expanded into 8 bits and then divided by 255. A common way of expanding bits to large word sizes is to replicate high order bits to the low order bits. So to expand 5 bits in the work ABCDE to 8 bits is ABCDEABC. Is this what occurs for DXTn?

Posted 22 January 2013 - 11:06 AM

http://msdn.microsoft.com/en-us/library/windows/desktop/bb694531(v=vs.85).aspx

Thanks for the link but unfortunately it doesn't cover the details of the question.

Posted 23 January 2013 - 03:46 AM

The way how K bits wide fixed-bit-width channels with unsigned integer values [0, 2^K-1] (UNORM types) are interpreted is standard: The unsigned number k in the range [0, 2^K-1] represents the rational number k / 2^k-1.

This means that a 5-bit channel can represent the rationals 0/31, 1/31, 2/31, 3/31, ..., 30/31, 31/31 = 1. A 8-bit color channel can represent the rationals 0/255, 1/255, ..., 254/255, 255/255=1.

When encoding a rational represented as a floating point as an unsigned integer, we (usually) pick the integer that is nearest to the floating point number in question, since that minimizes the generated rounding error.

Converting the other way, from 5-bit integer to floating point, or 8-bit integer to floating point is lossless (but not necessarily exact), since floats have more precision.

The method of "expanding bits" you specify is an optimization that does not have mathematical basis, and it introduces an error.

Here is a small code snippet that converts colors encoded as 5-bit UNORM to floating point and to their closest 8-bit representative, as well as directly 5-bit to 8-bit using the approximation you describe:

for(int c = 0; c <= 31; ++c) { double d = c / 31.0; // Convert 5-bit UNORM color to nearest double representation. int u = d * 255.0; // Convert UNORM color as double to nearest 8-bit UNORM encoded representation. int u2 = (c << 3) | (c >> 2); // Approximate conversion from 5-bit directly to 8-bit. printf("5-bit:%3d as UNORM:%.05g Stored as 8-bit:%3d approx 5-bit->8-bit: %3d\n", c, d, u, u2); }

The output for that is:

5-bit: 0 as UNORM:0 Stored as 8-bit: 0 approx 5-bit->8-bit: 0 5-bit: 1 as UNORM:0.032258 Stored as 8-bit: 8 approx 5-bit->8-bit: 8 5-bit: 2 as UNORM:0.064516 Stored as 8-bit: 16 approx 5-bit->8-bit: 16 5-bit: 3 as UNORM:0.096774 Stored as 8-bit: 24 approx 5-bit->8-bit: 24 5-bit: 4 as UNORM:0.12903 Stored as 8-bit: 32 approx 5-bit->8-bit: 33 5-bit: 5 as UNORM:0.16129 Stored as 8-bit: 41 approx 5-bit->8-bit: 41 5-bit: 6 as UNORM:0.19355 Stored as 8-bit: 49 approx 5-bit->8-bit: 49 5-bit: 7 as UNORM:0.22581 Stored as 8-bit: 57 approx 5-bit->8-bit: 57 5-bit: 8 as UNORM:0.25806 Stored as 8-bit: 65 approx 5-bit->8-bit: 66 5-bit: 9 as UNORM:0.29032 Stored as 8-bit: 74 approx 5-bit->8-bit: 74 5-bit: 10 as UNORM:0.32258 Stored as 8-bit: 82 approx 5-bit->8-bit: 82 5-bit: 11 as UNORM:0.35484 Stored as 8-bit: 90 approx 5-bit->8-bit: 90 5-bit: 12 as UNORM:0.3871 Stored as 8-bit: 98 approx 5-bit->8-bit: 99 5-bit: 13 as UNORM:0.41935 Stored as 8-bit:106 approx 5-bit->8-bit: 107 5-bit: 14 as UNORM:0.45161 Stored as 8-bit:115 approx 5-bit->8-bit: 115 5-bit: 15 as UNORM:0.48387 Stored as 8-bit:123 approx 5-bit->8-bit: 123 5-bit: 16 as UNORM:0.51613 Stored as 8-bit:131 approx 5-bit->8-bit: 132 5-bit: 17 as UNORM:0.54839 Stored as 8-bit:139 approx 5-bit->8-bit: 140 5-bit: 18 as UNORM:0.58065 Stored as 8-bit:148 approx 5-bit->8-bit: 148 5-bit: 19 as UNORM:0.6129 Stored as 8-bit:156 approx 5-bit->8-bit: 156 5-bit: 20 as UNORM:0.64516 Stored as 8-bit:164 approx 5-bit->8-bit: 165 5-bit: 21 as UNORM:0.67742 Stored as 8-bit:172 approx 5-bit->8-bit: 173 5-bit: 22 as UNORM:0.70968 Stored as 8-bit:180 approx 5-bit->8-bit: 181 5-bit: 23 as UNORM:0.74194 Stored as 8-bit:189 approx 5-bit->8-bit: 189 5-bit: 24 as UNORM:0.77419 Stored as 8-bit:197 approx 5-bit->8-bit: 198 5-bit: 25 as UNORM:0.80645 Stored as 8-bit:205 approx 5-bit->8-bit: 206 5-bit: 26 as UNORM:0.83871 Stored as 8-bit:213 approx 5-bit->8-bit: 214 5-bit: 27 as UNORM:0.87097 Stored as 8-bit:222 approx 5-bit->8-bit: 222 5-bit: 28 as UNORM:0.90323 Stored as 8-bit:230 approx 5-bit->8-bit: 231 5-bit: 29 as UNORM:0.93548 Stored as 8-bit:238 approx 5-bit->8-bit: 239 5-bit: 30 as UNORM:0.96774 Stored as 8-bit:246 approx 5-bit->8-bit: 247 5-bit: 31 as UNORM:1 Stored as 8-bit:255 approx 5-bit->8-bit: 255

I can't outright find a page from MSDN that describes this, but you can find the same interpretation from OpenGL specification pdfs where it is explained with formal rigor.

Me+PC=clb.demon.fi | C++ Math and Geometry library: MathGeoLib, test it live! | C++ Game Networking: kNet | 2D Bin Packing: RectangleBinPack | Use gcc/clang/emcc from VS: vs-tool | Resume+Portfolio | gfxapi, test it live!