compressing DXT5 textures, seemingly interesting results...

Started by
8 comments, last by cr88192 11 years, 1 month ago

I felt interested, so got around to throwing together a simplistic compressor for DXT5... (compressed textures).

this performs an additional compression pass over the normal DXTn compression, mostly for same of making the images smaller for sake of storage on disk or similar (where they would be again decompressed before handing them off to the GPU).

this compressor doesn't use any entropy coding, so is basically a plain bytes-for-bytes encoder.

otherwise, it uses an LZ77 variant vaguely analogous to the one in Deflate, albeit block-based rather than byte based, and supporting a potentially much bigger sliding window (currently the same as the image size).

currently:


/*
DXTn packed images.

Each block will be encoded in the form (byte):
0 <block:QWORD>        Literal Block (single raw block), Value=0.
1-127                  Single byte block index (Value=1-127).
128-191    X           Two byte block index (16384 blocks, Value=128-16383).
192-223    XX          Three byte block index (2097152 blocks).
224-238    I           LZ/RLE Run (2-16 blocks, Index)
239        LI          LZ/RLE Run (Length, Index)
240        XXX         24-Bit Index
241        XXXX        32-Bit Index
242-246                Literal Blocks (2-6 raw blocks)
247        L           Literal Blocks (L raw blocks)
248-255                Reserved

The block index will indicate how many blocks backwards to look for a matching block (1 will repeat the prior block). Length/Index values will use the same organization as above, only limited to encoding numeric values.

Note that DXT5 images will be split into 2 block-planes, with the first encoding the alpha component, followed by the plane encoding the RGB components.
*/
 

interesting results:

it is generally spitting out smaller output images than their JPEG analogues.

this much was unexpected...

for example, a 512x512 image is compressing down to roughly 30kB, and a 64x64 image to slightly under 2kB.

the JPEG versions of each are 85kB and 3.5kB.

the PNG versions of each are 75kB and 7kB.

the raw DXT5 versions of each are 256kB and 4kB.

so, this much is seeming "interesting" at least.

compression is a little worse with UVAY textures (YUVA colorspace with Y in the alpha channel, UV in RG, and a mixed alpha and UV scale in B), but this is to be expected (results were ~ 39kB and ~ 3.8kB).

currently, the compression is a little slow (due mostly to the logic for searching for runs, which does a raw search). speeding this up is possible via the use of hash-chains and similar, just I am not currently doing so.

decode speeds seem to measure in at approx 1500Mp/s (~ 1.5GB/s). this is *much* faster than my JPEG decoder.

it is yet to be seen if I will use this for much, but it could be considered as a possible alternative to JPEG for certain tasks (probably with a header thrown on and put into a TLV container).

thoughts?...

Advertisement

You give some file size comparisons, but how was the image quality with those sizes? It would be interesting to see the L1 or L2 norms of the different compression techniques, but then again I strongly doubt that you are going to do better than wavelet-based coding techniques like JPEG2000, except for special cases maybe. But perhaps I am wrong and you can surprise me ;)

How about some comparision shots, including deltas. I can compress an image to 1bit and that ultra fast smile.png

There could be although legal issues when compression already compressed image files (don't find the link, but I remember trouble with an existing patent), at least I would do some research in this field.

I can compress an image to 1bit and that ultra fast smile.png

Quick! Patent it before anyone steals your amazing compression algorithm!

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

There could be although legal issues when compression already compressed image files (don't find the link, but I remember trouble with an existing patent), at least I would do some research in this field.

Some ex-nVidia employee slash apparent patent troll by the name of Doug Rogers has a patent called "High Compression Rate Texture Mapping". IMHO It's a dubious patent of an obvious idea, with plenty of implementations that pre-date his patent, but he was the first-to-file, giving him trolling rights.
Basically, if you treat the DXT blocks as individual items, and then compress them via quantization etc, this guy might harass you for money.

He tried to legally threaten Valve (indirectly, perhaps accidentally) because they were using crunch%5B1%5D, but there was such an uproar that he backed down and promised to leave open source implementations alone (basically, if you're *making money* from "his" idea, he might try and muscle in).

[edit] Doug has taken issue with this post, accusing me of libel, so here's the squish author's post on the subject for some factual balance:
http://richg42.blogspot.com.au/2012/07/the-saga-of-crunch.html?m=1

I would link to Doug's blog where a lot of the discussion took place and where statements were made, but he took it down from the web shortly after promising to cooperate with open source projects.

Yes, I don't like it when patents step on the toes of game tech R&D. I was pretty shitty with you at the time, Doug. We're not used to patent folk on our lawns.
If there's a specific fact I've gotten wrong, then please correct me. My apologies for calling you an [expletive deleted] over actions which probably weren't intended to be malicious on your part, but were taken to be malicious by a large number of people.

Right, and it is such a shame because I know Doug Rogers from the Autodesk® FBX® forums where he helped me with a lot of issues when I was learning the Autodesk® FBX® SDK.

However I don’t believe he was suing Valve. He was suing one guy who works at Valve and happens to maintain crunch on the side.

And yes the method described here for further compressing DXT blocks sounds basically like his method. At the very least the wording of his patent will definitely encompass this algorithm. It was probably not in your plans anyway but keep in mind you won’t be able to sell it or he will come for you and your first-born.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

You give some file size comparisons, but how was the image quality with those sizes? It would be interesting to see the L1 or L2 norms of the different compression techniques, but then again I strongly doubt that you are going to do better than wavelet-based coding techniques like JPEG2000, except for special cases maybe. But perhaps I am wrong and you can surprise me ;)

side notes:

the algorithm is LZ77 based, and beyond the initial (lossy) conversion into DXTn, there shouldn't be any additional loss in quality (apart from potential encoder bugs).

typically, a person will compare the source data with the output decompressed data, and make sure that they match.

it is basically analogous to Deflate, but works in larger 8-byte units, and lacks any entropy coding (such as Huffman), and uses a larger window.

basically, each coded block consists of a tag-byte, followed by any operand bytes, and possibly followed by raw 8-byte blocks (the "literal block cases").

if a particular raw block has been seen previously, it is reused by index. if a span of blocks is seen, it will also be encoded (this includes copying runs from the window, or potentially encoding an RLE run of matching blocks).

for DXT5, which uses 16-byte blocks, the image is split into two planes of 8-byte blocks, which are encoded as two passes.

granted, comparing this against Deflate could also make sense, I have yet to test/compare them.

granted, there is some possible variation in compression based on how "good" the DXTn encoder is, which in my case, it is more optimized for speed than quality.

I was actually surprised that such a naive strategy was managing to do better (compression-wise) than the JPEG images, but there is a possible explanation:

DXTn itself is a bit more lossy than JPEG.

(it was mostly intended for "quick and dirty" compression, but did better than expected).

IOW: you probably wont want to be using DXTn for your photos, but it works well enough for game textures.

( so, it isn't really in the same camp as JPEG, JPEG-2K, or JPEG-XR ).

here is a link for the encoder being used for the DXT5 part:

http://pastebin.com/8rq3z5F5

and, for other information:

http://en.wikipedia.org/wiki/DXTn

http://en.wikipedia.org/wiki/LZ77

other possibilities are possible, but this is all for now.

There could be although legal issues when compression already compressed image files (don't find the link, but I remember trouble with an existing patent), at least I would do some research in this field.

Some ex-nVidia employee slash patent troll by the name of Doug Rogers has a patent called "High Compression Rate Texture Mapping". It's a dubious patent of an obvious idea, with plenty of implementations that pre-date his patent, but he was the first-to-file, giving him trolling rights.

Basically, if you treat the DXT blocks as individual items, and then compress them via quantization etc, this [expletive deleted] might harass you.

He tried to sue Valve because they were using crunch, but there was such an uproar that he backed down and promised to leave open source implementations alone (basically, if you're making money from "his" idea, he'll try and steal some of it).

good to know...

looking at the patent though, it appears mostly to apply to VQ or LZ78 compressors (which make use of an explicit dictionary or codebook), and I would think would maybe be N/A for LZ77, which uses a sliding window instead (and is currently not using any secondary quantization, apart from a fairly naive range-fit DXTn encoder...).

this particular code in my case is part of my "BTJPG" library, which I currently have under the MIT license.

for my engine itself, dunno. not exactly making money off this at the moment (currently it is donationware, but no one is making donations...).

ok, here are a few comparison images.

the PNG images are the originals, the fist dump is a JPEG version (100% quality), and the "dump_dxt1" versions are after converting to DXT5, LZ compressing, decompressing, converting back to RGBA, and saving out a JPEG of the resulting image.

LZ compressed versions of these images are 29kB and 32kB, respectively.

the artifacts visible are due mostly to the conversion to DXT5 (otherwise kind of need to write a better-quality converter...).

(EDIT: uploading seems to have re-encoded the JPEGs, they are now smaller and no longer apparently at 100% quality...).

(EDIT2:

Packed DXTn Decoder:

http://pastebin.com/QK3ZeVJn

Encoder:

http://pastebin.com/UxnbXycA

)

slightly updated DXT textures, mostly after experimenting somewhat with fine-tuning the arithmetic and adding fixed-pattern dithering.

basically, added a slight bias for even/odd pixels (via slight fiddling with the fixed-point arithmetic), mostly to cause pixels "on the edge" to alternate patterns, as well as some logic trying to better approximate the colors of flat areas via dithering. some fine-tuning to the color rounding was also added (no longer simply truncates).

RMSE seems to be improved slightly for the sorts of images I am testing.

there seems to be little obvious adverse impact on the LZ compression.

this has little impact on performance.

dither pattern is basically:

- + - +

+ - + -

- + - +

+ - + -

This topic is closed to new replies.

Advertisement