DXTn + AC & Deflate

Published June 20, 2013
Advertisement
well, I was working more on the DXTn based "BTIC" images.


ultimately, the downfall of the AC codec seems to be, once again, that its raw performance is a bit weak, and isn't really "generally better" enough than other options (such as Deflate) to really make the performance impact worthwhile.

granted, yes, it has demonstrated an ability to squeeze down JPEG images slightly in my tests, which could be worthwhile, but seems to be less effective with Deflate (which seems to get comparable compression to AC for the BTIC images, but with higher performance). note that Deflate does not work effectively with JPEG images in my tests.

granted, with some fiddling I have gotten BTIC within comparable output size ranges to JPEG, albeit the image quality with photographic images is worse (BTIC results in more obvious blocking and patches).


note: some of the size/quality tradeoff does have to do with the block-reduction filtering, which I was fiddling with (basically, trying to improve the algorithm for matching and reducing the number of unique blocks used). further tweaking involved adding explicit support for "patch" images (more relevant to video), which will try (when possible) to match against the blocks used by the prior I-Frame, and try to create new blocks sparingly.

I am left wondering some if it could be possible to "precondition" DXTn blocks to compress better with Deflate, but this would introduce additional decoding complexity (the need to convert the blocks back into the format the GPU expects).


included are image comparisons between BTIC images at the current 50% quality, vs similarly-sized JPEG images.

the match-up at 50% in both cases is mostly coincidence (the pony image was previously closer to 25%, but altering the "choice" algorithm to go back to choosing DXT5 in this case seems to have made the BTIC image a bit larger, and thus, closer to the 50% quality JPEG, well, either that or a few tweaks made to the block-reduction algorithm).


on a related note:
got around to adding engine (video texture) support for both BTIC and Motion-PNG video.
unlike BTJ and BTIC video, M-PNG does not currently support extended components or layers, but does support lossless RGBA video.

note that BTIC is inherently lossy in its current form (unlike PNG, and BTJ supports both lossy and lossless modes).

or such...
2 likes 6 comments

Comments

Navyman

This is an area that I do not have a lot of experience in and have been enjoying your insight.

June 21, 2013 02:00 AM
cr88192

yeah. compression is one of those things that requires a lot of fiddling to work on.

some patterns hold though, for example, technologies that have worked well in the past will tend to work well in the future, and ones which haven't usually worked well will often continue to do so.

however, there will sometimes be edge cases, especially in non-standard use-cases, where a different strategy may be more desirable.

hence me designing several codecs with arguably "awful" designs by conventional standards (IOW: poor size/quality tradeoffs, ...), sometimes, there may be advantages to this sort of thing.

for example, the "industry standard" in video codecs is ones typically based around the DCT or Hadamard transforms, and the use of block-based motion compensation. while these get good size/quality tradeoffs, they are typically designed for opaque single-layer video and decoding to RGB.

however, a lot of the quality is wasted if converting to DXT (which doesn't really preserve image quality very well), and ideally we may want to deal with multiple concurrent streams rather than high-resolution high-quality streams, ...

likewise, my custom audio codec was designed primarily for random access to compressed audio, which would have been more complex with a more traditionally designed audio codec (nevermind all the issues of trying to get reasonably acceptable audio quality from fixed-sized fixed-format audio blocks...).

granted, it could have been a little easier had I used bigger blocks (which would have resulted in less per-block header overhead, but oh well...). I did partly implement a version which would have gone from 256-bit to 1024-bit blocks, but it was more complex (and I was less happy with the added complexity).

June 21, 2013 04:30 AM
cr88192

ADD:

trivia, one thing the BTIC codec does do well is decode video at around 300 Mpix/second (for the Deflate version), and around 600 Mpix/second for the raw version (optimized build).

(for one of the videos, at 512x512, this is around 2200 frames/second decoding...).

so, it is presently one of my faster codecs...

June 21, 2013 05:53 AM
Navyman

Oh wow. How is it on larger resolutions? 1080p?

June 21, 2013 10:34 AM
cr88192

600 Mpix / second would mean around 300 frames/second at 1080p, and 300 Mpix/second would mean around 150 frames/second.

granted, size/quality tradeoffs would be a bit worse than with other codecs.

IOW, block-reduced DXT isn't really high quality.

it is actually loosely analogous in some ways to how codecs like Cinepak worked.

the primary differences are mostly that Cinepak used YUV pixel-blocks, initially at 2x2 pixels (used to build a table of 4x4 pixel blocks), and then used a table of 4x4 pixel blocks to build the output.

in my case, the 4x4 pixel blocks are stored as DXT-blocks, and are handled by an LZ77 scheme, where basically the stream either directly encodes blocks, references individual prior blocks, or references spans of blocks. this is done relative to a sliding window. for P-Frames, the I-Frame is used as the starting-window, so the P-Frame can reference its blocks.

note that using a table-driven strategy, and possibly tweaking how escape-coding is handled, could potentially allow for faster decoding speeds (and could also open more room for improving compression, by increasing the compressibility of DXT blocks, likely by splitting them up and using a per-frame color palette, ...).

ADD:

basically, a possible image structure something like:

palette, likely YUV based;

pixel-pattern table;

DXT1/5 block-table, each contains two palette references and a pixel-pattern reference;

DXT5 alpha-block table;

image-frame (block indices, likely LZ-like, or relying on Deflate for LZ).

each would be stored as an individual optionally Deflate-compressed lump.

ADD2:

some experimental results (mostly from trying things out), imply that the above would not buy much compression wise. neither the pixel pattern data nor color data really seem to be compressible, nor independent of each other. palette reduction also seems to somewhat hurt image quality.

June 21, 2013 03:29 PM
cr88192

ADD3:

did add something that slightly works:

splitting up the raw block data and indices/LZ data.

while the DXT blocks still don't really compress (and are still typically the bulk of the image data), splitting up the data does at least cause the data to compress slightly better.

note that each block is 4x4 pixels, and requires 64 bits.

each typical block contains:

2 x 16 bits: color data

32 bits: pixel data

no means have yet been found to effectively decorrelate color and pixel data, nor to effectively predict either (note that at this point the image data has typically already been reduced to a few thousand "unique" blocks). splitting color and pixel data also has relatively little effect.

this likely mostly leaves more fiddling with the block-reduction filter.

one option could be more aggressively merging rare blocks (to reduce the amount of image data doing into "one off" blocks, which will be made to tolerate a higher error).

June 22, 2013 08:45 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement