Jump to content

  • Log In with Google      Sign In   
  • Create Account






misc: FRIR2, (Possible) Alpha + Theora and XviD

Posted by BGB, 02 December 2013 · 392 views

recently was working some on a new interpreter design I was calling FRIR2.

what is it?

basically a Three-Address-Code Statically-Typed bytecode format;
the current intention was mostly to try to make a bytecode at least theoretically viable to JIT compile into a form which could be more performance competitive with native code, mostly for real-time audio/video stuff, while still allowing readily changing scripts (not requiring a rebuild, and possibly interactively being able to tweak things).

made some progress implementing it, but it still has a ways to go before it could be usable (and considerably more work before it is likely to be within the target range WRT performance).

not an immediate priority though.

ADD: FWIW, as-is FRIR2 ASM syntax will look something like:
neg.i r13, r9;            //2 byte instruction
add.i r14, r7, r11;    //3 byte instruction
...
neg.i r19, r23;              //5 byte instruction
add.i r42, r37, r119;    //6 byte instruction
...
neg.v3f r19, r23;              //6 byte instruction
add.v3f r42, r37, r119;    //7 byte instruction
...
mov.ic r3, 0
L0:
jmp_ge.ic r3, 10, L1
inc.i r3, r3
jmp L0
L1:

...

//with declarations:
var someVar:i;    //someVar is an integer
function SomeFunc:i(x:f, y:f)    //int SomeFunc(float, float)
{
    var z:f;
    add.f z, x, y;
    convto.f t0, z, 'i';
    ret.i t0;
}

otherwise, more idle thoughts for how to do alpha blending with Theora and XviD (within an AVI).

previously, I had tried the use of out-of-gamut colors, which while able to encode transparency, would do so with some ugly artifacts and limitations (namely violet bands and an inability to accurately encode colors for alpha-blended areas).

another possibility is to utilize some tricks similar to those used by Google for WebM, namely one of:
encode a secondary video channel containing alpha data (implementation PITA, little idea how existing video players will respond);
double the vertical resolution, encoding the extended information in the lower half, and indicating somehow that this has been done (would be handled via a special hack in the image decoder).


current leaning is toward the resolution-doubling strategy, as it is likely to be less effort.

the main issue is likely how to best encode the use of the hack:
somehow hacking it into one of the existing headers (how to best avoid breaking something?...);
possibly add an extra chunk which would mostly have the role of indicating certain format extensions (would need to be handled in the AVI code and passed back to the codec code).

contents of the extended components:
most likely, DAE (Depth, Alpha, Exponent).

Depth: used for bump-maps, possibly also for generating normal-maps via a Sobel filter (or cheaper analogue), ignored otherwise;
Alpha: obvious enough;
Exponent: Exponent for HDR images, ignored for LDR.

likely, DAE would still be subject to RGB/YUV conversions (could be skipped if only alpha were used).


compatibility?
resolution doubling at least should work without too much issue for existing video players and similar, but would double the height of the video for normal players (leaving all the alpha-related stuff in the bottom of the screen).

relevance?
Theora and XviD compress a little better than my BTIC2C format, so this could offer a better size/quality tradeoff, but likely worse decoding speeds (BTIC2C is roughly on-par with XviD as-is while already using an alpha channel);
unlike some other options, this would still not support specular or glow maps.

most likely, this is more likely to be relevant to video sequences than for animated textures, where raw RGB or RGBA is more likely to be sufficient for video sequences.

still not sure if this is a big enough use-case to really bother with though.


performance?
this could potentially require a fairly significant increase in the cost of the color-conversion, doubling the amount of pixels handled and potentially adding some extra filtering cost for normal-maps;
this should still be fast enough for 720p-equivalent resolutions though.

FWIW, a similar cost is implied as with the BGBTech-JPEG format (which supports alpha and normal maps via additional images embedded within the main image).


otherwise, went and added more video textures (to my game project):
water and slime now are video-mapped (using the BTIC1C codec, *);
ended up using 256x256 for the video-textures (was going to use 512x512, figured this was overkill);
discovered and fixed a few bugs (some engine related, a few minor decoder bugs in 1C discovered and fixed, ...);
made a lot of minor cosmetic tweaks (scaling textures, ...);
...

a minor tweak is that 1C will now try to "guess" the missing green and alpha bits based on the other bits;
basically, 1C normally stores RGB in 555 format (vs 565 as DXTn uses), so there is a missing bit;
likewise, for alpha, which is stored using 7 bits, vs the usual 8.

in both cases, the guess is currently made by assuming that the low bit depends on the high bit, so it copies the bit, which while naive, seems to be better than just leaving it as 0.

the other option is preserving these bits, but the quality gain is not particularly noticeable vs the image size increase.


*: note, 1C and 2C are different formats. 1C uses an RPZA-based format (RPZA + Deflate + more features), whereas 2C is loosely JPEG-based (and does RGBA mostly by encoding 4-component YUVA images).

1C is primarily focused on decoding to DXTn. it is effectively LDR only (HDR is theoretically possible, but the size and quality from some tests is "teh suck"). while decoding to DXTn it is drastically faster than most other options.

2C is mostly intended for intermediate video and HDR (it can do HDR mostly by encoding images filled with 16-bit half-floats, and/or using one of several fixed-point formats). speed and perceptual size/quality are a little worse than XviD or Theora, but the image quality is much higher at higher bitrates (ex: 30-70 Mbps).

decode speeds are "similar" to those of XviD (both are fast enough to do 1080p30, but 2C can do 1080p30 with HFloat+Alpha). generally, it is ~80 Mpix/sec vs ~105 Mpix/sec.

if XviD were used at 2x resolution to do alpha, this would likely cut the effective speed to around 53 Mpix/sec.
similar applies to Theora.

note: BTJPEG is around 90 Mpix/sec for raw RGB images, and around 60 for RGB+Alpha, for similar reasons.

this leaves the advantage of XviD and Theora mostly in terms of better image quality at lower bitrates (IOW: not throwing 30+ Mbps at the problem...).




BTW, if anyone cares, new engine release available:

http://cr88192.dyndns.org:8080/wiki/index.php/BGB_Current_Status

PARTNERS