The documentation says those bits are reserved...

Started by
17 comments, last by dave j 10 years, 10 months ago

Always validate your input.

It doesn't matter if you are writing SQL queries or a UI library, or a jpeg loader, or anything else mentioned above.

Validate your input. If the input is invalid you assert and fail.

Advertisement

...that must mean we can use them.

I’ve honestly never thought of it that way. My impression has always been that anything flagged as “reserved” is short-hand for “reserved for our own future uses, not yours, so don’t touch them.”

I document my own engine heavily and I doubt I would feel bad if I said a bit was “Reserved” (with no further details) and then someone’s code started breaking because he or she thought that meant it was “Available to the user”. I’d explicitly say a bit was “Reserved for the user” if that is what I meant.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

I wouldn't use the wording "reserved" for something the user is free to use =P

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

...that must mean we can use them.

I’ve honestly never thought of it that way. My impression has always been that anything flagged as “reserved” is short-hand for “reserved for our own future uses, not yours, so don’t touch them.”

I document my own engine heavily and I doubt I would feel bad if I said a bit was “Reserved” (with no further details) and then someone’s code started breaking because he or she thought that meant it was “Available to the user”. I’d explicitly say a bit was “Reserved for the user” if that is what I meant.


L. Spiro

yeah. I think it depends some on other considerations though, for example:

if the bits are reserved in a file-format which hasn't changed much in 20 or 30 years, and a person for whatever reason wants to hack more features onto it, sometimes it is necessary to resort to hacks like using reserved bits or invalid bit-combinations to hack on new features (and sometimes even more severe).

granted, yes, unless a person is careful about it, this can break compatibility with other implementations (so, it requires some level of care, and testing to validate that various other implementations still work if exposed to the new features, ...).

for example, my own hacked JPEG-variant had internally done some "questionable" hacking onto the format to add some features, most notably in a variant mode I had called "NBCES" (for "Non Backward Compatible Extensions"), which had in a few cases gone as far as to alter how blocks were encoded (images encoded in NBCES mode can't be decoded in a conventional decoder, at all, but it can support a few more features and higher compression).

decided to leave out the specifics, but basically "NBCES" made some fairly severe hacks in many areas of the format, partly to support things like direct floating-point HDR textures (vs hacking them into an RGBE variant as part of the colorspace transform), larger blocks, and optional per-block PNG-like filters (which could improve compression over straight DCT), ... given the way my BTJ and NBCES formats work, NBCES images will appear like empty images to conventional JPEG decoders (they only contain APPn markers), which while invalid, is probably better than the decoder running head-first into unrecognized data and probably blowing up in some less predictable way. ( note this is relative to the more conservative BTJ format, which can support a lot of features in a "sort-of backwards-compatible" way. as-in, the base RGB image can still be decoded by a conventional decoder even if a lot of the extended features will no longer work, and the decoder can still decode images encoded in the baseline format. )

sometimes things are more acceptable though if the images aren't likely to be used for interchange, but rather to be used specifically with a controlled implementation. (much like libjpeg has its own sets of format extensions, many of which are not mutually compatible...).

if hacking things in a way which uses or interacts with an implementation maintained by someone else, then it is probably an "all bets are off" scenario.

and, even then, even with ones' own implementation, sometimes minor version issues, quirks, or bugs in prior versions may still need to be worked around... as it is kind of a pain sometimes designing things which deal gracefully with version issues...

sometimes some of this tends to leave a "distinctive look" to long-lived file-formats (like those which have been around for multiple decades), where many people will see these sorts of formats and immediately be like "barf" at often all the bit-twiddly, information spread around all over the place, multiple layers of packaging and daisy-chained headers, ... but, at times, there can be a sort of elegance to them as well... (or, at least, clues about how to design file-formats to be extensible...). (like, say, if a file format has survived, at least in part, since the 70s, it has probably at least doing something right...).

so, in this way, one person's reserved bits can often become another person's extension-point.

Honestly in that case I'd have just changed the header and called it a new format =P It's based in JPEG, but the changes you mention sound serious enough to be considered its own new thing, really.

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

Honestly in that case I'd have just changed the header and called it a new format =P It's based in JPEG, but the changes you mention sound serious enough to be considered its own new thing, really.

interestingly, JPEG doesn't really have a "file header" in the traditional sense, but is instead a sequence of escape-coded markers (IOW: the byte value 0xFF is magic, with the following byte as a marker, ...). this makes it a little "interesting" in a few ways...

typically for these files, I am using a "BTJ" extension, and generally calling the format "BGBTech-JPEG", and has enough options that it can actually generate a range of sub-formats, ranging from those resembling and compatible with conventional JPEG, to the significantly altered (and incompatible) NBCES variant, which is pretty much "beaten beyond recognition" (most of the internal headers and tables are different, the block-coding is different, it incorporates parts from PNG and Deflate, ...).

a related format is "BTIC" ("BGBTech Image Codec"), which was partly based on NBCES, but is internally based more around secondary compression of DXTn images (and high speed decompression), so is really its own format (during implementation I decided against trying to shoe-horn it in with BTJ and NBCES). it still retains a few elements from JPEG, mostly in terms of reusing a few of the markers (SOI/EOI, APPn, JPGn -> SYSn, COM, ...), and some of the high-level structure from BTJ and NBCES, but is otherwise its own format (it uses its own internal headers and similar, and apart from these markers, there is little else in common).

the main thing here is that the original JPEG markers were used to build a TLV format for BTJ, which is then expanded on and used as part of the general structure for building the other formats (*3).

BTJ basically just used them to add layers and a few additional informational headers.

BTJ-NBCES basically used them to implement new headers (and mostly discards the original JPEG headers, *1);

BTIC, basically similar, but has no headers or other structures in common with NBCES (*2).

*1: the NBCES headers are basically stored together as a big glob of bits, using a fixed-width bit-field to identify various headers, and typically with the headers themselves entropy-coded (vaguely similar to the strategy used in Deflate). some minor tweaks are also made to the FF escape-coding to make it more space efficient (owing to the loss of required JPEG compatibility, and nested packaging with each level adding an FF -> FF 00 conversion gets expensive, making a more efficient encoding scheme desirable...).

*2: most of its structures are byte-based, and currently no entropy coding is used, mostly to allow very fast decoding at the expense of compression ratio.

*3: sort of like RIFF, which was used as the base for AVI/WAV/RMI/... but, this TLV format offers a few more interesting possibilities: optional longer marker names (FOURCC/EIGHTCC/ASCIZ), distinguishes optional vs must-understand, allows for open-ended markers (paired start/end markers), as well as for resynchronization (re-aligning with an in-progress data-stream, mostly as the markers can't validly appear in the payload data). multiplexing could also be possible (likely via an additional wrapping layer).

currently, I am using the current FOURCC codes (for AVIs):

"MJPG": Motion JPEG and sometimes very-conservative subsets of BTJ;

"MBTJ": Motion BTJ;

"MBTC": Motion BTIC.

I originally considered using a similar packaging for "BTAC" (BGBTech Audio Codec), but was lazy and just used a fixed file-header instead (sort of like BMP).

decided to leave out stuff mostly about possible future directions and TLV escape tagging (BTJ vs my network protocol vs MPEG and friends...).

short answer: my network protocol (SBXE) and MPEG use longer (but different) escape-sequences, but there are various tradeoffs here.

I'm pretty sure that the OP was presenting "that must mean we can use them" as the WTF/'Coding Horror' for the thread.
He's implying that it's horrible to make this assumption, not seriously suggesting it.

I'm pretty sure that the OP was presenting "that must mean we can use them" as the WTF/'Coding Horror' for the thread.
He's implying that it's horrible to make this assumption, not seriously suggesting it.

yes, true.

I see it more as pros/cons though...

otherwise, how would we have gotten some of the data representations and file-formats and similar in common use today?...

I'm pretty sure that the OP was presenting "that must mean we can use them" as the WTF/'Coding Horror' for the thread.
He's implying that it's horrible to make this assumption, not seriously suggesting it.


Absolutely this.

The guy who originally made the decision to do it had left the company before the problem arose, which was probably better for us that his next employer.

This topic is closed to new replies.

Advertisement