Jump to content

  • Log In with Google      Sign In   
  • Create Account

The documentation says those bits are reserved...

  • You cannot reply to this topic
18 replies to this topic

#1 dave j   Members   -  Reputation: 592

Like
3Likes
Like

Posted 07 June 2013 - 06:22 AM

...that must mean we can use them.

Many APIs have flags words to contain a various options. Frequently not all the bits are used. Despite the fact that the documentation mentioned that all unused bits in flag words were reserved and should be set to zero, a GUI project that sat on top of OS/2's Presentation Manager (think Windows GUI - they are almost identical) used the reserved bits for it's own purposes. A few years later IBM changed OS/2 to use some of those bits. Not only did the framework need changing to not use those bits, 600 screens needed updating as well (think changing 600 Windows .RC files).

Top tip:

If you are documenting an API that has flag words, define every single unused bit in every flag word as "must be zero". If you define a standard for unused bits once at the begining of your documentation some idiot will not notice/ignore it.

Sponsor:

#2 markr   Crossbones+   -  Reputation: 1653

Like
6Likes
Like

Posted 07 June 2013 - 01:41 PM

Yes, and moreover, if any of the "reserved" bits are set, the function should return an error.



#3 Sik_the_hedgehog   Crossbones+   -  Reputation: 1743

Like
0Likes
Like

Posted 07 June 2013 - 05:54 PM

This is even more true for hardware, from what I know many hardware designers are in bad terms with programmers because they insist on completely ignoring to keep reserved bits untouched (that's another thing, if you just want to change a flag, make sure you only modify that bit rather than overwriting the entire value).

 

Then again don't assume that just because you document it as "must be zero" programmers will pay attention. There will always be somebody who notices that those bits do nothing and then will use them for its own purposes... Of course it's even worse when two programmers notice it separately and decide to reuse those bits with different purposes in each case - enjoy the debugging nightmare.


Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

#4 dave j   Members   -  Reputation: 592

Like
0Likes
Like

Posted 08 June 2013 - 04:55 AM


Yes, and moreover, if any of the "reserved" bits are set, the function should return an error.


Errors are too easy to ignore - it might be better to just and the flag word with a mask of the valid bits.

#5 Bacterius   Crossbones+   -  Reputation: 8857

Like
0Likes
Like

Posted 08 June 2013 - 08:12 AM

Errors are too easy to ignore - it might be better to just and the flag word with a mask of the valid bits.

 

That won't solve the problem if the library is dynamically linked, though, as soon as you'll roll out a version of the library which actually uses those reserved bits the crappy application will still be using the bits incorrectly and you're back to square one.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#6 dave j   Members   -  Reputation: 592

Like
1Likes
Like

Posted 08 June 2013 - 10:08 AM

That won't solve the problem if the library is dynamically linked, though, as soon as you'll roll out a version of the library which actually uses those reserved bits the crappy application will still be using the bits incorrectly and you're back to square one.


Surely the crappy application would never have worked when they tried to use the reserved bits with an earlier version of the dynamic library so they'd have changed it to use another method for their flags.

#7 slicer4ever   Crossbones+   -  Reputation: 3885

Like
1Likes
Like

Posted 08 June 2013 - 10:51 AM


Yes, and moreover, if any of the "reserved" bits are set, the function should return an error.

Errors are too easy to ignore - it might be better to just and the flag word with a mask of the valid bits.


that's just doing unnecessary work to fix potential problems that the library shouldn't be held responsible to fix in the first place.

if the library says it's reserved, and you get screwed in the future because you decided to use the reserved bits, that's not the library maker's fault, nor should they have to add unjustifiable overhead to correct an issue that only a handful of people might make.
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

#8 Sik_the_hedgehog   Crossbones+   -  Reputation: 1743

Like
0Likes
Like

Posted 08 June 2013 - 11:30 PM

Tell that to Microsoft. A large amount of Windows programs do stuff that they should never do in the first place, and then if a new version of Windows comes along that breaks those programs relying on undefined behavior, the users will blame Microsoft, not the developers of those programs. There's a reason why Windows has such a ridiculous amount of weird stuff for the sake of backwards compatibility.

 

That said: a function that ends up in an error if you use reserved flags should be the kind of issue that pop ups immediately, right? I mean, the function would even refuse to work, that alone would prevent the buggy code from even working in the first place, which would force developers to not use reserved flags, period. OK, in some situations it may not be obvious and get ignored, but it should reduce the amount of cases considerably.


Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

#9 ApochPiQ   Moderators   -  Reputation: 15698

Like
0Likes
Like

Posted 09 June 2013 - 12:10 PM

Simple solution: Don't just return an error, fail violently.

#10 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 09 June 2013 - 05:29 PM

Simple solution: Don't just return an error, fail violently.

 

sometimes this is good and sometimes it is bad, and sometimes it helps find errors, or results in the app failing for stupid reasons.

 

 

case where it was good (maybe):

encoding JPEG files where the DC coefficient Huffman table had more than 16 entires;

"Windows Photo Viewer" was like "nope, not going to open this...", turns out this case is essentially reserved for some later (not widely used or supported) JPEG extensions, namely the ability to scale the quantization table on a block-per-block basis (and photo viewer cared how many entries were in the table, rather than which values were actually being used).

 

 

cases where it was not as good:

getting an error box whenever a LoadLibrary or GetProcAddress call failed, in a case where I really did want it to fail quietly;

I then ended up having to resort to using SEH to catch an exception, so that the code could quietly return a NULL on failure (it was being used to probe for optional libraries and features);

the event which originally prompted me to look into developing a custom script language in the first place (a bit over a decade ago): code within a library (IIRC, Guile) being hard-coded to call "abort()" on the first sign of trouble (*1).

 

 

*1: at the time, this seemed like damn near the worst possible solution to the problem, but it wasn't until later on that I discovered that the intention was that people would then use "signal()" with "SIGABRT" and "longjmp()" to implement a makeshift exception handler...

 

for the most part for C code, I have tended to prefer the "return with an error status" strategy. a lot of code basically ended up using a special pointer value (which I called "UNDEFINED"), generally with an error-status variable hidden in the background, so that it would be possible to spot the combination of UNDEFINED and an error status, and handle it as appropriate.

 

a better strategy in retrospect would have probably been to make UNDEFINED be a value range, where the type of condition that raised the error could be encoded within the value (with the use of an error variable, it is possible for multiple piece of code to set the status in the course of the same error, making it harder to detect the initial or "most important" error condition).

 

granted, initially, this was before I had the idea of encoding information inside pointer value-ranges either (tagged references and pointers were seen as separate, and tagged references would often actually encode the "index" of a heap object rather than the "address" of the object, and pointers were generally treated as raw pointers and generally assumed to normally point to accessible memory objects, ...). over the years, a lot of this would change...



#11 frob   Moderators   -  Reputation: 21237

Like
2Likes
Like

Posted 10 June 2013 - 12:55 PM

Always validate your input.

 

It doesn't matter if you are writing SQL queries or a UI library, or a jpeg loader, or anything else mentioned above.

 

Validate your input.  If the input is invalid you assert and fail.


Edited by frob, 10 June 2013 - 12:56 PM.

Check out my personal indie blog at bryanwagstaff.com.

#12 L. Spiro   Crossbones+   -  Reputation: 13577

Like
0Likes
Like

Posted 13 June 2013 - 03:44 PM

...that must mean we can use them.

I’ve honestly never thought of it that way. My impression has always been that anything flagged as “reserved” is short-hand for “reserved for our own future uses, not yours, so don’t touch them.”

I document my own engine heavily and I doubt I would feel bad if I said a bit was “Reserved” (with no further details) and then someone’s code started breaking because he or she thought that meant it was “Available to the user”. I’d explicitly say a bit was “Reserved for the user” if that is what I meant.


L. Spiro
It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#13 Sik_the_hedgehog   Crossbones+   -  Reputation: 1743

Like
0Likes
Like

Posted 13 June 2013 - 11:27 PM

I wouldn't use the wording "reserved" for something the user is free to use =P


Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

#14 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 14 June 2013 - 02:16 AM

 

...that must mean we can use them.

I’ve honestly never thought of it that way. My impression has always been that anything flagged as “reserved” is short-hand for “reserved for our own future uses, not yours, so don’t touch them.”

I document my own engine heavily and I doubt I would feel bad if I said a bit was “Reserved” (with no further details) and then someone’s code started breaking because he or she thought that meant it was “Available to the user”. I’d explicitly say a bit was “Reserved for the user” if that is what I meant.


L. Spiro

 

 

 

yeah. I think it depends some on other considerations though, for example:

if the bits are reserved in a file-format which hasn't changed much in 20 or 30 years, and a person for whatever reason wants to hack more features onto it, sometimes it is necessary to resort to hacks like using reserved bits or invalid bit-combinations to hack on new features (and sometimes even more severe).

 

granted, yes, unless a person is careful about it, this can break compatibility with other implementations (so, it requires some level of care, and testing to validate that various other implementations still work if exposed to the new features, ...).

 

 

for example, my own hacked JPEG-variant had internally done some "questionable" hacking onto the format to add some features, most notably in a variant mode I had called "NBCES" (for "Non Backward Compatible Extensions"), which had in a few cases gone as far as to alter how blocks were encoded (images encoded in NBCES mode can't be decoded in a conventional decoder, at all, but it can support a few more features and higher compression).

 

decided to leave out the specifics, but basically "NBCES" made some fairly severe hacks in many areas of the format, partly to support things like direct floating-point HDR textures (vs hacking them into an RGBE variant as part of the colorspace transform), larger blocks, and optional per-block PNG-like filters (which could improve compression over straight DCT), ... given the way my BTJ and NBCES formats work, NBCES images will appear like empty images to conventional JPEG decoders (they only contain APPn markers), which while invalid, is probably better than the decoder running head-first into unrecognized data and probably blowing up in some less predictable way. ( note this is relative to the more conservative BTJ format, which can support a lot of features in a "sort-of backwards-compatible" way. as-in, the base RGB image can still be decoded by a conventional decoder even if a lot of the extended features will no longer work, and the decoder can still decode images encoded in the baseline format. )

 

sometimes things are more acceptable though if the images aren't likely to be used for interchange, but rather to be used specifically with a controlled implementation. (much like libjpeg has its own sets of format extensions, many of which are not mutually compatible...).

 

if hacking things in a way which uses or interacts with an implementation maintained by someone else, then it is probably an "all bets are off" scenario.

 

and, even then, even with ones' own implementation, sometimes minor version issues, quirks, or bugs in prior versions may still need to be worked around... as it is kind of a pain sometimes designing things which deal gracefully with version issues...

 

sometimes some of this tends to leave a "distinctive look" to long-lived file-formats (like those which have been around for multiple decades), where many people will see these sorts of formats and immediately be like "barf" at often all the bit-twiddly, information spread around all over the place, multiple layers of packaging and daisy-chained headers, ... but, at times, there can be a sort of elegance to them as well... (or, at least, clues about how to design file-formats to be extensible...). (like, say, if a file format has survived, at least in part, since the 70s, it has probably at least doing something right...).

 

so, in this way, one person's reserved bits can often become another person's extension-point.



#15 Sik_the_hedgehog   Crossbones+   -  Reputation: 1743

Like
0Likes
Like

Posted 14 June 2013 - 02:31 AM

Honestly in that case I'd have just changed the header and called it a new format =P It's based in JPEG, but the changes you mention sound serious enough to be considered its own new thing, really.


Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

#16 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 14 June 2013 - 04:20 AM

Honestly in that case I'd have just changed the header and called it a new format =P It's based in JPEG, but the changes you mention sound serious enough to be considered its own new thing, really.

 

 

interestingly, JPEG doesn't really have a "file header" in the traditional sense, but is instead a sequence of escape-coded markers (IOW: the byte value 0xFF is magic, with the following byte as a marker, ...). this makes it a little "interesting" in a few ways...

 

 

typically for these files, I am using a "BTJ" extension, and generally calling the format "BGBTech-JPEG", and has enough options that it can actually generate a range of sub-formats, ranging from those resembling and compatible with conventional JPEG, to the significantly altered (and incompatible) NBCES variant, which is pretty much "beaten beyond recognition" (most of the internal headers and tables are different, the block-coding is different, it incorporates parts from PNG and Deflate, ...).

 

a related format is "BTIC" ("BGBTech Image Codec"), which was partly based on NBCES, but is internally based more around secondary compression of DXTn images (and high speed decompression), so is really its own format (during implementation I decided against trying to shoe-horn it in with BTJ and NBCES). it still retains a few elements from JPEG, mostly in terms of reusing a few of the markers (SOI/EOI, APPn, JPGn -> SYSn, COM, ...), and some of the high-level structure from BTJ and NBCES, but is otherwise its own format (it uses its own internal headers and similar, and apart from these markers, there is little else in common).

 

the main thing here is that the original JPEG markers were used to build a TLV format for BTJ, which is then expanded on and used as part of the general structure for building the other formats (*3).

 

BTJ basically just used them to add layers and a few additional informational headers.

BTJ-NBCES basically used them to implement new headers (and mostly discards the original JPEG headers, *1);

BTIC, basically similar, but has no headers or other structures in common with NBCES (*2).

 

*1: the NBCES headers are basically stored together as a big glob of bits, using a fixed-width bit-field to identify various headers, and typically with the headers themselves entropy-coded (vaguely similar to the strategy used in Deflate). some minor tweaks are also made to the FF escape-coding to make it more space efficient (owing to the loss of required JPEG compatibility, and nested packaging with each level adding an FF -> FF 00 conversion gets expensive, making a more efficient encoding scheme desirable...).

 

*2: most of its structures are byte-based, and currently no entropy coding is used, mostly to allow very fast decoding at the expense of compression ratio.

 

*3: sort of like RIFF, which was used as the base for AVI/WAV/RMI/... but, this TLV format offers a few more interesting possibilities: optional longer marker names (FOURCC/EIGHTCC/ASCIZ), distinguishes optional vs must-understand, allows for open-ended markers (paired start/end markers), as well as for resynchronization (re-aligning with an in-progress data-stream, mostly as the markers can't validly appear in the payload data). multiplexing could also be possible (likely via an additional wrapping layer).

 

 

currently, I am using the current FOURCC codes (for AVIs):

"MJPG": Motion JPEG and sometimes very-conservative subsets of BTJ;

"MBTJ": Motion BTJ;

"MBTC": Motion BTIC.

 

 

I originally considered using a similar packaging for "BTAC" (BGBTech Audio Codec), but was lazy and just used a fixed file-header instead (sort of like BMP).

 

decided to leave out stuff mostly about possible future directions and TLV escape tagging (BTJ vs my network protocol vs MPEG and friends...).

short answer: my network protocol (SBXE) and MPEG use longer (but different) escape-sequences, but there are various tradeoffs here.


Edited by cr88192, 14 June 2013 - 04:28 AM.


#17 Hodgman   Moderators   -  Reputation: 30380

Like
0Likes
Like

Posted 14 June 2013 - 06:23 AM

I'm pretty sure that the OP was presenting "that must mean we can use them" as the WTF/'Coding Horror' for the thread.
He's implying that it's horrible to make this assumption, not seriously suggesting it.

#18 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 14 June 2013 - 11:18 AM

I'm pretty sure that the OP was presenting "that must mean we can use them" as the WTF/'Coding Horror' for the thread.
He's implying that it's horrible to make this assumption, not seriously suggesting it.

 

yes, true.

I see it more as pros/cons though...

 

otherwise, how would we have gotten some of the data representations and file-formats and similar in common use today?...



#19 dave j   Members   -  Reputation: 592

Like
0Likes
Like

Posted 14 June 2013 - 11:19 AM

I'm pretty sure that the OP was presenting "that must mean we can use them" as the WTF/'Coding Horror' for the thread.
He's implying that it's horrible to make this assumption, not seriously suggesting it.


Absolutely this.

The guy who originally made the decision to do it had left the company before the problem arose, which was probably better for us that his next employer.





PARTNERS