Unique ID
defining the sizes of various numeric types (for example: "char" and "unsigned char" are 8 bits, "short" and "unsigned short" are 16-bits, ...);
C has addressed that with the sized types, like int_32t.
The "old style" / K&R style function declarations have been marked as obsolete since at least 1999, and many current compilers don't accept them any more.As always though, the C99 spec mentions that they didn't forbid them outright for fear of breaking people's code. C++11 has a similar amount of hoop-jumping, because an updated standard that breaks old code is basically just a new language, not an update.
all this doesn't even need to be "the standard", but maybe could be a "standardized profile".
Yeah it would defeat the purpose of the whole C language to nail down all of those things, which are basically hardware details, as then the language couldn't be ported to other types of hardware. Having sub-standards /profiles for certain platforms, e.g. "C for x86" makes a bit more sense though
if you don't nail down endianess, then the "profile" could apply equally to x86, PowerPC, and ARM.
if endianess is specified (say, as LE), then it is mostly x86 and most ARM devices.
IIRC, both PPC and ARM are bi-endian, but typically people run ARM devices in little-endian mode, and PPC in big-endian mode.
if it gets more specific, like whether or not unaligned load/stores are allowed, ... then it is probably target specific.
as for endianess, maybe 95% of the time, there is not much reason to care, and in the few cases where there is reason to care, it may make sense to have some way to specify it, and have the compiler emulate it if needed (like, say, if we say "this value needs to be LE or BE", then likely, the code has already agreed to pay for any requisite byte-swapping on loads/stores).
per-structure or per-value indication could be more useful than specifying it globally though, where the global endianess could still be left unspecified.
then, one can have a structure or similar, and be able to say that, with a compiler implementing this profile, and with a structure following the relevant rules, it is possible to know the exact byte-for-byte layout of the structure.
such a profile, if it existed, could still be "reasonably" portable within a range of targets, and possibly any points of disagreement could be emulated.
this would then make it a little more like Java or C# in these regards.
granted, this would be N/A for some targets, but these targets need not implement this profile.
Empty argument lists in C is a terrible idea! It's also different to C++ where it means the function is void. It just seems like an incredibly lazy way to prototype a function and what's more the compiler doesn't complain when you pass the wrong number or type of arguments!!! It's basically the same as using ellipsis argument list except without the requirement that you have at least one argument before the ellipsis.
At least one bug has been incredibly hard to track down in our codebase because of this "feature" (i.e. someone does a local prototype in a c file for a function that was once void, and it continues to work (incorrectly of course) without complaint when someone added an argument to the function).
Ah, thanks for clarifying. After some pondering, I realized I can just concatenate them with shift and OR normally. I guess I was trying too hard to use cool union tricks.
Yes sorry, the situation is the same for both C/C++ -- the spec says it doesn't have to work, but the compiler vendors say that if you want to willingly break the strict aliasing rule, you should do so via a union.Oh, I didn't know it was for c++. Does it apply for c99? It would be nice if it is defined behaviour for c99.
According to the spec, the only way to do this generally is:
This is pretty silly, and the compiler vendors agree this is silly. So, they chose allow you to (ab)use unions in this way, even though the spec says they don't have to allow it.struct uid parts; parts.blah = ...; int32_t id; memcpy( &id, &parts, 4 );
int32_t cell_x = uid.cell_x;
int32_t cell_y = uid.cell_y;
int32_t n = uid.n;
int32_t id = n | (cell_x << 16) | (cell_y << 24)
Will this work on systems with different endianess?
It will work, the binary representation will be different though.
Have a look at the functions ntohl and htonl which you can call to do endian swaps before sending things across a network between machines of different endianness.
Be careful. This may not work how you expect if your values can be negative. You might want to cast to unsigned ints of the appropriate size before doing the bit manipulation.Ah, thanks for clarifying. After some pondering, I realized I can just concatenate them with shift and OR normally. I guess I was trying too hard to use cool union tricks.
int32_t cell_x = uid.cell_x; int32_t cell_y = uid.cell_y; int32_t n = uid.n; int32_t id = n | (cell_x << 16) | (cell_y << 24)
Oh, thanks. I take it those two functions are what I need if I want to store my values in a file portably, yes?Have a look at the functions ntohl and htonl which you can call to do endian swaps before sending things across a network between machines of different endianness.
Thanks, what a gotcha. So if I shift a signed number, the sign bit will not change, yes? And if I cast to unsigned, I can cast that unsigned back to signed to get the same value?Be careful. This may not work how you expect if your values can be negative. You might want to cast to unsigned ints of the appropriate size before doing the bit manipulation.
Empty argument lists in C is a terrible idea! It's also different to C++ where it means the function is void. It just seems like an incredibly lazy way to prototype a function and what's more the compiler doesn't complain when you pass the wrong number or type of arguments!!! It's basically the same as using ellipsis argument list except without the requirement that you have at least one argument before the ellipsis.
At least one bug has been incredibly hard to track down in our codebase because of this "feature" (i.e. someone does a local prototype in a c file for a function that was once void, and it continues to work (incorrectly of course) without complaint when someone added an argument to the function).
yes, but this is more about what would be defined by the standard, not about whether or not it is good practice.
the issue is, there is still a lot of code floating around which would break if you took away "()", but relatively little which would break if the rest of old-style declarations were taken away.
(FWIW, my C compiler had effectively dropped both, making "()" behave like "(void)" and treating relying upon the old-style semantics as a warning).
but, anyways, to clarify a few of the thoughts: the endianess specifiers would probably be as special preprocessor defines, which would have undefined behavior (probably being no-op) if the compiler doesn't support the feature. preprocessor defines could exist to specify whether the feature is present and works.
the declaration ordering restrictions would be more subtle, probably placing a few restrictions like:
type qualifiers will precede specifiers (except in certain conditions);
other type specifiers and user-defined types (typedef-name) will be mutually exclusive;
only one (and exactly one) user-defined type may be referenced as part of a given declaration type;
...
most existing code already does this, and making this an optional requirement can result in a parser speedup (though, as-is, a command-line option can achieve similar effect). basically, it allows eliminating most cases where it is necessary to check whether or not an identifier is a known typedef (IME: this is where a big chunk of the time goes when parsing declarations from headers, at least in my tools).
Really the problem is sign extension when moving to a larger size. Let's say n is -5. As a int16_t your computer might represent that as 1111 1111 1111 1011. When you cast that to a int32_t it might become 1111 1111 1111 1111 1111 1111 1111 1011; all the upper bits are 1. Trying to do a bitwise or with that will wipe out any information that you want to get from cell_x and a cell_y. If you cast it to a uint16_t first you'll get something like 0000 0000 0000 0000 1111 1111 1111 1011 which you can bitwise-or things in the upper bits with.Thanks, what a gotcha. So if I shift a signed number, the sign bit will not change, yes?
To be pedantic whether or not a cast from signed to unsigned and back will give you the same number is implementation defined. However, I've never worked on a platform where you wouldn't get the same value from a round trip.And if I cast to unsigned, I can cast that unsigned back to signed to get the same value?
Well the solution is to mask out the bits you don't want before you do the shift... looks like your x, y are 8 bits and n is 16 bits so do
int32_t id = (n & 0xffff) | ((cell_x & 0xff) << 16) | ((cell_y & 0xff) << 24)
EDIT: Missed an effing eff