Problem: turning a bit shift into a division

Started by
24 comments, last by Sik_the_hedgehog 12 years ago
At some point bit trickery must be used if a div instruction is to be avoided (either by you or the compiler) - there is no way around that. I assumed you had a problem with the unspecified behavior of ">>" (arithmetic vs not, although i have never encountered a compiler that used non-arithmetic for signed, but afaik it is allowed).

The problem strikes in that according to the C standard, bit shifting on signed integers is undefined behavior, so the idea here is to replace that piece of code with something that has defined behavior for the standard.

If the compiler outputs a bit shift it's OK. In fact, it's the ideal case =P (EDIT: in other words, we don't care what the compiler does, only we care about the source code)

Confused what the actual goal here is :/ (ie. what goal is there for the replacement of the shift?)

Must be defined behavior according to the C standard. This pretty much means we're stuck with pure arithmetic (+, -, *, /, %).
Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.
Advertisement
Then the second option (you probably missed the edit) in my previous reply is the only way (fully specified behavior) ... except.

Best choice would be: use a helper function for shift right. So you can use ">>" on compilers that use arithmetic (VC specifies that it uses arithmetic - no ambiguity there) and whatever else where not (some compiler specific intrinsic functions or "(val & ~255) / 256" if unknown compiler).

edit: if inlining / performance in non-optimized debug build is a problem then a macro function would be prudent as long as the value is used only once in the macro (this criteria is met here).

edit: Hm, actually. I think i am wrong. I doubt C specifies how signed values are represented. Bummer.
edit:This are the options (a far as i can see):
It would be interesting to compare the actual assembly that each of these produces, and see how clever modern optimisers are!

The whole point of this is to get rid of bit trickery though
float speed_to_float(float value) {
return value;
}
... Sorry tongue.png wink.png
float speed_to_float(float value) {
return value;
}
... Sorry tongue.png wink.png

And that breaks log-based replays across different machines! *hides*

Jeez, I'm amazed at how many things can go wrong so easily. And now is when I wish the C standard would just specify some defined behavior for two's complement machines, since all of them work the same. Only on DSPs and the like you may find other encodings these days.
Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

It would be interesting to compare the actual assembly that each of these produces, and see how clever modern optimisers are!

A bit OT, but got curious and ...


00000001400255A9 mov eax,dword ptr [test1 (1400FA310h)]
00000001400255AF test eax,eax
00000001400255B1 jns tester+28h (1400255B8h)
00000001400255B3 add eax,0FFFFFF01h
00000001400255B8 cdq
00000001400255B9 movzx edx,dl
00000001400255BC add eax,edx
00000001400255BE sar eax,8
00000001400255C1 mov dword ptr [test1 (1400FA310h)],eax

Very good imo (branchless instructions are an option also, but they usually perform much worse than branching code due to branch prediction and worse behavior with register-renaming hardware [or something i forgot {fucks up the dependency chain or something?}]). Expecting compiler to see through this code to just replace with a "sar" - would be too much to ask.


00000001400255C7 mov eax,dword ptr [test2 (1400FA314h)]
00000001400255CD and eax,0FFFFFF00h // if it would be clever then it would realize that this is actually unnecessary (compiler fails again because it assumes the next 3 instructions actually do something [it otherwise does keep track of this kind of bit manipulations and would remove the "and" - did test that])
00000001400255D2 cdq // this does absolutely nothing useful (it would be relatively easy for a compiler to deduce it based on previous instruction, but - evidently it does not)
00000001400255D3 movzx edx,dl // --||--
00000001400255D6 add eax,edx // --||--
00000001400255D8 sar eax,8 // this is actually all it should do
00000001400255DB mov dword ptr [test2 (1400FA314h)],eax

Absolutely terrible. Like i said - one should avoid bit manipulations as compilers just can not see through it most of the time.


00000001400255E1 sar dword ptr [test3 (1400FA318h)],8

Perfect. Not a fair comparision with 2. option tho as it rolled "load" and "save" also into the instruction - load / save should not be part of the test as in real case one could not roll them into one. So, the code should be seen as (for comparision purposes):

mov eax,dword ptr [test3]
sar eax,8
mov dword ptr [test3],eax


edit: compiled with full optimizations with _ReadWriteBarrier separating the tests to ensure compiler does not mix the tests.
The problem strikes in that according to the C standard, bit shifting on signed integers is undefined behavior
Despite being ill-defined, every project is going to have non-standard code in it somewhere. If you just want to document the (unlikely) portability issue and keep working, I usually just use static assertions nearby the bit-magic code//Non-standard compiler extensions assumed
STATIC_ASSERT( s8(0x80) >> 7 == s8(0xFF) );//signed shift is arithmetic / sign-extending
STATIC_ASSERT( u8(0x80) >> 7 == u8(0x01) );//unsigned shift is logical
This assumption is probably true on every platform you care about, and in the case where it's not true, the above code will produce an error you when you compile.
I personally make the above assumptions in my code, and don't expect them to fail me any time soon.
the code relies on this behavior[/quote]

Algorithm you're dealing with here is black box. There is no division that would make it more portable or better.

Because it gets worse. If you rely on exactly identical behavior, then you not only need to properly emulate operation, but also overflow behavior and CPU flags. And perhaps something else, who knows, there might be an interrupt hooked on overflow.

Code like this wasn't written for longevity, it's defined by the code itself.


As for >> and signed types - it wasn't until recently that developers finally realized that it doesn't work quite correctly for negative numbers. Before, >> was the equivalent to division. Best part - nobody noticed it wasn't correct.

Then there's matter of larger values - does it perform identically across 16/32/64 register sizes?

If trying for emulation, then the rabbit hole goes really deep.
Am I the only one who didn't understand the point of that last post at all?

Because it gets worse. If you rely on exactly identical behavior, then you not only need to properly emulate operation, but also overflow behavior and CPU flags. And perhaps something else, who knows, there might be an interrupt hooked on overflow.

First of all, this is C, I don't care about what the flags are or anything (the compiler takes care about that). And this code assumes overflow shouldn't happen (if it does then it's considered a bug in the game).

Second, the only behavior that matters here is the result the function returns, not what the compiler decides to do inside (though ideally the compiler should not do stupid things to achieve it!).

As for >> and signed types - it wasn't until recently that developers finally realized that it doesn't work quite correctly for negative numbers. Before, >> was the equivalent to division. Best part - nobody noticed it wasn't correct.

Pretty sure this issue was known for as long as bit shifting was used for this purpose. It would have appeared pretty quickly during debugging. (EDIT: assuming here you talk about the rounding issue, if you mean it not working on non-two's-complement hardware, nevermind)
Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.
First of all, this is C, I don't care about what the flags are or anything (the compiler takes care about that). And this code assumes overflow shouldn't happen (if it does then it's considered a bug in the game).

Second, the only behavior that matters here is the result the function returns, not what the compiler decides to do inside (though ideally the compiler should not do stupid things to achieve it!).[/quote]
No, the compiler is, by spec, required to do stupid things. Result of bitshift on a signed number is compiler-defined.

So whatever you're trying to emulate "correctly" has nothing to compare to.

The following function is correct:
int32_t speed_to_int(int32_t value) {
if (value + comma_offset >= 0) return (value+comma_offset) / 256;
return rand();
}
For a given compiler that chooses such behavior. C language does not forbid it, compiler that generates code like above is perfectly C standard compliant.

Since that is probably not what you intended, it is up to you what the supposed result will be and how it's implemented.

If log replayes across machines (as mentioned above) are important, then any behavior will do, as long as its consistent.

But there is no way to provide 1:1 correct version, since original one is undefined, the only thing one can do is compare against a specific result of a specific compiler.
No, the compiler is, by spec, required to do stupid things. Result of bitshift on a signed number is compiler-defined.

By "stupid" I meant "don't try to pull off 20 lines when it could have done with 5" and stuff like that.

So whatever you're trying to emulate "correctly" has nothing to compare to.

Except I've already described what I'm expecting, and the whole point here is to write code that achieves what I'm expecting without relying on undefined behavior (ideally without making the compiler output needlessly complex code).
Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

This topic is closed to new replies.

Advertisement