I really wish the standard would be written better. The part where it introduces operators / and % seems to be written with integer arithmetic in mind, because % means nothing between floating-point types and because it's not entirely clear what they mean by "zero". Is 0.0 zero? How about -0.0?
Why does divison by 0 always result in negative NaN?
I really wish the standard would be written better. The part where it introduces operators / and % seems to be written with integer arithmetic in mind, because % means nothing between floating-point types and because it's not entirely clear what they mean by "zero". Is 0.0 zero? How about -0.0?
It actually specifically states that % only works on integral or (unscoped) enum types, whereas / works for arithmetic or (unscoped) enum types. (5.6 section 2) 5.6 section 4 simply uses "zero" with no number value.
Looking up "arithmetic types" is convoluted, as one place refers to section 3.9.1 as a whole, but if you dig into that section, part 8 specifically says "Integral and floating types are collectively called arithmetic types". Part 7 specifies integral types as the signed/unsigned int types, bool, and the char types. Part 8 specifies floating point types as float, double, and long double.
So apparently division with bools and chars is a thing (via int promotion)
IMHO, any compiler for an IEEE FP platform that blindly follows that spec is simply being harmful towards it's users. There should be a vendor specific extension, enabled by default, which lets the IEEE spec override the C++ spec.
Checking GCC for the sake of it. Test program:
#include <stdio.h>
int foo(float x) {
float y;
y = 2.5f / x;
printf("%f ", y);
y = 0.0f / x;
printf("%f ", y);
if (x == 0.0f) return 0;
return 1;
}
int main() {
int x;
x = foo(1.0f);
printf("%d\n", x);
x = foo(0.0f);
printf("%d\n", x);
x = foo(-1.0f);
printf("%d\n", x);
return 0;
}
Result (setting optimization to -O3 and the standard to -std=c99, also -ffast-math doesn't make a difference it seems):
2.500000 0.000000 1
inf 0.000000 0
-2.500000 0.000000 1
So at the very least GCC doesn't seem to be optimizing out any code. Can somebody come up with a more foolproof test, in case I missed something?
Which is what one gets on clang using C++ printing functions, thus I can only consider one of two things: It considers the division of 0 by X to be 0, and thus optimizes out the division entirely and since it does not expect x to be 0, you get 0 instead of NaN... which one would expect with UB. This would also explain why your third 0 is not -0 as well, The other option is that printf on GCC is broken. Also, we're talking about C++ here, not C (although the C version in clang ALSO prints NaN).
#include <iostream>
int foo(float x) {
float y;
y = 2.5f / x;
std::cout<<y<<" ";
y = 0.0f / x;
std::cout<<y<<" ";
if (x == 0.0f) return 0;
return 1;
}
int main() {
int x;
x = foo(1.0f);
std::cout<<x<<std::endl;
x = foo(0.0f);
std::cout<<x<<std::endl;
x = foo(-1.0f);
std::cout<<x<<std::endl;
return 0;
}
2.5 0 1
inf nan 0
-2.5 -0 1
2.5 0 1
1.#INF -1.#IND 0
-2.5 -0 1
Same results in release and debug (default project settings for both)
Interestingly, VS detects the divide by 0 in release mode and spits out a "potential divide by 0" warning. I imagine this is because it's inling them and removing the x variable. An examination of the disassembly shows there are no jumps at all, and the divisions where the parameter is non-zero have been removed entirely while leaving the divisions by 0 intact.
So I guess the most interesting takeaway for me from this example is that the compiler, even knowing the exact platform it was emitting code for (32bit x86) did not do the division itself, even though the processor follows the IEEE standard and it could, theoretically, follow the IEEE standard and calculate the result itself.
Calling with a non-zero value:
x = foo(1.0f);
00F81006 push ecx
00F81007 mov ecx,dword ptr ds:[0F83060h]
00F8100D mov dword ptr [esp],40200000h
00F81014 call dword ptr ds:[0F83038h]
00F8101A mov ecx,eax
00F8101C call std::operator<<<std::char_traits<char> > (0F81130h)
00F81021 push ecx
00F81022 mov ecx,dword ptr ds:[0F83060h]
00F81028 mov dword ptr [esp],0
00F8102F call dword ptr ds:[0F83038h]
00F81035 mov ecx,eax
00F81037 call std::operator<<<std::char_traits<char> > (0F81130h)
std::cout<<x<<std::endl;
00F8103C push dword ptr ds:[0F8305Ch]
00F81042 mov ecx,dword ptr ds:[0F83060h]
00F81048 push 1
00F8104A call dword ptr ds:[0F8303Ch]
00F81050 mov ecx,eax
00F81052 call dword ptr ds:[0F83040h]
Calling with 0.0f:
x = foo(0.0f);
00F81058 movss xmm0,dword ptr ds:[0F8325Ch]
00F81060 divss xmm0,dword ptr ds:[0F83258h]
00F81068 push ecx
00F81069 mov ecx,dword ptr ds:[0F83060h]
00F8106F movss dword ptr [esp],xmm0
00F81074 call dword ptr ds:[0F83038h]
00F8107A mov ecx,eax
00F8107C call std::operator<<<std::char_traits<char> > (0F81130h)
00F81081 xorps xmm1,xmm1
00F81084 movaps xmm0,xmm1
00F81087 divss xmm0,xmm1
00F8108B push ecx
00F8108C mov ecx,dword ptr ds:[0F83060h]
00F81092 movss dword ptr [esp],xmm0
00F81097 call dword ptr ds:[0F83038h]
00F8109D mov ecx,eax
00F8109F call std::operator<<<std::char_traits<char> > (0F81130h)
std::cout<<x<<std::endl;
00F810A4 push dword ptr ds:[0F8305Ch]
00F810AA mov ecx,dword ptr ds:[0F83060h]
00F810B0 push 0
00F810B2 call dword ptr ds:[0F8303Ch]
00F810B8 mov ecx,eax
00F810BA call dword ptr ds:[0F83040h]
This page explains a bit on what optimizations are allowed under different floating point behavior settings in VC++:
http://msdn.microsoft.com/en-us/library/e7s85ffb.aspx
For example:
Expression optimizations that are invalid for special values (NaN, +infinity, -infinity, +0, -0) are not allowed. The optimizations x-x => 0, x*0 => 0, x-0 => x, x+0 => x, and 0-x => -x are invalid for various reasons. (See IEEE 754 and the C99 standard.)
I suspect we're simplifying this whole problem by saying "optimizes stuff relying on undefined behavior out"... even though I don't usually proclaim "formats C:", I think when division by zero can do anything, a later statement being optimized out isn't the major issue.
It's easy to add a _controlfp() in a program that makes it crash on division by zero, if that's the behavior we want. Though it can be very useful for a Debug build, for me it's usually isn't at all acceptable in a final product, and on the contrary I want it to absolutely not crash, since I specifically do not want to have to marshall my values. What we're considering is basically this (all __assumes auto-added by compiler):
float r = n / d;
__assume(d nonzero);
What are we really assuming here?
That IF the implementation happens to accept division, and the program continues, we assume that d wasn't zero? Or are we assuming that a division by zero will never occur at all?
If it's the latter, would this also be OK?
__assume(d nonzero);
float r = n / d;
If so, consider this:
assert(!isnan(n) && !isnan(d));
__assume(n, d not nan);
__assume(d not zero);
float r = n / d;
__assume(r not nan); // can't possibly happen
if(isnan(r)) {
// optimized out?
}
When floats are used as acceptable approximations, and functions are expected to hit singularities on numerical instability, a floating point mode that gracefully handles Inf and NaN is very desirable, and incredibly useful for performance-sensitive approximating algorithms. As such I think that even if a standard doesn't explicitly define it, every major implementation will at the least implement it as an option for targets that is expected to support it.
Also, if we use something like fast-math or just generally accept floats as approximations that don't always round exactly the same, then it's not like with integers where we can guarantee that a number won't ever be zero, and value-marshalling would have to be added at unreasonably many points in the code if we were to guarantee that no division by zero ever occured. Basically we would have to revert to something like this:
float result;
int err = tryFloatOp(f0, f1, &result);
if(!succeeded(err)) {
return default();
}
I definitely think there are times where division by zero is expected to and should be undefined for floats, for example in mathematical computations that aren't supposed to deteriorate, and an exception is the correct result on division by zero. But I don't think that it's acceptable for an implementation to have it both ways, and generate code that expects division by zero to work perfectly fine, and still remove pieces of code that won't be hit unless that division by zero actually takes place.
EDIT: Another example where all if-statements could be removed if we assume the user will never input a zero divisor:
#include <iostream>
int main() {
int i0;
std::cin >> i0;
float f0 = static_cast<float>(i0);
int i1;
std::cin >> i1;
float f1 = static_cast<float>(i1);
float r = f0 / f1;
std::cout << "Answer is " << r << std::endl;
if(isnan(r))
std::cout << "Answer is NaN!" << std::endl;
if(isinf(r))
std::cout << "Answer is Infinity!" << std::endl;
if(f1 == 0.0f)
std::cout << "The divisor was zero" << std::endl;
return 0;
}
On the second set I would expect it to have printed out a NaN for the second division, as that's the correct result for IEEE floats.
Huh damn you're right, I wasn't paying attention >_> Good I actually copypasted the output to be sure.
sik@host221:~/temp$ gcc -O0 -o fdivtest fdivtest.c
sik@host221:~/temp$ ./fdivtest
2.500000 0.000000 1
inf -nan 0
-2.500000 -0.000000 1
sik@host221:~/temp$ gcc -O3 -o fdivtest fdivtest.c
sik@host221:~/temp$ ./fdivtest
2.500000 0.000000 1
inf -nan 0
-2.500000 -0.000000 1
sik@host221:~/temp$ gcc -O3 -ffast-math -o fdivtest fdivtest.c
sik@host221:~/temp$ ./fdivtest
2.500000 0.000000 1
inf 0.000000 0
-2.500000 0.000000 1
EDIT: huh, now -ffast-math gives me a difference? Are GCC parameters really that unstable or what? =| (and yeah, apparently -ffast-math gets affected by its position, if I put it at the end I get a linker error!)