Back to General and Gameplay Programming

Just when you thought stupidity had reached its limits (a debugging rant)

spraff · 2008-12-04T18:21:56

I just spent the best part of two hours wondering why my code breaks apparently arbitrarily. Eventually I boiled it down to this: int tmp = -8; unsigned int tmp2 = 16; int tmp3 = tmp2/5; float a = float(tmp + tmp3); float b = float(tmp + tmp2/5); This sets a==-5 and b==4.2949673e+009 *beats forehead against desk* For the last few years I've thought of myself as a fairly competent programmer, yet there's always a new lesson to learn. I just wasn't expecting a lesson like this. It's got something to do with unsigned vs signed integers, I guess. I still don't know if this is a C language subtlety I've overlooked, or maybe a bug in Visual Studio (I usually use gcc). Not sure if I want to find out. Thanks for letting me vent.

General and Gameplay Programming Programming

Started by spraff December 02, 2008 10:59 AM

31 comments, last by Way Walker 15 years, 4 months ago

me22

212

December 02, 2008 05:17 PM

Quote:Original post by Sneftel
The proper place for someone to learn the domain of a function is the function's documentation. Types are a weak substitute.

Fundamental types are a weak substitute.

Types are a better solution than documentation, since they're runtime-verifiable documentation, and in C++, need not impose a runtime cost in release.

In fact, given pervasive use, impose only slight cost at runtime, too, since assignment between instances of the type need not re-verify invariants. They could even lead to less cost at runtime, by moving the precondition check into the construction of instances of the type, which is only done once, rather than every time a function takes the value as an argument.

Imagine something to be the equivalent of GLclampf. sqrt? Stays in range, no checks on either end. Cubic spline sampling? It's in range, no checks. Etc.

visitor

643

December 02, 2008 05:21 PM

Quote:
Because you can't discourage it that way. Negative integers passed into such a function will be invisibly converted into very large numbers, at most with a compiler warning (one which is often disabled, because of the large false-positive rate). The proper place for someone to learn the domain of a function is the function's documentation. Types are a weak substitute.

I still don't get what's bad about doing one test instead of two. And if the "very large unsigned" turns out to be in range, wouldn't it be out of range for signed type too (untestable)?

I was also under the impression that using types that enforce rules (OK perhaps unsigned doesn't do that) is a lot better that putting the rules in the documentation which the compiler doesn't read. Isn't code supposed to be self-documenting?

Sneftel

1,788

December 02, 2008 10:09 PM

Quote:I still don't get what's bad about doing one test instead of two.

As I said, there's often no reasonable maximum to test against.

Quote:And if the "very large unsigned" turns out to be in range, wouldn't it be out of range for signed type too (untestable)?

I have no idea what you're asking here. Sorry.

Quote:I was also under the impression that using types that enforce rules (OK perhaps unsigned doesn't do that) is a lot better that putting the rules in the documentation which the compiler doesn't read. Isn't code supposed to be self-documenting?

As you said, the compiler doesn't enforce that rule. In contrast, assertions are one of the best ways to make code self-documenting.

Sneftel

1,788

December 02, 2008 10:24 PM

Quote:Original post by me22
Compiler warnings test every line. I'm not going to cassert every line, and if I did, I'd probably forget at least once. When was the last time you wrote a program that checked every single printf, or checked for fail on std::cout?

I agree to some extent, but it's a matter of priorities. It's rare for me to write functions without putting assertions on the arguments as my first few lines, because that's an important and useful method of limiting the scope of bugs.

Quote:Types are a better solution than documentation, since they're runtime-verifiable documentation, and in C++, need not impose a runtime cost in release.

It sounds like you're suggesting range-parameterized types. It's a great idea, and I've seen it done a few times, but it never seems to be done elegantly, or in a way that gives the impression that the solution is better than the problem. C++'s morass of casting tools force one to decide between ugly, distracting boilerplate syntax and clever but bug-prone syntax. Also, the usefulness there is limited to when the domain of a given parameter is independent of all other variables. If you want to specify that the first parameter is less than the second, for instance, you'll need to stuff them into a type together, which... this is what I mean about whether the solution is better than the problem.

spraff

100

Author

December 03, 2008 01:00 PM

Wow, I never expected such a flurry of responses!

I'm a bit of a pedant when it comes to signed/unsigned etc. I would never, for example, use ints to store RGB values. It's a similar philosophy as underlies Hungarian notation (REAL Hungarian notation, that is): variables have meaningful names and types because types have meaning! It is important to use unsigned integers for indices (or size_t, really, which I only avoid because I don't like to #include things liberally).

This is not the first time I've been burned by automatic casting, but in the past it's been because I was using variables contrary to their proper semantics, and it was always fixed by actually following the conventions I set for myself more strictly. The const modifier springs to mind -- I make everything const unless there's a good reason not to.

Here's what pissed me off this time round: if you implicitly cast an int to a short int, the compiler complains. If you implicitly cast an int to a float, the compiler complains. Again because of a loss of precision. If you compare signed and unsigned numbers with a < or >, the compiler complains. Somehow, implicit signed/unsigned casts go silent.

Dumb.

DevFred

840

December 03, 2008 01:13 PM

Quote:Original post by spraff
Somehow, implicit signed/unsigned casts go silent.

Because if they didn't, you'd probably be flooded with warnings that are based on harmless casts.

Zahlman

1,682

December 03, 2008 02:43 PM

Quote:Original post by SiCrane
When you do arithmetic between a signed int and an unsigned int, the result is unsigned.

Wow, that's exactly the opposite of what I would expect. Why would they do it that way?

dmail

116

December 03, 2008 03:28 PM

Quote:Original post by Zahlman
Quote:Original post by SiCrane
When you do arithmetic between a signed int and an unsigned int, the result is unsigned.

Wow, that's exactly the opposite of what I would expect. Why would they do it that way?

This is not always the case with all signed and unsigned ints, looking at section 4.5 and 5 in the standard details what will happen.

Quote:
4.5 Integral promotions [conv.prom]
1 An rvalue of type char, signed char, unsigned char, short int, or unsigned short
int can be converted to an rvalue of type int if int can represent all the values of the source type; otherwise,
the source rvalue can be converted to an rvalue of type unsigned int.
2 An rvalue of type wchar_t (3.9.1) or an enumeration type (7.2) can be converted to an rvalue of the first
of the following types that can represent all the values of its underlying type: int, unsigned int,
long, or unsigned long.
3 An rvalue for an integral bit-field (9.6) can be converted to an rvalue of type int if int can represent all
the values of the bit-field; otherwise, it can be converted to unsigned int if unsigned int can represent
all the values of the bit-field. If the bit-field is larger yet, no integral promotion applies to it. If the
bit-field has an enumerated type, it is treated as any other value of that type for promotion purposes.
4 An rvalue of type bool can be converted to an rvalue of type int, with false becoming zero and true
becoming one.
5 These conversions are called integral promotions.

Quote:
5 Expressions
9 Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield
result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
— If either operand is of type long double, the other shall be converted to long double.
— Otherwise, if either operand is double, the other shall be converted to double.
— Otherwise, if either operand is float, the other shall be converted to float.
— Otherwise, the integral promotions (4.5) shall be performed on both operands.54)
— Then, if either operand is unsigned long the other shall be converted to unsigned long.
— Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent
all the values of an unsigned int, the unsigned int shall be converted to a long int;
otherwise both operands shall be converted to unsigned long int.
— Otherwise, if either operand is long, the other shall be converted to long.
— Otherwise, if either operand is unsigned, the other shall be converted to unsigned.

specifically "Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent
all the values of an unsigned int, the unsigned int shall be converted to a long int;"

Sneftel

1,788

December 03, 2008 03:30 PM

Quote:Original post by Zahlman
Quote:Original post by SiCrane
When you do arithmetic between a signed int and an unsigned int, the result is unsigned.

Wow, that's exactly the opposite of what I would expect. Why would they do it that way?

My guess is that the most common case would be summing an unsigned absolute position and a signed offset, where you'd want the result to still be unsigned.

Zahlman

1,682

December 03, 2008 04:00 PM

Quote:Original post by Sneftel
Quote:Original post by Zahlman
Quote:Original post by SiCrane
When you do arithmetic between a signed int and an unsigned int, the result is unsigned.

Wow, that's exactly the opposite of what I would expect. Why would they do it that way?

My guess is that the most common case would be summing an unsigned absolute position and a signed offset, where you'd want the result to still be unsigned.

Hmm. My guess would be that the common case would be summing an unsigned absolute position and a signed offset, and then assigning that result back to the same variable, such that a conversion back to unsigned would be forced anyway. :)

Of course, the existing system does have the dubious advantage of not forcing people to explicitly specify integer literals as unsigned all over the place... I guess...

Eh. This is certainly one of the things I liked about the JVM. (Although it must be noted that certain cell phones have *very irritating* bugs in the implementation...)

Just when you thought stupidity had reached its limits (a debugging rant)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Just when you thought stupidity had reached its limits (a debugging rant)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines