using signed int for values that should never be negative

Started by
28 comments, last by SiCrane 16 years, 7 months ago
One thing to be aware of when using unsigned integers is that the result of an arithmetic operation involving a signed and an unsigned integer is unsigned. So if you have a signed integer, and you subtract an unsigned integer from it, if the result would be negative, you will get wrapping.

#include <iostream>int main(){	int i = 100;	unsigned j = 200;	float k = i - j;	std::cout << k << std::endl;}

Although I use unsigned integers to enforce the constraint that a value will never be negative, when you're doing a lot of arithmetic with a mixture of signed and unsigned integers, it can make sense to make them signed and avoid subtle bugs caused by implicit conversions.
Advertisement
Quote:Original post by thedustbustr
Quote:Original post by Sneftel
for (int i = 0; i < 10; i++) {  a = i; <-- array index cannot be 4 billion in a sane application either.};


er, why would i be 4 billion, and why would having i be -1 make a difference?

Because in the unsigned world, -1 is equal to 4294967295.

So if you had some strange code that set i to -1, it would crash whether you used a signed or unsigned int.
NextWar: The Quest for Earth available now for Windows Phone 7.
Quote:Original post by Sc4Freak
Quote:Original post by thedustbustr
Quote:Original post by Sneftel
for (int i = 0; i < 10; i++) {  a = i; <-- array index cannot be 4 billion in a sane application either.};


er, why would i be 4 billion, and why would having i be -1 make a difference?

Because in the unsigned world, -1 is equal to 4294967295.

So if you had some strange code that set i to -1, it would crash whether you used a signed or unsigned int.
I would prefer unsigned in such a case. an index of 4294967295 will obviously always crash with 32-bit addressing, whereas an index of -1 might simply access a different variable on the stack/heap.

The only place where using unsigned trips you up (that I can remember) is when doing a decreasing loop that has an ending condition of i>=0, which becomes an infinite loop.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Quote:Original post by thedustbustr
The game engine my company has licensed does this all the time - it will use signed math for a simple for loop. It seems to me that this is bad practice and can, in some circumstances, result in severe security vulnerabilities.

Should I always be using unsigned types for types that are never negative, or am I overreacting and adding complexity where it is completely unnecessary?


Sticking to the original topic of a "simple for loop", why type out anything more than 3 letters?

I *always* use a simple int for "simple for loops".

// spawn a bunch of random objectsfor (int i = 0; i < 50; ++i)    spawnRandomObject();


Not really sure how I can be more "safe".
Quote:Original post by iMalc
Quote:Original post by Sc4Freak
Quote:Original post by thedustbustr
Quote:Original post by Sneftel
for (int i = 0; i < 10; i++) {  a = i; <-- array index cannot be 4 billion in a sane application either.};


er, why would i be 4 billion, and why would having i be -1 make a difference?

Because in the unsigned world, -1 is equal to 4294967295.

So if you had some strange code that set i to -1, it would crash whether you used a signed or unsigned int.
I would prefer unsigned in such a case. an index of 4294967295 will obviously always crash with 32-bit addressing, whereas an index of -1 might simply access a different variable on the stack/heap.
No, that's the thing. An index of 4294967295 will access exactly the same thing as an index of -1. From the CPU's point of view, there literally is zero difference between the two situations.
Quote:Original post by Sneftel
No, that's the thing. An index of 4294967295 will access exactly the same thing as an index of -1. From the CPU's point of view, there literally is zero difference between the two situations.


My example wasn't best, the point I was trying to make is that an array index, as a value, cannot be signed. a[-1] isn't allowed, as such it makes no sense to use it. With a variable, it's possible to do it, but it's not sane. Depending on the compiler, the results might even be deterministic, but still not sane. So might as well enforce this relation.

However, unsigned types may lead to slightly more elegant code (or not, depends).

template < class T >T get( unsigned int index ) {  if ( index < n_elements ) {    return ...  } else {    // error  }}

compared to:
template < class T >T get( int index ) {  if ( (index >= 0) && ( index < n_elements )) {    return ...  } else {    // error  }}

since it allows you to establish hard relation. Even if a negative index is passed, it only requires one test.

The decrementing while statement is tricky as well, since subtle flaws can lead to, as mentioned, infinite loops.

Overall, unsigned types are just fine for things that are unsigned. But they come with their own set of problems and gotchas.
Quote:Original post by Antheus
a[-1] isn't allowed, as such it makes no sense to use it.

int a[2];int *b = a+1;b[-1] = 3;

I've actually seen this sort of thing on more than one occasion, usually in code converted from FORTRAN.

Anyways, I think we're both arguing the same side, sooo.
All this info scares me; it looks like these int/unsigned loops are not as inherently safe as they should be. Maybe a suitable solution is to use some new types for loops that check for these conditions, like Astrachan does in some of his classes, something like:

for( isFirst; !isDone; isNext ){    // Whatever...}


--random
--random_thinkerAs Albert Einstein said: 'Imagination is more important than knowledge'. Of course, he also said: 'If I had only known, I would have been a locksmith'.
I definitely think it is a mistake to use an unsigned variable just because the values are expected to never be negative. Two reasons:
  • It gives you a false sense of security. As pointed earlier, it doesn't catch or prevent problems, and the mistaken assumptions in some of the posts above illustrate the point.
  • Mixing signed and unsigned variables results in a lot of casting, which reduces readability and leads to bugs.

I use unsigned values for bit manipulation, for variables that need an extra bit of range, and for compatibility.

Does anyone use unsigned char if the value is expected to always fit in 8 bits? In other words, does anyone advocate doing this?
    for ( unsigned char i = 0; i < 10; ++i )    ... 
For the same reasons described above, I don't use char or short just because the values are expected to be in their ranges. I just use int. In fact I only use char for text and can't think of any reason for ever using short (if I need specific sizes, types such as __int8 and __int16 are more appropriate).

[Edited by - JohnBolton on September 16, 2007 2:06:00 AM]
John BoltonLocomotive Games (THQ)Current Project: Destroy All Humans (Wii). IN STORES NOW!
C++ is just an inherently unsafe language, and unbounded loops have the inherent problem (and benefit) of iterating for some unknown number of times that could depend on values being computed within the loop itself. It is up to the programmer to ensure that loops eventually terminate and at the proper time, while not performing undefined operations.

If you want to enforce really strict safety, I've read that Eiffel is designed for that sort of thing. It has ways to insert whatever kinds of checks you like.

Using an unsigned int on a 32 bit machine probably won't cause any harm since it is difficult to imagine a loop counting past two billion, and in the rare case that such a thing is necessary it is a simple change to use an unsigned int, a long, or a bigint class.

This topic is closed to new replies.

Advertisement