Jump to content
  • Advertisement
Sign in to follow this  
nuclear123

int vs unsigned int

This topic is 2976 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

quick question which might seem a little dumb but....i understand u can represent a number in binary as neg. or positive by using 2's complement and using 1 bit as a sign bit. Say i wanna do an unsigned int in my program...how does the computer know the sign bit is not being used? where might this information be stored on in a computer letting the CPU know this number must be positive and is unsigned?

Share this post


Link to post
Share on other sites
Advertisement
Its not stored. Its up to your program to use that data in the correct manner. You can even do something along the lines of:



int value = -10;
int * ptr = &value;
unsigned int * temp = reinterpret_cast<unsigned int>(ptr);



At which point, if you use temp, then the compiler will treat the data in value's memory location as an unsigned int instead of a signed int. The only thing you've done here is tell the compiler "Treat this signed number as an unsigned number now." The memory signature of value still remains the same.

Share this post


Link to post
Share on other sites
Quote:
Original post by nuclear123
where might this information be stored on in a computer letting the CPU know this number must be positive and is unsigned?


Where it matters, the instruction itself specifies when the data is operated on.

e.g there are different multiply instructions for signed and unsigned.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rycross
Its not stored. Its up to your program to use that data in the correct manner. You can even do something along the lines of
To expand on this, whether the sign is used or not depends on the instructions used and not the data itself. So in a way it is stored, but as part of the program itself and not as extra data that the program checks before using the data type.

Share this post


Link to post
Share on other sites
Quote:
Original post by nuclear123
how does the computer know the sign bit is not being used? where might this information be stored on in a computer letting the CPU know this number must be positive and is unsigned?


CPU doesn't care. It takes some bits from memory, changes them, writes some bits. What these bits mean CPU doesn't know or care about, it's up to programmer.

Therefore, compiler will generate a different opcode depending on which types the programmer *declared*. If one writes 'unsigned' vs. 'signed', compiler will generate different instructions for operations that touch these types.
int a, b;
int c = a*b;
Here, compiler would generate IMUL.
unsigned int a, b;
unsigned int c = a*b;
Here, it would generate MUL.

But it gets tricky:
unsigned short a;
signed int b;
unsigned long c = a*b;
Here there exist rules defined by C++ which determine how these types get promoted or changed. After compiler adjusts the types, it then generates actual assembly just like in previous examples.


But as far as CPU is concerned, all of the above is done on exactly the same registers, each of which contains 32 or 64 bits. It's up to user to interpret what those bits mean. They might be numbers, they might be signed, or they might be something completely else. Compiler helps a lot with that.

Share this post


Link to post
Share on other sites
You say: "This variable is unsigned"
The compiler says: "Okay, I'll generate instructions that treat that value as an unsigned number wherever I see it used".

Share this post


Link to post
Share on other sites
And you can take this even "further" and ask how does the computer know whether those 4 bytes are an integer value or a float. The answers would be the same.

Share this post


Link to post
Share on other sites
Quote:
Original post by nuclear123
i understand u can represent a number in binary as neg. or positive by using 2's complement and using 1 bit as a sign bit.

Bit of a nitpick here, two's complement does not use a 1-bit sign. A simple example to illustrate this, positive and negative 1 would look like this in 8-bit binary:

00000001 = +1 decimal
11111111 = -1 decimal

Two's complement negation follows this rule: N = NOT(P) + 1


One's complement does not have a dedicated sign bit either:

00000001 = +1 decimal
11111110 = -1 decimal

One's complement negation follows this rule: N = NOT(P)

Most modern architectures only use two's compliment arithmetic. As for singed vs unsigned arithmetic, how does the CPU know which one to use?

Since CPU assume a two's compliment system, most instructions (add, sub, ...) will work on both signed and unsigned arithmetic in the same manner, without any special treatment of the data in the registers. There may be a few instructions that are sign specific (such as imul, idiv, ...), but those are exceptions. What really matters is how the arithmetic results are interpreted. For instance, comparing two numbers as unsigned can yield different results when compared as signed.

When you declare things in a high level language, such as C/C++, you already specify what kind of integer type you want for your variables. The compiler will pick and choose the right assembly instructions to match your integer type, particularly when it comes to "if" compare statements.

Example:


int a = -1;
int b = 1;
if (a == b) {} /* Compiler will choose signed compare instructions*/

unsigned int c = 2;
unsigned int d = 3;
if (c == d) {} /* Compiler will choose unsigned compare instructions */

int e = -2;
unsigned int f = 3;
if (e == f) {} /* Compiler should generate a warning about sign type mismatch, and falls back to signed compare */


Share this post


Link to post
Share on other sites
Quote:
Original post by Tachikoma
Bit of a nitpick here, two's complement does not use a 1-bit sign.


To nitpick further, it depends on which definition of "sign bit" one is using.

Under the "bit which gets toggled to change the sign" definition, you are correct.

But under the looser "bit which indicates the sign" definition (which is the more widely-used definition), 2's complement does have a sign bit, as the most significant bit will always be set for negative and clear for positive. This is true even though 2's complement negation is more complicated than merely toggling said bit.

And as another further nitpick, the statement "Most modern architectures only use two's compliment arithmetic" is only strictly true for integer math - floating-point numbers are almost universally represented in IEEE 754 format, which is a sign-and-magnitude format in which toggling the sign bit does negate the value (and thus, both definitions above apply). And to further nitpick on that, the floating-point exponent is represented in excess-n, and the specific bias used (127 for float, 1023 for double) means that the exponent actually doesn't have a sign bit under either definition.

Furthermore, strictly speaking the compiler will only issue different instructions for signed vs. unsigned for a limited subset of operations: multiply, divide, and inequality (with the exception of not equal to).

Addition, subtraction, and comparison (i.e. subtraction that discards the numerical result) result in identical bit patterns in unsigned and 2's complement, with the only difference being how numerical overflow is indicated (unsigned overflow if there is a carry out of the highest bit, signed overflow if the carry out of the highest bit is not the same as the carry into the highest bit) - because of this, on most CPUs there is just a single instruction for each of those operations, which changes both the signed AND unsigned overflow flags. 0xFFFF plus 0x000F is always 0x000E (with carry out and no signed overflow), and 0xFFFF minus 0x000F is always 0xFFF0 (with no borrow out and no signed overflow).

This can't be done with multiplication or division, because signed and unsigned versions result in different bit patterns - for example, 0xFFFF times 0x000F is 0xFFFFFFF1 signed, but 0x000EFFF1 unsigned (the result of a multiplication instruction is twice the width of the operands on almost all CPUs that support multiplication), while 0xFFFF divided by 0x000F is 0x1111 remainder 0x0000 unsigned, but either 0x0000 remainder 0xFFFF or 0xFFFF remainder 0x000E (depending on implementation) signed.

Thus the three comparison examples you gave will all issue the same instruction sequence - "CMP; JNE" on an Intel CPU, for example. (Signed and unsigned equality are indicated in exactly the same way - after the CMP, the zero flag will be set if the the two values were equal, and clear otherwise.) For the same reason, if you were testing for inequality using !=, this would still be true - the compiler would issue a "CMP; JE" sequence.

This would be different with, for example, less than. In this case, the signed version would be a "CMP; JGE" sequence, while the unsigned comparison would result in a "CMP; JAE" sequence.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anthony Serrano
To nitpick further, it depends on which definition of "sign bit" one is using.

Under the "bit which gets toggled to change the sign" definition, you are correct.

But under the looser "bit which indicates the sign" definition (which is the more widely-used definition), 2's complement does have a sign bit, as the most significant bit will always be set for negative and clear for positive. This is true even though 2's complement negation is more complicated than merely toggling said bit.

Sure, one's and two's complement representations have the MSB set for negative integers, and can be used as an indicator for negativity. Realistically speaking, the sign in such numbers should regarded as form of encoding, since negation can affect all bits in the number.

I think the "sign bit" definition should be reserved just for a 1-bit flag exclusively. It is a clarification of the underlying structure of the system.

If you'd evaluate all bits in the integer on pen and paper, suddenly the MSB will start to have a very different meaning, depending on the number system you are working with.

Therefore, it would be a serious misnomer to regard the MSB as the "sign bit" in a two's complement system, when in reality it is a "weight bit" with a special significance. Otherwise, one could imply that flipping this bit would correctly negate the number for two's complement, which we all know is false.


Quote:
And as another further nitpick, the statement "Most modern architectures only use two's compliment arithmetic" is only strictly true for integer math - floating-point numbers are almost universally represented in IEEE 754 format, which is a sign-and-magnitude format in which toggling the sign bit does negate the value (and thus, both definitions above apply). And to further nitpick on that, the floating-point exponent is represented in excess-n, and the specific bias used (127 for float, 1023 for double) means that the exponent actually doesn't have a sign bit under either definition.

Yep, I'm aware of that, and I was focusing on integers, as per topic.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!