Sign in to follow this  
agottem

In C#, is an 'int' really always 32 bits?

Recommended Posts

I'm trying to determine the minimum valid ranges of a C# int, and all sources seem to say an int is always 32 bits, no matter what. Is this really the case? It seems rather short-sighted of Microsoft to force all architectures supporting C# to virtualize the behavior of an int to be 32 bits. Also, if the size is forced to be 32 bits, is the endianness of the type also forced to be little endian? I really hope this isn't the case, since this seems like it'd be problematic for architectures where the ideal int size is, say, 31 bits.

Share this post


Link to post
Share on other sites
Quote:

I'm trying to determine the minimum valid ranges of a C# int, and all sources seem to say an int is always 32 bits, no matter what. Is this really the case?

The C# 'int' is an alias for the CLR type System.Int32 which, as the name implies, is always 32-bits.

Quote:

It seems rather short-sighted of Microsoft to force all architectures supporting C# to virtualize the behavior of an int to be 32 bits.

Why do you think that? The CLR is a virtualized runtime environment. It makes no difference if the defined size of a type is 32 bits or 76 bits. It's basically just as easy to implement.

Quote:

I really hope this isn't the case, since this seems like it'd be problematic for architectures where the ideal int size is, say, 31 bits.

"int" is a name. It's not a fundamental aspect of a computer architecture. The closest you get to something like an "ideal int size" is probably the register size of the GP registers on the chip, and dealing with a mismatch in the size of a fundamental type in the language and the size of a register is a well-known and more-or-less trivially solvable problem (and has been for some time now). In C or C++, for example, the size of the 'int' type doesn't always match the processor's register size, and this causes very little trouble.

Share this post


Link to post
Share on other sites
Quote:
Original post by jpetrie
Why do you think that? The CLR is a virtualized runtime environment. It makes no difference if the defined size of a type is 32 bits or 76 bits. It's basically just as easy to implement.


Performance wise, it does make a difference. If I'm running on an architecture that doesn't support a native 32 bit type, I now need to emulate 32 bit behavior such as overflow and underflow.

Why not just take the C approach, and define guaranteed minimums for these types? This would allow the compiler to use a more natural sized int, so long as it meets the minimum range requirements.

Share this post


Link to post
Share on other sites
Yes, the System.Int32 data type is always represented by 32 bits. Imagine that. It's actually amazingly useful to be able to depend on sizes of the built-in data types. No, it's not forced to any sort of "endianness". The two concepts aren't really related.

If you manage to find an architecture that likes integers to be sized to 31 bits, you're not going to be running common and widely used operating systems and APIs on it anyway, so it's a moot point.

Share this post


Link to post
Share on other sites
C/C++ is a systems level programming language and it helps to have the data sizes be different sizes sometimes to better fit the actual hardware platform the compiler is running on.

C# is an application level programming language aimed at rapid development. It runs on a virtual machine, and co-exists with, and inter-operates with all the .NET languages under the common language run time. It's nice to know things like that for sure.

C/C++ is a portable assembly meant to run right on top the hardware platform.

C# is part of the .NET tech. It runs on top the CLR, not the hardware. You shouldn't have to care about these details.

--edit NINJA'D x 100

Share this post


Link to post
Share on other sites
Quote:
Original post by Mike.Popoloski
Yes, the System.Int32 data type is always represented by 32 bits. Imagine that. It's actually amazingly useful to be able to depend on sizes of the built-in data types. No, it's not forced to any sort of "endianness". The two concepts aren't really related.

If you manage to find an architecture that likes integers to be sized to 31 bits, you're not going to be running common and widely used operating systems and APIs on it anyway, so it's a moot point.


You can have the same benefits of 'dependable sizes' by using minimum range guarantees. This provides the compiler with much greater flexibility.

Share this post


Link to post
Share on other sites
Quote:

Performance wise, it does make a difference. If I'm running on an architecture that doesn't support a native 32 bit type, I now need to emulate 32 bit behavior such as overflow and underflow.

What you're failing to grasp is that processors don't have "int types," they have register sizes. Generally very few register sizes -- 32 or 64 bit GP registers, and 80 bit floating point ones, for example. The problem of providing over/under flow behavior to types in a language that the processor wouldn't naturally do is something that almost all language compilers need to do anyway. It's not any extra effort.

Quote:

You can have the same benefits of 'dependable sizes' by using minimum range guarantees. This provides the compiler with much greater flexibility.

No, minimum sizes do not afford the same benefits. Minimum sizes come at the expense of much more difficult interop (which is a critical aspect of the CLR) and more headaches on the part of the developer because the size is not assured.

Share this post


Link to post
Share on other sites
Quote:
Original post by agottem
Quote:
Original post by Mike.Popoloski
Yes, the System.Int32 data type is always represented by 32 bits. Imagine that. It's actually amazingly useful to be able to depend on sizes of the built-in data types. No, it's not forced to any sort of "endianness". The two concepts aren't really related.

If you manage to find an architecture that likes integers to be sized to 31 bits, you're not going to be running common and widely used operating systems and APIs on it anyway, so it's a moot point.


You can have the same benefits of 'dependable sizes' by using minimum range guarantees. This provides the compiler with much greater flexibility.


No, you can't. Any situation where you're trying to port or interop with a different bit of code that requires you to match data type sizes is going to require a lot of extra work to ensure that you're using the right sizes as intended by the original writer of that system. For example, a lot of the lesser known file formats are documented by a simple C structure in source form. How exactly do you ensure that you are loading the right sized data from the file when your reference is giving things in ints and longs, which can have ANY size?

Share this post


Link to post
Share on other sites
Quote:
Original post by jpetrie
What you're failing to grasp is that processors don't have "int types," they have register sizes. Generally very few register sizes -- 32 or 64 bit GP registers, and 80 bit floating point ones, for example. The problem of providing over/under flow behavior to types in a language that the processor wouldn't naturally do is something that almost all language compilers need to do anyway. It's not any extra effort.


I'm well aware of the low level details. Ideally, an int would map to the architectures ideal size for integer operations.

Are C# floats also set to a specific size?

Also, most C compilers do not need to provide virtualized overflow/underflow behavior.

Share this post


Link to post
Share on other sites
Quote:
Original post by Mike.Popoloski
No, you can't. Any situation where you're trying to port or interop with a different bit of code that requires you to match data type sizes is going to require a lot of extra work to ensure that you're using the right sizes as intended by the original writer of that system. For example, a lot of the lesser known file formats are documented by a simple C structure in source form. How exactly do you ensure that you are loading the right sized data from the file when your reference is giving things in ints and longs, which can have ANY size?


If the advantage of forcing the size to be 32 bits is that you can now interact with files without having to worry about the sizes of integers that were stored, wouldn't the C# spec also need to define the endianness of the types?

Share this post


Link to post
Share on other sites
It's simply a trade-off between programmer convenience and performance. The fewer such implementation-defined behaviors you can get away with, the less portability bugs you'll see.

If 31-bit machines were a real concern then Microsoft might well do something about it, but they aren't so they won't. On the other hand Java had to relax their float-point requirements since they ended up being prohibitively expensive on x86 machines.

Share this post


Link to post
Share on other sites
Quote:

If the advantage of forcing the size to be 32 bits is that you can now interact with files without having to worry about the sizes of integers that were stored, wouldn't the C# spec also need to define the endianness of the types?

Oh, yes, the CLR stores everything in little-endian order (with a few minor exceptions, mainly in the PE file format and in some packed length metadata). I'm fairly certain that applies to the CLR built-int types (int32, int64, O, F, & and native unsigned int), but I'd have to review the standard to be sure.

Which brings me to the next point: the CLR's internal type system (which consists of only the types I enumerated above) does include a "native unsigned int" type for those rare scenarios where it actually matters. That type is almost exclusively used (in my experience) as a pointer or address, which makes sense.

Share this post


Link to post
Share on other sites
Quote:
Original post by implicit
It's simply a trade-off between programmer convenience and performance. The fewer such implementation-defined behaviors you can get away with, the less portability bugs you'll see.

If 31-bit machines were a real concern then Microsoft might well do something about it, but they aren't so they won't. On the other hand Java had to relax their float-point requirements since they were prohibitively expensive on x86 machines.


I guess I'm willing to leave it at that. Coming from a C mindset, I was kind of shocked to discover this...and kept searching google thinking this can't really be the case.

Share this post


Link to post
Share on other sites
Quote:

Ideally, an int would map to the architectures ideal size for integer operations.

C# is so far removed from the actual chip being used to execute the code, ultimately, that this makes little sense. It does not need to make any such mapping (and it would be stupid) because all the intermediate layers in between can do it just fine for you. That, in fact, is part of the point of all the intermediate layers.

Quote:

Are C# floats also set to a specific size?

System.Single and System.Double are IEC 60559:1989 32 and 64 bit floats. The CLR's F type is a floating point of unspecified (native) size.

Quote:

(which consists of only the types I enumerated above)

I suppose I should clarify this: those are the core types, a CLR can support more but needs only those to be minimally functional.

Share this post


Link to post
Share on other sites
Quote:
Original post by jpetrie
C# is so far removed from the actual chip being used to execute the code, ultimately, that this makes little sense. It does not need to make any such mapping (and it would be stupid) because all the intermediate layers in between can do it just fine for you. That, in fact, is part of the point of all the intermediate layers.


I understand C# is very far removed from the actual hardware. Do you agree there is a performance penalty to be paid for this restriction though? Maybe not on x86, but for arguments sake, let's assume a 31 bit native integer size.

I'm still under the opinion that the C method of handling this is superior. A minimum bound is sufficient. A maximum bound is too restrictive.

Share this post


Link to post
Share on other sites
Quote:

Do you agree there is a performance penalty to be paid for this restriction though? Maybe not on x86, but for arguments sake, let's assume a 31 bit native integer size.

Assuming the most efficient implementation (i.e., not a brain-dead compiler or CLR implementation, et cetera) -- no. Maybe the very first time a particular piece of code is run, but that's the cost you pay for the JIT anyhow. After that, you're executing the same native code that a C or C++ compiler would have produced, so there is no difference.

Share this post


Link to post
Share on other sites
Quote:
Original post by jpetrie
Assuming the most efficient implementation (i.e., not a brain-dead compiler or CLR implementation, et cetera) -- no. Maybe the very first time a particular piece of code is run, but that's the cost you pay for the JIT anyhow. After that, you're executing the same native code that a C or C++ compiler would have produced, so there is no difference.


I don't think that's true. If my native int type is 31 bits, extra assembly would need to be emitted for C# integer operations.

For example, take the following C# code:


int x = 0x7FFFFFFF;
int y = 0x00000001;
int z = x+y;




In x86 assembly, this can easily be mapped to:


mov eax, 0x7FFFFFFF ;x=0x7FFFFFFF
mov ebx, 0x00000001 ;y=0x00000001
add ebx, eax ;z=x+y




Now, imagine that those register sizes were 31 bits instead of 32 bits. In order to be compliant with the C# spec, that assembly is no longer valid. Eventually the C# virtual machine needs to translate the specific integer instructions into native machine assembly. As it stands, for 32 bit architectures, this is a trivial translation. This is certainly not the case for a 31 bit integer type.

Share this post


Link to post
Share on other sites
Supporting platform-dependent sizes for int is not a zero cost thing, it has costs in code complexity. It's a little more difficult to write code when you can't just assume what the maximum size of an int is.

So, why would you pay for something that you're not going to use? The number of coders that need to support strange architectures is very small (compared to the rest of the software world). And among those people, the number of people that would choose C# rather than the more appropriate C or C++ is even smaller (perhaps zero). Why would you make life more difficult for the rest of the software community, just to support this extremely rare use case?

Share this post


Link to post
Share on other sites
Couple of things here:
1. Pretty much any system where you would even consider using C# supports 32 bit integers. All others are most likely embedded systems where there is no CLR and hence running your C# code might pose a slight problem.
2. C# provides the standard integer types expected by most programmers: byte, short, int, and long with their respective CLR types System.Byte, System.Int16, System.Int32, and System.Int64. This provides all the required flexibility needed for any application, while providing the certainty of fixed type sizes. C/C++ type sizes have long posed a problem with library and api implementors, along with compiler vendors. Hence the plethora of compiler extensions such as __int32 and __int64 to support fixed size types.

Share this post


Link to post
Share on other sites
Quote:
Original post by pinacolada
Supporting platform-dependent sizes for int is not a zero cost thing, it has costs in code complexity. It's a little more difficult to write code when you can't just assume what the maximum size of an int is.

So, why would you pay for something that you're not going to use? The number of coders that need to support strange architectures is very small (compared to the rest of the software world). And among those people, the number of people that would choose C# rather than the more appropriate C or C++ is even smaller (perhaps zero). Why would you make life more difficult for the rest of the software community, just to support this extremely rare use case?


Maybe you can provide me with some insight, but, the only cases I can think of where code complexity increases without an exact type size are file/network I/O. For these cases, I'd by happy defining an API through which the sizes of types being transferred had an exact size.

For nearly everything else, a minimum bound is generally all that's needed for a type.

Share this post


Link to post
Share on other sites
You know, the only 31 bit system that I know of, the IBM System/370, still used 32 bit registers and arithmetic. 31 bits only referred to the address space.

Share this post


Link to post
Share on other sites
Quote:
Original post by agottem

I know :( I'm new to C#, and from a performance perspective thus far, C# seems less portable than C.


C# is perfectly portable - it needs to support a single CPU. Then it's just a matter of implementing CLR for arbitrary platform. That is one CLR per CPU (or architecture) to support *all* code ever written in C#.

Compare this to C code, where each source needs to support and account for any platform it would be compiled to as well as each compiler specific behavior.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
You know, the only 31 bit system that I know of, the IBM System/370, still used 32 bit registers and arithmetic. 31 bits only referred to the address space.


31 bits is just the hypothetical architecture I am using to discuss what I think is a problematic area of C#. I'd like to make the point minimum bounds on a type are sufficient. Minimum and maximum bounds are too restrictive and even harmful.

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.
Sign in to follow this