How reliable are streams?

Started by
24 comments, last by Fruny 17 years, 9 months ago
Quote:Original post by Fruny
Quote:Original post by Lode
So will "value" now be equal to 49 on any platform, no matter what charset it uses, what country it's in, how many bits a char is, and so on?


No, only on platforms that use ASCII characters. If you were on, say, an EBCDIC system, the numeric value of the character '1' would be 241. The stream would still produce a '1', but the actual numeric value of the character is platform-specific.

Quote:Because if it isn't, then my "StringStream" would NOT be a "NIH", since it's then different in that it's independent of charset etc...


No. It is not. Your class would suffer from exactly the same problem in exactly the same way. '1' is not always equal to 49.


If my game would be copied to such an EBCDIC system, then the script file would be copied over too. This script file contains bytes, bytes of ascii characters. Would this file be converted to EBCDIC too?

I'm looking at files as "a list of 8-bit numbers", and those numbers never change and determine what happens when it's parsed, maybe this is a wrong way to look at files?
Advertisement
Quote:Original post by Lode
If my game would be copied to such an EBCDIC system, then the script file would be copied over too. This script file contains bytes, bytes of ascii characters. Would this file be converted to EBCDIC too?


Nope. The bytes would stay the same, but they would be interpreted differently. i.e. your script would be an unreadable mess, which neither your class nor the standard library could make sense of. Unless, that is, your game were copied as a binary (rather than recompiled from source), in which case the interpretation wouldn't change (barring dynamic linking), regardless of whether you used your own class or the standard streams.

Of course, the odds of you running into an EBCDIC system these days are close to nil. [smile]
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
Hmm, if, (say in the odd case where an EBCDIC system would compile and run the game :p), the C++ source code would be converted to EBCDIC, and the script file would also be converted to EBCDIC, would in that case, everything become readable again by the standard streams, and integers from the script file be interpreted correctly, and so on?

Also, is the "default locale" used by the C++ standard always the same, so would it be US ASCII even on an EBCDIC system, and also on a Chinese computer?
Quote:Original post by Lode
Hmm, if, (say in the odd case where an EBCDIC system would compile and run the game :p), the C++ source code would be converted to EBCDIC, and the script file would also be converted to EBCDIC, would in that case, everything become readable again by the standard streams, and integers from the script file be interpreted correctly, and so on?


Yes.

Quote:Also, is the "default locale" used by the C++ standard always the same, so would it be US ASCII even on an EBCDIC system, and also on a Chinese computer?


It would be the same on a US and on a Chinese computer, yes. Things would be different on an EBCDIC machine (since it would need to use different numeric values to encode the characters).

Incidentally, that's why C provides functions like isdigit() and isalpha() to classify characters rather than having you rely on their binary encoding. Aside from convenience, it protects you from differences in character encoding.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
The default locale is the "C" locale which means using '.' as a decimal point, no thousands separator and generally acting as if the numbers were interpreted the same as if they showed up inside C source code by the compiler in the same native character set encoding used by the system.

However, the notion of treating files as streams of 8 bit numbers is not something portable. Some platform use 16 bit bytes. Some use 32 bits. Your file loading and reading might end up being very very different on those compilers. For that matter, system calls (such as opening files with file names) might end up different depending if the operating sytem interpreted char *s as ANSI characters strings or UTF-8 unicode character strings even if both systems used 8 bit chars.

Porting programs to different platforms is non-trivial; however, you can best reduce your problems by using standard library functions and portable third party libraries that have already done the hard parts of getting the porting right. If you have to verify your own code for every tiny little bit then your platform porting is going to be a nightmare.
Quote:Original post by SiCrane
Porting programs to different platforms is non-trivial; however, you can best reduce your problems by using standard library functions and portable third party libraries that have already done the hard parts of getting the porting right. If you have to verify your own code for every tiny little bit then you're platform porting is going to be a nightmare.


Quoted for emphasis.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan

This topic is closed to new replies.

Advertisement