Sign in to follow this  
CinoGenX

Extended Ascii

Recommended Posts

Hi all,

Rather basic question...

Im writing a little c++ console maze game. The char 2d array to hold the maze display is a mixture of normal and extended ascii codes. (i.e. 219 for walls etc).

I'm having a little trouble with identifying these chars in my collision detection, i've looked around but not found a definitive reason for [u]why [/u]i cant check for different extended ASCII codes, only work arounds to get it to work. I tried isascii and straight up comparision operators. I dont think im doing anything wrong with the code (please excuse the fact that i dont have an example, im at work).

Should i look again (i.e. these should work fine) or is there some issue with basic comparison operators and extended ascii codes?

Many thanks.

Share this post


Link to post
Share on other sites
Are you using chars or unsigned chars?
I ran the following program using chars and it came up with AB|BA but with unsigned it does the expected AB219BA. (| is meant to be the block symbol)

chars are only 7 bits, because the top bit is always set, if I remember rightly, so 219 is really 91 (open square bracket).

[source lang="cpp"]int CharacterDetectTest(void)
{
static const int constMaxCharacters = 5;
const unsigned char maze[constMaxCharacters] = {65, 66, 219, 66, 65};

for (int i = 0; i < constMaxCharacters; ++i)
{
if (maze[i] < 127)
{
printf ("%c", maze[i]);
}
else
{
printf ("%d", static_cast<int>(maze[i]));
}
}
printf("\nPress enter to exit\n");
return fgetc(stdin);
}[/source]

Share this post


Link to post
Share on other sites
It depends on how you are doing it. For example:
[code]
#include <iostream>
int main() {
char c = 219;
std::cout << (c == 219) << '\n';
std::cout << static_cast<int>(c) << '\n';
}
[/code]
Yields the following output:
[quote]
0
-37
[/quote]
Moving to a named constant:
[code]
#include <iostream>
int main() {
const char Wall = 219;
char c = Wall;
std::cout << (c == Wall) << '\n';
std::cout << static_cast<int>(c) << '\n';
}
[/code]
Produces the expected comparison:
[quote]
1
-37
[/quote]

Finally, you can divorce your internal representation of the tiles from the visual representation:
[code]
#include <iostream>
enum Tile {
Wall,
// ...
};
std::ostream &operator<<(std::ostream &stream, Tile tile) {
char c = ' ';
switch(tile) {
case Wall: c = 219;
// ...
}
stream << c;
return stream;
}
int main() {
Tile tile = Wall;
std::cout << tile << '\n';
if(tile == Wall) {
// ...
}
}
[/code] Edited by rip-off

Share this post


Link to post
Share on other sites
[quote]
Are you using chars or unsigned chars?
[/quote]
There are three types of characters in C++, char, unsigned char and signed char. Unlike the other integral types, the unqualified char is considered to be almost a distinct type, and whether it is signed or unsigned by default varies across compilers.

[quote]
chars are only 7 bits, because the top bit is always set, if I remember rightly, so 219 is really 91 (open square bracket).
[/quote]
Nope. Chars are CHAR_BIT bits (which must be at least 8). The "top" bit is not always set.

Share this post


Link to post
Share on other sites
[quote name='FredOrJoeBloggs' timestamp='1343309069' post='4963294']
chars are only 7 bits, because the top bit is always set, if I remember rightly, so 219 is really 91 (open square bracket).
[/quote]
Not quite. ASCII uses only 7 bits (extended ASCII was later added to take advantage of the 8th bit). [font=courier new,courier,monospace]char[/font] (the C++ data type) is at least 8 bits, though it could be more on certain systems, and meant to be used in UTF-8 strings. The top bit (assuming 8 bit [font=courier new,courier,monospace]char[/font]s) is never set for regular ASCII. The top bit is, however, set when using extended ASCII characters (because they require the 8th bit to be set).

[edit]

ninja'd Edited by Cornstalks

Share this post


Link to post
Share on other sites
[quote name='rip-off' timestamp='1343309854' post='4963300']
There are three types of characters in C++, char, unsigned char and signed char. Unlike the other integral types, the unqualified char is considered to be almost a distinct type, and whether it is signed or unsigned by default varies across compilers.
[/quote]
To be exact, [font=courier new,courier,monospace]char[/font] is a distinct type from [font=courier new,courier,monospace]signed char[/font] and [font=courier new,courier,monospace]unsigned char[/font], and whether it is signed or unsigned varies across compilers.

If you need to guarantee it is not a signed [font=courier new,courier,monospace]char[/font], use an [font=courier new,courier,monospace]unsigned char[/font]. That should resolve the OP's problem.

Share this post


Link to post
Share on other sites
Perfect, thankyou guys. Working now.

Still a little dizzy over why as a char the 8th bit makes it a negative...but ill study up on that later.

Thanks again.

Share this post


Link to post
Share on other sites
[quote name='CinoGenX' timestamp='1343420042' post='4963746']
Still a little dizzy over why as a char the 8th bit makes it a negative...but ill study up on that later.
[/quote]
Because the char is, on your platform, a signed type by default, and the way two's complement signed values work the top bit is effectively an sign-indicator: the high bit is set for any negative value.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this