#### Archived

This topic is now archived and is closed to further replies.

# This character in this file...I can't figure out what it is

This topic is 6039 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I downloaded the official Scrabble word list to make a scrabble playing program, and there is a character separating the words, but I can''t figure out what it is. In Notepad the character shows up as one of those little boxes, and in Wordpad it acts as a newline, with each word on it''s own line. So I wrote a little C++ program to tell me what the ASCII value of the character was, and my program doesn''t even recognize it. It acts like it''s not even in the file. It only reads the alphabetic characters of the words. Here''s my program:
  #include #include using namespace std; int main(int argc, char * argv[]) { char input; int limit = 100; int n = 0; ifstream infile("ospd.txt"); for(n = 0; n < 20; n++) { infile >> input; cout << input << " (" << (int)input << ")" << endl; } return 0; } 
and the output...
a (97)
a (97)
a (97)
a (97)
h (104)
a (97)
a (97)
h (104)
e (101)
d (100)
a (97)
a (97)
h (104)
i (105)
n (110)
g (103)
a (97)
a (97)
h (104)
s (115) 
and the text from the file...
aa
aah
aahed
aahing
aahs

It prints out only the alph characters, and none of the special character that separates the words. You can get the file I''m using here. If you care to take a look for yourself. I just need some way to be able to determine where a word ends and where one begins. Thanks for your help. Russell

##### Share on other sites
It''s an endline.

##### Share on other sites
okay, lots of easy ways to do this, and remember them aswell

Under MS/PC-DOS, OS2, Win32, Win16 and so on the end of line seperaters are as follows - remember this
13
10

These are the two sympols that you couldn''t see, a music sign and a playing card

To see these, go to the command prompt (console), locate the file and type this
edit /77 ospd.txt
This loads the file in binary mode (binary file mode meaning it shows you everything instead of using the ascii formatting characters).

Note that the bottom right of the screen shows the value of the current charcater.

To write a program to read and test the file, try this (off the top of my head so hope there are no mistakes)
  #include void main(void){ FILE *in; in=fopen("ospd.txt","rb"); while(!feof(in))  printf("%i\n",fgetc(in); fclose(in);}

Beer - the love catalyst
good ol'' homepage

##### Share on other sites
errr, for your data file use this code instead so you don't have to wait for half an hour

  #include void main(void){ FILE *in; int i; in=fopen("ospd.txt","rb"); for(i=0;i<20;i++)  printf("%i\n",fgetc(in)); fclose(in);}

PS - first example I also left off a bracket.

Beer - the love catalyst
good ol' homepage

Edited by - Dredge-Master on February 4, 2002 8:04:24 PM

##### Share on other sites
Haha...sweet screenshot and everything. Thanks a lot.

Russell

##### Share on other sites
If you ever run across another such problem, just grab a hex editor and open the file up in that. Im surprised no one else suggested this, actually.

##### Share on other sites
I did show that.

You don''t need a hex editor, just the binary editor.

Edit.com (or edit.exe depending on version) has a binary mode

a hex editor is a viewer/editor which has binary and hex on it.

To view a character value, you do not need hex, unless your viewer doesn''t support a value display.

edit /77 is also the easiest way to view any file under 8mb.
fast, easy to navigate and change, everyone has it and unless you need the hex support, it involves less fumbling and you see more. It shows the full ascii set aswell which alot of hex editors do not.

Beer - the love catalyst
good ol'' homepage

##### Share on other sites
Also, one important thing to note is ''\n'' is the line-break character 10, whereas ''\r'' is the carriage-return character 13. But when working with files in text mode, the ''\n'' is automatically converted to ''\r\n''.

~CGameProgrammer( );

##### Share on other sites
The most important thing is the "rb" argument in fopen (). The ''b'' means open in binary mode, because otherwise it gets opened in text mode and doesn''t hand every byte to you.

___________________________________

##### Share on other sites
quote:
Original post by CGameProgrammer
Also, one important thing to note is '\n' is the line-break character 10, whereas '\r' is the carriage-return character 13. But when working with files in text mode, the '\n' is automatically converted to '\r\n'.

~CGameProgrammer( );

Carefull, that is for windows only!

Unix only uses linefeed and Mac only uses carriage return!!!

So when people write parsers, they should check for that!

  if ( char == 13' ){ if ( nextChar == 0x10 ) //windows new line else //mac new line} else if ( char == 10 ){ //unix new line!!!}

Edited by - Gorg on February 5, 2002 2:02:37 AM

1. 1
2. 2
JoeJ
20
3. 3
frob
16
4. 4
5. 5

• 10
• 10
• 11
• 13
• 9
• ### Forum Statistics

• Total Topics
632195
• Total Posts
3004717

×