Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

Russell

This character in this file...I can't figure out what it is

This topic is 6128 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I downloaded the official Scrabble word list to make a scrabble playing program, and there is a character separating the words, but I can''t figure out what it is. In Notepad the character shows up as one of those little boxes, and in Wordpad it acts as a newline, with each word on it''s own line. So I wrote a little C++ program to tell me what the ASCII value of the character was, and my program doesn''t even recognize it. It acts like it''s not even in the file. It only reads the alphabetic characters of the words. Here''s my program:
  
#include <iostream>
#include <fstream>

using namespace std;

int main(int argc, char * argv[]) {

	char	input;
	int		limit = 100;
	int		n = 0;

	ifstream	infile("ospd.txt");

	for(n = 0; n < 20; n++) {
		infile >> input;
		cout << input << " (" << (int)input << ")" << endl;
	}

	return 0;
}  
and the output...
a (97)
a (97)
a (97)
a (97)
h (104)
a (97)
a (97)
h (104)
e (101)
d (100)
a (97)
a (97)
h (104)
i (105)
n (110)
g (103)
a (97)
a (97)
h (104)
s (115) 
and the text from the file...
aa
aah
aahed
aahing
aahs
 
It prints out only the alph characters, and none of the special character that separates the words. You can get the file I''m using here. If you care to take a look for yourself. I just need some way to be able to determine where a word ends and where one begins. Thanks for your help. Russell

Share this post


Link to post
Share on other sites
Advertisement
okay, lots of easy ways to do this, and remember them aswell

Under MS/PC-DOS, OS2, Win32, Win16 and so on the end of line seperaters are as follows - remember this
13
10

These are the two sympols that you couldn''t see, a music sign and a playing card

To see these, go to the command prompt (console), locate the file and type this
edit /77 ospd.txt
This loads the file in binary mode (binary file mode meaning it shows you everything instead of using the ascii formatting characters).

Note that the bottom right of the screen shows the value of the current charcater.


To write a program to read and test the file, try this (off the top of my head so hope there are no mistakes)
  
#include <stdio.h>

void main(void)

{

 FILE *in;

 in=fopen("ospd.txt","rb");

 while(!feof(in))

  printf("%i\n",fgetc(in);

 fclose(in);

}






Beer - the love catalyst
good ol'' homepage

Share this post


Link to post
Share on other sites
errr, for your data file use this code instead so you don't have to wait for half an hour

    
#include <stdio.h>

void main(void)

{

 FILE *in;

 int i;

 in=fopen("ospd.txt","rb");

 for(i=0;i<20;i++)

  printf("%i\n",fgetc(in));

 fclose(in);

}


PS - first example I also left off a bracket.





Beer - the love catalyst
good ol' homepage

Edited by - Dredge-Master on February 4, 2002 8:04:24 PM

Share this post


Link to post
Share on other sites
If you ever run across another such problem, just grab a hex editor and open the file up in that. Im surprised no one else suggested this, actually.

Share this post


Link to post
Share on other sites
I did show that.

You don''t need a hex editor, just the binary editor.

Edit.com (or edit.exe depending on version) has a binary mode

a hex editor is a viewer/editor which has binary and hex on it.

To view a character value, you do not need hex, unless your viewer doesn''t support a value display.


edit /77 is also the easiest way to view any file under 8mb.
fast, easy to navigate and change, everyone has it and unless you need the hex support, it involves less fumbling and you see more. It shows the full ascii set aswell which alot of hex editors do not.



Beer - the love catalyst
good ol'' homepage

Share this post


Link to post
Share on other sites
Also, one important thing to note is ''\n'' is the line-break character 10, whereas ''\r'' is the carriage-return character 13. But when working with files in text mode, the ''\n'' is automatically converted to ''\r\n''.

~CGameProgrammer( );

Share this post


Link to post
Share on other sites
The most important thing is the "rb" argument in fopen (). The ''b'' means open in binary mode, because otherwise it gets opened in text mode and doesn''t hand every byte to you.

___________________________________

Share this post


Link to post
Share on other sites
quote:
Original post by CGameProgrammer
Also, one important thing to note is '\n' is the line-break character 10, whereas '\r' is the carriage-return character 13. But when working with files in text mode, the '\n' is automatically converted to '\r\n'.

~CGameProgrammer( );




Carefull, that is for windows only!

Unix only uses linefeed and Mac only uses carriage return!!!

So when people write parsers, they should check for that!

    
if ( char == 13' )
{
if ( nextChar == 0x10 )
//windows new line

else
//mac new line

}
else if ( char == 10 )
{
//unix new line!!!

}


Edited by - Gorg on February 5, 2002 2:02:37 AM

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!