Sign in to follow this  
GameMasterXL

How do i read names from files?

Recommended Posts

How do i read names from a file like if my file had Orange, Bannana, Apple. How would i read each individual name? because i am wanting to read in the names and check if they are recognised and then if they are output them if not ignore them.

Share this post


Link to post
Share on other sites

Hi,

We might need more information. What language are you using? Is the file format regular ascii or binary?

Later,

GCS584

Share this post


Link to post
Share on other sites
However, the premise is the same:
+You take in the data from the file and dump it into a variable
+You then 'tokenize' the data. This means that if you have a string like
"Orange Bannana Apple"
and you want to get the individual words you use a white space as the token (' ') and do the following loop:

-Find the first instance of the token and tokenize (like split the string into components). You basically copy all the data before the token and the data after and dump the first section of the string into one variable and dump the data after the token into another variable and you then do the loop again on the second variable.
Well, thats how I do it anyway.

eg pseudo code (assuming you have read in the data into the var Data:
var Data, String1, String2;
Data = FileRead();
int pos;
pos = Data.Pos(' ');
String1 = substr(Data,0,pos-1);
String2 = substr(Data,pos+1,Data.Length());
loop
{
pos = String2.Pos(' ');
String1 = substr(String2,0,pos-1);
String2 = substr(String2,pos+1,String2.Length());
if(String2 == "")
break;
}

Well, thats kinda how you do it off the top of my head.

Hope thats helps in a kinda pseudo way

Share this post


Link to post
Share on other sites
Quote:
Original post by MotionCoil
-Find the first instance of the token and tokenize (like split the string into components). You basically copy all the data before the token and the data after and dump the first section of the string into one variable and dump the data after the token into another variable and you then do the loop again on the second variable.
Well, thats how I do it anyway.


Well, I'll make the supposition that the names are in the file with line breaks [smile] as follows:


Apples
Oranges
Bananas



But hey, I could be wrong!

Later,

GCS584

Share this post


Link to post
Share on other sites
Yeah the names are in seperate lines. It is in C++. I just need to get each individual name, i maby could read each character then when i reach whitespace exit and go to a new line but i don't know if that will work, what if it was all on one line though?

Share this post


Link to post
Share on other sites
Ah, this makes it a bit easier. Use IOStream objects:
string data;
ifstream file;
file.open(Filename, ios_base::in);

if (!file.is_open())
return -1;


while(getline(file, data))
{
//Do your processing here
}

Quote:

what if it was all on one line though


Then you use the method I stated before but use the above method to get the different lines. You need to do some string processing and tokenize the information to get the individual names out of different lines.

An example of string tokenizing

The String class is your friend[smile]

Share this post


Link to post
Share on other sites
Quote:
Original post by MotionCoil
Quote:

what if it was all on one line though


Then you use the method I stated before but use the above method to get the different lines. You need to do some string processing and tokenize the information to get the individual names out of different lines.


Yeah, he's quite right. There is also a nifty function called strok(..). It makes that somewhat dirty stuff really easy.

A description and example of the function can be found here. In all honesty, this is my most favorite function in C.

Later,

GCS584

[EDIT:] So basically, your best bet for my recommendation above (if the fruit is all on one line) would be to read all the characters in a single array and then split.

Share this post


Link to post
Share on other sites
Quote:

strok(..)


Indeed, if you want to just stick with chars or if you want it to look nice and neat then definitly use strtok. My method is somewhat bruteforce [smile] Comes from many years of Borland C++ Builder before Visual C++ [wink]

Anyway, you should now be fully equipped to parse any string no matter what kinda format its in.
If you need practice to remember how to do it, try parsing some INI files, they are really easy [smile]

[Edited by - MotionCoil on May 25, 2005 6:10:39 PM]

Share this post


Link to post
Share on other sites
Thanks for all the help, i was just stuck because i am building a parser and up to now i could do this basic command P(Hello Worl) that would display Hello World. But my parser ignored whitespace so if i did put P(Hello World) it would read it in like this HelloWorld, so i removed the part which ignored whitespace and it worked but if i added a space in between the function name and its brackets P (Hello Worl) i would get a parse error :(. So i think this function will help because now i will be able to read in my whole string from the function to be outputed and be hopefully able to process if, esle, for ect.



[Edited by - GameMasterXL on May 26, 2005 2:44:52 PM]

Share this post


Link to post
Share on other sites
Sorry about this but i have come into yet another problem. I want to validate the input from the file so like this:
if
so like if i wanted to see if their is an if in a file how would i check this when i am reading individual characters? this is what i am having problems with since i can't read whole words and check if that word is recognised. I could only read individual characters so i don't know if they are going to be the correct word.

Share this post


Link to post
Share on other sites
ok, post the the file or a section of the file so I can see how to parse it and i'll walk you though it and get an algorithm sorted out [smile]

Share this post


Link to post
Share on other sites
hiya, sorry about this but my source code is mangled up at the moment and i did something and it stopped working so i need to re-write but i will tell you how it goes. It is a recursive descente parser that read up in a chain untill it finds its specified operations for the certain token it reads in. It first starts by reading in the tokens into the evaluate function and then it uses a function that checks what the type of token it is and if it is recognised. So upto now thats it, it uses a class for the parser and all the function are private class members. I am just confused because how would i make my parser see when an if come along in my source code when it is reading character by character, i could do this P, (, then the characters her, ), these will bee my tokens. So i don't know how a normal parser works but i shure can't figure out how to check for statments in the syntax. I hope i helped you try to figure it out. Thanks.

Share this post


Link to post
Share on other sites
ok, well, its still coming down to tokenizing the strings.
From what I can gather you are going to have to use the flexible CString class. The idea it not to read each character individually but to take a string as a whole and look at each section (this is why syntax is so important in compilers and parsers because its the only logical way to do it).
This could get quite complex since I have no idea on the syntax of the source you are trying to parse. Keep in mind that source code parsing can get very very complex and even just a simple asm style parser can take days to write.

Say you have the string (avoiding the concept of white-space trimming for simplicity. Also its a very simple string to parse)

if (x=10); goto blah; ...

you have many different tokens here- whitespace, brackets and equals operator. You also have a keyword ('if') and a statement break (';')

We are going to use a pseudo-code version of substr() and strlen() quite a bit.
Heres the declaration:

string substr(string data, int Begin, int End)
string data
Here data is the string that is to be broken up
int Begin
Here Begin is the position of the character to start the substring
int End
Here End is the position of the character to stop the substring
Return
The function will return the substring created from the input string data


int strlen(string data)
string data
Here data will be the string that has its length checked.
Return
The function will return the number of characters



First you need to get the statements (in psuedo-code):

string data = "if (x=10); goto blah;"
string CurrentStatement = substr(data, 0, Pos(';')-1);
string Rest = substr(data, Pos(';')+1,strlen(data));

so here we have split the string up into the two statements:
if (x=10)
goto blah; ...

Now we need to break the current statement into even smaller components:

string str1 = substr(CurrentStatement, 0, Pos(' ')-1)
string str2 = substr(CurrentStatement, Pos(' ')+1, strlen(CurrentStatement));


Now we have:
If
(x=10)

We check to see what str1 is- its 'if' so we go down a switch statement or an if-else statement until we hit the logic processing for 'if'. Since we know what the syntax is for an 'if' statement its easy to break it down:

if(str1 == 'if')
{
string Statement = substr(str2, 2, strlen(str2)-1);

...


Since we know the first character will be a bracket we can skip the first character and since we know the last character is a bracket we can create a substring from the second character to the penultimate character.

Now 'Statement' is x=10. Its easy to break this down again:

String Variable = substr(Statement,1,Pos('=')-1);
String Value = substr(Statement,Pos('=')+1,strlen(Statement));
}


Now we have:
str1 = if
Variable = x
Value = 10
Rest = goto blah; ...

So here we have broken everything down into its component parts. You just loop round and do the same to the variable 'rest' to get the next statement. You basically read in a line at a time, then a statement at a time and finally components of that statement. Try to avoid reading it in a character at a time- if you do that you will have to construct statements by tokens (hence the statement break ';') so you read into an array a character at a time until you reach a ';'. At which point you stop reading and break down the statement as I described above. Once you have processed that line you continue reading until you hit another ';' and so on.
If I were you I would read in 1 line at a time using C++'s iostream classes, it will make it alot easier for the job at hand.

Hope that helps.
I'm more than happy to do a bit more clarification or tackle this another way if its not what you wanted. If its really not what you wanted it would really help if you posted up an example bit of source code that you are trying to parse so I can make a proof-of-concept, as it were, for you to read through.

[EDIT]
Sorry, made a few 'pseudo-code' errors that could have caused great confusion. Its all fixed now [smile].
[/EDIT]

[Edited by - MotionCoil on May 26, 2005 8:32:58 PM]

Share this post


Link to post
Share on other sites
Thanks for the reply just that lol i am not well up on strings and string classes i am not that far up in that in my book. I can do basic char strings by pointing to all the characters in char *str; or something [lol].

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this