How do i read names from files?

Started by
12 comments, last by GameMasterXL 18 years, 11 months ago
ok, post the the file or a section of the file so I can see how to parse it and i'll walk you though it and get an algorithm sorted out [smile]
Advertisement
hiya, sorry about this but my source code is mangled up at the moment and i did something and it stopped working so i need to re-write but i will tell you how it goes. It is a recursive descente parser that read up in a chain untill it finds its specified operations for the certain token it reads in. It first starts by reading in the tokens into the evaluate function and then it uses a function that checks what the type of token it is and if it is recognised. So upto now thats it, it uses a class for the parser and all the function are private class members. I am just confused because how would i make my parser see when an if come along in my source code when it is reading character by character, i could do this P, (, then the characters her, ), these will bee my tokens. So i don't know how a normal parser works but i shure can't figure out how to check for statments in the syntax. I hope i helped you try to figure it out. Thanks.
ok, well, its still coming down to tokenizing the strings.
From what I can gather you are going to have to use the flexible CString class. The idea it not to read each character individually but to take a string as a whole and look at each section (this is why syntax is so important in compilers and parsers because its the only logical way to do it).
This could get quite complex since I have no idea on the syntax of the source you are trying to parse. Keep in mind that source code parsing can get very very complex and even just a simple asm style parser can take days to write.

Say you have the string (avoiding the concept of white-space trimming for simplicity. Also its a very simple string to parse)

if (x=10); goto blah; ...

you have many different tokens here- whitespace, brackets and equals operator. You also have a keyword ('if') and a statement break (';')

We are going to use a pseudo-code version of substr() and strlen() quite a bit.
Heres the declaration:

string substr(string data, int Begin, int End)
string data
Here data is the string that is to be broken up
int Begin
Here Begin is the position of the character to start the substring
int End
Here End is the position of the character to stop the substring
Return
The function will return the substring created from the input string data


int strlen(string data)
string data
Here data will be the string that has its length checked.
Return
The function will return the number of characters



First you need to get the statements (in psuedo-code):
string data = "if (x=10); goto blah;"string CurrentStatement = substr(data, 0, Pos(';')-1);string Rest = substr(data, Pos(';')+1,strlen(data));

so here we have split the string up into the two statements:
if (x=10)
goto blah; ...

Now we need to break the current statement into even smaller components:
string str1 = substr(CurrentStatement, 0, Pos(' ')-1)string str2 = substr(CurrentStatement, Pos(' ')+1, strlen(CurrentStatement));


Now we have:
If
(x=10)

We check to see what str1 is- its 'if' so we go down a switch statement or an if-else statement until we hit the logic processing for 'if'. Since we know what the syntax is for an 'if' statement its easy to break it down:
if(str1 == 'if'){string Statement = substr(str2, 2, strlen(str2)-1);...


Since we know the first character will be a bracket we can skip the first character and since we know the last character is a bracket we can create a substring from the second character to the penultimate character.

Now 'Statement' is x=10. Its easy to break this down again:
String Variable = substr(Statement,1,Pos('=')-1);String Value = substr(Statement,Pos('=')+1,strlen(Statement));}


Now we have:
str1 = if
Variable = x
Value = 10
Rest = goto blah; ...

So here we have broken everything down into its component parts. You just loop round and do the same to the variable 'rest' to get the next statement. You basically read in a line at a time, then a statement at a time and finally components of that statement. Try to avoid reading it in a character at a time- if you do that you will have to construct statements by tokens (hence the statement break ';') so you read into an array a character at a time until you reach a ';'. At which point you stop reading and break down the statement as I described above. Once you have processed that line you continue reading until you hit another ';' and so on.
If I were you I would read in 1 line at a time using C++'s iostream classes, it will make it alot easier for the job at hand.

Hope that helps.
I'm more than happy to do a bit more clarification or tackle this another way if its not what you wanted. If its really not what you wanted it would really help if you posted up an example bit of source code that you are trying to parse so I can make a proof-of-concept, as it were, for you to read through.

[EDIT]
Sorry, made a few 'pseudo-code' errors that could have caused great confusion. Its all fixed now [smile].
[/EDIT]

[Edited by - MotionCoil on May 26, 2005 8:32:58 PM]
Thanks for the reply just that lol i am not well up on strings and string classes i am not that far up in that in my book. I can do basic char strings by pointing to all the characters in char *str; or something [lol].

This topic is closed to new replies.

Advertisement