Jump to content
  • Advertisement
Sign in to follow this  
LYH1

Few questions on writing a interpreter

This topic is 4080 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am writing a pascal interpreter to control a game's event flow. It is the ll(1) and recrusive descent one, for study purpose. It only have ablity to run sum defined functions and access predefined variables. I rewrited the pascal bnf from google, only if-then-else statment part is left, no loop or declaration.. Some problem I cant slove during the parsing phase. First, how to do symmetric analysis at the same time, just do something when the production is true? Second, the lexer, I have no idea on it. Just skiping space the others left for the parser? Third, is it necessary to built a parse tree first, before interpreting?

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by LYH1
First, how to do symmetric analysis at the same time, just do something when the production is true?


This is the first time I hear about symmetric analysis. What is it?

Quote:
Second, the lexer, I have no idea on it. Just skiping space the others left for the parser?


The lexer builds tokens from characters, and "sends" them to the parser (how they are sent depends on your architecture). Write the code for building a token, and move on from there.

Quote:
Third, is it necessary to built a parse tree first, before interpreting?


It's easier, because you don't have to worry about parsing and interpreting at the same time.

Share this post


Link to post
Share on other sites
First:

I am not sure what you mean by symmetric analysis, perhaps I know of it under a different name. Could you describe what you mean by symmetric analysis?


Second:

Well the lexer would usually feed tokens to the parser.

So ":=" would be fed to the parser as ASSIGNMENT token.

variable and procedure names would be fed to the parser as some ID token (of course , a token would be an object containing its type and actual data so the parser can access the actual variable name once it sees that it is a an ID), for example parser could receive a token stream from the lexer like this one:

ID ASSIGNMENT CONSTANT SEMICOLON

that stream could correspond to:" temp:=0; "


Third:

Well, I think it is easier to interpret the code from an Abstract Syntax Tree (Parse Tree), but since you have such a simplified syntax (ll(1)) you probably could just interpret it as you parse it. You might be able to just parse the statements when you match a certain production rule (like variable assignment for example), using some intermediate tables but that might require that you impose some semantic rules (like that the function needs to be fully coded prior to its first use).

Hope i helped.

Share this post


Link to post
Share on other sites
It should be "semantic analysis", not "symmetric" sorry if confused you.

More about the lexer, if the lexer ate a 'identifier' but
the production wanted a 'char_list' only,
the lexer should roll back the charator stream and reproduce a token?
Or the production should tell the lexer it should take a 'char_list' or some types that predefined?

Share this post


Link to post
Share on other sites
Ideally, the lexer should be able to work independently of the parser. This means that, given some input, there is only one possible sequence of tokens possible for that input. The lexer then extracts this unique sequence and gives it to the parser. If the parser can't parse it, the file is invalid, because there is no other possible token sequence for that file.

Another possibility that is gaining popularity is to request from the lexer only a given set of tokens, and create a parsing error if none of these tokens are available, with a rule for deciding which token is chosen if several are possible. However, this greatly increases the complexity of the program.

Share this post


Link to post
Share on other sites
I get the idea of lexer now.
I have a last question, while dealing with the if-then-else statement,
the else part of the statement can be optional.
the else_part production eat a token and check if there is a 'else'.
if no 'else', epsilon move encounter.
My production checking method is first take a token from the lexer.
While epsilon move the unused token should store in a gobal variable or
unput it in lexer? If look ahead a previous token seems break the ll(1) rule.
Or rewrite the grammar, so no epsilon appear?

assume the if-then-else rule is:

if-statment -> 'if' expression 'then' statement else_part
else_part -> 'else' statement | empty

Isn't it a ll(1) grammar?

Share this post


Link to post
Share on other sites
Quote:
Original post by LYH1
I get the idea of lexer now.
If look ahead a previous token seems break the ll(1) rule.
Or rewrite the grammar, so no epsilon appear?

assume the if-then-else rule is:

if-statment -> 'if' expression 'then' statement else_part
else_part -> 'else' statement | empty

Isn't it a ll(1) grammar?


It depends how you defined statement. Because of empty production in the else_part it might be undecidable where the statement ends and where the "empty" else_part starts, if you know what I mean.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!