Parsing

Started by
8 comments, last by AliasNotFound 20 years, 8 months ago
I''m currently working through the book "Programming Language Processors in Java, Compilers and Interpreters", attempting to implement a recursive descent parser. I''m having a problem with the way they are implementing expressions. Here are the production rules for an expression: Expression ::= primary-Expression (Operator primary-Expression)* primary-Expression ::= Integer literal | Identifier | Operator primary-Expression | ( Expression ) Using this gramnmer allows an expression such as 5++6 to be legal. In fact, you can string as many operators together as you want to and it parses correctly, which I don''t want. I removed the rule ''Operator primary-Expression'' from the rules for ''primary-Expression'' and that seemed to solved the problem, but now of course I can''t begin an expression with an operator, such as when negating an expression. Just in case anyone asks, this is not a homework assignment, I''m merely trying to teach myself how to write a compiler. Phil
Advertisement
a simple recursive descent grammar for an expression parser could be as follows:

expression ::= term (''+'' term)*
| term (''-'' term)*

term ::= factor (''*'' factor)*
| factor (''/'' factor)*

factor ::= INTEGER
| ''('' expression '')''
| ''-'' factor


If you think about it, this grammar couldn''t possibly allow the expression "5++5", because of the predictive nature of LL grammars. 5 would be parsed as a factor, which in turn can be parsed as a single term. When parsing an expression, the parser expects that an expression is a term followed by an additive operator followed by another term. Thus, when it encounters the second +, instead of the term, it gives an error and parsing fails.
daerid@gmail.com
Thanks dearid, breaking it up into factors and terms makes more sense.
Phil
Although, it does make semantic sense to allow 5++6 since it states what we normally imply (that 6 is a positive number).

It wouldn''t make sense to have 5**6, though.
quote:Original post by flangazor
It wouldn''t make sense to have 5**6, though.


I''m sure I''ve seen that notation to mean "five to the power of six" (or, 5^6) somewhere along the line...too bad I don''t remember where it was, though. Just goes to show, sweeping generalizations are always wrong.

-Odd the Hermit
Python has the syntax 5**6 to mean 56.
quote:Original post by flangazor
Python has the syntax 5**6 to mean 56.


Ah. Thanks for the tip--it reminded me where I ran across it. Perl does the same thing.

-Odd the Hermit
Python even has x//y, floor divide, which is the same as x/y for integers and float(int(x/y)) if x or y is a floating point.
You need to make a distinction between binary and unary operators.
John BoltonLocomotive Games (THQ)Current Project: Destroy All Humans (Wii). IN STORES NOW!
quote:Original post by JohnBolton
You need to make a distinction between binary and unary operators.


And tertiary operators. Amazing how often people forget that operators can take more than two arguments...

-Odd the Hermit

This topic is closed to new replies.

Advertisement