Jump to content
  • Advertisement
Sign in to follow this  
DeadXorAlive

Advice on (learning) parsing

This topic is 4726 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm interested in learning how to make and / or use a parser. What would be a good way to go? So far, I have written (from a book) only a very simple recursive descent parser (or so I am told), for a stack based calculator (in C++). Basically I'm looking for the big picture here: - Any useful tools. I've heard about yacc-lex, is it hard? - Any references, links, books. - What is involved in writing / using a parser that is extensible, what are the components? - How difficult is it, how much experience and time needed? What are the required programming skills? (I might drop the whole idea for now and learn those skills first) These are some project ideas that get me excited. Any comments are welcome: - Adventure games like the old Larry (eh excited, not intended). - Simple First Order Logic parser. I had an educational tool called Tarski's world where you could write FOL sentences about some geometric objects, whose truth-value would then be evaluated. In the long run, something like this in a game would be very nice. - Parsing and translating C++ in plain English. I suppose this is too much, but I like the idea.

Share this post


Link to post
Share on other sites
Advertisement
I recently went through and learned at least the basics of boost::spirit, and it was suprisingly easy. If you know regular expressions, at least a little, it's fairly easy to make something quite powerful with only a day or two of learning/experimentation.

Share this post


Link to post
Share on other sites
Quote:
Any useful tools. I've heard about yacc-lex, is it hard?


yacc stands for "yet another compiler compiler", it's used to compile a programming language grammar into a compiler. I don't think that will help.

My recommendation would be to read up on lexical analysis, which helps break a string into a sequence of tokens (or keywords), the tokens are then passed to a parser, that figures out what to do with the tokens.

Share this post


Link to post
Share on other sites
Thanks you both. I recently got boost installed and glanced over the regular expressions. Good Things Happen. Praise boost.
And eh, I really don't want to write yet another compiler indeed...

Share this post


Link to post
Share on other sites
Quote:
Original post by DeadXorAlive
Thanks you both. I recently got boost installed and glanced over the regular expressions. Good Things Happen. Praise boost.
And eh, I really don't want to write yet another compiler indeed...


Eh, just to be clear, I was talking about spirit, not boost's regular expressions. Spirit is something that takes a language definition [which is quite similar to, if not mathematically equivalent to regular expression 'code'] and generates a recursive descent parser from it.

Note that I'm no CS major, so am unsure if I'm using some of these terms right...

Share this post


Link to post
Share on other sites
Quote:
Original post by Telastyn
Note that I'm no CS major, so am unsure if I'm using some of these terms right...


Pretty much, boost spirit is a domain-specific embedded language (DSEL) in library form that allows near Extended Backus-Naur Form (EBNF) context free grammars directly in and with C++ constructs this gives the flexiablity to use the same enivoriment and mix it with other C++ libraries. BNF/EBNF is similar to regular expressions. Boost.Spirit achieves this using techniques such as expression templates, template metaprogramming, etc. Regular expressions can be used to specify a lexer for a language.

[Edited by - snk_kid on October 10, 2005 2:22:06 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Telastyn
Eh, just to be clear, I was talking about spirit, not boost's regular expressions. Spirit is something that takes a language definition [which is quite similar to, if not mathematically equivalent to regular expression 'code'] and generates a recursive descent parser from it.


Even better. I understand that it's not neccesary to write the 'low-level' data structures like linked lists yourself, as I was doing with the calculator. I assume regular expressions are near universal, is that correct?
I think Java comes with regular expressions? They sure seem very useful to me.

Thanks for the links smr and snk_kid.

Share this post


Link to post
Share on other sites
Quote:
Original post by DeadXorAlive
Even better. I understand that it's not neccesary to write the 'low-level' data structures like linked lists yourself, as I was doing with the calculator.


Even if you needed a linked-list generally you don't need to write your own as languages typically provide them with there standard library, like C++ there is std::list, back to the point yes nowadays it is generally not necessary to write the structures/internals of a lexer/paser as it is such a mundane, repetitive task usually resulting in much boilerplate code. We have domain specific languages that are declarative and from this we can specify whats to be done as opposed how it should be done.

So we have libraries/tools that can take these specs and the code is automatically generated that does the work. Generally lexers will be specificed with regular expressions and parsers for programming languages will be specificed with context free grammers like BNF/EBNF.

Quote:
Original post by DeadXorAlive
I assume regular expressions are near universal, is that correct?
I think Java comes with regular expressions? They sure seem very useful to me.


I'm pretty sure in the lastest versions they do provide a regular expressions library.

Share this post


Link to post
Share on other sites
Regular expressions are... ubiquitous, though I might not go so far as call them universal. Not a few unix tools deal with them, some windows text editors, a few languages deal with them natively [perl for example], and most languages now can use them to some degree via library.

The actual syntaxes allowed are a little quirky between versions, though that is mostly historical as different implimentations developed independantly.

And yes, they're exceptionally useful; especially in little sys-admin tasks and command line voodoo.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!