Jump to content
  • Advertisement
Sign in to follow this  
gsgeek

Any good introduction for a noob on parsing scripting languages with Python?

This topic is 496 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I want to start learning how to parse fragments of text -mostly, snippets of code in a custom language-, using Python. But I'm at a loss since all the material I find seems to be quite advanced, or at least it's terminology it's to foreign to me.

So, is there any tutorial "for dummies"-style on parsing custom languages with Python? Could you perhaps point me to any good material? I know that there are many packages for parsing (PyParse, modgrammar, etc), but besides I needing some introductory material, I also am not clear as to wether they would work with Python 3.6 or not.

 

Any advice would be welcome. Thank you in advance.

Share this post


Link to post
Share on other sites
Advertisement

I used pygments to do some parsing with, its fairly straight forward and will tell you what token each word in a script is. Pygments supports a lot of languages, I for example used it to figure out what the arguments and names of public function were in a actionscript 3 class. This took me about 3 hours to write most of that time was not spent on integrating pygments to the script.

Pygments is mostly used for langauges highlighters in text editors, it allows you to define the grammar it matches the string against too.

You might want to look at abstract syntaxes to understand the meaning of the concepts mentioned in parsing.

Edited by NightCreature83

Share this post


Link to post
Share on other sites

There is "Text processing with Python" which looks like a good starting point. http://gnosis.cx/TPiP/

Never read it myself, as a glance through the table of contents suggested I know most of it.

 

Python versions are not that interesting, plain string operations haven't changed much. In addition, newer Python typically get new things added, and very little (if any) things removed, so you should be ok with anything Python 3. In fact even 2.6 and 2.7 will mostly work, as these version moved a lot towards Python 3 already. Biggest difference between 2 and 3 is how data 'outside' and data 'inside' is now strictly separated. You get that when writing or reading files, but those are standard patterns, so once you know them, it's no problem.

 

As for parsers, I always use PLY ( http://www.dabeaz.com/ply/ ), which is "Python Lex & Yacc". Lex and Yacc are the defacto standard tools to write parsers for production compilers, they are old tools, and the C version is all over the Internet, and in compiler construction books.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!