Jump to content

  • Log In with Google      Sign In   
  • Create Account

#Actualcr88192

Posted 20 March 2013 - 02:38 AM

Personally I was never a big fan of flex/bison.  If they don't suit your needs perhaps try Boost::Spirit.  For smaller/simple languages its clean and succinct.

yeah, agreed.

I have generally held the opinion that flex/bison are often a bit overkill for what one is getting from them.


for very simple "languages", my general preference is basically to have a "split" function which takes a string and splits it up into an array of strings (tokens), and then the first token is used to lookup the command (typically with some way to register new commands, if relevant).

or, if the syntax is not line orientated, to basically just start reading-off tokens and matching against various patterns.


for more advanced syntax designs, my general preference has been for hand-written recursive descent parsers.
these scale in a fairly straightforward manner over a fairly wide range of syntactic complexities.

for example, both my scripting language and C parser use recursive descent, ...
also, many common compilers (GCC, Clang, ...) also use recursive descent, ...

granted, there is a little funkiness when it comes to languages with C-like syntax in a few areas (like regarding parsing things like declarations and similar), and different languages have addressed these in different ways.


ADD: explanation: a recursive descent parser is basically just code for matching against tokens, but typically involves the use of multiple functions which may call themselves recursively to parse things (typically, each type of expression will have its own function).

at a more basic level though, there will be a tokenization function, whose main job is mostly to read off and classify the various kinds of tokens. the tokenizer (or lexer) is typically what deals with the actual input characters, and will generally work by matching various patterns in the characters, and will generally return either individual tokens, or an array of tokens.

(my parsers have most often read tokens one-at-a-time, but many parsers work by splitting the whole input buffer into an array of tokens in advance).

#1cr88192

Posted 20 March 2013 - 02:12 AM

Personally I was never a big fan of flex/bison.  If they don't suit your needs perhaps try Boost::Spirit.  For smaller/simple languages its clean and succinct.

yeah, agreed.

I have generally held the opinion that flex/bison are often a bit overkill for what one is getting from them.


for very simple "languages", my general preference is basically to have a "split" function which takes a string and splits it up into an array of strings (tokens), and then the first token is used to lookup the command (typically with some way to register new commands, if relevant).

or, if the syntax is not line orientated, to basically just start reading-off tokens and matching against various patterns.


for more advanced syntax designs, my general preference has been for hand-written recursive descent parsers.
these scale in a fairly straightforward manner over a fairly wide range of syntactic complexities.

for example, both my scripting language and C parser use recursive descent, ...
also, many common compilers (GCC, Clang, ...) also use recursive descent, ...

granted, there is a little funkiness when it comes to languages with C-like syntax in a few areas (like regarding parsing things like declarations and similar), and different languages have addressed these in different ways.

PARTNERS