Reading a context free grammar in c++?

Started by
27 comments, last by Telastyn 14 years ago
Quote:Original post by Sneftel
Quote:Original post by alvaro
Actually, it would be really useful to have a library to which you give a grammar and a piece of text and it returns a tree.
I'm not sure what that would be useful for. With no semantic backend, you wouldn't be able to go any further than the tree. Nevertheless, you can certainly do it.. just pass the output of yacc to tcc.


It could be templated on a factory and a treenode type. You pass it a factory so that it generates tree nodes with whatever semantics you want.
Advertisement
I've used Lex and Yacc for this sort of thing before, but that's C only. The extremely similar C++ equivalents are Flex and Bison.
Quote:Original post by Malevolence
I've used Lex and Yacc for this sort of thing before, but that's C only. The extremely similar C++ equivalents are Flex and Bison.

Actually, both flex and bison output C code (though I think bison can also do C++). The difference between lex and flex, and between yacc and bison, is that the former are POSIX standards for behavior (based on early actual implementations), whereas the latter are modern implementations of the standard (plus some add-on behavior).
Flex will also output C++ code if you use the -+ command line option or "%option c++".
Quote:Original post by dmatter
Don't think that you're limited to regular grammars though, expressions are allowed to be recursive so it's pretty simple to construct context-free grammars too.


I didn't know this.

Xpressive lets you create an expression that includes itself as a sub-expression? Or a grammar of n mutually recursive expressions?
Quote:Original post by jwezorek
I didn't know this.

Xpressive lets you create an expression that includes itself as a sub-expression? Or a grammar of n mutually recursive expressions?
Certainly.
It's essentially a hybrid between boost::spirit and boost::regex. You can build spirit-style static grammars with it, then mix and match those with the dynamic regex-style expressions. By embedding expressions in one another by-reference they can be made to be recursive, they give the classic example in the docs of matching nested brackets.
Quote:Original post by dmatter
Certainly.
If I remember correctly it's built on top of boost::spirit. You can definitely build spirit-style static grammars with it too, then mix and match dynamic and static expressions together into a complete grammar.

Oh, if it's on top of spirit then it's going to have spirit's big problem: crazy compile times. That's what I was trying to get around.

I guess what I want is something like spirit but that is a less extreme example of template metaprogramming.

Like just have a banal polymorphic hierarchy of parser objects where each parser object has semantics like "parse a disjunction" or "parse a sequence". Have the whole thing templated on a pointer to an output tree node type. Have the output of each parser object be a pointer to the tree node type but have each parser object take a boost::function for creating the output tree node from the composition of the output from the parser object's child parsers. So for example, to instantiate a sequence parser object you'd need to supply it with a function that takes a vector of pointers to tree nodes and returns a new tree node pointer.
I might get on board with a simpler Spirit type library. But I bet that at the beginning, Spirit was meant to be a simple, clean little library too, but to do the kind of stuff it does is just flat-out complicated. You'd have to have a rather cut-down feature list.

Of course, the other benefit of that would be that it would be easier to use. None of this typedefing three levels of iterators, scanners and rules. Just the basics.

Or maybe there should just be a simpler way of writing of plain recursive-descent grammars functional style. A couple support functions for pattern matching and tree generation, and let the user do the rest. Or we could combine them. Crap, idea overload...
Quote:Original post by theOcelot
I might get on board with a simpler Spirit type library. But I bet that at the beginning, Spirit was meant to be a simple, clean little library too, but to do the kind of stuff it does is just flat-out complicated. You'd have to have a rather cut-down feature list.


Yes, you're right ... In the sketch of a parser library that I tried to explain above, I am cutting down features. Namely, I'm saying that the output of each parser object -- where by "parser object" I mean the things that you compose together to construct your target grammar -- has to be a dynamically allocated instantiation of a class that derives from a base class that all of the parser objects take as a template parameter. You would be given a class hierarchy of parser object templates, would construct your grammar by composing instantiations of them, would provide a class hierarchy of output tree nodes, and would tell the parser objects how to create output tree nodes by supplying an appropriate boost::function object when instantiating a parser object. It would be verbose, yes, but, you know, so is spirit.

Spirit is a remarkable achievement and it has no drawbacks if your development box happens to be a very fast computer. But spirit is trying consciously to be a template metaprogramming library -- this was obviously one of the constraints that its designer put on it. What I'm saying is that if you don't mind using old-fashioned polymorphism/inheritance you could write something similar that may not be as clever but that would compile faster.

[Edited by - jwezorek on March 13, 2010 5:02:44 AM]
Verbosity is one of the things I don't like about Spirit, along with the related problem of difficulty of use. If you're not using the default iterator type, Spirit is a pain. Or at least it was; I don't know about the new version, but from what I've seen the new version isn't very easy either... Would you support different iterator types?

This topic is closed to new replies.

Advertisement