Sign in to follow this  
rapso

tool to generate XML as intermediate

Recommended Posts

hi, I'm seekin a tool,sdk or whatever to generate an itermediate XML from a scripting language. The idea is described in http://www.idealliance.org/papers/xml2001/papers/html/03-05-04.html This way I could add some specific stuff I need to a scripting language (like typechecking) without writing the whole parser stuff (or use Lex/Yacc etc.). thx rapso

Share this post


Link to post
Share on other sites
You need a parser (and probably tokeniser) in the first place to derive the XML expression tree from the language. The example talks about writing a recursive descent parser. You can't avoid that stage - it all depends on the the type of XML you want and the grammar you're using.

Really that article is just another example of XML fanatics trying to suggest that XML is useful everywhere, when it's not.

Share this post


Link to post
Share on other sites
I just came up with that idea by myself (and while googling for a tool found that side). I dont want to be the 1001guy who writes a parser etc.
that's why I hope there is already an parser/converter to XML. that's why ppl use XML, as meta-format and in this way it's very usefull.

Share this post


Link to post
Share on other sites
XML is a meta-format but you still need to decide the format. That would come from the parser you would write to get your language into XML. Nobody can write an arbitrary XML generator for a language that doesn't exist. Now, if you are referring to a specific scripting language that already exists, that might be different as someone may have written one before - but I doubt it, because generally speaking it would be useless...

Share this post


Link to post
Share on other sites
of course I'm speaking about an existing language. it's not that useless, xml allows to interchange data easily, there is even a patch for the g++ to support the export of the parse tree with xml.
But I need it just for a simple language (lua, ruby,... alike).

Share this post


Link to post
Share on other sites
The point is that for this purpose, XML is useless to almost everybody, because 99% of applications do not need any 'interchange' between parsing the code and compiling or executing it. You parse the code into a representation that is good for compilation - you don't typically ever want to expand it into a bulky text format like XML, only to have to then re-parse the XML back into an expression tree later. Even the paper you cite only claims that "tools for processing XML can reduce the time for compiler implementation" which is unlikely to be true anyway, given that you still have to handle the XML tree at the end anyway.

If you google 'XML intermediate language' you may find some links, but really you're asking for something that nobody really needs or wants, so good luck.

Share this post


Link to post
Share on other sites
I think the point is, that u dont understand my point.

XML-parsing is not the problem, there are a lot of libs for that and it's already done for other meta-data (by many ppl). so that's not the problem.

And "tools for processing XML can reduce the time for compiler implementation" is true, implementing an parser, even with the help of tool like yacc is very time consuming and error-prone, but getting the parse tree by using an xml-tree is very simple.

But I did not find any parser like this, so I will have to write a parser once again :-/

Share this post


Link to post
Share on other sites
Quote:
Original post by rapso
And "tools for processing XML can reduce the time for compiler implementation" is true, implementing an parser, even with the help of tool like yacc is very time consuming and error-prone, but getting the parse tree by using an xml-tree is very simple.


Only if you already translated a language to the correct tree structure in the first place, which requires a language specific lexer and parser. Whether you then output that as XML or as some other data makes no difference to anything really. Once you have that tree structure, the hard work is already done. This is why nobody uses XML for this.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Quote:
Original post by rapso
And "tools for processing XML can reduce the time for compiler implementation" is true, implementing an parser, even with the help of tool like yacc is very time consuming and error-prone, but getting the parse tree by using an xml-tree is very simple.

Only if you already translated a language to the correct tree structure in the first place, which requires a language specific lexer and parser....Once you have that tree structure, the hard work is already done. ..
exactly, that's my point. I want to avoid this hard, error prone... work. because it's done thousand times by other ppl already.




Quote:

... Whether you then output that as XML or as some other data makes no difference to anything really. ... This is why nobody uses XML for this.

the difference is, that XML is an standard format, easy to write and read with several free libs (and even to write an own xml-parser is easier than for human-written-scripts/text).
And yes, nobody seems to output his parse-tree with XML, but that would make the life a lot easier for ppl who need their own special language, because most of those specific language features dont apply to the parsing-end, but to the further preparing, e.g. in my case I need type- and resource-checking on compile-time.

Share this post


Link to post
Share on other sites
XML is standard and easy to read but there is almost no demand for data exchange at this point in the parsing process, therefore all the benefits of XML are wasted. You're wrong - language specific stuff is absolutely paramount to the parsing stage, as well as the tokenising/lexing stage, and the execution stage. Pass one string to two different languages and you will receive a completely different tree. The tokens are different, the grammars are different. It's all language-dependent so there is little point having a generic data exchange half-way through the chain.

Share this post


Link to post
Share on other sites
Is the point that there is already an existing parser and bytecode-compiler/interpreter for the language you want to use, but you want to add a bit in the middle to do some extra processing (to add new lint-like warnings, or stricter type checking, or replacing "+" with "String.concat", or whatever)?

If so, then it seems the fundamental problem is that the language implementation is inextensible, and XML doesn't have to be part of the solution - it would be just as good if you could write a module that's compiled in with the rest of the compiler and which is given some data structure representing the parse tree / AST of the program. Then you could do whatever manipulations you'd like, and you would avoid the effort of having to write XML code for your own external program to import the XML-ised tree into an in-memory tree and then write it out again.

Compilers ought to already work somewhat like that, passing data through well-defined interfaces to the parser, type-checker, semantic-manipulator, bytecode-generator, interpreter, etc, so you could hook in new stages in the middle. I have no idea if many are actually cleanly designed in practice, so maybe that's impossible.

If you could hook into the compiler like that, then you (or it) could serialise the AST through XML and use an external tool to process it, but I wouldn't see much benefit compared to just writing that tool as part of the compiler itself.

Or I might be misunderstanding the point [smile]

Share this post


Link to post
Share on other sites
Quote:
Original post by Excors
Is the point that there is already an existing parser and bytecode-compiler/interpreter for the language you want to use, but you want to add a bit in the middle to do some extra processing (to add new lint-like warnings, or stricter type checking, or replacing "+" with "String.concat", or whatever)?

there is just a VM and an assembler-compiler. I don't have a parser yet for a human readable scripting language. But yes, I want a simple scripting language like those thousands that have been implemented bevor, but add my special stuff like typechecking and validation for all kind of stuff on compile-time, 'cause my game(engine) shouldn't crash due to an unhandled issue on runtime.

Quote:

If you could hook into the compiler like that, then you (or it) could serialise the AST through XML and use an external tool to process it...[smile]
Yes, I already thought about that, this way I could exploit e.g. the lua parser. But it's in someway unkind to just rip their parser for my own benefits without permission (that's the reason why the xml-parsetree-exporter for GCC is not part of GCC but just a patch).

thank you for your time :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this